Application and Deployment of Optical Module in Smart Computing Center
Core Application of Optical Module in Smart Computing Center
Application Scenarios and Demand Analysis
Smart Computing Center, with a large-scale GPU cluster as its core, has extremely high requirements for network bandwidth, latency, and reliability, and the optical module, as a core interconnection component, is mainly applied to the following scenarios:
- Server and switch interconnection. Support high-speed data transmission (e.g., NVIDIA DGX H100 cluster connects Quantum-2 switch 800G OSFP 2*DR4/2*VR4 via 400G OSFP DR4/VR4 optical module).
- Inter-switch long-distance interconnection. Such as the use of single-mode optical modules (such as 100m/500m/2km/10km fiber) between data centers to achieve distributed training.
- Storage network and high-speed computing. Support RoCE (RDMA over Converged Ethernet) or InfiniBand (IB) protocols to ensure lossless transmission and low latency (e.g., IB network realizes 200Gbps transmission through QSFP56 HDR optical modules).
Technology Evolution and Mainstream Selection
Rate Upgrade. 800G optical modules (QSFP-DD/OSFP package) gradually become mainstream. 1.6T module demand based on NV GB200 and GB300 servers in the Smart Computing Center began to sprout. 3.2T to CPO (co-packaged optics).
- Packaging Format
QSFP-DD. supports 8x100Gbps, backward compatible with QSFP28/QSFP56, suitable for high-density, low-power scenarios (e.g., Huawei CloudEngine 16800 switches).
OSFP. larger size to support higher power consumption (e.g., NVIDIA Quantum-2 switches with dual-port OSFP modules).
- Technology Route
LPO (Linear Drive Pluggable Optics) technology reduces power consumption by 27% and latency by 17% by removing the DSP chip, making it the preferred choice for smart computing scenarios.
Key Performance Indicators
- Bandwidth and Rate. Single-channel rate upgraded from 25G (NRZ) to 100G/200G (PAM4), supporting multi mode (MMF, short-haul) and single-mode (SMF, long-haul) fibers.
- Latency and Reliability. The bit error rate (BER) should be lower than 1E-12 (RoCEv2 requirement).
- Power consumption and heat dissipation.The power consumption of 800G module is about 16-20W (OSFP), liquid cooling technology can reduce the temperature by 15℃, and it needs to be adapted to rack cooling design.
Optical Module Deployment Considerations
Selection and Compatibility
rate matching. Optical module rate should be the same as the equipment port (such as switch port for 800G, you need to choose 800G module, NIC side port for 400G to connect the switch single port is also 400G), to avoid speed reduction or alarm.
Encapsulation Compatibility
- In IB networks, NVIDIA Quantum-2 switches require dual-port OSFP modules (OSFP 800G 2*DR4/2*VR4), while ConnectX-7 NICs support OSFP400G DR4/VR4 or QSFP112 DR4/VR4.
- In RoCE networks, Huawei CloudEngine Huawei CloudEngine switches adapted with QSFP-DD 800G modules in RoCE networks need to verify interoperability with NVIDIA ConnectX NICs.
Protocols and Standards
- IB networks need to follow IBTA standards, and modules need to pass interoperability tests.
- RoCE networks rely on PFC (Priority Flow Control) and ECN (Explicit Congestion Notification), and optical modules need to support the IEEE 802.3ck standard.
Signal Integrity and Link Optimization
Optical Power and Loss
- Multimode fiber (MMF, 850nm) is suitable for short distances within 100m (e.g., OSFP SR8 module).
- Single-mode fiber (SMF, 1310/1550nm) supports transmission from 500m to 2km (e.g. DR8 module), and it is necessary to ensure that the optical power is within the range of module receiving sensitivity (e.g. -9 to -3dBm).
Signal Compensation and Equalization
LPO module relies on the dynamic compensation of the switch ASIC, and it is necessary to verify the auto-tuning capability of modules from different vendors.
Reliability and Operation and Maintenance
- Testing and certification. Optical modules need to pass BER, extinction ratio, eye diagram, aging test, and provide CNAS/CMA quality inspection report.
- Environment simulation test. Build a laboratory for actual scenarios, and test for a long time on a small scale.
- Monitoring and early warning. Real-time monitoring of temperature, power consumption, optical power and other parameters through DDM (Digital Diagnostic Monitoring), and setting threshold alarms (e.g., alarms triggered by temperature exceeding 85℃). Deploy intelligent operation and maintenance system to realize minute-level fault location.
Power Consumption and Heat Dissipation Management
- Power consumption optimization. The power consumption of LPO module is about 50% lower than traditional DSP module (e.g., 9 pJ/bit vs. 18 pJ/bit), and the power consumption of the whole machine can be reduced by 25%-40%.
- Liquid cooling technology. Optical modules are directly submerged in the coolant (e.g., MPO interface modules), and sealing and compatibility need to be verified.
- Thermal design. High-density racks (e.g., full with 800G modules) need to increase the fan speed or liquid-cooled backplane to avoid overheating of the module resulting in increased BER.
Matching Strategies for Optical Modules and Devices in IB Networking
Core Device and Optical Module Selection
- Switch
NVIDIA Quantum-2 (supports NDR 400G/800G) requires dual-port OSFP optical modules (e.g., MMS4X00-NM, 1310nm, 500m).
- NIC
ConnectX-7 supports OSFP or QSFP112 module (e.g., flat-top 400G single-port OSFP), BlueField-3 DPU supports QSFP112 only.
- Cables
Multimode Fiber (MMF): 50/125 μm for short distance (e.g., 3-50m) with MPO-12/APC connector.
Single Mode Fiber (SMF): 9/125μm, supports 500m to 2km, paired with MPO-12/APC connectors or Duplex LC/UPC connectors.

Topology and Rate Matching
Fat-Tree/DragonFly+ topology. Realize microsecond latency through 800G OSFP module, support thousands of nodes expansion (e.g. NVIDIA cluster).
Rate Correspondence
- EDR (100Gbps): QSFP28 optical module (e.g. ConnectX-5 NIC).
- HDR (200Gbps): QSFP56 module, single channel 50G PAM4. (e.g. ConnectX-6 NIC QM8700 switch)
- NDR (400G/800G): OSFP or QSFP112 module (e.g. 400G matches ConnectX-7 NIC, 800G matches QM9700 switch), single channel 100G PAM4.
Compatibility Verification and Configuration
- Interoperability test. Optical modules need to be certified with switches and NICs through IBTA (e.g. NVIDIA LinkX solution) to ensure successful link training.
- Third-party modules need to verify compatibility with NVIDIA devices.
- Firmware and drivers. Update the switch firmware (e.g., NVIDIA QM9700) and NIC driver (e.g., OFED) to the latest version to support the NDR protocol and automatic recognition of optical modules.
- Link parameter configuration. Set the correct cable type (DAC/ACC/AOC) and transmission distance to avoid excessive BER (e.g. AOC cable should be configured in “active” mode).
Matching Strategies for Optical Modules and Devices in RoCE Networking
Core Device and Optical Module Selection
- Switch
Huawei CloudEngine 16800 (supports 800G QSFP-DD module), Mellanox Spectrum SN4700/SN5600/QM9700 series (compatible with QSFP-DD/OSFP).
- NIC
NVIDIA ConnectX-6/7 (RoCEv2 support, QSFP112/OSFP encapsulation), Intel E810 (DCB and PFC configuration required).
- Cables
Multimode MPO: MMF, 50/125μm, short pitch.
Single-mode MPO: SMF, supports DR4 (500M), LR4 (10km) or ER4 (40km), adapts to QSFP-DD /OSFP DR8, LR8, ER8 modules.
Protocol and Traffic Control
RoCEv2 Configuration
- PFC (Priority Flow Control). The switch needs to assign priority queues (such as queue 3) for RoCE traffic and enable PFC deadlock prevention.
- ECN (Explicit Congestion Notification). Switch ports are configured with ECN thresholds to mark messages when queue occupancy exceeds the threshold, triggering end-to-end congestion control.
- DCB (Data Center Bridging). Allocate bandwidth via ETS (Enhanced Transmission Selection) to ensure RoCE traffic is prioritized.
- MTU & Jumbo frames. Set MTU to 9214 bytes to reduce packet fragmentation and improve transmission efficiency.
Compatibility and Performance Optimization
- Cross-vendor adaptation. Huawei switches and NVIDIA NICs need to be verified for PFC/ECN teamwork to ensure lossless transmission (e.g., ICBC RoCE cluster).
- Third-party optical modules need to be certified by the switch.
- Drivers and firmware. Install the latest NIC driver (e.g. NVIDIA MLNX-OFED) and switch firmware to support RoCEv2 and optical module plug-and-play.
- Performance testing. Verify throughput and latency using iperf3 or netperf tools to ensure that BER is below 1E-12 and P90 latency <1μs.
Deployment and Verification Process
Optical Module Arrival and Acceptance
Check whether the package, rate, wavelength, and transmission distance are consistent with the order (e.g., QSFP-DD/OSFP 800G DR8, 1310nm, 500m).
Verify the DDM information. Read the module temperature, optical power, BER and other parameters through the switch CLI to ensure that they are within the specification.
Link Connection and Initialization
Physical Connection
Connect the optical module and equipment according to the topology diagram, and use the MPO cleaning tool to deal with the fiber end face to avoid pollution leading to loss.
Link Training
IB network: the switch automatically negotiates the rate and link width (e.g. x4/x12), and verifies the status through “ibstatus” command.
RoCE network: Configure the switch port as “RoCE mode”, enable PFC/ECN, and check the RDMA function of NIC through “ethtool-K”.
Function and Performance Test
Basic Connectivity
Use ping or ibping tool to verify the end-to-end communication and ensure no packet loss.
Throughput Test
IB network: run “ib_write_bw” to test the unidirectional/bidirectional bandwidth, the target value is ≥90% of the line speed.
RoCE network: Use “rdma_perftest” tool to test Read/Write performance and verify zero-copy transmission.
Stress Test
Simulate full load through traffic generator, monitor switch queue depth, optical module temperature and BER.
Long-term Operation and Maintenance and Optimization
- Real-time monitoring. Deploy network management system to collect indicators such as optical module status, link utilization, and PFC pause frame count.
- Troubleshooting. Abnormal optical power: Check the fiber connection, clean the end face or replace the module.
- Elevated BER. Troubleshoot signal integrity, heat dissipation, or firmware issues, and downgrade the rate or replace the module if necessary.
- Capacity planning. Reserve 20% of optical module ports and link bandwidth to support cluster expansion based on business growth forecasts.
Typical Case Reference
NVIDIA DGX H100 Cluster (IB Network)
Configuration: 1920 800G OSFP DR8 optical modules connected to Quantum-2 switches to build Fat-Tree topology.
Advantages: Realizes ultra-high-speed interconnection between GPUs (500m transmission), and the distributed training performance reaches more than 95% of the centralized one.
Huawei CloudEngine 16800 RoCE Network
Configuration
288 x 800G QSFP-DD modules with NVIDIA ConnectX-7 NICs, supporting PFC/ECN and intelligent lossless networking.
Cross-data center 1.2T optical modules (S+C+L band extension) and empty core fiber (10km) are used to reduce latency through OCS (all-optical switching).
Advantages
40% increase in synchronization efficiency of large model parameters, PUE≤1.14 (combined with liquid cooling and LPO technology). Distributed Smart Computing Cluster (multi-data center interconnection)
Verification
10 billion parameter models trained in 140km three-computer room with performance loss <5%.
Summary and Recommendations
Selection Recommendations
IB Network. Prioritize NVIDIA certified OSFP/QSFP112 modules (e.g., MMS4X00-NM) to ensure seamless compatibility with Quantum-2 switches and ConnectX NICs.
RoCE Networking. Select QSFP-DD modules (e.g., Huawei CloudEngine-compatible models) that support the IEEE 802.3ck standard and combine with PFC/ECN to realize lossless transmission.
Technology Trends
LPO/LRO technology will become mainstream, and attention needs to be paid to the switch-side dynamic compensation capability (e.g., Xinhua San’s intelligent tuning algorithm). Liquid-cooled optical modules and CPO technology are gradually coming on stream, and the heat dissipation and power supply systems need to be planned in advance.
Risk Avoidance
Avoid mixing IB modules from different vendors, and prioritize the use of NVIDIA LinkX or Mellanox-compatible solutions. In RoCE networks, strictly configure PFC/ECN parameters to prevent deadlocks and congestion, and conduct regular traffic simulation tests.
In This Article
- 1 Core Application of Optical Module in Smart Computing Center
- 2 Optical Module Deployment Considerations
- 3 Matching Strategies for Optical Modules and Devices in IB Networking
- 4 Matching Strategies for Optical Modules and Devices in RoCE Networking
- 5 Deployment and Verification Process
- 6 Typical Case Reference
- 7 Summary and Recommendations
Show All
Collapse