Dear All, We have a cluster installing Lustre-2.12.4. We occationally encountered serious I/O slowing down. So I am asking how to fix this problem.
There are one MDT server and one OST server. Since our operating system is Debian-9.12, we installed Lustre by compiling from the source codes: - Operating system: Debian-9.12 - Linux kernel: 4.19.123 - Infiniband software: MLNX_OFED_SRC-debian-4.6-1.0.1.1 Infiniband hardware: FDR - MDT: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4 - OST: spl-0.7.13 + zfs-0.7.13 + (Infiniband software) + Lustre-2.12.4 - Client: (Infiniband software) + Lustre-2.12.4 - Some clients connect to MDT/OST through gigabit ethernet (because they don't have infiniband card), and the others connect through Infiniband. The ones connect through Infiniband only go through infiniband, since in these clients we set in /etc/modprobe.d/lustre.conf: options lnet networks="o2ib0(ib0)" In the following we only discuss the case of clients connecting through Infiniband. With this configuration, we occationally found abnormally I/O slowing down from one of the clients side to the Lustre file system. When that happens, the other clients are all normal. There is almost no loading in the whole cluster. We have done some tests, as showing below. 1. The timing of normal and abnormal cases are the following (datafile size: 577 MB): # time cat /lustre/filesystem/datafile > /dev/null normally: 0.265s abnormally: 0.560s # time cp /lustre/filesystem/datafile /lustre/another_dir/ normally: 1.0s abnormally: 60s or longer 2. We checked the dmesg in MDT, OST, and clients. There is no message at all. 3. We checked the Infiniband I/O perforamnce by "ib_write_bw". When in both the normal or abnormal situations, the performance of Infiniband I/O from the client to the OST are almost the same: ************************************ * Waiting for client to connect... * ************************************ --------------------------------------------------------------------------------------- RDMA_Write BW Test Dual-port : OFF Device : mlx4_0 Number of qps : 1 Transport type : IB Connection type : RC Using SRQ : OFF CQ Moderation : 100 Mtu : 2048[B] Link type : IB Max inline data : 0[B] rdma_cm QPs : OFF Data ex. method : Ethernet --------------------------------------------------------------------------------------- local address: LID 0x09 QPN 0x025a PSN 0x8ad56b RKey 0x10010200 VAddr 0x007f179e3f7000 remote address: LID 0x08 QPN 0x021b PSN 0x8f4018 RKey 0x8010200 VAddr 0x00149df8dc3000 --------------------------------------------------------------------------------------- #bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps] 65536 5000 5975.01 5974.74 0.095596 --------------------------------------------------------------------------------------- So it seems that this is not the problem of Infiniband connection, but probably the problem of Lustre file system. 4. We tried to test the LNet performance of the abnormal case by following: https://wiki.lustre.org/LNET_Selftest#Appendix:_LNET_Selftest_Wrapper Our parameters are: ======================================================================== #Output file ST=lst-output-$(date +%Y-%m-%d-%H:%M:%S) # Concurrency CN=64 #Size SZ=1M # Length of time to run test (secs) TM=30 # Which BRW test to run (read or write) BRW=read # Checksum calculation (simple or full) CKSUM=simple # The LST "from" list -- e.g. Lustre clients. Space separated list of NIDs. LFROM="192.168.12.1@o2ib" # The LST "to" list -- e.g. Lustre servers. Space separated list of NIDs. LTO="192.168.12.141@o2ib" ======================================================================== The results of the last 5 outputs are: [LNet Rates of lto] [R] Avg: 5497 RPC/s Min: 5497 RPC/s Max: 5497 RPC/s [W] Avg: 10994 RPC/s Min: 10994 RPC/s Max: 10994 RPC/s [LNet Bandwidth of lto] [R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s [W] Avg: 5497.74 MiB/s Min: 5497.74 MiB/s Max: 5497.74 MiB/s [LNet Rates of lfrom] [R] Avg: 11018 RPC/s Min: 11018 RPC/s Max: 11018 RPC/s [W] Avg: 5509 RPC/s Min: 5509 RPC/s Max: 5509 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 5508.93 MiB/s Min: 5508.93 MiB/s Max: 5508.93 MiB/s [W] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s [LNet Rates of lto] [R] Avg: 5508 RPC/s Min: 5508 RPC/s Max: 5508 RPC/s [W] Avg: 11015 RPC/s Min: 11015 RPC/s Max: 11015 RPC/s [LNet Bandwidth of lto] [R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s [W] Avg: 5507.83 MiB/s Min: 5507.83 MiB/s Max: 5507.83 MiB/s [LNet Rates of lfrom] [R] Avg: 10974 RPC/s Min: 10974 RPC/s Max: 10974 RPC/s [W] Avg: 5487 RPC/s Min: 5487 RPC/s Max: 5487 RPC/s [LNet Bandwidth of lfrom] [R] Avg: 5487.16 MiB/s Min: 5487.16 MiB/s Max: 5487.16 MiB/s [W] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s [LNet Rates of lto] [R] Avg: 5488 RPC/s Min: 5488 RPC/s Max: 5488 RPC/s [W] Avg: 10974 RPC/s Min: 10974 RPC/s Max: 10974 RPC/s [LNet Bandwidth of lto] [R] Avg: 0.84 MiB/s Min: 0.84 MiB/s Max: 0.84 MiB/s [W] Avg: 5487.36 MiB/s Min: 5487.36 MiB/s Max: 5487.36 MiB/s 5. Finally, we found that if we unplug and plug again the infiniband cable from the client side, the I/O performance is recovered to normal. I am asking what else could we do to fix this problem ? Any suggestions are very appreciated. Thank you very much. T.H.Hsieh _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org