On 09/12/2021 16:04, Douglas O'flaherty wrote:
Though not directly about your design, our work with NVIDIA on GPUdirect
Storage and SuperPOD has shown how sensitive RDMA (IB & RoCE) to both
MOFED and Firmware version compatibility can be.
I would suggest anyone debugging RDMA issues should look at those closely.
May I ask what are the alleged benefits of using RDMA in GPFS?
I can see there would be lower latency over a plain IP Ethernet or IPoIB
solution but surely disk latency is going to swamp that?
I guess SSD drives might change that calculation but I have never seen
proper benchmarks comparing the two, or even better yet all four
connection options.
Just seems a lot of complexity and fragility for very little gain to me.
JAB.
--
Jonathan A. Buzzard Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss