I still have the guide from that system, and I saved some of the routing 
scripts and what not. But really, it wasn’t much more complicated than Ethernet 
routing.

The routing nodes, I guess obviously, had both Omnipath and Infiniband 
interfaces. Compute knifes themselves I believe used a supervisord script, if 
I’m remembering that name right, to try to balance out which routing nide ione 
would use as a gateway. There were two as it was configured when I got to it, 
but a larger number was possible.

It seems to me that there was probably a better way to do that, but it did 
work. The read/write rates were not as fast as our fully Inifniband clusters, 
but they were fast enough.

The cluster was Caliburn, which was in the top 500 for some time, so there may 
be some papers and whatnot written on it before we inherited it. If there’s 
something specific you want to know, I could probably dig it up.

Sent from my iPhone

On Aug 21, 2023, at 14:48, Kidger, Daniel <daniel.kid...@hpe.com> wrote:


Ryan,

This sounds very interesting.
Do you have more details or references of how they connected together, and what 
any pain points were?

Daniel


From: gpfsug-discuss <gpfsug-discuss-boun...@gpfsug.org> On Behalf Of Ryan 
Novosielski
Sent: 21 August 2023 19:07
To: gpfsug main discussion list <gpfsug-discuss@gpfsug.org>
Cc: gpfsug-disc...@spectrumscale.org
Subject: Re: [gpfsug-discuss] Joining RDMA over different networks?

If I understand what you’re asking correctly, we used to have a cluster that 
did this. GPFS was on Infininiband, some of the compute nodes were too, and the 
rest were on Omnipath. There were routers in between with both types.
Sent from my iPhone


On Aug 21, 2023, at 13:55, Kidger, Daniel 
<daniel.kid...@hpe.com<mailto:daniel.kid...@hpe.com>> wrote:


I know in the Lustre world that LNET routers are used to provide RDMA over 
heterogeneous networks.

Is there an equivalent for Storage Scale?
eg if an ESS uses Infiniband to connect directly to Cluster A, could that 
InfiniBand RDMA fabric be “routed” to ClusterB that has RoCE connecting all its 
nodes together and hence the filesystem mounted?

ps. The same question would apply to other usually incompatible RDMA networks 
like Omnipath, Slingshot, Cornelis, … ?

Daniel

Daniel Kidger
HPC Storage Solutions Architect, EMEA
daniel.kid...@hpe.com<mailto:daniel.kid...@hpe.com>

+44 (0)7818 522266

hpe.com<http://www.hpe.com/>

<image001.png>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

Reply via email to