Paul Kapinos <kapi...@rz.rwth-aachen.de> writes: > Jeff, I would turn the question the other way around: > > - are there any penalties when using KNEM?
Bull should be able to comment on that -- they turn it on by default in their proprietary OMPI derivative -- but I doubt I can get much of a story on it. Mellanox ship it now too, but I don't know if their distribution defaults to using it. I expect to use knem on hardware that's essentially the same as Mark's. If any issues appear in production, I'll be surprised and will report them. > We have a couple of Really Big Nodes (128 cores) with non-huge memory > bandwidth (because coupled of 4x standalone nodes with 4 sockets > each). I was hoping to have some results for just such a setup, but haven't been able to spend any time on it this week. If there are any suggestions for OMPI tuning on it I'd be interested. > So cutting the bandwidth in halves on these nodes sound like > Very Good Thing. > > But otherwise we've 1500+ nodes with 2 sockets and 24GB memory only > and we do not wanna to disturb the production on these nodes.... (and > different MPI versions for different nodes are doofy). Why would you need that? Our horribly heterogeneous cluster just has a node group-specific openmpi-mca-params.conf, and SGE parallel environments keep jobs in specific host groups with basically the same CPU speed and interconnect. > > Best > > Paul