Re: [OMPI devel] Heterogeneous OpenFabrics hardware

Jeff Squyres Tue, 27 Jan 2009 08:15:45 -0500

On Jan 26, 2009, at 4:46 PM, Jeff Squyres wrote:

Note that I did not say that. I specifically stated that OMPIfailed and it is due to the fact that we are customizing for theindividual hardware devices. To be clear: this is an OMPI issue.I'm asking (at the request of the IWG) if anyone cares about fixingit.

I should clarify something in this discussion: Open MPI is *capable*of running in heterogeneous OpenFabrics hardware (assuming IB <--> IBand iWARP <--> iWARP, of course -- not IB <--> iWARP) as long as it isconfigured to use the same verbs/hardware configuration on all thehardware. Depending on the hardware, Open MPI may not be configuredto run this way by default because it may choose to customizedifferently for different HCAs/RNICs.

However, if one manually configures Open MPI to use the same verbs/hardware configuration values across all the HCAs/RNICs in yourcluster, Open MPI will probably work fine. If Open MPI doesn't workin this kind of configuration, it may indicate some kind of vendor HCA/RNIC incompatibility.

Case in point: I regression test "limited heterogeneous" scenarios onmy MPI testing cluster at Cisco every night. Specifically, I have avariety of different models of Mellanox HCAs and they all interoperatejust fine across 2 air-gapped IB subnets. I don't know if anyone hastested with wildly different HCAs/RNICs using some lowest-commondenominator verbs/hardware configuration values (i.e., some set ofvalues that is supported by all HCAs/RNICs) to see if OMPI works. Idon't immediately see why that wouldn't work, but I haven't tested itmyself.

Out of the box, however, Open MPI is not necessarily configured tohave the same verbs/hardware configuration for each device. That iswhat may fail by default.


--
Jeff Squyres
Cisco Systems

Re: [OMPI devel] Heterogeneous OpenFabrics hardware

Reply via email to