Guys, I might be misunderstanding, but: > 1. test case works when using verbs using Mellanox only > 2. test case works ok when we use PSM on Qlogic only > 3. test case fails when using verbs between Mellanox and Qlogic > 4. test case fails when using verbs on Qlogic
Doesn't this show that verbs / MPI fail full stop with Qlogic? That's doesn't look like vendor interoperability so much as a bug with verbs and Qlogic hardware. Ross (Infiniband newbie) On Sun, Jun 28, 2009 at 7:03 PM, Scott A. Friedman<[email protected]> wrote: > We have had several tickets submitted by users since we have started adding > Qlogic 7240 cards into our cluster which is mostly Mellanox (we have a > couple different cards). We have looked at the codes (MPI based) and they do > run fine when the Qlogic cards are excluded. Qlogic suggests using PSM or > IPoIB on our cluster - both of which seem like a punt to us as PSM doesn't > make sense with Mellanox and IPofIB is not a solution. > > Right now, we are trying to figure out where the problem is - it is not at > the application level as we have distilled down to a specific case which > will cause a problem (MPI all-to-all, for example). However, some things > seem clearer to us. > > 1. test case works when using verbs using Mellanox only > 2. test case works ok when we use PSM on Qlogic only > 3. test case fails when using verbs between Mellanox and Qlogic > 4. test case fails when using verbs on Qlogic > > Is this a verb level issue with the ipath stuff or an mpi problem? Or, is > the issue someplace else? There had been some discussion of a mixed > environment early this year on the OMPI list but the thread petered out. > > We would be happy to share our failing test case with whomever does the > interop testing - if it could shed some light on the problem we see. > > The point is that we would like to know that different IB cards work > together (like ethernet) so we can have a choice. > > Sean Hefty wrote: >>> >>> Is a mixed HCA environment cluster not ready for prime time - yet? >> >> Are the crashes in the kernel or userspace? Is there a specific HCA on >> the >> nodes that crash? >> >> Interop testing is done, but I do not know the details of the >> configurations and >> tests that are run. > > _______________________________________________ > general mailing list > [email protected] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
