Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Sayantan Sur wrote: Hello Roger, I'm just CC-ing this to openib-general for the community. Thanks for giving us access. I have verified that the `ibv_get_device_list' verb is indeed *missing* from the OpenIB install. I'm afraid that given this Redhat rpm, it is difficult to get mvapich to work (without patching it). As Roland and others have indicated, perhaps the best way is for you to upgrade to atleast the 1.0 branch. That should be the most stable OpenIB release yet. https://openib.org/svn/gen2/branches/1.0/src/userspace/ You should be able to keep the kernel stuff intact and just upgrade the user level support (management, libibverbs, libmthca). You may skip upgrading management, however it'll be best to upgrade it too, lest you face any OpenSM issues. Thanks, Sayantan. I now have the machines running RHEL4U3 + kernel.org 2.6.16.5 + the Openib 1.0 userspace, given that the RPM spec files did work for the openib tools that made things pretty simple, and have a resonable set of rpms and tar files to execute the kernel+userspace update. I have succeeded in getting OpenMPI to compile and execute HPL under raw IB, and so far I am getting reasonable results and no corruption Mvapich compiles but appears to not have made the mpirun version for Infiniband, and yells about that when attempting to start HPL, I have not yet looked at that in detail to see what the nature of the failure is. Roger ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Hi Roger, Mvapich compiles but appears to not have made the mpirun version for Infiniband, and yells about that when attempting to start HPL, I have not yet looked at that in detail to see what the nature of the failure is. Thanks for reporting this. Infact, just today we have fixed this in the MVAPICH trunk. This problem was reported by another user on mvapich-discuss. http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2006-April/98.html If this was the error you got, we'll be glad if you could just `svn up' your tree and give it a shot. Please let us know if this worked for you. Thanks, Sayantan. Roger -- http://www.cse.ohio-state.edu/~surs ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Sayantan Sur wrote: Hi Roger, Mvapich compiles but appears to not have made the mpirun version for Infiniband, and yells about that when attempting to start HPL, I have not yet looked at that in detail to see what the nature of the failure is. Thanks for reporting this. Infact, just today we have fixed this in the MVAPICH trunk. This problem was reported by another user on mvapich-discuss. http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2006-April/98.html If this was the error you got, we'll be glad if you could just `svn up' your tree and give it a shot. Please let us know if this worked for you. Thanks, Sayantan. Yeap, that is what I saw. I will try the newer version Monday. Roger ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Sayantan Sur wrote: Hello Roger, With mvapich-0.9.7 it errors out in the building stage with an error ibv_free_device_list/ibv_get_device_list missing, I cannot find any of the ib libraries on RHEL4U3 that appear to contain that library. Thanks for trying out MVAPICH-0.9.7. Currently, we don't have any machine with RHEL4U3. We are installing two machines with RHEL4U3 and we will try out MVAPICH on that as soon as possible. The verbs `ibv_get_device_list' was introduced before the 1.0 branch. So, if you have either OpenIB installed from the trunk or from the 1.0 branch, you _should_ be able to see this verb in the library. I am wondering if you are trying out the default versions of the OpenIB rpms on RHEL4U3? Yes, I am trying the default version of RHEL4U3, alot of our customers would much rather use unmodified RHEL, though I can probably talk them out of it with a bit of work. They have some strange ideas that RHEL is somehow guaranteed to work right, and from what I can tell it won't completely work just because RH did not include a IB mpi variant, at least not one that I can find. Using the mvapich-gen2-1.src.rpm from openib.org results in these errors (on the first thing it tries to compile). viainit.c: In function `create_cq': viainit.c:118: error: too few arguments to function `ibv_create_cq' This is also due to a verb change made a while back to the ibv_create_cq. I believe this version of mvapich-gen2 source rpm was created against the version of userspace support which is present in the very same .src.rpm (you may install those if you want, though they are a little old now). The userspace verbs changed after this src rpm was created. I have verified that the include file prototype has more arguments, than are contained in viainit.c. Yes, it seems that the RPM you have installed is from somewhere in between the ibv_create_cq verb change and the later introduction of the ibv_get_device list verb. I'm wondering if you could try it out with the latest 1.0 branch of OpenIB? In addition, we will get back to you asap with our testing on RHEL4U3. Thanks, Sayantan. Do you know if it would be possible to just replace the userspace section and not mess with the kernel part of OpenIB? I am guessing from what I have read that this is very possible, and only requires me to remove the already existing RHEL rpms for OpenIB userspace support. Thank you very much. If you guys need access I have 2 test machines that I can give access to to do whatever testing is needed. Roger ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Hello Roger, Do you know if it would be possible to just replace the userspace section and not mess with the kernel part of OpenIB? I am guessing from what I have read that this is very possible, and only requires me to remove the already existing RHEL rpms for OpenIB userspace support. IMHO, it should be possible. However, OpenIB userspace and kernel module authors should be able to exactly answer this question. Roland, any thoughts on which SVN version of userspace support may work with the RHEL default RPMs? Thank you very much. If you guys need access I have 2 test machines that I can give access to to do whatever testing is needed. That's great! You can send the login information to me. Thanks, Sayantan. -- http://www.cse.ohio-state.edu/~surs ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Sayantan Roland, any thoughts on which SVN version of userspace Sayantan support may work with the RHEL default RPMs? Any version should work. It might be simpler to use stable releases such as libibverbs-1.0.2 and libmthca-1.0.1. - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Yes, I am trying the default version of RHEL4U3, alot of our customers would much rather use unmodified RHEL, though I can probably talk them out of it with a bit of work. They have some strange ideas that RHEL is somehow guaranteed to work right, and from what I can tell it won't completely work just because RH did not include a IB mpi variant, at least not one that I can find. I didn't try MVAPICH, but I had no luck getting Open MPI 1.0.1 to work with the RHEL4 U3 OpenIB code. The RHEL4 U3 relnotes are pretty clear that its included OpenIB is a technology preview not for production environments, and the APIs are subject to change (which they already did comparing RHEL4 U3 to OF 1.0). I think you are much better off trying the OF 1.0 code. Scott Weitzenkamp SQA Manager Cisco Systems ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Scott wrote, I didn't try MVAPICH, but I had no luck getting Open MPI 1.0.1 to work with the RHEL4 U3 OpenIB code. Not sure if you are interested in a comercial MPI or not, but we did test Intel MPI with the RHEL4-U3 code and it worked fine, except on Mellanox DDR cards. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Hello Roger, I'm just CC-ing this to openib-general for the community. Thanks for giving us access. I have verified that the `ibv_get_device_list' verb is indeed *missing* from the OpenIB install. I'm afraid that given this Redhat rpm, it is difficult to get mvapich to work (without patching it). As Roland and others have indicated, perhaps the best way is for you to upgrade to atleast the 1.0 branch. That should be the most stable OpenIB release yet. https://openib.org/svn/gen2/branches/1.0/src/userspace/ You should be able to keep the kernel stuff intact and just upgrade the user level support (management, libibverbs, libmthca). You may skip upgrading management, however it'll be best to upgrade it too, lest you face any OpenSM issues. Thanks, Sayantan. * On Apr,4 Sayantan Sur[EMAIL PROTECTED] wrote : Hello Roger, Do you know if it would be possible to just replace the userspace section and not mess with the kernel part of OpenIB? I am guessing from what I have read that this is very possible, and only requires me to remove the already existing RHEL rpms for OpenIB userspace support. IMHO, it should be possible. However, OpenIB userspace and kernel module authors should be able to exactly answer this question. Roland, any thoughts on which SVN version of userspace support may work with the RHEL default RPMs? Thank you very much. If you guys need access I have 2 test machines that I can give access to to do whatever testing is needed. That's great! You can send the login information to me. Thanks, Sayantan. -- http://www.cse.ohio-state.edu/~surs ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general -- http://www.cse.ohio-state.edu/~surs ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Trying to compile mvapich RHEL4U3 for ib.
I am not having much luck with the default RHEL4U3 setup. I appear to have IP over IB running and appearing to work, but am unable to get any mpi variants to work directly with IB, I do have it working over tcp with ch_p4. With mvapich-0.9.7 it errors out in the building stage with an error ibv_free_device_list/ibv_get_device_list missing, I cannot find any of the ib libraries on RHEL4U3 that appear to contain that library. Using the older mvapich-0.9.6 there is no option to make an ib_gen2 version, and there does not appear to be any ch_gen2 device code. Using the mvapich-gen2-1.src.rpm from openib.org results in these errors (on the first thing it tries to compile). viainit.c: In function `create_cq': viainit.c:118: error: too few arguments to function `ibv_create_cq' I have verified that the include file prototype has more arguments, than are contained in viainit.c. Trying to use openmpi produced different but still failing results, everything compiled and linked and HPL would start but never produced any output from HPL itself, it did produce some things that looked like internal openmpi errors. Any suggestion on what I am missing, or if there is another version that will work? It looks like there must be alot of API differences between the different variants that I have. Roger ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Trying to compile mvapich RHEL4U3 for ib.
Hello Roger, With mvapich-0.9.7 it errors out in the building stage with an error ibv_free_device_list/ibv_get_device_list missing, I cannot find any of the ib libraries on RHEL4U3 that appear to contain that library. Thanks for trying out MVAPICH-0.9.7. Currently, we don't have any machine with RHEL4U3. We are installing two machines with RHEL4U3 and we will try out MVAPICH on that as soon as possible. The verbs `ibv_get_device_list' was introduced before the 1.0 branch. So, if you have either OpenIB installed from the trunk or from the 1.0 branch, you _should_ be able to see this verb in the library. I am wondering if you are trying out the default versions of the OpenIB rpms on RHEL4U3? Using the mvapich-gen2-1.src.rpm from openib.org results in these errors (on the first thing it tries to compile). viainit.c: In function `create_cq': viainit.c:118: error: too few arguments to function `ibv_create_cq' This is also due to a verb change made a while back to the ibv_create_cq. I believe this version of mvapich-gen2 source rpm was created against the version of userspace support which is present in the very same .src.rpm (you may install those if you want, though they are a little old now). The userspace verbs changed after this src rpm was created. I have verified that the include file prototype has more arguments, than are contained in viainit.c. Yes, it seems that the RPM you have installed is from somewhere in between the ibv_create_cq verb change and the later introduction of the ibv_get_device list verb. I'm wondering if you could try it out with the latest 1.0 branch of OpenIB? In addition, we will get back to you asap with our testing on RHEL4U3. Thanks, Sayantan. -- http://www.cse.ohio-state.edu/~surs ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general