The cuda support in the 1.7 series has been evolving - a number of patches have been applied since 1.7.3 was released, and I see another (for optimization) scheduled.
You might try the 1.7.4 nightly tarball and see if the problem has been fixed. On Nov 24, 2013, at 7:11 AM, Jörg Bornschein <j...@capsec.org> wrote: > On 23.11.2013, at 22:56, Dmitry N. Mikushin <maemar...@gmail.com> wrote: > >> VT is getting out of sync with CUDA from time to time, this already >> happened before. > > Yes, thats what I thought and thats why I didn’t mention it as my main issue. > > > > I’m rather stuck because cuda support and ob1 don’t seem to fit together — at > least on my systems. > > > j > > > >> - D. >> >> >> 2013/11/24 Jörg Bornschein <j...@capsec.org>: >>> On 23.11.2013, at 21:42, Jörg Bornschein <j...@capsec.org> wrote: >>> >>> Sorry, >>> >>>> I’m typically compiling with >>>> >>>> ./configure —with-cuda >>> >>> >>> I’m actually compiling with >>> >>> ./configure —with-cuda —disable-vt >>> >>> because otherwise I get a compile time error: >>> >>> make[5]: Entering directory >>> `/u/bornj/software-old/src/openmpi-1.7.3/ompi/contrib/vt/vt/vtlib' >>> CC libvt_la-vt_cudart.lo >>> CC libvt_mpi_la-vt_pform_linux.lo >>> CC libvt_mpi_la-vt_thrd.lo >>> CC libvt_mpi_la-vt_trc.lo >>> CC libvt_mpi_la-vt_user_comment.lo >>> CC libvt_mpi_la-vt_user_control.lo >>> CC libvt_mpi_la-vt_user_count.lo >>> CC libvt_mpi_la-vt_user_marker.lo >>> vt_cudart.c: In function 'cudaLaunch': >>> vt_cudart.c:2725:15: error: 'vt_cupti_events_enabled' undeclared (first use >>> in this function) >>> vt_cudart.c:2725:15: note: each undeclared identifier is reported only once >>> for each function it appears in >>> >>> >>> >>> j >>> >>> >>> >>>> but I tried combining it with various other options. OMPI builds fine, but >>>> when I try to run programs compiled against it I always get: >>>> >>>> /a.out: symbol lookup error: /usr/local/lib/openmpi/mca_pml_ob1.so: >>>> undefined symbol: progress_one_cuda_htod_event >>>> >>>> That error even seems to make sense, because the code in ompi/mca/pml/ob1/ >>>> refers to common_cuda.[ch], but it does not >>>> seem to link against it's dynamic binary. >>>> >>>> Am I missing something? >>>> >>>> >>>> Thanks! >>>> >>>> >>>> jb >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel