Seems to be fixed.
On 7/14/08, Lenny Verkhovsky <lenny.verkhov...@gmail.com> wrote: > > ../configure --with-memory-manager=ptmalloc2 --with-openib > > I guess not. I always use same configure line, and only recently I started > to see this error. > > On 7/13/08, Jeff Squyres <jsquy...@cisco.com> wrote: >> >> I think you said opposite things: Lenny's command line did not >> specifically ask for ibcm, but it was used anyway. Lenny -- did you >> explicitly request it somewhere else (e.g., env var or MCA param file)? >> >> I suspect that you did not; I suspect (without looking at the code again) >> that ibcm tried to select itself and failed on the ibcm_listen() call, so it >> fell back to oob. This might have to be another workaround in OMPI, perhaps >> something like this: >> >> if (ibcm_listen() fails) >> if (ibcm explicitly requested) >> print_warning() >> fail to use ibcm >> >> Has this been filed as a bug at openfabrics.org? I don't think that I >> filed it when Brad and I were testing on RoadRunner -- it would probably be >> good if someone filed it. >> >> >> >> On Jul 13, 2008, at 8:56 AM, Lenny Verkhovsky wrote: >> >> Pasha is right, I didn't disabled it. >>> >>> On 7/13/08, Pavel Shamis (Pasha) <pa...@dev.mellanox.co.il> wrote: Jeff >>> Squyres wrote: >>> Brad and I did some scale testing of IBCM and saw this error sometimes. >>> It seemed to happen with higher frequency when you increased the number of >>> processes on a single node. >>> >>> I talked to Sean Hefty about it, but we never figured out a definitive >>> cause or solution. My best guess is that there is something wonky about >>> multiple processes simultaneously interacting with the IBCM kernel driver >>> from userspace; but I don't know jack about kernel stuff, so that's a total >>> SWAG. >>> >>> Thanks for reminding me of this issue; I admit that I had forgotten about >>> it. :-( Pasha -- should IBCM not be the default? >>> It is not default. I guess Lenny configured it explicitly, is not it ? >>> >>> Pasha. >>> >>> >>> >>> >>> >>> On Jul 13, 2008, at 7:08 AM, Lenny Verkhovsky wrote: >>> >>> Hi, >>> >>> I am getting this error sometimes. >>> >>> /home/USERS/lenny/OMPI_COMP_PATH/bin/mpirun -np 100 -hostfile >>> /home/USERS/lenny/TESTS/COMPILERS/hostfile >>> /home/USERS/lenny/TESTS/COMPILERS/hello >>> [witch24][[32428,1],96][../../../../../ompi/mca/btl/openib/connect/btl_openib_connect_ibcm.c:769:ibcm_component_query] >>> failed to ib_cm_listen 10 times: rc=-1, errno=22 >>> Hello world! I'm 0 of 100 on witch2 >>> >>> >>> Best Regards >>> >>> Lenny. >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >> >> >> -- >> Jeff Squyres >> Cisco Systems >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > >