[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-04-01 Thread Hongjia Cao
I can not reproduce the problem. It seems that there is a buffer overflow in PMI2 client of MPICH. 在 2013-03-31日的 18:08 -0600,Christoph Sprenger写道: > sorry... here is a the complete trace: > > *** buffer overflow detected ***: > /vol/bob/check/csprenger/linux64/opt/bin/mpi_hello_world terminate

[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-04-01 Thread Christoph Sprenger
to follow up: after fixing an issue in the source of mpich2 simple2pmi.c ( which overruns a snprintf buffer ), the spawn interface started to work. however other things started to break ( eg the singleton mode, when no srun was provided ). Pavan Balaji directed me to these steps, which works

[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-03-31 Thread Christoph Sprenger
sorry... here is a the complete trace: *** buffer overflow detected ***: /vol/bob/check/csprenger/linux64/opt/bin/mpi_hello_world terminated === Backtrace: = /lib/libc.so.6(__fortify_fail+0x37)[0x7f194dc19217] /lib/libc.so.6(+0xfe0d0)[0x7f194dc180d0] /lib/libc.so.6(+0xfd7cb)[0x7f194d

[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-03-29 Thread Hongjia Cao
could you please paste the complete output/error messages? 在 2013-03-28四的 14:59 -0600,Christoph Sprenger写道: > pich.so.10(PMI2_Init+0x7ff)[0x7f5daff7806f] > /tech/home/csprenger/mpich-3.0.2_SLURM//lib/libmpich.so.10(MPID_Init > +0xac)[0x7f5daff371ac] > /tech/home/csprenger/mpich-3.0.2_SLURM//lib/l

[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-03-28 Thread Christoph Sprenger
Hi Yiannis, thanks for your reply, but unfortunately i still seem to having issues. i've rebuilt mpich2-3.0.2 ./configure --with-slurm=/local1/slurm-2.5.4_INSTALL/ --with-pmi=pmi2 --enable-pmiport --prefix=/local1/mpich-3.0.2_SLURM/ --enable-shared --enable-cxx ; now I'm crashing right away i

[slurm-dev] Re: MPI_comm_spawn with MPICH2 via slurm pm

2013-03-26 Thread yiannis georgiou
Hi Christoph, you need to make use of PMI2 version of slurm to test MPI_comm_spawn primitive of mpich2. In more detail, you have to rebuilt your mpich2 adding the following flags on your configure: --enable-pmiport --with-pmi=pmi2--with-slurm=$YOUR_SLURM and when you run jobs with slurm y