[OMPI users] Behaviour of MPI_Cancel when using 'large' messages

2010-06-07 Thread Gijsbert Wiesenekker
The following code tries to send a message, but if it takes too long the message is cancelled: #define DEADLOCK_ABORT (30.0) MPI_Isend(message, count, MPI_BYTE, comm_id, MPI_MESSAGE_TAG, MPI_COMM_WORLD, &request); t0 = time(NULL); cancelled = FALSE; while(TRUE) { //do som

Re: [OMPI users] [sge::tight-integration] slot scheduling and resources handling

2010-06-07 Thread Eloi Gaudry
Hi Reuti, I've been unable to reproduce the issue so far. Sorry for the convenience, Eloi On Tuesday 25 May 2010 11:32:44 Reuti wrote: > Hi, > > Am 25.05.2010 um 09:14 schrieb Eloi Gaudry: > > I do no reset any environment variable during job submission or job > > handling. Is there a simple wa

[OMPI users] ompi-restart, ompi-ps problem

2010-06-07 Thread Nguyen Kim Son
Hello, I'n trying to get functions like orte-checkpoint, orte-restart,... works but there are some errors that I don't have any clue about. Blcr (0.8.2) works fine apparently and I have installed openmpi 1.4.2 from source with option blcr. The command mpirun -np 4 -am ft-enable-cr ./checkpoint_

[OMPI users] ompi-restart failed

2010-06-07 Thread Nguyen Toan
Hello everyone, I'm using OpenMPI 1.4.2 with BLCR 0.8.2 to test checkpointing on 2 nodes but it failed to restart (Segmentation fault). Here are the details concerning my problem: + OS: Centos 5.4 + OpenMPI configure: ./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads \ --with-blcr=

Re: [OMPI users] ompi-restart failed

2010-06-07 Thread Nguyen Toan
Sorry, I just want to add 2 more things: + I tried configure with and without --enable-ft-thread but nothing changed + I also applied this patch for OpenMPI here and reinstalled but I got the same error https://svn.open-mpi.org/trac/ompi/raw-attachment/ticket/2139/v1.4-preload-part1.diff Somebody

Re: [OMPI users] Segmentation fault in MPI_Finalize with IB hardware and memory manager.

2010-06-07 Thread Jeff Squyres
George -- Scott's patch was different than the one you applied. Apparently, his fixes this user's problem (I don't know if Guillaume tested yours). Which one wins? On Jun 3, 2010, at 9:49 AM, Scott Atchley wrote: > On Jun 3, 2010, at 8:54 AM, guillaume ranquet wrote: > > > granquet@bordepl

Re: [OMPI users] Process doesn't exit on remote machine when using hostfile

2010-06-07 Thread Shiqing Fan
Hi, The hostfile seems working for me on my Windows XP machines, but it should be the same on Windows 7. The problem you had looks to me more like a synchronization problem. Could you send me your test program? Regards, Shiqing On 2010-5-25 11:41 AM, Rajnesh Jindel wrote: disabled the fir

Re: [OMPI users] Behaviour of MPI_Cancel when using 'large' messages

2010-06-07 Thread Jovana Knezevic
Hello Gijsbert, I had the same problem few months ago. I even could not cancel the messages for which I did not have a matching receive on the other side (thus, they could not have been received! :-)). I was wondering really what was going on... I have some experience with MPI, but I am not an exp

[OMPI users] segfault with -pernode on 1.4.2

2010-06-07 Thread S. Levent Yilmaz
Dear All, I recently installed 1.4.2 version, and am having a problem specific to this version only (or so it seems). Before I lay out the details please note that I am building 1.4.2 *exactly* the same as I built 1.4.1: same compiler options, same OpenIB and other system libraries, same configur

Re: [OMPI users] segfault with -pernode on 1.4.2

2010-06-07 Thread Ralph Castain
Thanks for reporting this. -pernode is just -npernode 1 - see the following ticket. Not sure when a fix will come out. https://svn.open-mpi.org/trac/ompi/ticket/2431 On Jun 7, 2010, at 4:27 PM, S. Levent Yilmaz wrote: > Dear All, > > I recently installed 1.4.2 version, and am having a probl