Re: [OMPI devel] application hangs with multiple dup

2009-09-15 Thread Thomas Ropars
nd and but the data is not in any of the queues on the receiver side), which seems to be consistent with two other bug reports currently being discussed on the mailing list. I could reproduce the hang with both sm and tcp, so its probably not a btl issue but somewhere higher. Thanks Edgar Tho

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
, there is no difference. I don't know if it can help but : I've first had the problem when launching bt.A.4 and sp.A.4 of the NAS Parallel Benchmarks (3.3 version). Thomas Thanks Edgar Thomas Ropars wrote: Ashley Pittman wrote: On Wed, 2009-09-09 at 17:44 +0200, Thomas Ropars wrote:

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
Ashley Pittman wrote: On Wed, 2009-09-09 at 17:44 +0200, Thomas Ropars wrote: Thank you. I think you missed the top three lines of the output but that doesn't matter. main() at ?:? PMPI_Comm_dup() at pcomm_dup.c:62 ompi_comm_dup() at communicator/comm.

Re: [OMPI devel] application hangs with multiple dup

2009-09-09 Thread Thomas Ropars
Ashley Pittman wrote: On Tue, 2009-09-08 at 15:00 +0200, Thomas Ropars wrote: Hi, I'm working on r21949 of the trunk. When I run on a single node with 4 processes this simple program calling 2 times MPI_Comm_dup , the processes hang from time to time in the 2nd dup. I

[OMPI devel] application hangs with multiple dup

2009-09-08 Thread Thomas Ropars
Hi, I'm working on r21949 of the trunk. When I run on a single node with 4 processes this simple program calling 2 times MPI_Comm_dup , the processes hang from time to time in the 2nd dup. int main(int argc, char *argv[]) { MPI_Comm comm,comm2; MPI_Init(&argc, &argv); MPI_Comm_dup(MPI_CO

[OMPI devel] segmentation fault when trying to connect processes from different jobs (r20888 of the trunk)

2009-03-29 Thread Thomas Ropars
base_full_modex (grpcomm_base_modex.c:201) ==3031==by 0x7B30678: modex (grpcomm_bad_module.c:381) ==3031==by 0x936FE2B: connect_accept (dpm_orte.c:377) But if the machinefile contains exactly the number of machines needed by the application, it works. Best regards, Thomas Ropars

[OMPI devel] bug in odls_base_default_fns.c

2009-01-22 Thread Thomas Ropars
Hi, I don't manage to run any application with r20318 of the trunk :( I always get the following message: [[24867,0],0] ORTE_ERROR_LOG: Value out of bounds in file base/odls_base_default_fns.c at line 1223 It seems that the modification of odls_base_default_fns.c in r20312 introduces some pro

Re: [OMPI devel] problem compiling r20196

2009-01-06 Thread Thomas Ropars
longer need that include file anyway - so I have removed it. Hopefully, that should let you build. Yes it's ok now. Thanks, Thomas Ralph On Jan 5, 2009, at 12:08 PM, Jeff Squyres wrote: Is there some other file that should be included instead? On Jan 5, 2009, at 1:16 PM, Thomas Ropars

[OMPI devel] problem compiling r20196

2009-01-05 Thread Thomas Ropars
Hi, I don't manage to compile the code from the svn r20196. I get the following error: pstat_linux_module.c:34:73: error: asm/page.h: No such file or directory make[2]: *** [pstat_linux_module.lo] Error 1 It seems that it is because new Linux kernels no longer install asm/page.h (I use a 2.6.2