Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
Ashley Pittman wrote: On Wed, 2009-09-09 at 17:44 +0200, Thomas Ropars wrote: Thank you. I think you missed the top three lines of the output but that doesn't matter. main() at ?:? PMPI_Comm_dup() at pcomm_dup.c:62 ompi_comm_dup() at communicator/comm.c:661 -

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Edgar Gabriel
Two short questions: do you have any open MPI mca parameters set in a file or at runtime? And second, is there any difference if you disable the hierarch coll module (which does communicate additionally as well?) e.g. mpirun --mca coll ^hierarch -np 4 ./mytest Thanks Edgar Thomas Ropars wrot

Re: [OMPI devel] XML request

2009-09-10 Thread Jeff Squyres
On Sep 9, 2009, at 12:17 PM, Ralph Castain wrote: HmmmI never considered the possibility of output-filename being used that way. Interesting idea! That feels way weird to me -- for example, how do you know that you're actually outputting to a tty? FWIW: +1 on the idea of writing to nu

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Thomas Ropars
Edgar Gabriel wrote: Two short questions: do you have any open MPI mca parameters set in a file or at runtime? No And second, is there any difference if you disable the hierarch coll module (which does communicate additionally as well?) e.g. mpirun --mca coll ^hierarch -np 4 ./mytest No, the

Re: [OMPI devel] XML request

2009-09-10 Thread Jeff Squyres
Thinking about this a little more ... This all seems like Open MPI-specific functionality for Eclipse. If that's the case, don't we have an ORTE tools communication library that could be used? IIRC, it pretty much does exactly what you want and would be far less clumsy than trying to jury

Re: [OMPI devel] XML request

2009-09-10 Thread Greg Watson
Hi Jeff, The problem is that I'm not running the command from java (which has it's own issues), but rather the command is started by the ssh shell/ exec service. Unfortunately ssh only provides stdin, stdout, and stderr forwarding on fd's 0-2. There is no mechanism to do anything else. It

Re: [OMPI devel] XML request

2009-09-10 Thread Greg Watson
The most appealing thing about the XML option is that it just works "out of the box." Using a library API invariably requires compiling an agent or distributing pre-compiled binaries with all the associated complications. We tried that in the dim past and it was pretty unworkable. The other

[OMPI devel] Error while writing more than 2GB data at once to file

2009-09-10 Thread Markus Wittmann
Hello, at RRZE we tried to write > 2 GB data (per process) at once to a file with MPI_File_write_at(_all). Thereby the function returns with error code 35. Attached you will find the compressed output of "ompi_info --all" and a test program (large_mpi_test.F90) with that the problem can be repro

[OMPI devel] Fwd: [all-osl-users] OSL server reboots

2009-09-10 Thread Jeff Squyres
FYI -- "milliways" is www.open-mpi.org and "sourcehaven" is svn.open- mpi.org. So all web, SVN, Mercurial, and Trac services will be disrupted at this time (read: all Open MPI services). I'm guessing it'll only be a few minutes outage at the time indicated below. Begin forwarded message:

Re: [OMPI devel] XML request

2009-09-10 Thread Jeff Squyres
Greg and I chatted on the phone about this. I now understand much better about what he is trying to do (short version: Eclipse is running on one machine, it is opening an ssh session to a remote machine and launching mpirun on that remote machine). Results of the phone conversation (for th

Re: [OMPI devel] application hangs with multiple dup

2009-09-10 Thread Edgar Gabriel
so I can confirm that I can reproduce the hang, and we (George, Rainer and me) have looked into that and are continue digging. I hate to say that, but it looked to us as if messages were 'lost' (sender clearly called send and but the data is not in any of the queues on the receiver side), whic

Re: [OMPI devel] XML request

2009-09-10 Thread Greg Watson
Hi Jeff, I think that sums up the situation nicely. For item #2, I wonder if it would be better to still use "ssh mpirun ...", but have mpirun fork itself "under the covers"? Not having an extra executable in your distribution would probably make long term maintenance easier. If Ralph ca

Re: [OMPI devel] XML request

2009-09-10 Thread Jeff Squyres
I filed ticket #2019 pointing to this email thread in case someone ever wants to implement it. FWIW: I don't think it matters much whether it's implemented as part of mpirun or a new executable; I suspect that whatever implementation is easiest will be fine. On Sep 10, 2009, at 9:28 PM,