[OMPI devel] Crash when using MPI_REAL8

2009-12-03 Thread Sylvain Jeaugey

Hi list,

I hope this time I won't be the only one to suffer this bug :)

It is very simple indeed, just perform an allreduce with MPI_REAL8 
(fortran) and you should get a crash in ompi/op/op.h:411. Tested with 
trunk and v1.5, working fine on v1.3.


From what I understand, in the trunk, MPI_REAL8 has now a fixed id (in 
ompi/datatype/ompi_datatype_internal.h), but operations do not have an 
index going as far as 54 (0x36), leading to a crash when looking for 
op->o_func.intrinsic.fns[ompi_op_ddt_map[ddt->id]] in ompi_op_is_valid() 
(or, if I disable mpi_param_check, in ompi_op_reduce()).


Here is a reproducer, just in case :
program main
 use mpi
 integer ierr
 real(8) myreal, realsum
 call MPI_INIT(ierr)
 call MPI_ALLREDUCE(myreal, realsum, 1, MPI_REAL8, MPI_SUM, MPI_COMM_WORLD, 
ierr)
 call MPI_FINALIZE(ierr)
 stop
end

Has anyone an idea on how to fix this ? Or am I doing something wrong ?

Thanks for any help,
Sylvain




Re: [OMPI devel] Deadlocks with new (routed) orted launch algorithm

2009-12-03 Thread Sylvain Jeaugey
Too bad. But no problem, that's very nice of you to have spent so much 
time on this.


I wish I knew why our experiments are so different, maybe we will find out 
eventually ...


Sylvain

On Wed, 2 Dec 2009, Ralph Castain wrote:


I'm sorry, Sylvain - I simply cannot replicate this problem (tried yet another 
slurm system):

./configure --prefix=blah --with-platform=contrib/platform/iu/odin/debug

[rhc@odin ~]$ salloc -N 16 tcsh
salloc: Granted job allocation 75294
[rhc@odin mpi]$ mpirun -pernode ./hello
Hello, World, I am 1 of 16
Hello, World, I am 7 of 16
Hello, World, I am 15 of 16
Hello, World, I am 4 of 16
Hello, World, I am 13 of 16
Hello, World, I am 3 of 16
Hello, World, I am 5 of 16
Hello, World, I am 8 of 16
Hello, World, I am 0 of 16
Hello, World, I am 9 of 16
Hello, World, I am 12 of 16
Hello, World, I am 2 of 16
Hello, World, I am 6 of 16
Hello, World, I am 10 of 16
Hello, World, I am 14 of 16
Hello, World, I am 11 of 16
[rhc@odin mpi]$ setenv ORTE_RELAY_DELAY 1
[rhc@odin mpi]$ mpirun -pernode ./hello
[odin.cs.indiana.edu:15280] [[28699,0],0] delaying relay by 1 seconds
[odin.cs.indiana.edu:15280] [[28699,0],0] delaying relay by 1 seconds
[odin.cs.indiana.edu:15280] [[28699,0],0] delaying relay by 1 seconds
[odin.cs.indiana.edu:15280] [[28699,0],0] delaying relay by 1 seconds
Hello, World, I am 2 of 16
Hello, World, I am 0 of 16
Hello, World, I am 3 of 16
Hello, World, I am 1 of 16
Hello, World, I am 4 of 16
Hello, World, I am 10 of 16
Hello, World, I am 7 of 16
Hello, World, I am 12 of 16
Hello, World, I am 6 of 16
Hello, World, I am 8 of 16
Hello, World, I am 5 of 16
Hello, World, I am 13 of 16
Hello, World, I am 11 of 16
Hello, World, I am 14 of 16
Hello, World, I am 9 of 16
Hello, World, I am 15 of 16
[odin.cs.indiana.edu:15280] [[28699,0],0] delaying relay by 1 seconds
[rhc@odin mpi]$ setenv ORTE_RELAY_DELAY 2
[rhc@odin mpi]$ mpirun -pernode ./hello
[odin.cs.indiana.edu:15302] [[28781,0],0] delaying relay by 2 seconds
[odin.cs.indiana.edu:15302] [[28781,0],0] delaying relay by 2 seconds
[odin.cs.indiana.edu:15302] [[28781,0],0] delaying relay by 2 seconds
[odin.cs.indiana.edu:15302] [[28781,0],0] delaying relay by 2 seconds
Hello, World, I am 2 of 16
Hello, World, I am 3 of 16
Hello, World, I am 4 of 16
Hello, World, I am 7 of 16
Hello, World, I am 6 of 16
Hello, World, I am 0 of 16
Hello, World, I am 1 of 16
Hello, World, I am 10 of 16
Hello, World, I am 5 of 16
Hello, World, I am 9 of 16
Hello, World, I am 8 of 16
Hello, World, I am 14 of 16
Hello, World, I am 13 of 16
Hello, World, I am 12 of 16
Hello, World, I am 11 of 16
Hello, World, I am 15 of 16
[odin.cs.indiana.edu:15302] [[28781,0],0] delaying relay by 2 seconds
[rhc@odin mpi]$

Sorry I don't have more time to continue pursuing this. I have no idea what is 
going on with your system(s), but it clearly is something peculiar to what you 
are doing or the system(s) you are running on.

Ralph


On Dec 2, 2009, at 1:56 AM, Sylvain Jeaugey wrote:


Ok, so I tried with RHEL5 and I get the same (even at 6 nodes) : when setting 
ORTE_RELAY_DELAY to 1, I get the deadlock systematically with the typical stack.

Without my "reproducer patch", 80 nodes was the lower bound to reproduce the 
bug (and you needed a couple of runs to get it). But since this is a race condition, your 
mileage may vary on a different cluster.

With the patch however, I'm in every time. I'll continue to try different 
configurations (e.g. without slurm ...) to see if I can reproduce it on much 
common configurations.

Sylvain

On Mon, 30 Nov 2009, Sylvain Jeaugey wrote:


Ok. Maybe I should try on a RHEL5 then.

About the compilers, I've tried with both gcc and intel and it doesn't seem to 
make a difference.

On Mon, 30 Nov 2009, Ralph Castain wrote:


Interesting. The only difference I see is the FC11 - I haven't seen anyone 
running on that OS yet. I wonder if that is the source of the trouble? Do we 
know that our code works on that one? I know we had problems in the past with 
FC9, for example, that required fixes.
Also, what compiler are you using? I wonder if there is some optimization issue 
here, or some weird interaction between FC11 and the compiler.
On Nov 30, 2009, at 8:48 AM, Sylvain Jeaugey wrote:

Hi Ralph,
I'm also puzzled :-)
Here is what I did today :
* download the latest nightly build (openmpi-1.7a1r22241)
* untar it
* patch it with my "ORTE_RELAY_DELAY" patch
* build it directly on the cluster (running FC11) with :
./configure --platform=contrib/platform/lanl/tlcc/debug-nopanasas --prefix=
make && make install
* deactivate oob_tcp_if_include=ib0 in openmpi-mca-params.conf (IPoIB is broken 
on my machine) and run with :
salloc -N 10 mpirun ./helloworld
And .. still the same behaviour : ok by default, deadlock with the typical 
stack when setting ORTE_RELAY_DELAY to 1.
About my previous e-mail, I was wrong about all components having a 0 priority : it was 
based on default parameters reported by "ompi_info -a | grep routed". It seems 
that the 

Re: [OMPI devel] OPEN-MPI Fault-Tolerance for GASNet

2009-12-03 Thread Chang IL Yoon
Dear Josh and Paul.

First of all, thank you very much for your interesting on my problem.

1) I tested it again with MPIRUN_CMD as 'mpirun -am ft-enable-cr -np %N %P'
   But the checkpoint did not work.

2) Here are the more information on my MPI configuration.
 - What version of Open MPI are you using?
   >> I am using Open-MPI ver 1.3.3 with BLCR ver 0.8.2

 - How did you configure Open MPI?
   >> ./configure --enable-ft-thread --with-ft=cr --enable-mpi-threads
--with-blcr={BLCR_DIR} --with-blcr-libdir={BLCR_LIBDIR}
--prefix={OPENMPI_DIR}

 - What arguments are being passed to 'mpirun' when running with GASNet?
   >> mpirun -am ft-enable-cr --machinefile ./machinefile -np 1 ./personal
   >> personal is the same probram, my-app.c except for using gasnet_init
and gasnet_exit() instead of MPI_Init() and MPI_Finalize().
   >> my-app.c is in http://osl.iu.edu/research/ft/ompi-cr/examples.php.
   >> gasnet_init() and gasnet_exit() use MPI_Init() and MPI_Finalize().

 - Do you have any environment variables/MCA parameters set for Open MPI?
   >> yes
   $HOME/.openmpi/mca-params.conf
   # Local snapshot directory (not used in this scenario)
   crs_base_snapshot_dir=${HOME}/temp

   # Remote snapshot directory (globally mounted file system))
   snapc_base_global_snapshot_dir=${HOME}/checkpoints

 - My network interconnects is Infiniband/OpenIB (IP over IB).

3) If there are something for me to solve this problem, please let me know
without any hesitation.

Thank you again for your reading

Sincerely


On Tue, Dec 1, 2009 at 1:49 PM, Paul H. Hargrove  wrote:

> Thomas,
>
> I connection with Josh's question about mpirun arguments, I suggest you try
> setting
>MPIRUN_CMD='mpirun -am ft-enable-cr -np %N %P %A'
> in your environment before launching the GASNet application.  This will
> instruct GASNet's wrapper around mpirun to include the flag Josh mentioned.
>
> -Paul
>
>
> Josh Hursey wrote:
>
>> Thomas,
>>
>> I have not tried to use the checkpoint/restart feature with GASNet over
>> MPI, so I cannot comment directly on how they interact. However, the
>> combination should work as long as the proper arguments (-am ft-enable-cr)
>> are passed along to the mpirun command, and Open MPI is configured properly.
>>
>> The error message that you copied seems to indicate that the local daemon
>> on one of the nodes failed to start a checkpoint of the target application.
>> Often this is caused by one of two things:
>>  - Open MPI was not configured with the fault tolerance thread, and the
>> application is waiting for a long time in a computation loop (not entering
>> the MPI library).
>>  - The '-am ft-enable-cr' flag was not provided to the mpirun process, so
>> the MPI application did not activate the C/R specific code paths and is
>> therefore denying the request to checkpoint.
>>
>> Can you send me a bit more information:
>>  - What version of Open MPI are you using?
>>  - How did you configure Open MPI?
>>  - What arguments are being passed to 'mpirun' when running with GASNet?
>>  - Do you have any environment variables/MCA parameters set for Open MPI?
>>
>> -- Josh
>>
>> On Nov 22, 2009, at 7:13 PM, Thomas CI Yoon wrote:
>>
>>  Dear all.
>>>
>>> Thanks to developers of OPEN-MPI for Fault-Tolerance, I can use the
>>> checkpoint/restart function very well for my MPI applications.
>>> But its checkpoint does not work for my GASNet applications which use the
>>> MPI conduit.
>>> Is here anyone else to help me?
>>> I wrote some code with GASNet API (Global-Address Space Networking:
>>> http://gasnet.cs.berkeley.edu/)and used MPI conduit for my gasnet
>>> application, so my program ran well with open-mpirun. Thus I thought that I
>>> could also use the transparent checkpoint/restart function supported by BLCR
>>> in Open-mpi. As opposed to my idea, it does not work and show the following
>>> error message.
>>> --
>>>
>>> Error: The process with PID 13896 is not checkpointable.
>>>   This could be due to one of the following:
>>>- An application with this PID doesn't currently exist
>>>- The application with this PID isn't checkpointable
>>>- The application with this PID isn't an OPAL application.
>>>   We were looking for the named files:
>>> /tmp/opal_cr_prog_write.13896
>>> /tmp/opal_cr_prog_read.13896
>>> --
>>>
>>> 1 more process has sent help message help-opal-checkpoint.txt
>>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help
>>>  0] 13896) Step 53
>>>  0] 15100) Step 53
>>>  0] 13896) Step 54
>>>  0] 15100) Step 54
>>>  0] 13896) Step 55
>>>
>>> In my application, the MPI_Initialized() says it is initialized.
>>>
>>> Thank you for your reading and have a great day.
>>>
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/ma