Re: [OMPI devel] Strange intercomm_create, spawn, spawn_multiple hang on trunk
On Jun 6, 2014, at 12:50 PM, Rolf vandeVaart wrote: > Thanks for trying Ralph. Looks like my issues has to do with coll ml > interaction. If I exclude coll ml, then all my tests pass. Do you know if > there is a bug for this issue? There is a known issue with coll ml for intercomm_create - Nathan is working on a fix. It was reported by Gilles (yesterday?) > If so, then I can run my nightly tests with coll ml disabled and wait for the > bug to be fixed. > > Also, where does simple_spawn and spawn_multiple live? I have a copy/version in my orte/test/mpi directory that I use - that's where these came from. Note that I left coll ml "on" for those as they weren't having troubles. > I was running “spawn” and “spawn_multiple” from the ibm/dynamic test suite. > Your output for spawn_multiple looks different than mine. > > From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Friday, June 06, 2014 3:19 PM > To: Open MPI Developers > Subject: Re: [OMPI devel] Strange intercomm_create, spawn, spawn_multiple > hang on trunk > > Works fine for me: > > [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./simple_spawn > [pid 22777] starting up! > [pid 22778] starting up! > [pid 22779] starting up! > 1 completed MPI_Init > Parent [pid 22778] about to spawn! > 2 completed MPI_Init > Parent [pid 22779] about to spawn! > 0 completed MPI_Init > Parent [pid 22777] about to spawn! > [pid 22783] starting up! > [pid 22784] starting up! > Parent done with spawn > Parent sending message to child > Parent done with spawn > Parent done with spawn > 0 completed MPI_Init > Hello from the child 0 of 2 on host bend001 pid 22783 > Child 0 received msg: 38 > 1 completed MPI_Init > Hello from the child 1 of 2 on host bend001 pid 22784 > Child 1 disconnected > Parent disconnected > Parent disconnected > Parent disconnected > Child 0 disconnected > 22784: exiting > 22778: exiting > 22779: exiting > 22777: exiting > 22783: exiting > [rhc@bend001 mpi]$ make spawn_multiple > mpicc -g --openmpi:linkallspawn_multiple.c -o spawn_multiple > [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./spawn_multiple > Parent [pid 22797] about to spawn! > Parent [pid 22798] about to spawn! > Parent [pid 22799] about to spawn! > Parent done with spawn > Parent done with spawn > Parent sending message to children > Parent done with spawn > Hello from the child 0 of 2 on host bend001 pid 22803: argv[1] = foo > Child 0 received msg: 38 > Hello from the child 1 of 2 on host bend001 pid 22804: argv[1] = bar > Child 1 disconnected > Parent disconnected > Parent disconnected > Parent disconnected > Child 0 disconnected > [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 -mca coll ^ml ./intercomm_create > b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 3] > b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 4] > b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 5] > c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 3] > c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4] > c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5] > a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) > a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) > a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) > b: intercomm_create (0) > b: barrier on inter-comm - before > b: barrier on inter-comm - after > b: intercomm_create (0) > b: barrier on inter-comm - before > b: barrier on inter-comm - after > c: intercomm_create (0) > c: barrier on inter-comm - before > c: barrier on inter-comm - after > c: intercomm_create (0) > c: barrier on inter-comm - before > c: barrier on inter-comm - after > a: intercomm_create (0) > a: barrier on inter-comm - before > a: barrier on inter-comm - after > c: intercomm_create (0) > c: barrier on inter-comm - before > c: barrier on inter-comm - after > a: intercomm_create (0) > a: barrier on inter-comm - before > a: barrier on inter-comm - after > a: intercomm_create (0) > a: barrier on inter-comm - before > a: barrier on inter-comm - after > b: intercomm_create (0) > b: barrier on inter-comm - before > b: barrier on inter-comm - after > a: intercomm_merge(0) (0) [rank 2] > c: intercomm_merge(0) (0) [rank 8] > a: intercomm_merge(0) (0) [rank 0] > a: intercomm_merge(0) (0) [rank 1] > c: intercomm_merge(0) (0) [rank 7] > b: intercomm_merge(1) (0) [rank 4] > b: intercomm_merge(1) (0) [rank 5] > c: intercomm_merge(0) (0) [rank 6] > b: intercomm_merge(1) (0) [rank 3] > a: barrier (0) > b: barrier (0) > c: barrier (0) > a: barrier (0) > c: barrier (0) > b: barrier (0) > a: barrier (0) > c: barrier (0) > b: barrier (0) > dpm_base_disconnect_init: error -12 in isend to process 3 > dpm_base_disconnect_init: error -12 in isend to process 3 > dpm_base_disconnect_init: error -12 in isend to process 3 > dpm_base_disconnect_init: erro
Re: [OMPI devel] Strange intercomm_create, spawn, spawn_multiple hang on trunk
Thanks for trying Ralph. Looks like my issues has to do with coll ml interaction. If I exclude coll ml, then all my tests pass. Do you know if there is a bug for this issue? If so, then I can run my nightly tests with coll ml disabled and wait for the bug to be fixed. Also, where does simple_spawn and spawn_multiple live? I was running "spawn" and "spawn_multiple" from the ibm/dynamic test suite. Your output for spawn_multiple looks different than mine. From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Friday, June 06, 2014 3:19 PM To: Open MPI Developers Subject: Re: [OMPI devel] Strange intercomm_create, spawn, spawn_multiple hang on trunk Works fine for me: [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./simple_spawn [pid 22777] starting up! [pid 22778] starting up! [pid 22779] starting up! 1 completed MPI_Init Parent [pid 22778] about to spawn! 2 completed MPI_Init Parent [pid 22779] about to spawn! 0 completed MPI_Init Parent [pid 22777] about to spawn! [pid 22783] starting up! [pid 22784] starting up! Parent done with spawn Parent sending message to child Parent done with spawn Parent done with spawn 0 completed MPI_Init Hello from the child 0 of 2 on host bend001 pid 22783 Child 0 received msg: 38 1 completed MPI_Init Hello from the child 1 of 2 on host bend001 pid 22784 Child 1 disconnected Parent disconnected Parent disconnected Parent disconnected Child 0 disconnected 22784: exiting 22778: exiting 22779: exiting 22777: exiting 22783: exiting [rhc@bend001 mpi]$ make spawn_multiple mpicc -g --openmpi:linkallspawn_multiple.c -o spawn_multiple [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./spawn_multiple Parent [pid 22797] about to spawn! Parent [pid 22798] about to spawn! Parent [pid 22799] about to spawn! Parent done with spawn Parent done with spawn Parent sending message to children Parent done with spawn Hello from the child 0 of 2 on host bend001 pid 22803: argv[1] = foo Child 0 received msg: 38 Hello from the child 1 of 2 on host bend001 pid 22804: argv[1] = bar Child 1 disconnected Parent disconnected Parent disconnected Parent disconnected Child 0 disconnected [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 -mca coll ^ml ./intercomm_create b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 3] b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 4] b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 5] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 3] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5] a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after a: intercomm_merge(0) (0) [rank 2] c: intercomm_merge(0) (0) [rank 8] a: intercomm_merge(0) (0) [rank 0] a: intercomm_merge(0) (0) [rank 1] c: intercomm_merge(0) (0) [rank 7] b: intercomm_merge(1) (0) [rank 4] b: intercomm_merge(1) (0) [rank 5] c: intercomm_merge(0) (0) [rank 6] b: intercomm_merge(1) (0) [rank 3] a: barrier (0) b: barrier (0) c: barrier (0) a: barrier (0) c: barrier (0) b: barrier (0) a: barrier (0) c: barrier (0) b: barrier (0) dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 0 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 0 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 1 dpm_base_disconnect_init: error -12 in isend to process 3 [rhc@bend001 mpi]$ On Jun 6, 2014, at 11:26 AM, Rolf vandeVaa
Re: [OMPI devel] Strange intercomm_create, spawn, spawn_multiple hang on trunk
Works fine for me: [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./simple_spawn [pid 22777] starting up! [pid 22778] starting up! [pid 22779] starting up! 1 completed MPI_Init Parent [pid 22778] about to spawn! 2 completed MPI_Init Parent [pid 22779] about to spawn! 0 completed MPI_Init Parent [pid 22777] about to spawn! [pid 22783] starting up! [pid 22784] starting up! Parent done with spawn Parent sending message to child Parent done with spawn Parent done with spawn 0 completed MPI_Init Hello from the child 0 of 2 on host bend001 pid 22783 Child 0 received msg: 38 1 completed MPI_Init Hello from the child 1 of 2 on host bend001 pid 22784 Child 1 disconnected Parent disconnected Parent disconnected Parent disconnected Child 0 disconnected 22784: exiting 22778: exiting 22779: exiting 22777: exiting 22783: exiting [rhc@bend001 mpi]$ make spawn_multiple mpicc -g --openmpi:linkallspawn_multiple.c -o spawn_multiple [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 ./spawn_multiple Parent [pid 22797] about to spawn! Parent [pid 22798] about to spawn! Parent [pid 22799] about to spawn! Parent done with spawn Parent done with spawn Parent sending message to children Parent done with spawn Hello from the child 0 of 2 on host bend001 pid 22803: argv[1] = foo Child 0 received msg: 38 Hello from the child 1 of 2 on host bend001 pid 22804: argv[1] = bar Child 1 disconnected Parent disconnected Parent disconnected Parent disconnected Child 0 disconnected [rhc@bend001 mpi]$ mpirun -n 3 --host bend001 -mca coll ^ml ./intercomm_create b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 3] b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 4] b: MPI_Intercomm_create( intra, 0, intra, MPI_COMM_NULL, 201, &inter) [rank 5] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 3] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 4] c: MPI_Intercomm_create( MPI_COMM_WORLD, 0, intra, 0, 201, &inter) [rank 5] a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) a: MPI_Intercomm_create( ab_intra, 0, ac_intra, 3, 201, &inter) (0) b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after c: intercomm_create (0) c: barrier on inter-comm - before c: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after a: intercomm_create (0) a: barrier on inter-comm - before a: barrier on inter-comm - after b: intercomm_create (0) b: barrier on inter-comm - before b: barrier on inter-comm - after a: intercomm_merge(0) (0) [rank 2] c: intercomm_merge(0) (0) [rank 8] a: intercomm_merge(0) (0) [rank 0] a: intercomm_merge(0) (0) [rank 1] c: intercomm_merge(0) (0) [rank 7] b: intercomm_merge(1) (0) [rank 4] b: intercomm_merge(1) (0) [rank 5] c: intercomm_merge(0) (0) [rank 6] b: intercomm_merge(1) (0) [rank 3] a: barrier (0) b: barrier (0) c: barrier (0) a: barrier (0) c: barrier (0) b: barrier (0) a: barrier (0) c: barrier (0) b: barrier (0) dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 0 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 0 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 3 dpm_base_disconnect_init: error -12 in isend to process 1 dpm_base_disconnect_init: error -12 in isend to process 3 [rhc@bend001 mpi]$ On Jun 6, 2014, at 11:26 AM, Rolf vandeVaart wrote: > I am seeing an interesting failure on trunk. intercomm_create, spawn, and > spawn_multiple from the IBM tests hang if I explicitly list the hostnames to > run on. For example: > > Good: > $ mpirun -np 2 --mca btl self,sm,tcp spawn_multiple > Parent: 0 of 2, drossetti-ivy0.nvidia.com (0 in init) > Parent: 1 of 2, drossetti-ivy0.nvidia.com (0 in init) > Child: 0 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) > Child: 1 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) > Child: 2 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) > Child: 3 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) > $ > > Bad: > $ mpirun -np 2 --mca
[OMPI devel] Strange intercomm_create, spawn, spawn_multiple hang on trunk
I am seeing an interesting failure on trunk. intercomm_create, spawn, and spawn_multiple from the IBM tests hang if I explicitly list the hostnames to run on. For example: Good: $ mpirun -np 2 --mca btl self,sm,tcp spawn_multiple Parent: 0 of 2, drossetti-ivy0.nvidia.com (0 in init) Parent: 1 of 2, drossetti-ivy0.nvidia.com (0 in init) Child: 0 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) Child: 1 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) Child: 2 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) Child: 3 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) $ Bad: $ mpirun -np 2 --mca btl self,sm,tcp -host drossetti-ivy0,drossetti-ivy0 spawn_multiple Parent: 0 of 2, drossetti-ivy0.nvidia.com (1 in init) Parent: 1 of 2, drossetti-ivy0.nvidia.com (1 in init) Child: 0 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) Child: 1 of 4, drossetti-ivy0.nvidia.com (this is job 1) (1 in init) Child: 2 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) Child: 3 of 4, drossetti-ivy0.nvidia.com (this is job 2) (1 in init) [..and we are hung here...] I see the exact same behavior for spawn and spawn_multiple. Ralph, any thoughts? Open MPI 1.8 is fine. I can provide more information if needed, but I assume this is reproducible. Thanks, Rolf --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---
[OMPI devel] iallgather failures with coll ml
On the trunk, I am seeing failures of the ibm tests iallgather and iallgather_in_place. Is this a known issue? $ mpirun --mca btl self,sm,tcp --mca coll ml,basic,libnbc --host drossetti-ivy0,drossetti-ivy0,drossetti-ivy1,drossetti-ivy1 -np 4 iallgather [**ERROR**]: MPI_COMM_WORLD rank 0, file iallgather.c:77: bad answer (0) at index 1 of 4 (should be 1) [**ERROR**]: MPI_COMM_WORLD rank 1, file iallgather.c:77: bad answer (0) at index 1 of 4 (should be 1) Interestingly, there is an MCA param to disable it in coll ml which allows the test to pass. $ mpirun --mca coll_ml_disable_allgather 1 --mca btl self,sm,tcp --mca coll ml,basic,libnbc --host drossetti-ivy0,drossetti-ivy0,drossetti-ivy1,drossetti-ivy1 -np 4 iallgather $ echo $? 0 --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---
Re: [OMPI devel] Intermittent hangs when exiting with error
On Jun 6, 2014, at 7:11 AM, Jeff Squyres (jsquyres) wrote: > Looks like Ralph's simpler solution fit the bill. Yeah, but I still am unhappy with it. It's about the stupidest connection model you can imagine. What happens is this: * a process constructs its URI - this is done by creating a string with the IP:PORT for each subnet the proc is listening on. The URI is constructed in alphabetical order (well, actually in kernel index order - but that tends to follow the alphabetical order of the interface names). This then gets passed to the other process * the sender breaks the URI into its component parts and creates a list of addresses for the target. This list gets created in the order of the components - i.e., we take the first IP:PORT out of the URI, and that is our first address. * when the sender initiates a connection, it takes the first address in the list (which means the alphabetically first name in the target's list of interfaces) and initiates the connection on that subnet. If it succeeds, then that is the subnet we use for all subsequent messages. So if the first subnet can reach the target, even if it means bouncing all over the Internet, we will use it - even though the second subnet in the URI might have provided a direct connection! It solves Gilles problem because "ib" comes after "eth", and it matches what was done in the original OOB (before my rewrite) - but it sure sounds to me like a bad, inefficient solution for general use. > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14987.php
Re: [OMPI devel] Intermittent hangs when exiting with error
On Jun 5, 2014, at 9:16 PM, Gilles Gouaillardet wrote: > i work on a 4k+ nodes cluster with a very decent gigabit ethernet > network (reasonable oversubscription + switches > from a reputable vendor you are familiar with ;-) ) > my experience is that IPoIB can be very slow at establishing a > connection, especially if the arp table is not populated > (as far as i understand, this involves the subnet manager and > performance can be very random especially if all nodes issue > arp requests at the same time) > on the other hand, performance is much more stable when using the > subnetted IP network. Got it. >> As a simple solution, there could be an TCP oob MCA param that says >> "regardless of peer IP address, I can connect to them" (i.e., assume IP >> routing will make everything work out ok). > +1 and/or an option to tell oob mca "do not discard the interface simply > because the peer IP is not in the same subnet" Looks like Ralph's simpler solution fit the bill. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI devel] MPI_Comm_spawn affinity and coll/ml
I fixed the binding algorithm so it shifts the location to be more of what you expected. However, we still won't bind the final spawn if there aren't enough free cores to support those procs. On Jun 5, 2014, at 7:12 AM, Hjelm, Nathan T wrote: > Coll/ml does disqualify itself if processes are not bound. The problem here > is there is an inconsistency between the two sides of the intercommunicator. > I can write a quick fix for 1.8.2. > > -Nathan > > From: devel [devel-boun...@open-mpi.org] on behalf of Gilles Gouaillardet > [gilles.gouaillar...@gmail.com] > Sent: Thursday, June 05, 2014 1:20 AM > To: Open MPI Developers > Subject: [OMPI devel] MPI_Comm_spawn affinity and coll/ml > > Folks, > > on my single socket four cores VM (no batch manager), i am running the > intercomm_create test from the ibm test suite. > > mpirun -np 1 ./intercomm_create > => OK > > mpirun -np 2 ./intercomm_create > => HANG :-( > > mpirun -np 2 --mca coll ^ml ./intercomm_create > => OK > > basically, this first two tasks will call twice MPI_Comm_spawn(2 tasks) > followed by MPI_Intercomm_merge > and the 4 spawned tasks will call MPI_Intercomm_merge followed by > MPI_Intercomm_create > > i digged a bit into that issue and found two distinct issues : > > 1) binding : > tasks [0-1] (launched with mpirun) are bound on cores [0-1] => OK > tasks[2-3] (first spawn) are bound on cores [0-1] => ODD, i would have > expected [2-3] > tasks[4-5] (second spawn) are not bound at all => ODD again, could have made > sense if tasks[2-3] were bound on cores [2-3] > i observe the same behaviour with the --oversubscribe mpirun parameter > > 2) coll/ml > coll/ml hangs when -np 2 (total 6 tasks, including 2 unbound tasks) > i suspect coll/ml is unable to handle unbound tasks. > if i am correct, should coll/ml detect this and simply automatically > disqualify itself ? > > Cheers, > > Gilles > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14980.php
Re: [OMPI devel] Intermittent hangs when exiting with error
Kewl - thanks! On Jun 5, 2014, at 9:28 PM, Gilles Gouaillardet wrote: > Ralph, > > sorry for my poor understanding ... > > i tried r31956 and it solved both issues : > - MPI_Abort does not hang any more if nodes are on different eth0 subnets > - MPI_Init does not hang any more if hosts have different number of IB ports > > this likely explains why you are having trouble replicating it ;-) > > Thanks a lot ! > > Gilles > > > On Fri, Jun 6, 2014 at 11:45 AM, Ralph Castain wrote: > I keep explaining that we don't "discard" anything, but there really isn't > any point to continuing trying to explain the system. With the announced > intention of completing the move of the BTLs to OPAL, I no longer need the > multi-module complexity in the OOB/TCP. So I have removed it and gone back to > the single module that connects to everything. > > Try r31956 - hopefully will resolve your connectivity issues. > > Still looking at the MPI_Abort hang as I'm having trouble replicating it. > > > On Jun 5, 2014, at 7:16 PM, Gilles Gouaillardet > wrote: > > > Jeff, > > > > as pointed by Ralph, i do wish using eth0 for oob messages. > > > > i work on a 4k+ nodes cluster with a very decent gigabit ethernet > > network (reasonable oversubscription + switches > > from a reputable vendor you are familiar with ;-) ) > > my experience is that IPoIB can be very slow at establishing a > > connection, especially if the arp table is not populated > > (as far as i understand, this involves the subnet manager and > > performance can be very random especially if all nodes issue > > arp requests at the same time) > > on the other hand, performance is much more stable when using the > > subnetted IP network. > > > > as Ralf also pointed, i can imagine some architects neglect their > > ethernet network (e.g. highly oversubscribed + low end switches) > > and in this case ib0 is a best fit for oob messages. > > > >> As a simple solution, there could be an TCP oob MCA param that says > >> "regardless of peer IP address, I can connect to them" (i.e., assume IP > >> routing will make everything work out ok). > > +1 and/or an option to tell oob mca "do not discard the interface simply > > because the peer IP is not in the same subnet" > > > > Cheers, > > > > Gilles > > > > On 2014/06/05 23:01, Ralph Castain wrote: > >> Because Gilles wants to avoid using IB for TCP messages, and using eth0 > >> also solves the problem (the messages just route) > >> > >> On Jun 5, 2014, at 5:00 AM, Jeff Squyres (jsquyres) > >> wrote: > >> > >>> Another random thought for Gilles situation: why not oob-TCP-if-include > >>> ib0? (And not eth0) > >>> > > > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > > http://www.open-mpi.org/community/lists/devel/2014/06/14982.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14983.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14984.php
Re: [OMPI devel] Intermittent hangs when exiting with error
Ralph, sorry for my poor understanding ... i tried r31956 and it solved both issues : - MPI_Abort does not hang any more if nodes are on different eth0 subnets - MPI_Init does not hang any more if hosts have different number of IB ports this likely explains why you are having trouble replicating it ;-) Thanks a lot ! Gilles On Fri, Jun 6, 2014 at 11:45 AM, Ralph Castain wrote: > I keep explaining that we don't "discard" anything, but there really isn't > any point to continuing trying to explain the system. With the announced > intention of completing the move of the BTLs to OPAL, I no longer need the > multi-module complexity in the OOB/TCP. So I have removed it and gone back > to the single module that connects to everything. > > Try r31956 - hopefully will resolve your connectivity issues. > > Still looking at the MPI_Abort hang as I'm having trouble replicating it. > > > On Jun 5, 2014, at 7:16 PM, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > > > Jeff, > > > > as pointed by Ralph, i do wish using eth0 for oob messages. > > > > i work on a 4k+ nodes cluster with a very decent gigabit ethernet > > network (reasonable oversubscription + switches > > from a reputable vendor you are familiar with ;-) ) > > my experience is that IPoIB can be very slow at establishing a > > connection, especially if the arp table is not populated > > (as far as i understand, this involves the subnet manager and > > performance can be very random especially if all nodes issue > > arp requests at the same time) > > on the other hand, performance is much more stable when using the > > subnetted IP network. > > > > as Ralf also pointed, i can imagine some architects neglect their > > ethernet network (e.g. highly oversubscribed + low end switches) > > and in this case ib0 is a best fit for oob messages. > > > >> As a simple solution, there could be an TCP oob MCA param that says > "regardless of peer IP address, I can connect to them" (i.e., assume IP > routing will make everything work out ok). > > +1 and/or an option to tell oob mca "do not discard the interface simply > > because the peer IP is not in the same subnet" > > > > Cheers, > > > > Gilles > > > > On 2014/06/05 23:01, Ralph Castain wrote: > >> Because Gilles wants to avoid using IB for TCP messages, and using eth0 > also solves the problem (the messages just route) > >> > >> On Jun 5, 2014, at 5:00 AM, Jeff Squyres (jsquyres) > wrote: > >> > >>> Another random thought for Gilles situation: why not > oob-TCP-if-include ib0? (And not eth0) > >>> > > > > ___ > > devel mailing list > > de...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14982.php > > ___ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/06/14983.php >