Try manually specifying the collective component "-mca coll tuned"
You seem to be using the "sync" collective component, any stale mca param
files lying around ?

--Nysal

On Tue, Jan 11, 2011 at 6:28 PM, Doron Shoham <doron.o...@gmail.com> wrote:

> Hi
>
> All machines on the setup are IDataPlex with Nehalem 12 cores per node,
> 24GB  memory.
>
>
>
> ·         *Problem 1 – OMPI 1.4.3 hangs in gather:*
>
>
>
> I’m trying to run IMB and gather operation with OMPI 1.4.3 (Vanilla).
>
> It happens when np >= 64 and message size exceed 4k:
>
> mpirun -np 64 -machinefile voltairenodes -mca btl sm,self,openib
> imb/src-1.4.2/IMB-MPI1 gather –npmin 64
>
>
>
> voltairenodes consists of 64 machines.
>
>
>
> #----------------------------------------------------------------
>
> # Benchmarking Gather
>
> # #processes = 64
>
> #----------------------------------------------------------------
>
>        #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>
>             0         1000         0.02         0.02         0.02
>
>             1          331        14.02        14.16        14.09
>
>             2          331        12.87        13.08        12.93
>
>             4          331        14.29        14.43        14.34
>
>             8          331        16.03        16.20        16.11
>
>            16          331        17.54        17.74        17.64
>
>            32          331        20.49        20.62        20.53
>
>            64          331        23.57        23.84        23.70
>
>           128          331        28.02        28.35        28.18
>
>           256          331        34.78        34.88        34.80
>
>           512          331        46.34        46.91        46.60
>
>          1024          331        63.96        64.71        64.33
>
>          2048          331       460.67       465.74       463.18
>
>          4096          331       637.33       643.99       640.75
>
>
>
> This the padb output:
>
> padb –A –x –Ormgr=mpirun –tree:
>
>
>
> =~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2011.01.06 14:33:17
> =~=~=~=~=~=~=~=~=~=~=~=
>
>
>
> Warning, remote process state differs across ranks
>
> state : ranks
>
> R (running) :
> [1,3-6,8,10-13,16-20,23-28,30-32,34-42,44-45,47-49,51-53,56-59,61-63]
>
> S (sleeping) : [0,2,7,9,14-15,21-22,29,33,43,46,50,54-55,60]
>
> Stack trace(s) for thread: 1
>
> -----------------
>
> [0-63] (64 processes)
>
> -----------------
>
> main() at ?:?
>
>   IMB_init_buffers_iter() at ?:?
>
>     IMB_gather() at ?:?
>
>       PMPI_Gather() at pgather.c:175
>
>         mca_coll_sync_gather() at coll_sync_gather.c:46
>
>           ompi_coll_tuned_gather_intra_dec_fixed() at
> coll_tuned_decision_fixed.c:714
>
>             -----------------
>
>             [0,3-63] (62 processes)
>
>             -----------------
>
>             ompi_coll_tuned_gather_intra_linear_sync() at
> coll_tuned_gather.c:248
>
>               mca_pml_ob1_recv() at pml_ob1_irecv.c:104
>
>                 ompi_request_wait_completion() at
> ../../../../ompi/request/request.h:375
>
>                   opal_condition_wait() at
> ../../../../opal/threads/condition.h:99
>
>             -----------------
>
>             [1] (1 processes)
>
>             -----------------
>
>             ompi_coll_tuned_gather_intra_linear_sync() at
> coll_tuned_gather.c:302
>
>               mca_pml_ob1_send() at pml_ob1_isend.c:125
>
>                 ompi_request_wait_completion() at
> ../../../../ompi/request/request.h:375
>
>                   opal_condition_wait() at
> ../../../../opal/threads/condition.h:99
>
>             -----------------
>
>             [2] (1 processes)
>
>             -----------------
>
>             ompi_coll_tuned_gather_intra_linear_sync() at
> coll_tuned_gather.c:315
>
>               ompi_request_default_wait() at request/req_wait.c:37
>
>                 ompi_request_wait_completion() at
> ../ompi/request/request.h:375
>
>                   opal_condition_wait() at ../opal/threads/condition.h:99
>
> Stack trace(s) for thread: 2
>
> -----------------
>
> [0-63] (64 processes)
>
> -----------------
>
> start_thread() at ?:?
>
>   btl_openib_async_thread() at btl_openib_async.c:344
>
>     poll() at ?:?
>
> Stack trace(s) for thread: 3
>
> -----------------
>
> [0-63] (64 processes)
>
> -----------------
>
> start_thread() at ?:?
>
>   service_thread_start() at btl_openib_fd.c:427
>
>     select() at ?:?
>
> -bash-3.2$
>
>
>
>
>
> When running again padb after couple of minutes, I can see that the total
> number of processes remain in the same position but
>
> different processes are at different positions.
>
> For example, this is the diff between two padb outputs:
>
>
>
> Warning, remote process state differs across ranks
>
> state : ranks
>
> -R (running) : [0,2-4,6-13,16-18,20-21,28-31,33-36,38-56,58,60,62-63]
>
> -S (sleeping) : [1,5,14-15,19,22-27,32,37,57,59,61]
>
> +R (running) : [2,5-14,16-23,25,28-40,42-48,50-51,53-58,61,63]
>
> +S (sleeping) : [0-1,3-4,15,24,26-27,41,49,52,59-60,62]
>
> Stack trace(s) for thread: 1
>
> -----------------
>
> [0-63] (64 processes)
>
> @@ -13,21 +13,21 @@
>
> mca_coll_sync_gather() at coll_sync_gather.c:46
>
> ompi_coll_tuned_gather_intra_dec_fixed() at coll_tuned_decision_fixed.c:714
>
> -----------------
>
> - [0,3-63] (62 processes)
>
> + [0-5,8-63] (62 processes)
>
> -----------------
>
> ompi_coll_tuned_gather_intra_linear_sync() at coll_tuned_gather.c:248
>
> mca_pml_ob1_recv() at pml_ob1_irecv.c:104
>
> ompi_request_wait_completion() at ../../../../ompi/request/request.h:375
>
> opal_condition_wait() at ../../../../opal/threads/condition.h:99
>
> -----------------
>
> - [1] (1 processes)
>
> + [6] (1 processes)
>
> -----------------
>
> ompi_coll_tuned_gather_intra_linear_sync() at coll_tuned_gather.c:302
>
> mca_pml_ob1_send() at pml_ob1_isend.c:125
>
> ompi_request_wait_completion() at ../../../../ompi/request/request.h:375
>
> opal_condition_wait() at ../../../../opal/threads/condition.h:99
>
> -----------------
>
> - [2] (1 processes)
>
> + [7] (1 processes)
>
> -----------------
>
> ompi_coll_tuned_gather_intra_linear_sync() at coll_tuned_gather.c:315
>
> ompi_request_default_wait() at request/req_wait.c:37
>
>
>
>
>
> *Choosing different gather algorithm seems to bypass the hang.*
>
> I’ve used the following mca parameters:
>
> --mca coll_tuned_use_dynamic_rules 1
>
> --mca coll_tuned_gather_algorithm 1
>
>
>
> Actually, both dec_fixed and basic_linear works while binomial and
> linear_sync doesn’t.
>
>
>
> With OMPI 1.5 it doesn’t hangs (with all gather algorithms) and it much
> faster (the number of repetitions is much higher):
>
> #----------------------------------------------------------------
>
> # Benchmarking Gather
>
> # #processes = 64
>
> #----------------------------------------------------------------
>
>        #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>
>             0         1000         0.02         0.03         0.02
>
>             1         1000        18.50        18.55        18.53
>
>             2         1000        18.17        18.25        18.22
>
>             4         1000        19.04        19.10        19.07
>
>             8         1000        19.60        19.67        19.64
>
>            16         1000        21.39        21.47        21.43
>
>            32         1000        24.83        24.91        24.87
>
>            64         1000        27.35        27.45        27.40
>
>           128         1000        33.23        33.34        33.29
>
>           256         1000        41.24        41.39        41.32
>
>           512         1000        52.62        52.81        52.71
>
>          1024         1000        73.20        73.46        73.32
>
>          2048         1000       416.36       418.04       417.22
>
>          4096         1000       638.54       640.70       639.65
>
>          8192         1000       506.26       506.97       506.63
>
>         16384         1000       600.63       601.40       601.02
>
>         32768         1000       639.52       640.34       639.93
>
>         65536          640       914.22       916.02       915.13
>
>        131072          320      2287.37      2295.18      2291.35
>
>        262144          160      4041.36      4070.58      4056.27
>
>        524288           80      7292.35      7463.27      7397.14
>
>       1048576           40     13647.15     14107.15     13905.29
>
>       2097152           20     30625.00     32635.45     31815.36
>
>       4194304           10     63543.01     70987.49     68680.48
>
>
>
>
>
> ·         *Problem 2 – segmentation fault with OMPI 1.4.3/1.5 and IMB
> gather np=768:*
>
> When trying to run the same command but with np=768 I get segmentation
> fault:
>
> openmpi-1.4.3/bin/mpirun -np 768 -machinefile voltairenodes -mca btl
> sm,self,openib -mca coll_tuned_use_dynamic_rules 1 -mca
> coll_tuned_gather_algorithm 1 imb/src/IMB-MPI1 gather -npmin 768 -mem 1.6
>
>
>
> This happens in OMPI 1.4.3 and 1.5
>
>
>
> [compa163:20249] *** Process received signal ***
>
> [compa163:20249] Signal: Segmentation fault (11)
>
> [compa163:20249] Signal code: Address not mapped (1)
>
> [compa163:20249] Failing at address: 0x2aab4a204000
>
> [compa163:20249] [ 0] /lib64/libpthread.so.0 [0x366aa0e7c0]
>
> [compa163:20249] [ 1]
> /gpfs/asrc/home/voltaire/install//openmpi-1.4.3/lib/libmpi.so.0(ompi_convertor_unpack+0x15f)
> [0x2b077882282e]
>
> [compa163:20249] [ 2]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_pml_ob1.so
> [0x2b077b9e1672]
>
> [compa163:20249] [ 3]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_pml_ob1.so
> [0x2b077b9dd0b6]
>
> [compa163:20249] [ 4]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_btl_sm.so
> [0x2b077c459d87]
>
> [compa163:20249] [ 5]
> /gpfs/asrc/home/voltaire/install//openmpi-1.4.3/lib/libopen-pal.so.0(opal_progress+0xbe)
> [0x2b0778d845b8]
>
> [compa163:20249] [ 6]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_pml_ob1.so
> [0x2b077b9d6d62]
>
> [compa163:20249] [ 7]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_pml_ob1.so
> [0x2b077b9d6ba7]
>
> [compa163:20249] [ 8]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_pml_ob1.so
> [0x2b077b9d6a90]
>
> [compa163:20249] [ 9]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_coll_tuned.so
> [0x2b077d298dc5]
>
> [compa163:20249] [10]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_coll_tuned.so
> [0x2b077d2990d3]
>
> [compa163:20249] [11]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_coll_tuned.so
> [0x2b077d286e9b]
>
> [compa163:20249] [12]
> /gpfs/asrc/home/voltaire/install/openmpi-1.4.3/lib/openmpi/mca_coll_sync.so
> [0x2b077d07e96c]
>
> [compa163:20249] [13]
> /gpfs/asrc/home/voltaire/install//openmpi-1.4.3/lib/libmpi.so.0(PMPI_Gather+0x55e)
> [0x2b077883ec9a]
>
> [compa163:20249] [14] imb/src/IMB-MPI1(IMB_gather+0xe8) [0x40a088]
>
> [compa163:20249] [15] imb/src/IMB-MPI1(IMB_init_buffers_iter+0x28a)
> [0x405baa]
>
> [compa163:20249] [16] imb/src/IMB-MPI1(main+0x30f) [0x40362f]
>
> [compa163:20249] [17] /lib64/libc.so.6(__libc_start_main+0xf4)
> [0x3669e1d994]
>
> [compa163:20249] [18] imb/src/IMB-MPI1 [0x403269]
> [compa163:20249] *** End of error message ***
>
>
> Any ideas? More debuggin tips?
>
> Thanks,
> Doron
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Reply via email to