Re: [OMPI users] Multi-threaded MPI communication

2017-09-21 Thread Jeff Hammond
Can you pad data so you can use MPI_Gather instead? It's possible that Gatherv doesn't use recursive doubling. Or you can implement your own aggregation tree to work around the in-cast problem. Jeff On Thu, Sep 21, 2017 at 2:01 PM saiyedul islam wrote: > Hi all, > > I am working on paralleliza

Re: [OMPI users] Multi-threaded MPI communication

2017-09-21 Thread saiyedul islam
Thank you very much for the reply, Sir. Yes, I can observe this pattern by eyeballing Network communication graph of my cluster (through Ganglia Cluster Monitor : http://ganglia.sourceforge.net/). During this loop execution, master is receiving data at ~ 100 MB/sec (of theoretical 125 MB/sec in Gi

Re: [OMPI users] Multi-threaded MPI communication

2017-09-21 Thread George Bosilca
All your processes send their data to a single destination, in same time. Clearly you are reaching the capacity of your network and your data transfers will be bound by this. This is a physical constraint that you can only overcome by adding network capacity to your cluster. At the software level

[OMPI users] Multi-threaded MPI communication

2017-09-21 Thread saiyedul islam
Hi all, I am working on parallelization of a Data Clustering Algorithm in which I am following MPMD pattern of MPI (i.e. 1 master process and p slave processes in same communicator). It is an iterative algorithm where 2 loops inside iteration are separately parallelized. The first loop is paralle

Re: [OMPI users] multi-threaded MPI

2007-11-08 Thread Brian Budge
Sorry for the noise. I found MPI_Init_thread and installed 1.2.4. Seems to be fine now! Thanks for the great work on the multi-threaded MPI codes! Brian On Nov 7, 2007 8:04 PM, Brian Budge wrote: > Hi All - > > I am working on a networked cache for an out-of-core application, and > currently

[OMPI users] multi-threaded MPI

2007-11-07 Thread Brian Budge
Hi All - I am working on a networked cache for an out-of-core application, and currently I have it set up where I have several worker threads, and one "request" thread per node. The worker threads check the cache on their own node first, and if there's a miss, they make a request to the other nod