Re: [OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

2017-10-17 Thread Gilles Gouaillardet
If you use the rsh tree spawn mechanism, then yes, any node must be able to SSH passwordless to any node. This is only used to spawn one orted per node. when the number of nodes is important, a tree spawn is faster and avoids having all the SSH connections issued and maintained from the node ru

Re: [OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

2017-10-17 Thread Konstantinos Konstantinidis
Thanks for clarifying that Gilles. Now I have seen that omitting "-mca plm_rsh_no_tree_spawn 1" requires establishing passwordless SSH among the machines but this is not required for setting "--mca coll_tuned_bcast_algo". Is this correct or am I missing something? Also, among all possible broadca

Re: [OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

2017-10-17 Thread Gilles Gouaillardet
Konstantinos, I am afraid there is some confusion here. the plm_rsh_no_tree_spawn is only used at startup time (e.g. when remote launching one orted daemon per node but the one running mpirun). there is zero impact on the performances of MPI communications such as MPI_Bcast() the coll/t

[OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

2017-10-16 Thread Konstantinos Konstantinidis
I have implemented some algorithms in C++ which are greatly affected by shuffling time among nodes which is done by some broadcast calls. Up to now, I have been testing them by running something like mpirun -mca btl ^openib -mca plm_rsh_no_tree_spawn 1 ./my_test which I think make MPI_Bcast to wo

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-25 Thread George Bosilca
> On Apr 25, 2016, at 11:33 , Dave Love wrote: > > George Bosilca writes: > >> Dave, >> >> You are absolutely right, the parameters are now 6-7 years old, >> gathered on interconnects long gone. Moreover, several discussions in >> this mailing list indicated that they do not match current net

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-25 Thread Dave Love
George Bosilca writes: > Dave, > > You are absolutely right, the parameters are now 6-7 years old, > gathered on interconnects long gone. Moreover, several discussions in > this mailing list indicated that they do not match current network > capabilities. > > I have recently reshuffled the tuned

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-22 Thread George Bosilca
Dave, You are absolutely right, the parameters are now 6-7 years old, gathered on interconnects long gone. Moreover, several discussions in this mailing list indicated that they do not match current network capabilities. I have recently reshuffled the tuned module to move all the algorithms in

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-22 Thread Dave Love
George Bosilca writes: > Matthieu, > > If you are talking about how Open MPI selects between different broadcast > algorithms you might want to read [1]. We have implemented a dozen > different broadcast algorithms and have run a set of tests to measure their > performance. I'd been meaning to

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-19 Thread George Bosilca
- > *From:* users [users-boun...@open-mpi.org] on behalf of George Bosilca [ > bosi...@icl.utk.edu] > *Sent:* Tuesday, April 19, 2016 2:03 PM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] MPI_Bcast implementations in OpenMPI > > Matthieu, > > If you are talkin

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-19 Thread Dorier, Matthieu
Users Subject: Re: [OMPI users] MPI_Bcast implementations in OpenMPI Matthieu, If you are talking about how Open MPI selects between different broadcast algorithms you might want to read [1]. We have implemented a dozen different broadcast algorithms and have run a set of tests to measure their

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-19 Thread George Bosilca
Matthieu, If you are talking about how Open MPI selects between different broadcast algorithms you might want to read [1]. We have implemented a dozen different broadcast algorithms and have run a set of tests to measure their performance. We then used a quad tree clasiffication algorithm to minim

Re: [OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-19 Thread Jeff Squyres (jsquyres)
On Apr 15, 2016, at 9:18 AM, Dorier, Matthieu wrote: > > I'd like to know how OpenMPI implements MPI_Bcast. And if different > implementations are provided, how one is selected. This is a fairly complicated topic. This old paper is the foundation for how Open MPI works (it's a bit different t

[OMPI users] MPI_Bcast implementations in OpenMPI

2016-04-15 Thread Dorier, Matthieu
Hi, I'd like to know how OpenMPI implements MPI_Bcast. And if different implementations are provided, how one is selected. Thanks, Matthieu Dorier

[OMPI users] mpi_bcast

2013-09-06 Thread Huangwei
Hello there, In my fortran code, I used mpi_bcast to broadcast an array Q(21, 51, 14) (the size for it is 150,000,000) from the root to all the nodes. I found when I used this bcast subroutine, it code will be very slow and sometimes it hangs there. Once I commented this array, this code speed

Re: [OMPI users] MPI_Bcast hanging after some amount of data transferred on Infiniband network

2013-07-26 Thread Jeff Squyres (jsquyres)
1.4.3 is fairly ancient. Can you upgrade to 1.6.5? On Jul 26, 2013, at 3:15 AM, Dusan Zoric wrote: > > I am running application that performs some transformations of large matrices > on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband. Open MPI > 1.4.3 is installed on the syste

[OMPI users] MPI_Bcast hanging after some amount of data transferred on Infiniband network

2013-07-26 Thread Dusan Zoric
I am running application that performs some transformations of large matrices on 7-node cluster. Nodes are connected via QDR 40 Gbit Infiniband. Open MPI 1.4.3 is installed on the system. Given matrix transformation requires large data exchange between nodes in such a way that at each algorithm st

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-16 Thread Jeff Squyres
A few points to add to this discussion... 1. In the new (proposed) MPI-3 Fortran bindings (i.e., the "use mpi_f08" module), array subsections will be handled properly by MPI. However, we'll have to wait for the Fortran compilers to support F08 features first (i.e., both the MPI Forum and the F

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-15 Thread Patrick Begou
Thanks all for your converging point of view about my problem. Portability is also an important point for this code so there is only one solution: using user defined data type. In my mind, this was more for C or C++ code without the fortran subarray behavior but I was in error. The problem is

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-14 Thread Gustavo Correa
When it comes to intrinsic Fortran-90 functions, or to libraries provided by the compiler vendor [e.g. MKL in the case of Intel], I do agree that they *should* be able to parse the array-section notation and use the correct memory layout. However, for libraries that are not part of Fortran-90, s

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-14 Thread David Warren
Actually, sub array passing is part of the F90 standard (at least according to every document I can find), and not an Intel extension. So if it doesn't work you should complain to the compiler company. One of the reasons for using it is that the compiler should be optimized for whatever method

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-14 Thread Gustavo Correa
Hi Patrick >From my mere MPI and Fortran-90 user point of view, I think that the solution offered by the MPI standard [at least up to MPI-2] to address the problem of non-contiguous memory layouts is to use MPI user-defined types, as I pointed out in my previous email. I like this solution becaus

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-14 Thread Patrick Begou
Thanks all for your anwers. yes, I understand well that it is a non contiguous memory access problem as the MPI_BCAST should wait for a pointer on a valid memory zone. But I'm surprised that with the MPI module usage Fortran does not hide this discontinuity in a contiguous temporary copy of the

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-12 Thread Edmund Sumbar
The interface to MPI_Bcast does not specify a assumed-shape-array dummy first argument. Consequently, as David points out, the compiler makes a contiguous temporary copy of the array section to pass to the routine. If using ifort, try the "-check arg_temp_created" compiler option to verify creation

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-12 Thread David Warren
What FORTRAN compiler are you using? This should not really be an issue with the MPI implementation, but with the FORTRAN. This is legitimate usage in FORTRAN 90 and the compiler should deal with it. I do similar things using ifort and it creates temporary arrays when necessary and it all works

Re: [OMPI users] MPI_BCAST and fortran subarrays

2011-12-12 Thread Gustavo Correa
Hi Patrick I think tab(i,:) is not contiguous in memory, but has a stride of nbcpus. Since the MPI type you are passing is just the barebones MPI_INTEGER, MPI_BCAST expects the four integers to be contiguous in memory, I guess. The MPI calls don't have any idea of the Fortran90 memory layout, and

[OMPI users] MPI_BCAST and fortran subarrays

2011-12-12 Thread Patrick Begou
I've got a strange problem with Fortran90 and MPI_BCAST call on a large application. I've isolated the problem in this short program samples. With fortran we can use subarrays in functions calls. Example, with passing a subarray to the "change" procedure: MODULE mymod IMPLICIT NONE CONTAINS S

Re: [OMPI users] MPI_Bcast vs. per worker MPI_Send?

2010-12-14 Thread Eugene Loh
David Mathog wrote: For the receive I do not see how to use a collective. Each worker sends back a data structure, and the structures are of of varying size. This is almost always the case in Bioinformatics, where what is usually coming back from each worker is a count M of the number of signi

Re: [OMPI users] MPI_Bcast vs. per worker MPI_Send?

2010-12-14 Thread David Mathog
So the 2/2 consensus is to use the collective. That is straightforward for the send part of this, since all workers are sent the same data. For the receive I do not see how to use a collective. Each worker sends back a data structure, and the structures are of of varying size. This is almost al

Re: [OMPI users] MPI_Bcast vs. per worker MPI_Send?

2010-12-13 Thread David Zhang
Unless your cluster has some weird connection topology and you're trying to take advantage of that, collective is the best bet. On Mon, Dec 13, 2010 at 4:26 PM, Eugene Loh wrote: > David Mathog wrote: > > Is there a rule of thumb for when it is best to contact N workers with >> MPI_Bcast vs. wh

Re: [OMPI users] MPI_Bcast vs. per worker MPI_Send?

2010-12-13 Thread Eugene Loh
David Mathog wrote: Is there a rule of thumb for when it is best to contact N workers with MPI_Bcast vs. when it is best to use a loop which cycles N times and moves the same information with MPI_Send to one worker at a time? The rule of thumb is to use a collective whenever you can. The ra

[OMPI users] MPI_Bcast vs. per worker MPI_Send?

2010-12-13 Thread David Mathog
Is there a rule of thumb for when it is best to contact N workers with MPI_Bcast vs. when it is best to use a loop which cycles N times and moves the same information with MPI_Send to one worker at a time? For that matter, other than the coding semantics, is there any real difference between the t

Re: [OMPI users] MPI_Bcast() Vs paired MPI_Send() & MPI_Recv()

2010-09-01 Thread David Zhang
Mpi send and recv are blocking, while you can exit bcast even if other processes haven't receive the bcast yet. A general rule of thumb is mpi calls are optimized and almost always perform better than if you were to manage the communication youself. On 9/1/10, ananda.mu...@wipro.com wrote: > Hi

[OMPI users] MPI_Bcast() Vs paired MPI_Send() & MPI_Recv()

2010-09-01 Thread ananda.mudar
Hi If I replace MPI_Bcast() with a paired MPI_Send() and MPI_Recv() calls, what kind of impact does it have on the performance of the program? Are there any benchmarks of MPI_Bcast() vs paired MPI_Send() and MPI_Recv()?? Thanks Ananda Please do not print this email unless it is absolutely

Re: [OMPI users] MPI_Bcast issue

2010-08-12 Thread Randolph Pullen
f Squyres wrote: From: Jeff Squyres Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Friday, 13 August, 2010, 3:03 AM Dick / all -- I just had a phone call with Ralph Castain who has had some additional off-list mails with Randolph.  Apparently, none of us u

Re: [OMPI users] MPI_Bcast issue

2010-08-12 Thread Jeff Squyres
- MPI Team > IBM Systems & Technology Group > Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 > Tele (845) 433-7846 Fax (845) 433-8363 > > > users-boun...@open-mpi.org wrote on 08/11/2010 08:59:16 PM: > > > [image removed] > >

Re: [OMPI users] MPI_Bcast issue

2010-08-12 Thread Richard Treumann
n...@open-mpi.org wrote on 08/11/2010 08:59:16 PM: > [image removed] > > Re: [OMPI users] MPI_Bcast issue > > Randolph Pullen > > to: > > Open MPI Users > > 08/11/2010 09:01 PM > > Sent by: > > users-boun...@open-mpi.org > > Please respon

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
Interesting point. --- On Thu, 12/8/10, Ashley Pittman wrote: From: Ashley Pittman Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Thursday, 12 August, 2010, 12:22 AM On 11 Aug 2010, at 05:10, Randolph Pullen wrote: > Sure, but broadcasts are faster -

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
question is why. --- On Wed, 11/8/10, Richard Treumann wrote: From: Richard Treumann Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Wednesday, 11 August, 2010, 11:34 PM Randolf I am confused about using multiple, concurrent mpirun operations.  If there are

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Ashley Pittman
On 11 Aug 2010, at 05:10, Randolph Pullen wrote: > Sure, but broadcasts are faster - less reliable apparently, but much faster > for large clusters. Going off-topic here but I think it's worth saying: If you have a dataset that requires collective communication then use the function call that

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 11, 2010, at 12:10 AM, Randolph Pullen wrote: > Sure, but broadcasts are faster - less reliable apparently, but much faster > for large clusters. Just to be totally clear: MPI_BCAST is defined to be "reliable", in the sense that it will complete or invoke an error (vs. unreliable data

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 11, 2010, at 9:54 AM, Jeff Squyres wrote: > (I'll say that OMPI's ALLGATHER algorithm is probably not well optimized for > massive data transfers like you describe) Wrong wrong wrong -- I should have checked the code before sending. I made the incorrect assumption that OMPI still only h

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Jeff Squyres
On Aug 10, 2010, at 10:09 PM, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurrent copies of a 1 to N data movement > program to affect an N to N solution. The actual mechanism I am using is to > spawn N copies of mpirun from PVM across the cl

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Richard Treumann
Randolf I am confused about using multiple, concurrent mpirun operations. If there are M uses of mpirun and each starts N tasks (carried out under pvm or any other way) I would expect you to have M completely independent MPI jobs with N tasks (processes) each. You could have some root in eac

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Randolph Pullen
wrote: From: Terry Frankcombe Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Wednesday, 11 August, 2010, 1:57 PM On Tue, 2010-08-10 at 19:09 -0700, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurre

Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Terry Frankcombe
On Tue, 2010-08-10 at 19:09 -0700, Randolph Pullen wrote: > Jeff thanks for the clarification, > What I am trying to do is run N concurrent copies of a 1 to N data > movement program to affect an N to N solution. I'm no MPI guru, nor do I completely understand what you are doing, but isn't this an

Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Randolph Pullen
rom: Jeff Squyres Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Wednesday, 11 August, 2010, 6:24 AM +1 on Eugene's comment that I don't fully understand what you are trying to do.  Can you send a short example code? Some random points: - Edgar alre

Re: [OMPI users] MPI_Bcast issue

2010-08-10 Thread Jeff Squyres
rote: > The install was completly vanilla - no extras a plain .configure command line > (on FC10 x8x_64 linux) > > Are you saying that all broadcast calls are actually implemented as serial > point to point calls? > > > --- On Tue, 10/8/10, Ralph Castain wrote: >

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Randolph Pullen
st is implemented with multicast calls but does it use any actual broadcast calls at all?  I know I'm scraping the edges here looking for something but I just cant get my head around why it should fail where it has. --- On Mon, 9/8/10, Ralph Castain wrote: From: Ralph Castain Su

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Ralph Castain
presume that bcast is implemented with multicast calls but does it use any > actual broadcast calls at all? > I know I'm scraping the edges here looking for something but I just cant get > my head around why it should fail where it has. > > --- On Mon, 9/8/10, Ralph Castain

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Eugene Loh
he odds of resolution. From: Randolph Pullen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users]

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann
7846 Fax (845) 433-8363 From: Randolph Pullen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users] MPI_Bcast issue Sent by: users-boun...@open-mpi.org I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n n

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Richard Treumann
llen To: us...@open-mpi.org Date: 08/07/2010 01:23 AM Subject: [OMPI users] MPI_Bcast issue Sent by: users-boun...@open-mpi.org I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n nodes. My problem starts because I requi

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Edgar Gabriel
o or more copies are run at [exactly] the > same time. > > Has anyone else seen similar behavior in concurrently running > programs that perform lots of broadcasts perhaps? > > Randolph > > > --- On Sun, 8/8/10, David Zhang wrote: > > From: David Zhang Subject: R

Re: [OMPI users] MPI_Bcast issue

2010-08-09 Thread Randolph Pullen
why it should fail where it has. --- On Mon, 9/8/10, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users" Received: Monday, 9 August, 2010, 1:32 PM Hi Randolph Unless your code is doing a connect/accept between the copies, there

Re: [OMPI users] MPI_Bcast issue

2010-08-08 Thread Ralph Castain
ram waits on broadcast reception forever when two or > more copies are run at [exactly] the same time. > > Has anyone else seen similar behavior in concurrently running programs that > perform lots of broadcasts perhaps? > > Randolph > > > --- On Sun, 8/8/10, David

Re: [OMPI users] MPI_Bcast issue

2010-08-08 Thread Randolph Pullen
copies are run at [exactly] the same time. Has anyone else seen similar behavior in concurrently running programs that perform lots of broadcasts perhaps? Randolph --- On Sun, 8/8/10, David Zhang wrote: From: David Zhang Subject: Re: [OMPI users] MPI_Bcast issue To: "Open MPI Users"

Re: [OMPI users] MPI_Bcast issue

2010-08-07 Thread David Zhang
In particular, intercommunicators On 8/7/10, Aurélien Bouteiller wrote: > You should consider reading about communicators in MPI. > > Aurelien > -- > Aurelien Bouteiller, Ph.D. > Innovative Computing Laboratory, The University of Tennessee. > > Envoyé de mon iPad > > Le Aug 7, 2010 à 1:05, Randol

Re: [OMPI users] MPI_Bcast issue

2010-08-07 Thread Aurélien Bouteiller
You should consider reading about communicators in MPI. Aurelien -- Aurelien Bouteiller, Ph.D. Innovative Computing Laboratory, The University of Tennessee. Envoyé de mon iPad Le Aug 7, 2010 à 1:05, Randolph Pullen a écrit : > I seem to be having a problem with MPI_Bcast. > My massive I/O int

[OMPI users] MPI_Bcast issue

2010-08-07 Thread Randolph Pullen
I seem to be having a problem with MPI_Bcast. My massive I/O intensive data movement program must broadcast from n to n nodes. My problem starts because I require 2 processes per node, a sender and a receiver and I have implemented these using MPI processes rather than tackle the complexities of

[OMPI users] MPI_Bcast hangs on with multiple nodes

2010-01-29 Thread Paul Wolfgang
I have just created a small cluster consisting of three nodes: bellhuey AMD 64 with 4 cores wolf1 AMD 64 with 2 cores wolf2 AMD 64 with 2 cores The host file is: bellhuey slots=4 wolf1 slots=2 wolf2 slots=2 bellhuey is the master and wolf1 and wolf2 share the /usr and /home file

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-24 Thread shan axida
, April 24, 2009 2:16:22 PM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI Right. So, baseline performance seems reasonable, but there is an odd spike that seems difficult to explain. This is annoying, but again: how important is it to resolve that mystery? You can spend a few days trying to

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-24 Thread Eugene Loh
Second cluster is almost the same features with the previous one From: Eugene Loh To: Open MPI Users Sent: Friday, April 24, 2009 1:26:14 AM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI So, the remaining mystery is the 6x or so spike at 128 Mbyte.  Dunno.  How important is it to resolve that mystery?

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread shan axida
, centOS 4.6, Second cluster: 2.8 GHz Intel Xeon, 3GBmemory, Fedora core5 Openmpi1.3 is used in both cluster. From: Eugene Loh To: Open MPI Users Sent: Friday, April 24, 2009 1:26:14 AM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI Okay. So, going back to

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread Eugene Loh
ble) but 131072 KB. It means around 128 MB.   From: Jeff Squyres To: Open MPI Users Sent: Thursday, April 23, 2009 8:23:52 PM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI Very strange; 6 seconds for a 1MB broadcast over 64 processes is *way* too long.  Even 2.5 sec at 2MB seems too long

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread shan axida
Sorry, I had a mistake in calculation. Not 131072 (double) but 131072 KB. It means around 128 MB. From: Jeff Squyres To: Open MPI Users Sent: Thursday, April 23, 2009 8:23:52 PM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI Very strange; 6 seconds for a

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread shan axida
To: Open MPI Users Sent: Thursday, April 23, 2009 8:23:52 PM Subject: Re: [OMPI users] MPI_Bcast from OpenMPI Very strange; 6 seconds for a 1MB broadcast over 64 processes is *way* too long. Even 2.5 sec at 2MB seems too long -- what is your network speed? I'm not entirely sure what you me

Re: [OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread Jeff Squyres
Very strange; 6 seconds for a 1MB broadcast over 64 processes is *way* too long. Even 2.5 sec at 2MB seems too long -- what is your network speed? I'm not entirely sure what you mean by "4 link" on your graph. Without more information, I would first check your hardware setup to see if the

[OMPI users] MPI_Bcast from OpenMPI

2009-04-23 Thread shan axida
Hi, One more question: I have executed the MPI_Bcast() in 64 processes in 16 nodes Ethernet multiple links cluster. The result is shown in the file attached on this E-mail. What is going on at 131072 double message size? I have executed it many times but the result is still the same. THANK YOU!

Re: [OMPI users] MPI_BCast problem on multiple networks.

2008-07-31 Thread David Robson
Sorry I should have given the version number. I'm running openmpi-1.2.4 on Fedora Core 6 Dave Adrian Knoth wrote: On Thu, Jul 31, 2008 at 03:26:09PM +0100, David Robson wrote: It also works if I disable the private interface. Otherwise there are no network problems. I can ping any host

Re: [OMPI users] MPI_BCast problem on multiple networks.

2008-07-31 Thread Adrian Knoth
On Thu, Jul 31, 2008 at 03:26:09PM +0100, David Robson wrote: > It also works if I disable the private interface. Otherwise there > are no network problems. I can ping any host from any other. > openmpi programs without MPI_BCast work OK. Weird. > Has any seen anything like this, or have any i

[OMPI users] MPI_BCast problem on multiple networks.

2008-07-31 Thread David Robson
Dear OpenMPI users I have a problem with openmpi codes hanging in MPI_BCast ... All our nodes are connected to one LAN. However, half of them also have an interface to a second private LAN. If the first openMPI process of a job starts on one of the dual-homed nodes, and a second process fr

Re: [OMPI users] MPI_Bcast not broadcast to all processes

2007-12-07 Thread alireza ghahremanian
Dear Jeff I want to send an integer vector of size 4000.It is a very confusing problem. --- Jeff Squyres wrote: > If you're seeing the same error from 2 entirely > different MPI > implementations, it is possible that it is an error > in your code. > > Ensure that all processes are calling MP

Re: [OMPI users] MPI_Bcast not broadcast to all processes

2007-12-05 Thread Jeff Squyres
If you're seeing the same error from 2 entirely different MPI implementations, it is possible that it is an error in your code. Ensure that all processes are calling MPI_Bcast with the same arguments (e.g., count, datatype, root, etc.), even on that 4000th iteration. How big are the block

[OMPI users] MPI_Bcast not broadcast to all processes

2007-12-05 Thread alireza ghahremanian
Dear Friends, I am writing a matrix multiplication program with MPI. MPI_Bcast does not broadcast to all processes, in last iteration for block size greater than a specific size. I test it with both MPICH and OPENMPI.I have 12 processes which 7 of them are reached to MPI_Bcast but when master (ran

Re: [OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-30 Thread Doug Gregor
On Jun 29, 2006, at 11:16 PM, Graham E Fagg wrote: On Thu, 29 Jun 2006, Doug Gregor wrote: When I use algorithm 6, I get: [odin003.cs.indiana.edu:14174] *** An error occurred in MPI_Bcast [odin005.cs.indiana.edu:10510] *** An error occurred in MPI_Bcast Broadcasting integers from root 0...[od

Re: [OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-29 Thread Graham E Fagg
On Thu, 29 Jun 2006, Doug Gregor wrote: When I use algorithm 6, I get: [odin003.cs.indiana.edu:14174] *** An error occurred in MPI_Bcast [odin005.cs.indiana.edu:10510] *** An error occurred in MPI_Bcast Broadcasting integers from root 0...[odin004.cs.indiana.edu:11752] *** An error occurred in

Re: [OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-29 Thread Graham E Fagg
On Thu, 29 Jun 2006, Doug Gregor wrote: Are there other settings I can tweak to try to find the algorithm that it's deciding to use at run-time? Yes just: -mca coll_base_verbose 1 will show whats being decided at run time. i.e. [reliant:25351] ompi_coll_tuned_bcast_intra_dec_fixed [reliant:25

Re: [OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-29 Thread Doug Gregor
On Jun 29, 2006, at 5:23 PM, Graham E Fagg wrote: Hi Doug wow, looks like some messages are getting lost (or even delivered to the wrong peer on the same node.. ) Could you also try with: -mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca coll_tuned_bcast_algorithm <1,2,3,

Re: [OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-29 Thread Graham E Fagg
Hi Doug wow, looks like some messages are getting lost (or even delivered to the wrong peer on the same node.. ) Could you also try with: -mca coll_base_verbose 1 -mca coll_tuned_use_dynamic_rules 1 -mca coll_tuned_bcast_algorithm <1,2,3,4,5,6> The values 1-6 control which topology/aglorith

[OMPI users] MPI_Bcast/MPI_Finalize hang with Open MPI 1.1

2006-06-29 Thread Doug Gregor
I am running into a problem with a simple program (which performs several MPI_Bcast operations) hanging. Most processes hang in MPI_Finalize, the others hang in MPI_Bcast. Interestingly enough, this only happens when I oversubscribe the nodes. For instance, using IU's Odin cluster, I take 4