[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM has retired

2011-01-01 Thread Richard Treumann
I am out of the office until 03/31/2011. I will be out of the office from now on - I will not see email or check phone messages. Good luck to you all Contact my former team leader: Charles J. Archer E-mail: arch...@us.ibm.com Phone: 553-0346 / 1-507-253-0346 OR My former manager: Carl F.

Re: [OMPI users] Method for worker to determine its "rank" on a single machine?

2010-12-10 Thread Richard Treumann
It seems to me the MPI_Get_processor_name description is too ambiguous to make this 100% portable. I assume most MPI implementations simply use the hostname so all processes on the same host will return the same string. The suggestion would work then. However, it would also be reasonable for

Re: [OMPI users] curious behavior during wait for broadcast: 100% cpu

2010-12-08 Thread Richard Treumann
Also - HPC clusters are commonly dedicated to running parallel jobs with exactly one process per CPU. HPC is about getting computation done and letting a CPU time slice among competing processes always has overhead (CPU time not spent on the computation). Unless you are trying to run extra

[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM is out of the office until 01/02/2001. (returning 11/01/2010)

2010-10-22 Thread Richard Treumann
I am out of the office until 11/01/2010. I will be out of the office on vacation the last week of Oct. Back Nov 1. I will not see any email. Note: This is an automated response to your message "[OMPI users] OPEN MPI data transfer error" sent on 10/22/10 15:19:05. This is the only

Re: [OMPI users] busy wait in MPI_Recv

2010-10-20 Thread Richard Treumann
Brian Most HPC applications are run with one processor and one working thread per MPI process. In this case, the node is not being used for other work so if the MPI process does release a processor, there is nothing else important for it to do anyway. In these applications, the blocking MPI

Re: [OMPI users] a question about [MPI]IO on systems without network filesystem

2010-10-19 Thread Richard Treumann
pen-mpi.org Date: 10/19/2010 02:47 PM Subject: Re: [OMPI users] a question about [MPI]IO on systemswithout network filesystem Sent by: users-boun...@open-mpi.org On Thu, Sep 30, 2010 at 09:00:31AM -0400, Richard Treumann wrote: > It is possible for MPI-IO to be implemented in a way that l

Re: [OMPI users] Shared memory

2010-10-06 Thread Richard Treumann
When you use MPI message passing in your application, the MPI library decides how to deliver the message. The "magic" is simply that when sender process and receiver process are on the same node (shared memory domain) the library uses shared memory to deliver the message from process to

Re: [OMPI users] a question about [MPI]IO on systems without network filesystem

2010-09-30 Thread Richard Treumann
I will add to what Terry said by mentioning that the MPI implementation has no awareness of ordinary POSIX or Fortran disk I/O routines. It cannot help on those. Any automated help the MPI implementation can provide would only apply to MPI_File_xxx disk I/O. These are implemented by the MPI

Re: [OMPI users] "self scheduled" work & mpi receive???

2010-09-24 Thread Richard Treumann
Amb It sounds like you have more workers than you can keep fed. Workers are finishing up and requesting their next assignment but sit idle because there are so many other idle workers too. Load balance does not really matter if the choke point is the master. The work is being done as fast

Re: [OMPI users] Question about Asynchronous collectives

2010-09-23 Thread Richard Treumann
Re: [OMPI users] Question about Asynchronous collectives Sent by: users-boun...@open-mpi.org Sorry Richard, what is CC issue order on the communicator?, in particular, "CC", what does it mean? 2010/9/23 Richard Treumann <treum...@us.ibm.com> request_1 and request_2 are j

Re: [OMPI users] Question about Asynchronous collectives

2010-09-23 Thread Richard Treumann
request_1 and request_2 are just local variable names. The only thing that determines matching order is CC issue order on the communicator. At each process, some CC is issued first and some CC is issued second. The first issued CC at each process will try to match the first issued CC at the

Re: [OMPI users] send and receive buffer the same on root

2010-09-16 Thread Richard Treumann
Tony You are depending on luck. The MPI Standard allows the implementation to assume that send and recv buffers are distinct unless MPI_IN_PLACE is used. Any MPI implementation may have more than one algorithm for a given MPI collective communication operation and the policy for switching

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > Richard Treumann wrote: > > Hi Ashley > > I understand the problem with descriptor flooding can be serious in > an application with unidirectional data dependancy. Perhaps we have >

Re: [OMPI users] MPI_Reduce performance

2010-09-10 Thread Richard Treumann
; to: > > Open MPI Users > > 09/09/2010 05:37 PM > > Sent by: > > users-boun...@open-mpi.org > > Please respond to Open MPI Users > > > On 9 Sep 2010, at 21:40, Richard Treumann wrote: > > > > > Ashley > > > > Can you provide

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley Can you provide an example of a situation in which these semantically redundant barriers help? I may be missing something but my statement for the text book would be "If adding a barrier to your MPI program makes it run faster, there is almost certainly a flaw in it that is better

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
I was pointing out that most programs have some degree of elastic synchronization built in. Tasks (or groups or components in a coupled model) seldom only produce data.they also consume what other tasks produce and that limits the potential skew. If step n for a task (or group or coupled

Re: [OMPI users] MPI_Reduce performance

2010-09-09 Thread Richard Treumann
Ashley's observation may apply to an application that iterates on many to one communication patterns. If the only collective used is MPI_Reduce, some non-root tasks can get ahead and keep pushing iteration results at tasks that are nearer the root. This could overload them and cause some extra

[OMPI users] AUTO: Richard Treumann/Poughkeepsie/IBM is out of the office until 01/02/2001. (returning 09/07/2010)

2010-08-30 Thread Richard Treumann
I am out of the office until 09/07/2010. I will be out of the office on vacation the week before Labor Day. I will not see any email. Note: This is an automated response to your message "[OMPI users] random IB failures when running medium core counts" sent on 8/30/10 12:22:19. This is the

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

2010-08-23 Thread Richard Treumann
Network saturation could produce arbitrary long delays the total data load we are talking about is really small. It is the responsibility of an MPI library to do one of the following: 1) Use a reliable message protocol for each message (e.g. Infiniband RC or TCP/IP) 2) detect lost packets and

Re: [OMPI users] IMB-MPI broadcast test stalls for large core counts: debug ideas?

2010-08-23 Thread Richard Treumann
It is hard to imagine how a total data load of 41,943,040 bytes could be a problem. That is really not much data. By the time the BCAST is done, each task (except root) will have received a single half meg message form one sender. That is not much. IMB does shift the root so some tasks may be

Re: [OMPI users] Accessing to the send buffer

2010-08-18 Thread Richard Treumann
As of MPI 2.2 there is no longer a restriction against read access to a live send buffer. The wording was changed to now prohibit the user to "modify". You can look the subsection of Communication Modes in chapter 3 but you will need to compare MPI 2.1 and 2.2 carefully to see the change.

Re: [OMPI users] MPI_Bcast issue

2010-08-12 Thread Richard Treumann
- yes I know this should > not happen, the question is why. > > --- On Wed, 11/8/10, Richard Treumann <treum...@us.ibm.com> wrote: > > From: Richard Treumann <treum...@us.ibm.com> > Subject: Re: [OMPI users] MPI_Bcast issue > To: "Open MPI Users&q

Re: [OMPI users] MPI_Bcast issue

2010-08-11 Thread Richard Treumann
Randolf I am confused about using multiple, concurrent mpirun operations. If there are M uses of mpirun and each starts N tasks (carried out under pvm or any other way) I would expect you to have M completely independent MPI jobs with N tasks (processes) each. You could have some root in

Re: [OMPI users] Accessing to the send buffer

2010-08-02 Thread Richard Treumann
For reading the data from an isend buffer to cause problems, the underlying hardware would need to have very unusual characteristic that the MPI implementation is exploiting. People have imagined hardware characteristics that could make reading an Isend buffer a problem but I have never heard

Re: [OMPI users] Partitioning problem set data

2010-07-21 Thread Richard Treumann
The MPI Standard (in my opinion) should have avoided the word "buffer". To me, a "buffer" is something you set aside as scratch space between the application data structures and the communication calls. In MPI, the communication is done directly from/to the application's data structures and

Re: [OMPI users] About the necessity of cancelation of pending communication and the use of buffer

2010-05-25 Thread Richard Treumann
Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 05/25/2010 12:03:11 AM: > [image removed] > > [OMPI users] About the necessity of

Re: [OMPI users] Questions about MPI_Isend

2010-05-11 Thread Richard Treumann
The MPI standard requires that when there is a free running task posting isends to a task that is not keeping up on receives, the sending task will switch to synchronous isend BEFORE the receive side runs out of memory and fails. There should be no need for the sender to us MPI_Issend

Re: [OMPI users] Fortran derived types

2010-05-07 Thread Richard Treumann
If someone is deciding whether to use complex datatypes or stick with contiguous ones, they need to look at their own situation. There is no simple answer. The only thing that is fully predictable is that an MPI operation, measured in isolation, will be no slower with contiguous data than with

Re: [OMPI users] MPI_Bsend vs. MPI_Ibsend (2)

2010-05-06 Thread Richard Treumann
Bsend does not guarantee to use the attached buffer, Return from MPI_Ibsend does not guarantee you can modify the application send buffer. Maybe the implementation would try to optimize by scheduling a nonblocking send from the apploication buffer that bypasses the copy to the attach buffer.

Re: [OMPI users] MPI_Bsend vs. MPI_Ibsend

2010-05-06 Thread Richard Treumann
An MPI send (of any kind), is defined by "local completion semantics". When a send is complete, the send buffer may be reused. The only kind of send that gives any indication whether the receive is posted is the synchronous send. Neither standard send nor buffered send tell the sender if the recv

Re: [OMPI users] Fortran derived types

2010-05-06 Thread Richard Treumann
Assume your data is discontiguous in memory and making it contiguous is not practical (e.g. there is no way to make cells of a row and cells of a column both contiguous.) You have 3 options: 1) Use many small/contiguous messages 2) Allocate scratch space and pack/unpack 3) Use a derived

Re: [OMPI users] Hide Abort output

2010-04-06 Thread Richard Treumann
The MPI standard says that MPI_Abort makes a "best effort". It also says that an MPI implementation is free to lose the value passed into MPI_Abort and deliver some other RC.. The standard does not say that MPI_Abort becomes a valid way to end a parallel job if it is passed a zero. To me it

Re: [OMPI users] Hide Abort output

2010-04-05 Thread Richard Treumann
are the messages telling you what the error -might- have been. On Apr 5, 2010, at 7:01 AM, Richard Treumann wrote: Why should any software system offer an option which lets the user hide all distinction between a run that succeeded and one that failed? Dick Treumann - MP

Re: [OMPI users] Hide Abort output

2010-04-05 Thread Richard Treumann
Why should any software system offer an option which lets the user hide all distinction between a run that succeeded and one that failed? Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax

Re: [OMPI users] Hide Abort output

2010-03-31 Thread Richard Treumann
I do not know what the OpenMPI message looks like or why people want to hide it. It should be phrased to avoid any implication of a problem with OpenMPI itself. How about something like this which: "The application has called MPI_Abort. The application is terminated by OpenMPI as the

Re: [OMPI users] running externalprogram on same processor (Fortran)

2010-03-22 Thread Richard Treumann
abc def The MPI_Barrier call in the parent must be on the intercomm returned by the spawn. The call in the children must be on MPI_COMM_PARENT. The semantic of an MPI_Barrier call on an intercomm is: No MPI_Barrier caller in the local group returns until all members of the remote group have

Re: [OMPI users] running externalprogram on same processor (Fortran)

2010-03-17 Thread Richard Treumann
abc def When the parent does a spawn call, it presumably blocks until the child tasks have called MPI_Init. The standard allows some flexibility on this but at least after spawn, the spawn side must be able to issue communication calls involving the children and expect them to work. What you

Re: [OMPI users] Why might MPI_Recv trip PSM_MQ_RECVREQS_MAX ?

2010-03-08 Thread Richard Treumann
The program Jonathan offers as an example is valid use of MPI standard send. With this message size it is fair to assume the implementation is doing standard send with an eager send. The MPI standard is explicit about how eager send, as a undercover option for standard send, must work. When the

Re: [OMPI users] MPI_Init() and MPI_Init_thread()

2010-03-04 Thread Richard Treumann
A call to MPI_Init allows the MPI library to return any level of thread support it chooses. This MPI 1.1 call does not let the application say what it wants and does not let the implementation reply with what it can guarantee. If you are using only one MPI implementation and your code will never

Re: [OMPI users] MPI_Init() and MPI_Init_thread()

2010-03-03 Thread Richard Treumann
...@open-mpi.org On Mar 3, 2010, at 11:35 AM, Richard Treumann wrote: > If the application will make MPI calls from multiple threads and > MPI_INIT_THREAD has returned FUNNELED, the application must be > willing to take the steps that ensure there will never be concurrent > c

Re: [OMPI users] MPI_Init() and MPI_Init_thread()

2010-03-03 Thread Richard Treumann
The caller of MPI_INIT_THREAD says what level of thread safety he would like to get from the MPI implementation. The implementation says what level of thread safety it provides. The implementation is free to provide more or less thread safety than requested. The caller of MPI_INIT_THREAD should

Re: [OMPI users] speed up this problem by MPI

2010-01-29 Thread Richard Treumann
Tim MPI is a library providing support for passing messages among several distinct processes. It offers datatype constructors that let an application describe complex layouts of data in the local memory of a process so a message can be sent from a complex data layout or received into a complex

Re: [OMPI users] How to start MPI_Spawn child processes early?

2010-01-27 Thread Richard Treumann
I cannot resist: Jaison - The MPI_Comm_spawn call specifies what you want to have happen. The child launch is what does happen. If we can come up with a way to have things happen correctly before we know what it is that we want to have happen, the heck with this HPC stuff. Lets get together

Re: [OMPI users] Mimicking timeout for MPI_Wait

2009-12-07 Thread Richard Treumann
The need for a "better" timeout depends on what else there is for the CPU to do. If you get creative and shift from {99% MPI_WAIT , 1% OS_idle_process} to {1% MPI_Wait, 99% OS_idle_process} at a cost of only a few extra microseconds added lag on MPI_Wait, you may be pleased by the CPU load

Re: [OMPI users] Program deadlocks, on simple send/recv loop

2009-12-03 Thread Richard Treumann
MPI standard compliant management of eager send requires that this program work. There is nothing that says "unless eager limit is set too high/low." Honoring this requirement in an MPI implementation can be costly. There are practical reasons to pass up this requirement because most applications

Re: [OMPI users] Multi-threading with OpenMPI ?

2009-09-25 Thread Richard Treumann
ou would need > to have a different input communicator for each thread that will > make an MPI_Comm_spawn call" , i am confused with the term "single > task communicator" > > Best Regards, > umanga > > Richard Treumann wrote: > It is dangerous to hold a

Re: [OMPI users] Multi-threading with OpenMPI ?

2009-09-18 Thread Richard Treumann
It is dangerous to hold a local lock (like a mutex} across a blocking MPI call unless you can be 100% sure everything that must happen remotely will be completely independent of what is done with local locks & communication dependancies on other tasks. It is likely that a MPI_Comm_spawn call in

Re: [OMPI users] Messages getting lost during transmission (?)

2009-09-09 Thread Richard Treumann
Dennis In MPI, you must complete every MPI_Isend by MPI_Wait on the request handle (or a variant like MPI_Waitall or MPI_Test that returns TRUE). An un-completed MPI_Isend leaves resources tied up. I do not know what symptom to expect from OpenMPI with this particular application error but the

Re: [OMPI users] Anyscientific application heavily using MPI_Barrier?

2009-08-24 Thread Richard Treumann
As far as I can see, Jeff's analysis is dead on. The matching order at P2 is based on the order in which the envelopes from P0 and P1 show up at P2. The Barrier does not force an order between the communication paths P0->P2 vs. P1->P2. The MPI standard does not even say what "show up" means

Re: [OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-28 Thread Richard Treumann
processes and determine if > any are outstanding. It could be accomplished with a single > MPI_Reduce(sent - received). > > Cheers, > Shaun > > Richard Treumann wrote: > > No - it is not guaranteed. (it is highly probable though) > > > > The return from the MPI_

Re: [OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-23 Thread Richard Treumann
No - it is not guaranteed. (it is highly probable though) The return from the MPI_Send only guarantees that the data is safely held somewhere other than the send buffer so you are free to modify the send buffer. The MPI standard does not say where the data is to be held. It only says that once

Re: [OMPI users] Exit Program Without Calling MPI_Finalize ForSpecial Case

2009-06-04 Thread Richard Treumann
Tee Wen Kai - You asked "Just to find out more about the consequences for exiting MPI processes without calling MPI_Finalize, will it cause memory leak or other fatal problem?" Be aware that Jeff has offered you an OpenMPI implementation oriented answer rather than an MPI standard oriented

Re: [OMPI users] Reduce with XOR with MPI_Double

2009-04-21 Thread Richard Treumann
Santolo The MPI standard defines reduction operations where the operand/operation pair has a meaningful semantic. I cannot picture a well defined semantic for: 999.0 BXOR 0.009. Maybe you can but it is not an error that the MPI standard leaves out BXOR on

Re: [OMPI users] MPI_Test without deallocation

2009-03-26 Thread Richard Treumann
You can use MPI_REQUEST_GET_STATUS as a way to "test" without deallocation. I do not understand the reason you would forward the request (as a request) to another function. The data is already in a specific receive buffer by the time an MPI_Test returns TRUE so calling the function and passing

Re: [OMPI users] Collective operations and synchronization

2009-03-23 Thread Richard Treumann
There is no synchronization operation in MPI that promises all tasks will exit at the same time. For MPI_Barrier they will exit as close to the same time as the implementation can reasonably support but as long as the application is distributed and there are delays in the interconnect, it is not

Re: [OMPI users] Any scientific application heavily using MPI_Barrier?

2009-03-06 Thread Richard Treumann
Jeff paraphrased an unnamed source as suggesting that: "any MPI program that relies on a barrier for correctness is an incorrect MPI application." . That is probably too strong. How about this assertion? If there are no wildcard receives - every MPI_Barrier call is semantically irrelevant. It

Re: [OMPI users] MPI_Gatherv and 0 size?

2009-02-23 Thread Richard Treumann
Hi George I have run into the argument that in a case where the number of array elements that will be accessed is == 0 it is "obviously" valid to pass NULL as the array address. I recognize the argument has merit but I am not clear that it really requires that an MPI implementation that tries to

Re: [OMPI users] How to quit asynchronous processes

2009-02-23 Thread Richard Treumann
I am not 100% sure I understand your situation. It it this? Process A has an ongoing stream of inputs. For each input unit, A does some processing and then passes on work to B via a message. B receives the message from A and does some additional work before sending a message to C. C receives

Re: [OMPI users] MPI_Test bug?

2009-02-05 Thread Richard Treumann
One difference is that putting a blocking send before the irecv is a classic "unsafe" MPI program. It depends on eager send buffering to complete the MPI_Send so the MPI_Irecv can be posted. The example with MPI_Send first would be allowed to hang. The original program is correct and safe MPI.

Re: [OMPI users] CPU burning in Wait state

2008-09-03 Thread Richard Treumann
Vincent 1) Assume you are running an MPI program which has 16 tasks in MPI_COMM_WORLD, you have 16 dedicated CPUs and each task is single threaded. (a task is a distinct process, a process can contain one or more threads) The is the most common traditional model. In this model, when a task

Re: [OMPI users] MPI_Brecv vs multiple MPI_Irecv

2008-08-27 Thread Richard Treumann
Hi Robert Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 users-boun...@open-mpi.org wrote on 08/27/2008 11:55:58 AM: << snip >> > > However from an application point of

Re: [OMPI users] MPI_Brecv vs multiple MPI_Irecv

2008-08-27 Thread Richard Treumann
Robert - A return from a blocking send means the application send buffer is available for reuse. If it is a BSEND, the application buffer could be available because the message data has been copied to the attached buffer or because the data has been delivered to the destination. The

Re: [OMPI users] MPI_Type_struct for structs with dynamic arrays

2008-08-20 Thread Richard Treumann
Hi Jitendra Before you worry too much about the inefficiency of using a contiguous scratch buffer to pack into and send from and a second contiguous scratch buffer to receive into and unpack from, it would be worth knowing how OpenMPI processes a discontiguous datatype on your platform.

Re: [OMPI users] MPI_CANCEL

2008-04-15 Thread Richard Treumann
Hi slimtimmy I have been involved in several of the MPI Forum's discussions of how MPI_Cancel should work and I agree with your interpretation of the standard. By my reading of the standard, the MPI_Wait must not hang and the cancel must succeed. Making an MPI implementation work exactly as the

Re: [OMPI users] openmpi credits for eager messages

2008-02-05 Thread Richard Treumann
Ron's comments are probably dead on for an application like bug3. If bug3 is long running and libmpi is doing eager protocol buffer management as I contend the standard requires then the producers will not get far ahead of the consumer before they are forced to synchronous send under the covers

Re: [OMPI users] openmpi credits for eager messages

2008-02-05 Thread Richard Treumann
Richard, > > You're absolutely right. What a shame :) If I have spent less time > drawing the boxes around the code I might have noticed the typo. The > Send should be an Isend. > >george. > > On Feb 4, 2008, at 5:32 PM, Richard Treumann wrote: > > > Hi George > &g

Re: [OMPI users] openmpi credits for eager messages

2008-02-05 Thread Richard Treumann
Hi Gleb There is no misunderstanding of the MPI standard or the definition of blocking in the bug3 example. Both bug 3 and the example I provided are valid MPI. As you say, blocking means the send buffer can be reused when the MPI_Send returns. This is exactly what bug3 is count on. MPI is a

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Richard Treumann
break a particular > MPI implementation. It doesn't necessarily make this implementation > non standard compliant. > > george. > > On Feb 4, 2008, at 9:08 AM, Richard Treumann wrote: > > > Is what George says accurate? If so, it sounds to me like OpenMPI > >

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Richard Treumann
08 02:03:20 PM: > On Mon, Feb 04, 2008 at 09:08:45AM -0500, Richard Treumann wrote: > > To me, the MPI standard is clear that a program like this: > > > > task 0: > > MPI_Init > > sleep(3000); > > start receiving messages > > > > each of tasks 1 to n-

[OMPI users] Fw: openmpi credits for eager messages

2008-02-04 Thread Richard Treumann
Sorry for typo - The reference is MPI 1.1 Dick Treumann - MPI Team/TCEM IBM Systems & Technology Group Dept 0lva / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 - Forwarded by Richard Treumann/Poughkeepsie/IBM on 02/04/2008 01:3

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Richard Treumann
Hi Ron - I am well aware of the scaling problems related to the standard send requirements in MPI. I t is a very difficult issue. However, here is what the standard says: MPI 1.2, page 32 lines 29-37 === a standard send operation that cannot complete because of lack of buffer space will