Re: [OMPI users] request_get_status: Recheck request status [PATCH]

2010-05-27 Thread Shaun Jackman
Ping. On Tue, 2010-05-04 at 14:06 -0700, Shaun Jackman wrote: > Hi Jeff, > > request_get_status polls request->req_complete before calling > opal_progress. Ideally, it would check req_complete, call opal_progress, > and check req_complete one final time. This patch identic

[OMPI users] request_get_status: Recheck request status [PATCH]

2010-05-04 Thread Shaun Jackman
iscussed this patch on the mailing list previously. I think we both agreed it was a good idea, but never made it around to being applied. Cheers, Shaun 2009-09-14 Shaun Jackman * ompi/mpi/c/request_get_status.c (MPI_Request_get_status): If opal_progress is called then check th

Re: [OMPI users] MPI_Init never returns on IA64

2010-03-30 Thread Shaun Jackman
wrote: > Could you try one of the 1.4.2 nightly tarballs and see if that makes the > issue better? > > http://www.open-mpi.org/nightly/v1.4/ > > > On Mar 29, 2010, at 7:47 PM, Shaun Jackman wrote: > > > Hi, > > > > On an IA64 platform, MPI_Init neve

[OMPI users] MPI_Init never returns on IA64

2010-03-29 Thread Shaun Jackman
Hi, On an IA64 platform, MPI_Init never returns. I fired up GDB and it seems that ompi_free_list_grow never returns. My test program does nothing but call MPI_Init. Here's the backtrace: (gdb) bt #0 0x20075620 in ompi_free_list_grow () from /home/aubjtl/openmpi/lib/libmpi.so.0 #1 0x200

Re: [OMPI users] mca_pml_ob1_send blocks

2009-09-14 Thread Shaun Jackman
Hi Jeff, Jeff Squyres wrote: On Sep 8, 2009, at 1:06 PM, Shaun Jackman wrote: My INBOX has been a disaster recently. Please ping me repeatedly if you need quicker replies (sorry! :-( ). (btw, should this really be on the devel list, not the user list?) It's tending that way. I&#x

Re: [OMPI users] mca_pml_ob1_send blocks

2009-09-08 Thread Shaun Jackman
Jeff Squyres wrote: ... Two questions then... 1. If the request has already completed, does it mean that since opal_progress() is not called, no further progress is made? Correct. It's a latency thing; if your request has already completed, we just tell you without further delay (i.e., wit

Re: [OMPI users] mca_pml_ob1_send blocks

2009-08-31 Thread Shaun Jackman
Shaun Jackman wrote: Jeff Squyres wrote: On Aug 26, 2009, at 10:38 AM, Jeff Squyres (jsquyres) wrote: Yes, this could cause blocking. Specifically, the receiver may not advance any other senders until the matching Irecv is posted and is able to make progress. I should clarify something else

Re: [OMPI users] mca_pml_ob1_send blocks

2009-08-31 Thread Shaun Jackman
Jeff Squyres wrote: On Aug 26, 2009, at 10:38 AM, Jeff Squyres (jsquyres) wrote: Yes, this could cause blocking. Specifically, the receiver may not advance any other senders until the matching Irecv is posted and is able to make progress. I should clarify something else here -- for long mess

Re: [OMPI users] mca_pml_ob1_send blocks

2009-08-27 Thread Shaun Jackman
Jeff Squyres wrote: On Aug 26, 2009, at 10:38 AM, Jeff Squyres (jsquyres) wrote: Yes, this could cause blocking. Specifically, the receiver may not advance any other senders until the matching Irecv is posted and is able to make progress. I should clarify something else here -- for long mess

Re: [OMPI users] mca_pml_ob1_send blocks

2009-08-25 Thread Shaun Jackman
Jeff Squyres wrote: On Aug 24, 2009, at 2:18 PM, Shaun Jackman wrote: I'm seeing MPI_Send block in mca_pml_ob1_send. The packet is shorter than the eager transmit limit for shared memory (3300 bytes < 4096 bytes). I'm trying to determine if MPI_Send is blocking due to a deadlock.

Re: [OMPI users] mca_pml_ob1_send blocks

2009-08-24 Thread Shaun Jackman
2a956c56f1 in PMPI_Send () from /home/sjackman/arch/xhost/lib/libmpi.so.0 Frames #0-#3 do return and loop. Frame #4 never returns. Cheers, Shaun Shaun Jackman wrote: Hi, I'm seeing MPI_Send block in mca_pml_ob1_send. The packet is shorter than the eager transmit limit for shared mem

[OMPI users] mca_pml_ob1_send blocks

2009-08-24 Thread Shaun Jackman
Hi, I'm seeing MPI_Send block in mca_pml_ob1_send. The packet is shorter than the eager transmit limit for shared memory (3300 bytes < 4096 bytes). I'm trying to determine if MPI_Send is blocking due to a deadlock. Will MPI_Send block even when sending a packet eagerly? Thanks, Shaun

Re: [OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-27 Thread Shaun Jackman
Hi Dick, Okay, it's good to know that even if using MPI_Barrier in this fashion did appear to be working, it's not guaranteed to work. Is there an MPI collective function that has the desired effect? that after all processes call this function, any previously posted MPI_Send are guaranteed to

Re: [OMPI users] Receiving an unknown number of messages

2009-07-24 Thread Shaun Jackman
Eugene Loh wrote: Shaun Jackman wrote: 2 calls MPI_Test. No message is waiting, so 2 decides to send. 2 sends to 0 and does not block (0 has one MPI_Irecv posted) 3 calls MPI_Test. No message is waiting, so 3 decides to send. 3 sends to 1 and does not block (1 has one MPI_Irecv posted) 0 calls

Re: [OMPI users] Receiving an unknown number of messages

2009-07-24 Thread Shaun Jackman
n with the MPI_Irecv/MPI_Test it will serve as an extra proof for the receivers to proceed. Any ideas on that ? On Wed, Jul 15, 2009 at 2:15 AM, Eugene Loh <mailto:eugene@sun.com>> wrote: Shaun Jackman wrote: For my MPI application, each process reads a file and

Re: [OMPI users] Receiving an unknown number of messages

2009-07-24 Thread Shaun Jackman
Eugene Loh wrote: Shaun Jackman wrote: Eugene Loh wrote: Shaun Jackman wrote: For my MPI application, each process reads a file and for each line sends a message (MPI_Send) to one of the other processes determined by the contents of that line. Each process posts a single MPI_Irecv and uses

[OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-23 Thread Shaun Jackman
Hi, Two processes run the following program: request = MPI_Irecv MPI_Send (to the other process) MPI_Barrier flag = MPI_Test(request) Without the barrier, there's a race and MPI_Test may or may not return true, indicating whether the message has been received. With the barrier, is it guarante

Re: [OMPI users] Receiving an unknown number of messages

2009-07-23 Thread Shaun Jackman
Eugene Loh wrote: Shaun Jackman wrote: For my MPI application, each process reads a file and for each line sends a message (MPI_Send) to one of the other processes determined by the contents of that line. Each process posts a single MPI_Irecv and uses MPI_Request_get_status to test for a

[OMPI users] Receiving an unknown number of messages

2009-07-14 Thread Shaun Jackman
Hi, For my MPI application, each process reads a file and for each line sends a message (MPI_Send) to one of the other processes determined by the contents of that line. Each process posts a single MPI_Irecv and uses MPI_Request_get_status to test for a received message. If a message has been

Re: [OMPI users] sharing memory between processes

2009-04-28 Thread Shaun Jackman
For what it's worth, the genome assembly software ABySS uses exactly this system that Jody is describing to represent a directed graph. http://www.bcgsc.ca/platform/bioinfo/software/abyss Cheers, Shaun jody wrote: Hi Barnabas As far as i know, Open-MPI is not a shared memory system. Using Op

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-16 Thread Shaun Jackman
Eugene Loh wrote: ... What's the rest? I said the shared-memory area is much smaller, but I was confused about which OMPI release I was using. So, the shared-memory area was 128 Mbyte and it was getting mapped in once for each process, and so it was counted doubly. If there are eight proces

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-14 Thread Shaun Jackman
Eugene Loh wrote: Okay. Attached is a "little" note I wrote up illustrating memory profiling with Sun tools. (It's "big" because I ended up including a few screenshots.) The program has a bunch of one-way message traffic and some user-code memory allocation. I then rerun with the receiver

Re: [OMPI users] Problem with MPI_File_read()

2009-04-14 Thread Shaun Jackman
Hi Jovana, 825307441 is 0x31313131 in base 16 (hexadecimal), which is the string `' in ASCII. MPI_File_read reads in binary values (not ASCII) just as the standard functions read(2) and fread(3) do. So, your program is fine; however, your data file (first.dat) is not. Cheers, Shaun Jovan

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-14 Thread Shaun Jackman
Eugene Loh wrote: ompi_info -a | grep eager depends on the BTL. E.g., sm=4K but tcp is 64K. self is 128K. Thanks, Eugene. On the other hand, I assume the memory imbalance we're talking about is rather severe. Much more than 2500 bytes to be noticeable, I would think. Is that really the s

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-14 Thread Shaun Jackman
Hi Eugene, Eugene Loh wrote: At 2500 bytes, all messages will presumably be sent "eagerly" -- without waiting for the receiver to indicate that it's ready to receive that particular message. This would suggest congestion, if any, is on the receiver side. Some kind of congestion could, I supp

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-09 Thread Shaun Jackman
Eugene Loh wrote: I'm no expert, but I think it's something like this: 1) If the messages are short, they're sent over to the receiver. If the receiver does not expect them (no MPI_Irecv posted), it buffers them up. 2) If the messages are long, only a little bit is sent over to the receiver

Re: [OMPI users] Debugging memory use of Open MPI

2009-04-09 Thread Shaun Jackman
Eugene Loh wrote: I'm no expert, but I think it's something like this: 1) If the messages are short, they're sent over to the receiver. If the receiver does not expect them (no MPI_Irecv posted), it buffers them up. 2) If the messages are long, only a little bit is sent over to the receiver

[OMPI users] Debugging memory use of Open MPI

2009-04-09 Thread Shaun Jackman
When running my Open MPI application, I'm seeing three processors that are using five times as much memory as the others when they should all use the same amount of memory. To start the debugging process, I would like to know if it's my application or the Open MPI library that's using the addit

Re: [OMPI users] Same bug in v1.0.6

2009-03-26 Thread Shaun Jackman
Please ignore the following message. It wasn't intended for the Open MPI list. My apologies. Cheers, Shaun Shaun Jackman wrote: Hi Todd, Back to the drawing board for me. The assertion is stating that all the tips should have been eroded in a single pass (and 2654086 tips were), but

Re: [OMPI users] Same bug in v1.0.6

2009-03-26 Thread Shaun Jackman
Hi Todd, Back to the drawing board for me. The assertion is stating that all the tips should have been eroded in a single pass (and 2654086 tips were), but in a second pass it unexpectedly found 2 more tips. As a workaround until I nail this bug, you can downgrade this error to a warning by r

[OMPI users] Bug in MPI_Request_get_status (1.3.1) [PATCH]

2009-03-26 Thread Shaun Jackman
MPI_Request_get_status fails if the status parameter is passed MPI_STATUS_IGNORE. A patch is attached. Cheers, Shaun 2009-03-26 Shaun Jackman * ompi/mpi/c/request_get_status.c (MPI_Request_get_status): Do not fail if the status argument is NULL, because the application may pass

Re: [OMPI users] MPI_Test without deallocation

2009-03-25 Thread Shaun Jackman
to be probed, but MPI_Test has the side effect of also deallocating the MPI_Request object. Cheers, Shaun Justin wrote: Have you tried MPI_Probe? Justin Shaun Jackman wrote: Is there a function similar to MPI_Test that doesn't deallocate the MPI_Request object? I would like to test if a m

[OMPI users] MPI_Test without deallocation

2009-03-25 Thread Shaun Jackman
Is there a function similar to MPI_Test that doesn't deallocate the MPI_Request object? I would like to test if a message has been received (MPI_Irecv), check its tag, and dispatch the MPI_Request to another function based on that tag. Cheers, Shaun

Re: [OMPI users] Collective operations and synchronization

2009-03-25 Thread Shaun Jackman
On Tue, 2009-03-24 at 07:03 -0800, Eugene Loh wrote: I'm not sure I understand this suggestion, so I'll say it the way I understand it. Would it be possible for each process to send an "all done" message to each of its neighbors? Conversely, each process would poll its neighbors for messages,

Re: [OMPI users] Collective operations and synchronization

2009-03-23 Thread Shaun Jackman
Hi Ralph, Thanks for your response. My problem is removing all leaf nodes from a directed graph, which is distributed among a number of processes. Each process iterates over its portion of the graph, and if a node is a leaf (indegree(n) == 0 || outdegree(n) == 0), it removes the node (which i

[OMPI users] Collective operations and synchronization

2009-03-23 Thread Shaun Jackman
I've just read in the Open MPI documentation [1] that collective operations, such as MPI_Allreduce, may synchronize, but do not necessarily synchronize. My algorithm requires a collective operation and synchronization; is there a better (more efficient?) method than simply calling MPI_Allreduce