[OMPI devel] Connect/Accept and Disconnect

2010-12-21 Thread Suraj Prabhakaran

Hello,

This is basically a repost of my previous mail regarding problems with 
connect/accept and disconnect (**this is not related to spawning, 
parent/child**).
I *sometimes* find processes blocking indefinitely at Connect/Accept 
calls or at Disconnect calls. I have an example below.


Process A
{
MPI_Open_port(...);
MPI_Publish_name(...);
MPI_Comm_accept(... &b_comm);  // -> (1)
// Do something1
MPI_Comm_disconnect(&b_comm);  // --> (2)
// Do something2

}

Process B
{
MPI_Lookup_name(...);
MPI_Comm_connect(... &a_comm); // -> (1)
// Do something1
MPI_Comm_disconnect(&a_comm); // --> (2)
// Do something2
}

In the above scenario, in a perfect case where A reaches (1) without any 
problems, **sometimes** B blocks at its (1) indefinitely. All arguments 
passed to both the functions are perfect.
Again, **sometimes** one of them block infinitely at (2) while the other 
goes on to do the something2. This could only be a problem at the 
application level only if the one that blocks indefinitely is always the 
same but it is not so. Sometimes A blocks and B is busy doing something2 
or A is busy doing its something2 while B blocks.


Is this a known issue? or am I the only person experiencing this and is 
clean for others who frequently use connect/accept/disconnect calls?


Thanks,
Suraj


Re: [OMPI devel] Connect/Accept and Disconnect

2010-12-21 Thread Ralph Castain
Are you using ompi-server for pub/sub, or just letting it default to mpirun?

You might want to output the return value from lookup_name and publish_name to 
see if they match. If they are different, then you will definitely hang.


On Dec 21, 2010, at 6:41 AM, Suraj Prabhakaran wrote:

> Hello,
> 
> This is basically a repost of my previous mail regarding problems with 
> connect/accept and disconnect (*this is not related to spawning, 
> parent/child*). 
> I *sometimes* find processes blocking indefinitely at Connect/Accept calls or 
> at Disconnect calls. I have an example below.
> 
> Process A
> {
> MPI_Open_port(...);
> MPI_Publish_name(...);
> MPI_Comm_accept(... &b_comm);  // -> (1)
> // Do something1
> MPI_Comm_disconnect(&b_comm);  // --> (2)
> // Do something2
> 
> }
> 
> Process B
> {
> MPI_Lookup_name(...);
> MPI_Comm_connect(... &a_comm); // -> (1)
> // Do something1
> MPI_Comm_disconnect(&a_comm); // --> (2)
> // Do something2
> }
> 
> In the above scenario, in a perfect case where A reaches (1) without any 
> problems, *sometimes* B blocks at its (1) indefinitely. All arguments passed 
> to both the functions are perfect.
> Again, *sometimes* one of them block infinitely at (2) while the other goes 
> on to do the something2. This could only be a problem at the application 
> level only if the one that blocks indefinitely is always the same but it is 
> not so. Sometimes A blocks and B is busy doing something2 or A is busy doing 
> its something2 while B blocks. 
> 
> Is this a known issue? or am I the only person experiencing this and is clean 
> for others who frequently use connect/accept/disconnect calls?
> 
> Thanks,
> Suraj
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] Connect/Accept and Disconnect

2010-12-21 Thread Suraj Prabhakaran


On 12/21/2010 03:12 PM, Ralph Castain wrote:
Are you using ompi-server for pub/sub, or just letting it default to 
mpirun?


You might want to output the return value from lookup_name and 
publish_name to see if they match. If they are different, then you 
will definitely hang.


I used ompi-server. I did print the ports and names and they all match!




[OMPI devel] openib btl_openib_async_thread poll question

2010-12-21 Thread Terry Dontje
We're doing some testing with openib btl on a system with Solaris.  It 
looks like Solaris can return POLLIN|POLLRDNORM in revents from a poll 
call.  I looked at the manpages for Linux and it reads like Linux could 
possibly do this too.  However the code in btl_openib_async_thread that 
checks for valid revents is only checking for POLLIN and in the case it 
gets POLLIN|POLLRDNORM the btl ends up throwing an error.  I think 
erroring out on the POLLIN|POLLRDNORM case is a bug.


Does anyone feel otherwise and can explain to me why we should not 
consider POLLIN|POLLRDNORM as a valid condition?  I have the same 
question pertaining to POLLRDBAND too but I don't believe we've seen 
this set.


thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





Re: [OMPI devel] Connect/Accept and Disconnect

2010-12-21 Thread Ralph Castain
You could try configuring with --enable-debug and then set -mca 
dpm_base_verbose 5 on the cmd line of your two jobs that are trying to connect. 
Will provide some hopefully useful debug info.


BTW: how did you configure OMPI?

On Dec 21, 2010, at 7:33 AM, Suraj Prabhakaran wrote:

> 
> On 12/21/2010 03:12 PM, Ralph Castain wrote:
>> Are you using ompi-server for pub/sub, or just letting it default to mpirun?
>> 
>> You might want to output the return value from lookup_name and publish_name 
>> to see if they match. If they are different, then you will definitely hang.
> 
> I used ompi-server. I did print the ports and names and they all match!
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] openib btl_openib_async_thread poll question

2010-12-21 Thread Terry Dontje
After further inspection I saw that events is being set to POLLIN only.  
Is that suppose to mask out any other bits from being set (like 
POLLRDNORM)?


--td
On 12/21/2010 10:35 AM, Terry Dontje wrote:
We're doing some testing with openib btl on a system with Solaris.  It 
looks like Solaris can return POLLIN|POLLRDNORM in revents from a poll 
call.  I looked at the manpages for Linux and it reads like Linux 
could possibly do this too.  However the code in 
btl_openib_async_thread that checks for valid revents is only checking 
for POLLIN and in the case it gets POLLIN|POLLRDNORM the btl ends up 
throwing an error.  I think erroring out on the POLLIN|POLLRDNORM case 
is a bug.


Does anyone feel otherwise and can explain to me why we should not 
consider POLLIN|POLLRDNORM as a valid condition?  I have the same 
question pertaining to POLLRDBAND too but I don't believe we've seen 
this set.


thanks,
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





[OMPI devel] Datatype question

2010-12-21 Thread Barrett, Brian W
All -

I'm trying to follow up on James Dinan's one-sided datatype errors e-mail and 
running into some datatype issues from when the datatype engine was moved to 
OPAL (sigh).  Accumulate needs to get at the underlying datatypes for a 
user-created dataype.  Before the ddt move, one just walked bdt_used and found 
the underlying type.  Now it appears that bdt_used refers to the opal types, 
not the ompi types.  And since there's not a one-to-one mapping between the 
two, I'm at a loss as to how one could find a MPI pre-defined datatype from the 
user-defined datatype.  Can someone point me in the right direction?

Brian

-- 
  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories





Re: [OMPI devel] Datatype question

2010-12-21 Thread George Bosilca
I have a big patch pending, that will map all ompi types, and therefore all OP 
directly into OPAL ddt. The OMPI DDT part is completed, but I have some 
troubles with the ops. At this point I'm looking into the .m4 files for some 
help with the mapping between Fortran types directly into the POSIX types 
(int8_t and friends) instead of C types.

I tried a different approach, where I add things in the ompi_datatype_t. 
Unfortunately, this will break all backward compatibility because of the use of 
the ompi_predefined_datatype_t fixed size structure. The problem is that adding 
things into the ompi_datatype_t, make this structure larger than 512 bytes, and 
therefore force us to modify the size of the ompi_predefined_datatype_t.

Anyway, back to your question. The MPI and OPAL datatypes uses the same 
indexes, for all the OPAL predefined types. Several MPI types map to the same 
underlying OPAL type: such as MPI_INT, MPI_INTEGER, MPI_INT32_T. All MPI types 
not supported at OPAL level, will get their indexes contiguously after the 
OPAL_DATATYPE_MAX_PREDEFINED upper bound (up to 
OMPI_DATATYPE_MPI_MAX_PREDEFINED). Moreover, the OPAL layer has been modified 
to support up to OPAL_DATATYPE_MAX_SUPPORTED datatypes, and this value should 
be modified based on the upper level requirements (today it is set to 46 as 
this is the total number of MPI supported datatypes, including the Fortran 
ones). bdt_used is currently defined as an uint32_t, so obviously there is not 
enough place to hold all possible MPI datatypes.

Solution 1: We can change the bdt_used to uint64_t. This requires some work, 
and I will prefer to have some time to see exactly all the implications.

Solution 2: Quick and dirty, but not the fastest one. Instead of walking the 
bdt_used you can walk the btypes array. If the count is not zero, then the MPI 
datatype corresponding to the index in the array is used.

  george.

On Dec 21, 2010, at 12:02 , Barrett, Brian W wrote:

> All -
> 
> I'm trying to follow up on James Dinan's one-sided datatype errors e-mail and 
> running into some datatype issues from when the datatype engine was moved to 
> OPAL (sigh).  Accumulate needs to get at the underlying datatypes for a 
> user-created dataype.  Before the ddt move, one just walked bdt_used and 
> found the underlying type.  Now it appears that bdt_used refers to the opal 
> types, not the ompi types.  And since there's not a one-to-one mapping 
> between the two, I'm at a loss as to how one could find a MPI pre-defined 
> datatype from the user-defined datatype.  Can someone point me in the right 
> direction?
> 
> Brian
> 
> -- 
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] openib btl_openib_async_thread poll question

2010-12-21 Thread Shamis, Pavel
According to man pages, only POLLIN or Errors maybe returned in the specific 
case:

The bits returned in revents can include any of those specified in events, or 
one of the values POLLERR, POLLHUP, or POLLNVAL.  (These three bits are 
meaningless
 in the events field, and will be set in the revents field whenever the 
corresponding condition is true.)

Since POLLRDNORM was not specified in the even mask, it is "unexpected" event 
that handled as error.


Regards,
Pasha

On Dec 21, 2010, at 11:11 AM, Terry Dontje wrote:

After further inspection I saw that events is being set to POLLIN only.  Is 
that suppose to mask out any other bits from being set (like POLLRDNORM)?

--td
On 12/21/2010 10:35 AM, Terry Dontje wrote:
We're doing some testing with openib btl on a system with Solaris.  It looks 
like Solaris can return POLLIN|POLLRDNORM in revents from a poll call.  I 
looked at the manpages for Linux and it reads like Linux could possibly do this 
too.  However the code in btl_openib_async_thread that checks for valid revents 
is only checking for POLLIN and in the case it gets POLLIN|POLLRDNORM the btl 
ends up throwing an error.  I think erroring out on the POLLIN|POLLRDNORM case 
is a bug.

Does anyone feel otherwise and can explain to me why we should not consider 
POLLIN|POLLRDNORM as a valid condition?  I have the same question pertaining to 
POLLRDBAND too but I don't believe we've seen this set.

thanks,
--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel