Re: [OMPI devel] matching code rewrite in OB1

2007-12-18 Thread Gleb Natapov
On Mon, Dec 17, 2007 at 08:08:02PM -0500, Richard Graham wrote:
>   Needless to say (for the nth time :-) ) that changing this bit of code
> makes me
>  nervous.
I've noticed it already :)

>However, it occurred to me that there is a much better way to
> test
>  this code than setting up an environment that generates some out of order
>  events with out us being able to specify the order.
>   Since this routine is executed serially, it should be sufficient to set up
> a test
>  code that would simulate any out-of-order scenario we want.  If one
> specifies
>  number of ³messages² to be ³sent², and ³randomly² changes the order they
>  arrive (e.g. scramble some input vector), one can check and see if the
> messages
>  are ³received² in the correct order.  One could even ³drop² messages and
> see
>  if matching stops.  Using a test code like this, and a code coverage tool,
> one
>  should be able to get much better testing that we have to date.
While I sometimes do unit testing for code that I write, in this case it is
easy to generate all reasonable corner case without isolating the code in a
separate unit. I run this code through different specially constructed MPI
application and checked code coverage with gcov. Here is the result:

File '/home/glebn/OpenMPI/ompi.stg/ompi/mca/pml/ob1/pml_ob1_recvfrag.c'
Lines executed:97.58% of 124

Only two lines of the code was never executes, both for error cases that
should cause abort anyway.
The pml_ob1_recvfrag.c.gcov with coverage results is attached for
however is curious enough to look at them. BTW I doubt that previous
code passed this level of testing. At least with gcov it is not possible
to generate meaningful results when most of the code is inside macros.


>   What would you think about doing something like this ?   Seems like a few
> hours
>  of this sort of simulation would be much better than even years of testing
> and
>  relying on random fluctuations in the run to thoroughly test out-of-order
> scenarios.
> 
> What do you think ?
I think that coverage testing I did is enough for this code.

> Rich
> 
> 
> On 12/17/07 8:32 AM, "Gleb Natapov"  wrote:
> 
> > On Thu, Dec 13, 2007 at 08:04:21PM -0500, Richard Graham wrote:
> >> > Yes, should be a bit more clear.  Need an independent way to verify that
> >> > data is matched
> >> >  in the correct order ­ sending this information as payload is one way to
> >> do
> >> > this.  So,
> >> >  sending unique data in every message, and making sure that it arrives in
> >> > the user buffers
> >> >  in the expected order is a way to do this.
> > 
> > Did that. Encoded sequence number in a payload and sent many eager
> > packets from one rank to another. Many packets were reoredered, but
> > application received everything in a correct order.
> > 
> > --
> > Gleb.
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > 
> 
> 

> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Gleb.
-:
0:Source:/home/glebn/OpenMPI/ompi.stg/ompi/mca/pml/ob1/pml_ob1_recvfrag.c
-:0:Graph:pml_ob1_recvfrag.gcno
-:0:Data:pml_ob1_recvfrag.gcda
-:0:Runs:8
-:0:Programs:1
-:0:Source is newer than graph
-:1:/*
-:2: * Copyright (c) 2004-2005 The Trustees of Indiana University 
and Indiana
-:3: * University Research and Technology
-:4: * Corporation.  All rights reserved.
-:5: * Copyright (c) 2004-2006 The University of Tennessee and The 
University
-:6: * of Tennessee Research Foundation.  
All rights
-:7: * reserved.
-:8: * Copyright (c) 2004-2007 High Performance Computing Center 
Stuttgart, 
-:9: * University of Stuttgart.  All rights 
reserved.
-:   10: * Copyright (c) 2004-2005 The Regents of the University of 
California.
-:   11: * All rights reserved.
-:   12: * $COPYRIGHT$
-:   13: * 
-:   14: * Additional copyrights may follow
-:   15: * 
-:   16: * $HEADER$
-:   17: */
-:   18:
-:   19:/**
-:   20: * @file
-:   21: */
-:   22:
-:   23:#include "ompi_config.h"
-:   24:
-:   25:#include "opal/class/opal_list.h"
-:   26:#include "opal/threads/mutex.h"
-:   27:#include "opal/prefetch.h"
-:   28:#include "ompi/constants.h"
-:   29:#include "ompi/communicator/communicator.h"
-:   30:#include "ompi/mca/pml/pml.h"
-:   31:#include "pml_ob1.h"
-:   32:#include 

Re: [OMPI devel] matching code rewrite in OB1

2007-12-17 Thread Gleb Natapov
On Thu, Dec 13, 2007 at 08:04:21PM -0500, Richard Graham wrote:
> Yes, should be a bit more clear.  Need an independent way to verify that
> data is matched
>  in the correct order ­ sending this information as payload is one way to do
> this.  So,
>  sending unique data in every message, and making sure that it arrives in
> the user buffers
>  in the expected order is a way to do this.

Did that. Encoded sequence number in a payload and sent many eager
packets from one rank to another. Many packets were reoredered, but
application received everything in a correct order.

--
Gleb.



Re: [OMPI devel] matching code rewrite in OB1

2007-12-14 Thread Gleb Natapov
On Fri, Dec 14, 2007 at 06:53:55AM -0500, Richard Graham wrote:
> If you have positive confirmation that such things have happened, this will
> go a long way.
I instrumented the code to log all kind of info about fragment reordering while
I chased a bug in openib that caused matching logic to malfunction. Any
non trivial application that uses OpenIB BTL will have reordered fragments.
(I wish this would not be that case, but I don't have a solution yet).

> I will not trust the code until this has also been done with
> multiple independent network paths. 
I ran IMB over IP and IB simultaneously on more then 80 ranks.

>  I very rarely express such strong
> opinions, even if I don't agree with what is being done, but this is the
> core of correct MPI functionality, and first hand experience has shown that
I agree that this is indeed very important piece of code, but it certain
is not more important than data type engine for instance (and it is much
easier to test all corner cases in matching logic than in data type engine
IMHO). And event if matching code works perfectly, but other parts of
OB1 are buggy the Open MPI will not work properly, so why this code is
chosen to be a sacred cow?

> just thinking through the logic, I can miss some of the race conditions.
That is of cause correct, but the more people will look at the code the
better, isn't it?

> The code here has been running for 8+ years in two production MPI's running
> on very large clusters, so I am very reluctant to make changes for what
Are you sure about this? I see a number of changes to this code during
Open MPI development and current SVN does not hold all the history of
this code unfortunately. Here is the list of commits that I found, part
of them change the code logic quite a bit:
r6770,r7342,r8339,r8352,r8353,r8356,r8946,r11874,r12323,r12582

> seems to amount to people's taste - maintenance is not an issue in this
> case.  Had this not been such a key bit of code, I would not even bat an
Why do you think that maintenance is not an issue? It is for me. Otherwise
I wouldn't even look at this part of code. All those macros prohibit the use
of a debugger for instance.

(And I see a small latency improvement too :))

> eye.  I suppose if you can go through some formal verification, this would
> also be good - actually better than hoping that one will hit out-of-order
> situations.
> 
> Rich
> 
> 
> On 12/14/07 2:20 AM, "Gleb Natapov"  wrote:
> 
> > On Thu, Dec 13, 2007 at 06:16:49PM -0500, Richard Graham wrote:
> >> The situation that needs to be triggered, just as George has mentions, is
> >> where we have a lot of unexpected messages, to make sure that when one that
> >> we can match against comes in, all the unexpected messages that can be
> >> matched with pre-posted receives are matched.  Since we attempt to match
> >> only when a new fragment comes in, we need to make sure that we don't leave
> >> other unexpected messages that can be matched in the unexpected queue, as
> >> these (if the out of order scenario is just right) would block any new
> >> matches from occurring.
> >> 
> >> For example:  Say the next expect message is 25
> >> 
> >> Unexpected message queue has:  26 28 29 ..
> >> 
> >> If 25 comes in, and is handled, if 26 is not pulled off the unexpected
> >> message queue, when 27 comes in it won't be able to be matched, as 26 is
> >> sitting in the unexpected queue, and will never be looked at again ...
> > This situation is triggered constantly with openib BTL. OpenIB BTL has
> > two ways to receive a packet: over a send queue or over an eager RDMA path.
> > Receiver polls both of them and may reorders packets locally. Actually
> > currently there is a bug in openib BTL that one channel may starve the other
> > at the receiver so if a match fragment with a next sequence number is in the
> > starved path tenth of thousands fragment can be reorederd. Test case 
> > attached
> > to ticket #1158 triggers this case and my patch handles all reordered 
> > packets.
> > 
> > And, by the way, the code is much simpler now and can be review easily ;)
> > 
> > --
> > Gleb.
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Gleb.


Re: [OMPI devel] matching code rewrite in OB1

2007-12-14 Thread Richard Graham
If you have positive confirmation that such things have happened, this will
go a long way.  I will not trust the code until this has also been done with
multiple independent network paths.  I very rarely express such strong
opinions, even if I don't agree with what is being done, but this is the
core of correct MPI functionality, and first hand experience has shown that
just thinking through the logic, I can miss some of the race conditions.
The code here has been running for 8+ years in two production MPI's running
on very large clusters, so I am very reluctant to make changes for what
seems to amount to people's taste - maintenance is not an issue in this
case.  Had this not been such a key bit of code, I would not even bat an
eye.  I suppose if you can go through some formal verification, this would
also be good - actually better than hoping that one will hit out-of-order
situations.

Rich


On 12/14/07 2:20 AM, "Gleb Natapov"  wrote:

> On Thu, Dec 13, 2007 at 06:16:49PM -0500, Richard Graham wrote:
>> The situation that needs to be triggered, just as George has mentions, is
>> where we have a lot of unexpected messages, to make sure that when one that
>> we can match against comes in, all the unexpected messages that can be
>> matched with pre-posted receives are matched.  Since we attempt to match
>> only when a new fragment comes in, we need to make sure that we don't leave
>> other unexpected messages that can be matched in the unexpected queue, as
>> these (if the out of order scenario is just right) would block any new
>> matches from occurring.
>> 
>> For example:  Say the next expect message is 25
>> 
>> Unexpected message queue has:  26 28 29 ..
>> 
>> If 25 comes in, and is handled, if 26 is not pulled off the unexpected
>> message queue, when 27 comes in it won't be able to be matched, as 26 is
>> sitting in the unexpected queue, and will never be looked at again ...
> This situation is triggered constantly with openib BTL. OpenIB BTL has
> two ways to receive a packet: over a send queue or over an eager RDMA path.
> Receiver polls both of them and may reorders packets locally. Actually
> currently there is a bug in openib BTL that one channel may starve the other
> at the receiver so if a match fragment with a next sequence number is in the
> starved path tenth of thousands fragment can be reorederd. Test case attached
> to ticket #1158 triggers this case and my patch handles all reordered packets.
> 
> And, by the way, the code is much simpler now and can be review easily ;)
> 
> --
> Gleb.
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] matching code rewrite in OB1

2007-12-13 Thread Richard Graham
Yes, should be a bit more clear.  Need an independent way to verify that
data is matched
 in the correct order ­ sending this information as payload is one way to do
this.  So,
 sending unique data in every message, and making sure that it arrives in
the user buffers
 in the expected order is a way to do this.

Rich


On 12/12/07 5:04 PM, "Jeff Squyres"  wrote:

> Was Rich referring to ensuring that the test codes checked that their
> payloads were correct (and not re-assembled in the wrong order)?
> 
> 
> On Dec 12, 2007, at 4:10 PM, Brian W. Barrett wrote:
> 
>> > On Wed, 12 Dec 2007, Gleb Natapov wrote:
>> >
>>> >> On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:
 >>> This is better than nothing, but really not very helpful for
 >>> looking at the
 >>> specific issues that can arise with this, unless these systems
 >>> have several
 >>> parallel networks, with tests that will generate a lot of parallel
 >>> network
 >>> traffic, and be able to self check for out-of-order received -
 >>> i.e. this
 >>> needs to be encoded into the payload for verification purposes.
 >>> There are
 >>> some out-of-order scenarios that need to be generated and
 >>> checked.  I think
 >>> that George may have a system that will be good for this sort of
 >>> testing.
 >>>
>>> >> I am running various test with multiple networks right now. I use
>>> >> several IB BTLs and TCP BTL simultaneously. I see many reordered
>>> >> messages and all tests were OK till now, but they don't encode
>>> >> message sequence in a payload as far as I know. I'll change one of
>>> >> them to do so.
>> >
>> > Other than Rich's comment that we need sequence numbers, why add
>> > them?  We
>> > haven't had them for non-matching packets for the last 3 years in
>> > Open MPI
>> > (ie, forever), and I can't see why we would need them.  Yes, we need
>> > sequence numbers for match headers to make sure MPI ordering is
>> > correct.
>> > But for the rest of the payload, there's no need with OMPI's datatype
>> > engine.  It's just more payload for no gain.
>> >
>> > Brian
>> > ___
>> > devel mailing list
>> > de...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> --
> Jeff Squyres
> Cisco Systems
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 




Re: [OMPI devel] matching code rewrite in OB1

2007-12-13 Thread Richard Graham
The situation that needs to be triggered, just as George has mentions, is
where we have a lot of unexpected messages, to make sure that when one that
we can match against comes in, all the unexpected messages that can be
matched with pre-posted receives are matched.  Since we attempt to match
only when a new fragment comes in, we need to make sure that we don't leave
other unexpected messages that can be matched in the unexpected queue, as
these (if the out of order scenario is just right) would block any new
matches from occurring.

For example:  Say the next expect message is 25

Unexpected message queue has:  26 28 29 ..

If 25 comes in, and is handled, if 26 is not pulled off the unexpected
message queue, when 27 comes in it won't be able to be matched, as 26 is
sitting in the unexpected queue, and will never be looked at again ...

Rich


On 12/13/07 2:09 PM, "George Bosilca"  wrote:

> Rich was referring to the fact that the reordering of fragments other
> than the matching ones is irrelevant to the Gleb's change. In order to
> trigger the changes we need to force a lot of small unexpected
> messages over multiple networks. The testing environment should have
> multiple similar networks (to make sure the matching fragment is
> distributed evenly across them), and the test should generate a lot of
> unexpected messages. I think the flood test is a good base for this.
> 
>Thanks,
>  george.
> 
> 
> On Dec 12, 2007, at 5:04 PM, Jeff Squyres wrote:
> 
>> Was Rich referring to ensuring that the test codes checked that their
>> payloads were correct (and not re-assembled in the wrong order)?
>> 
>> 
>> On Dec 12, 2007, at 4:10 PM, Brian W. Barrett wrote:
>> 
>>> On Wed, 12 Dec 2007, Gleb Natapov wrote:
>>> 
 On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:
> This is better than nothing, but really not very helpful for
> looking at the
> specific issues that can arise with this, unless these systems
> have several
> parallel networks, with tests that will generate a lot of parallel
> network
> traffic, and be able to self check for out-of-order received -
> i.e. this
> needs to be encoded into the payload for verification purposes.
> There are
> some out-of-order scenarios that need to be generated and
> checked.  I think
> that George may have a system that will be good for this sort of
> testing.
> 
 I am running various test with multiple networks right now. I use
 several IB BTLs and TCP BTL simultaneously. I see many reordered
 messages and all tests were OK till now, but they don't encode
 message sequence in a payload as far as I know. I'll change one of
 them to do so.
>>> 
>>> Other than Rich's comment that we need sequence numbers, why add
>>> them?  We
>>> haven't had them for non-matching packets for the last 3 years in
>>> Open MPI
>>> (ie, forever), and I can't see why we would need them.  Yes, we need
>>> sequence numbers for match headers to make sure MPI ordering is
>>> correct.
>>> But for the rest of the payload, there's no need with OMPI's datatype
>>> engine.  It's just more payload for no gain.
>>> 
>>> Brian
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> -- 
>> Jeff Squyres
>> Cisco Systems
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] matching code rewrite in OB1

2007-12-13 Thread George Bosilca
Rich was referring to the fact that the reordering of fragments other  
than the matching ones is irrelevant to the Gleb's change. In order to  
trigger the changes we need to force a lot of small unexpected  
messages over multiple networks. The testing environment should have  
multiple similar networks (to make sure the matching fragment is  
distributed evenly across them), and the test should generate a lot of  
unexpected messages. I think the flood test is a good base for this.


  Thanks,
george.


On Dec 12, 2007, at 5:04 PM, Jeff Squyres wrote:


Was Rich referring to ensuring that the test codes checked that their
payloads were correct (and not re-assembled in the wrong order)?


On Dec 12, 2007, at 4:10 PM, Brian W. Barrett wrote:


On Wed, 12 Dec 2007, Gleb Natapov wrote:


On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:

This is better than nothing, but really not very helpful for
looking at the
specific issues that can arise with this, unless these systems
have several
parallel networks, with tests that will generate a lot of parallel
network
traffic, and be able to self check for out-of-order received -
i.e. this
needs to be encoded into the payload for verification purposes.
There are
some out-of-order scenarios that need to be generated and
checked.  I think
that George may have a system that will be good for this sort of
testing.


I am running various test with multiple networks right now. I use
several IB BTLs and TCP BTL simultaneously. I see many reordered
messages and all tests were OK till now, but they don't encode
message sequence in a payload as far as I know. I'll change one of
them to do so.


Other than Rich's comment that we need sequence numbers, why add
them?  We
haven't had them for non-matching packets for the last 3 years in
Open MPI
(ie, forever), and I can't see why we would need them.  Yes, we need
sequence numbers for match headers to make sure MPI ordering is
correct.
But for the rest of the payload, there's no need with OMPI's datatype
engine.  It's just more payload for no gain.

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] matching code rewrite in OB1

2007-12-13 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:10:10PM -0600, Brian W. Barrett wrote:
> On Wed, 12 Dec 2007, Gleb Natapov wrote:
> 
> > On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:
> >> This is better than nothing, but really not very helpful for looking at the
> >> specific issues that can arise with this, unless these systems have several
> >> parallel networks, with tests that will generate a lot of parallel network
> >> traffic, and be able to self check for out-of-order received - i.e. this
> >> needs to be encoded into the payload for verification purposes.  There are
> >> some out-of-order scenarios that need to be generated and checked.  I think
> >> that George may have a system that will be good for this sort of testing.
> >>
> > I am running various test with multiple networks right now. I use
> > several IB BTLs and TCP BTL simultaneously. I see many reordered
> > messages and all tests were OK till now, but they don't encode
> > message sequence in a payload as far as I know. I'll change one of
> > them to do so.
> 
> Other than Rich's comment that we need sequence numbers, why add them?  We 
> haven't had them for non-matching packets for the last 3 years in Open MPI 
> (ie, forever), and I can't see why we would need them.  Yes, we need 
> sequence numbers for match headers to make sure MPI ordering is correct. 
> But for the rest of the payload, there's no need with OMPI's datatype 
> engine.  It's just more payload for no gain.
> 
As I understand what Rich propose he says that we need to construct
special test that will check that matching engine did its job right on
the application layer. In other words test should check that payload
received is correct one. He is not talking about adding additional
fields to OB1 header.

--
Gleb.


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres

Tarballs available at:

http://www.open-mpi.org/~jsquyres/unofficial/



On Dec 12, 2007, at 4:08 PM, Jeff Squyres (jsquyres) wrote:

Heh, ok.  I'll make a tarball against your patch later.  Its against  
the trunk?


-jms
Sent from my PDA

 -Original Message-
From:   Gleb Natapov [mailto:gl...@voltaire.com]
Sent:   Wednesday, December 12, 2007 03:54 PM Eastern Standard Time
To: Open MPI Developers
Subject:Re: [OMPI devel] matching code rewrite in OB1

On Wed, Dec 12, 2007 at 03:52:17PM -0500, Jeff Squyres wrote:
> On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote:
>
> >> How about making a tarball with this patch in it that can be  
thrown

> >> at
> >> everyone's MTT? (we can put the tarball on www.open-mpi.org
> >> somewhere)
> > I don't have access to www.open-mpi.org, but I can send you the  
patch.

> > I can send you a tarball too, but I prefer to not abuse email.
>
> Do you have access to staging.openfabrics.org?  I could download it
> from there and put it on www.open-mpi.org.
>
No. I don't :(

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres
Was Rich referring to ensuring that the test codes checked that their  
payloads were correct (and not re-assembled in the wrong order)?



On Dec 12, 2007, at 4:10 PM, Brian W. Barrett wrote:


On Wed, 12 Dec 2007, Gleb Natapov wrote:


On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:
This is better than nothing, but really not very helpful for  
looking at the
specific issues that can arise with this, unless these systems  
have several
parallel networks, with tests that will generate a lot of parallel  
network
traffic, and be able to self check for out-of-order received -  
i.e. this
needs to be encoded into the payload for verification purposes.   
There are
some out-of-order scenarios that need to be generated and  
checked.  I think
that George may have a system that will be good for this sort of  
testing.



I am running various test with multiple networks right now. I use
several IB BTLs and TCP BTL simultaneously. I see many reordered
messages and all tests were OK till now, but they don't encode
message sequence in a payload as far as I know. I'll change one of
them to do so.


Other than Rich's comment that we need sequence numbers, why add  
them?  We
haven't had them for non-matching packets for the last 3 years in  
Open MPI

(ie, forever), and I can't see why we would need them.  Yes, we need
sequence numbers for match headers to make sure MPI ordering is  
correct.

But for the rest of the payload, there's no need with OMPI's datatype
engine.  It's just more payload for no gain.

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Brian W. Barrett

On Wed, 12 Dec 2007, Gleb Natapov wrote:


On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:

This is better than nothing, but really not very helpful for looking at the
specific issues that can arise with this, unless these systems have several
parallel networks, with tests that will generate a lot of parallel network
traffic, and be able to self check for out-of-order received - i.e. this
needs to be encoded into the payload for verification purposes.  There are
some out-of-order scenarios that need to be generated and checked.  I think
that George may have a system that will be good for this sort of testing.


I am running various test with multiple networks right now. I use
several IB BTLs and TCP BTL simultaneously. I see many reordered
messages and all tests were OK till now, but they don't encode
message sequence in a payload as far as I know. I'll change one of
them to do so.


Other than Rich's comment that we need sequence numbers, why add them?  We 
haven't had them for non-matching packets for the last 3 years in Open MPI 
(ie, forever), and I can't see why we would need them.  Yes, we need 
sequence numbers for match headers to make sure MPI ordering is correct. 
But for the rest of the payload, there's no need with OMPI's datatype 
engine.  It's just more payload for no gain.


Brian


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres (jsquyres)
Heh, ok.  I'll make a tarball against your patch later.  Its against the trunk?

-jms
Sent from my PDA

 -Original Message-
From:   Gleb Natapov [mailto:gl...@voltaire.com]
Sent:   Wednesday, December 12, 2007 03:54 PM Eastern Standard Time
To: Open MPI Developers
Subject:Re: [OMPI devel] matching code rewrite in OB1

On Wed, Dec 12, 2007 at 03:52:17PM -0500, Jeff Squyres wrote:
> On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote:
> 
> >> How about making a tarball with this patch in it that can be thrown  
> >> at
> >> everyone's MTT? (we can put the tarball on www.open-mpi.org  
> >> somewhere)
> > I don't have access to www.open-mpi.org, but I can send you the patch.
> > I can send you a tarball too, but I prefer to not abuse email.
> 
> Do you have access to staging.openfabrics.org?  I could download it  
> from there and put it on www.open-mpi.org.
> 
No. I don't :(

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 03:52:17PM -0500, Jeff Squyres wrote:
> On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote:
> 
> >> How about making a tarball with this patch in it that can be thrown  
> >> at
> >> everyone's MTT? (we can put the tarball on www.open-mpi.org  
> >> somewhere)
> > I don't have access to www.open-mpi.org, but I can send you the patch.
> > I can send you a tarball too, but I prefer to not abuse email.
> 
> Do you have access to staging.openfabrics.org?  I could download it  
> from there and put it on www.open-mpi.org.
> 
No. I don't :(

--
Gleb.


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres

On Dec 12, 2007, at 3:20 PM, Gleb Natapov wrote:

How about making a tarball with this patch in it that can be thrown  
at
everyone's MTT? (we can put the tarball on www.open-mpi.org  
somewhere)

I don't have access to www.open-mpi.org, but I can send you the patch.
I can send you a tarball too, but I prefer to not abuse email.


Do you have access to staging.openfabrics.org?  I could download it  
from there and put it on www.open-mpi.org.


--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Gleb Natapov
On Wed, Dec 12, 2007 at 11:57:11AM -0500, Jeff Squyres wrote:
> Gleb --
> 
> How about making a tarball with this patch in it that can be thrown at  
> everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere)
I don't have access to www.open-mpi.org, but I can send you the patch.
I can send you a tarball too, but I prefer to not abuse email.

> 
> 
> On Dec 11, 2007, at 4:14 PM, Richard Graham wrote:
> 
> > I will re-iterate my concern.  The code that is there now is mostly  
> > nine
> > years old (with some mods made when it was brought over to Open  
> > MPI).  It
> > took about 2 months of testing on systems with 5-13 way network  
> > parallelism
> > to track down all KNOWN race conditions.  This code is at the center  
> > of MPI
> > correctness, so I am VERY concerned about changing it w/o some very  
> > strong
> > reasons.  Not apposed, just very cautious.
> >
> > Rich
> >
> >
> > On 12/11/07 11:47 AM, "Gleb Natapov"  wrote:
> >
> >> On Tue, Dec 11, 2007 at 08:36:42AM -0800, Andrew Friedley wrote:
> >>> Possibly, though I have results from a benchmark I've written  
> >>> indicating
> >>> the reordering happens at the sender.  I believe I found it was  
> >>> due to
> >>> the QP striping trick I use to get more bandwidth -- if you back  
> >>> down to
> >>> one QP (there's a define in the code you can change), the reordering
> >>> rate drops.
> >> Ah, OK. My assumption was just from looking into code, so I may be
> >> wrong.
> >>
> >>>
> >>> Also I do not make any recursive calls to progress -- at least not
> >>> directly in the BTL; I can't speak for the upper layers.  The  
> >>> reason I
> >>> do many completions at once is that it is a big help in turning  
> >>> around
> >>> receive buffers, making it harder to run out of buffers and drop  
> >>> frags.
> >>>  I want to say there was some performance benefit as well but I  
> >>> can't
> >>> say for sure.
> >> Currently upper layers of Open MPI may call BTL progress function
> >> recursively. I hope this will change some day.
> >>
> >>>
> >>> Andrew
> >>>
> >>> Gleb Natapov wrote:
>  On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
> > Try UD, frags are reordered at a very high rate so should be a  
> > good test.
>  Good Idea I'll try this. BTW I thing the reason for such a high  
>  rate of
>  reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
>  (500) and process them one by one and if progress function is  
>  called
>  recursively next 500 completion will be reordered versus previous
>  completions (reordering happens on a receiver, not sender).
> 
> > Andrew
> >
> > Richard Graham wrote:
> >> Gleb,
> >>  I would suggest that before this is checked in this be tested  
> >> on a
> >> system
> >> that has N-way network parallelism, where N is as large as you  
> >> can find.
> >> This is a key bit of code for MPI correctness, and out-of-order  
> >> operations
> >> will break it, so you want to maximize the chance for such  
> >> operations.
> >>
> >> Rich
> >>
> >>
> >> On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:
> >>
> >>> Hi,
> >>>
> >>>   I did a rewrite of matching code in OB1. I made it much  
> >>> simpler and 2
> >>> times smaller (which is good, less code - less bugs). I also  
> >>> got rid
> >>> of huge macros - very helpful if you need to debug something.  
> >>> There
> >>> is no performance degradation, actually I even see very small  
> >>> performance
> >>> improvement. I ran MTT with this patch and the result is the  
> >>> same as on
> >>> trunk. I would like to commit this to the trunk. The patch is  
> >>> attached
> >>> for everybody to try.
> >>>
> >>> --
> >>> Gleb.
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
>  --
>  Gleb.
>  ___
>  devel mailing list
>  de...@open-mpi.org
>  http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >> --
> >> Gleb.
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> > 

Re: [OMPI devel] matching code rewrite in OB1

2007-12-12 Thread Jeff Squyres

Gleb --

How about making a tarball with this patch in it that can be thrown at  
everyone's MTT? (we can put the tarball on www.open-mpi.org somewhere)



On Dec 11, 2007, at 4:14 PM, Richard Graham wrote:

I will re-iterate my concern.  The code that is there now is mostly  
nine
years old (with some mods made when it was brought over to Open  
MPI).  It
took about 2 months of testing on systems with 5-13 way network  
parallelism
to track down all KNOWN race conditions.  This code is at the center  
of MPI
correctness, so I am VERY concerned about changing it w/o some very  
strong

reasons.  Not apposed, just very cautious.

Rich


On 12/11/07 11:47 AM, "Gleb Natapov"  wrote:


On Tue, Dec 11, 2007 at 08:36:42AM -0800, Andrew Friedley wrote:
Possibly, though I have results from a benchmark I've written  
indicating
the reordering happens at the sender.  I believe I found it was  
due to
the QP striping trick I use to get more bandwidth -- if you back  
down to

one QP (there's a define in the code you can change), the reordering
rate drops.

Ah, OK. My assumption was just from looking into code, so I may be
wrong.



Also I do not make any recursive calls to progress -- at least not
directly in the BTL; I can't speak for the upper layers.  The  
reason I
do many completions at once is that it is a big help in turning  
around
receive buffers, making it harder to run out of buffers and drop  
frags.
 I want to say there was some performance benefit as well but I  
can't

say for sure.

Currently upper layers of Open MPI may call BTL progress function
recursively. I hope this will change some day.



Andrew

Gleb Natapov wrote:

On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
Try UD, frags are reordered at a very high rate so should be a  
good test.
Good Idea I'll try this. BTW I thing the reason for such a high  
rate of

reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
(500) and process them one by one and if progress function is  
called

recursively next 500 completion will be reordered versus previous
completions (reordering happens on a receiver, not sender).


Andrew

Richard Graham wrote:

Gleb,
 I would suggest that before this is checked in this be tested  
on a

system
that has N-way network parallelism, where N is as large as you  
can find.
This is a key bit of code for MPI correctness, and out-of-order  
operations
will break it, so you want to maximize the chance for such  
operations.


Rich


On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:


Hi,

  I did a rewrite of matching code in OB1. I made it much  
simpler and 2
times smaller (which is good, less code - less bugs). I also  
got rid
of huge macros - very helpful if you need to debug something.  
There
is no performance degradation, actually I even see very small  
performance
improvement. I ran MTT with this patch and the result is the  
same as on
trunk. I would like to commit this to the trunk. The patch is  
attached

for everybody to try.

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Richard Graham
I will re-iterate my concern.  The code that is there now is mostly nine
years old (with some mods made when it was brought over to Open MPI).  It
took about 2 months of testing on systems with 5-13 way network parallelism
to track down all KNOWN race conditions.  This code is at the center of MPI
correctness, so I am VERY concerned about changing it w/o some very strong
reasons.  Not apposed, just very cautious.

Rich


On 12/11/07 11:47 AM, "Gleb Natapov"  wrote:

> On Tue, Dec 11, 2007 at 08:36:42AM -0800, Andrew Friedley wrote:
>> Possibly, though I have results from a benchmark I've written indicating
>> the reordering happens at the sender.  I believe I found it was due to
>> the QP striping trick I use to get more bandwidth -- if you back down to
>> one QP (there's a define in the code you can change), the reordering
>> rate drops.
> Ah, OK. My assumption was just from looking into code, so I may be
> wrong.
> 
>> 
>> Also I do not make any recursive calls to progress -- at least not
>> directly in the BTL; I can't speak for the upper layers.  The reason I
>> do many completions at once is that it is a big help in turning around
>> receive buffers, making it harder to run out of buffers and drop frags.
>>   I want to say there was some performance benefit as well but I can't
>> say for sure.
> Currently upper layers of Open MPI may call BTL progress function
> recursively. I hope this will change some day.
> 
>> 
>> Andrew
>> 
>> Gleb Natapov wrote:
>>> On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
 Try UD, frags are reordered at a very high rate so should be a good test.
>>> Good Idea I'll try this. BTW I thing the reason for such a high rate of
>>> reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
>>> (500) and process them one by one and if progress function is called
>>> recursively next 500 completion will be reordered versus previous
>>> completions (reordering happens on a receiver, not sender).
>>> 
 Andrew
 
 Richard Graham wrote:
> Gleb,
>   I would suggest that before this is checked in this be tested on a
> system
> that has N-way network parallelism, where N is as large as you can find.
> This is a key bit of code for MPI correctness, and out-of-order operations
> will break it, so you want to maximize the chance for such operations.
> 
> Rich
> 
> 
> On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:
> 
>> Hi,
>> 
>>I did a rewrite of matching code in OB1. I made it much simpler and 2
>> times smaller (which is good, less code - less bugs). I also got rid
>> of huge macros - very helpful if you need to debug something. There
>> is no performance degradation, actually I even see very small performance
>> improvement. I ran MTT with this patch and the result is the same as on
>> trunk. I would like to commit this to the trunk. The patch is attached
>> for everybody to try.
>> 
>> --
>> Gleb.
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> --
>>> Gleb.
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> --
> Gleb.
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Gleb Natapov
On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:
> Try UD, frags are reordered at a very high rate so should be a good test.
mpi-ping works fine with UD BTL and the patch.

> 
> Andrew
> 
> Richard Graham wrote:
> > Gleb,
> >   I would suggest that before this is checked in this be tested on a system
> > that has N-way network parallelism, where N is as large as you can find.
> > This is a key bit of code for MPI correctness, and out-of-order operations
> > will break it, so you want to maximize the chance for such operations.
> > 
> > Rich
> > 
> > 
> > On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:
> > 
> >> Hi,
> >>
> >>I did a rewrite of matching code in OB1. I made it much simpler and 2
> >> times smaller (which is good, less code - less bugs). I also got rid
> >> of huge macros - very helpful if you need to debug something. There
> >> is no performance degradation, actually I even see very small performance
> >> improvement. I ran MTT with this patch and the result is the same as on
> >> trunk. I would like to commit this to the trunk. The patch is attached
> >> for everybody to try.

--
Gleb.


Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Andrew Friedley
Possibly, though I have results from a benchmark I've written indicating 
the reordering happens at the sender.  I believe I found it was due to 
the QP striping trick I use to get more bandwidth -- if you back down to 
one QP (there's a define in the code you can change), the reordering 
rate drops.


Also I do not make any recursive calls to progress -- at least not 
directly in the BTL; I can't speak for the upper layers.  The reason I 
do many completions at once is that it is a big help in turning around 
receive buffers, making it harder to run out of buffers and drop frags. 
 I want to say there was some performance benefit as well but I can't 
say for sure.


Andrew

Gleb Natapov wrote:

On Tue, Dec 11, 2007 at 08:03:52AM -0800, Andrew Friedley wrote:

Try UD, frags are reordered at a very high rate so should be a good test.

Good Idea I'll try this. BTW I thing the reason for such a high rate of
reordering in UD is that it polls for MCA_BTL_UD_NUM_WC completions
(500) and process them one by one and if progress function is called
recursively next 500 completion will be reordered versus previous
completions (reordering happens on a receiver, not sender).


Andrew

Richard Graham wrote:

Gleb,
  I would suggest that before this is checked in this be tested on a system
that has N-way network parallelism, where N is as large as you can find.
This is a key bit of code for MPI correctness, and out-of-order operations
will break it, so you want to maximize the chance for such operations.

Rich


On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:


Hi,

   I did a rewrite of matching code in OB1. I made it much simpler and 2
times smaller (which is good, less code - less bugs). I also got rid
of huge macros - very helpful if you need to debug something. There
is no performance degradation, actually I even see very small performance
improvement. I ran MTT with this patch and the result is the same as on
trunk. I would like to commit this to the trunk. The patch is attached
for everybody to try.

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Andrew Friedley

Try UD, frags are reordered at a very high rate so should be a good test.

Andrew

Richard Graham wrote:

Gleb,
  I would suggest that before this is checked in this be tested on a system
that has N-way network parallelism, where N is as large as you can find.
This is a key bit of code for MPI correctness, and out-of-order operations
will break it, so you want to maximize the chance for such operations.

Rich


On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:


Hi,

   I did a rewrite of matching code in OB1. I made it much simpler and 2
times smaller (which is good, less code - less bugs). I also got rid
of huge macros - very helpful if you need to debug something. There
is no performance degradation, actually I even see very small performance
improvement. I ran MTT with this patch and the result is the same as on
trunk. I would like to commit this to the trunk. The patch is attached
for everybody to try.

--
Gleb.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Richard Graham
Gleb,
  I would suggest that before this is checked in this be tested on a system
that has N-way network parallelism, where N is as large as you can find.
This is a key bit of code for MPI correctness, and out-of-order operations
will break it, so you want to maximize the chance for such operations.

Rich


On 12/11/07 10:54 AM, "Gleb Natapov"  wrote:

> Hi,
> 
>I did a rewrite of matching code in OB1. I made it much simpler and 2
> times smaller (which is good, less code - less bugs). I also got rid
> of huge macros - very helpful if you need to debug something. There
> is no performance degradation, actually I even see very small performance
> improvement. I ran MTT with this patch and the result is the same as on
> trunk. I would like to commit this to the trunk. The patch is attached
> for everybody to try.
> 
> --
> Gleb.
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Brian W. Barrett

On Tue, 11 Dec 2007, Gleb Natapov wrote:


  I did a rewrite of matching code in OB1. I made it much simpler and 2
times smaller (which is good, less code - less bugs). I also got rid
of huge macros - very helpful if you need to debug something. There
is no performance degradation, actually I even see very small performance
improvement. I ran MTT with this patch and the result is the same as on
trunk. I would like to commit this to the trunk. The patch is attached
for everybody to try.


I don't think we can live without those macros :).  Out of curiousity, is 
there any functionality that was removed as a result of this change?


I'll test on a couple systems over the next couple of days...

Brian


[OMPI devel] matching code rewrite in OB1

2007-12-11 Thread Gleb Natapov
Hi,

   I did a rewrite of matching code in OB1. I made it much simpler and 2
times smaller (which is good, less code - less bugs). I also got rid
of huge macros - very helpful if you need to debug something. There
is no performance degradation, actually I even see very small performance
improvement. I ran MTT with this patch and the result is the same as on
trunk. I would like to commit this to the trunk. The patch is attached
for everybody to try.

--
Gleb.
diff --git a/ompi/mca/pml/ob1/pml_ob1_recvfrag.c b/ompi/mca/pml/ob1/pml_ob1_recvfrag.c
index d3f7c37..299ae9e 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvfrag.c
+++ b/ompi/mca/pml/ob1/pml_ob1_recvfrag.c
@@ -184,244 +184,159 @@ void mca_pml_ob1_recv_frag_callback( mca_btl_base_module_t* btl,
 }
 }

-/**
- * Try and match the incoming message fragment to a generic
- * list of receives
- *
- * @param hdr Matching data from received fragment (IN)
- *
- * @param generic_receives Pointer to the receive list used for
- * matching purposes. (IN)
- *
- * @return Matched receive
- *
- * This routine assumes that the appropriate matching locks are
- * set by the upper level routine.
- */
-#define MCA_PML_OB1_MATCH_GENERIC_RECEIVES(hdr,generic_receives,proc,return_match) \
-do {   \
-/* local variables */  \
-mca_pml_ob1_recv_request_t *generic_recv = (mca_pml_ob1_recv_request_t *)  \
- opal_list_get_first(generic_receives);\
-mca_pml_ob1_recv_request_t *last_recv = (mca_pml_ob1_recv_request_t *) \
-opal_list_get_end(generic_receives);   \
-register int recv_tag, frag_tag = hdr->hdr_tag;\
-   \
-/* Loop over the receives. If the received tag is less than zero  */   \
-/* enter in a special mode, where we match only our internal tags */   \
-/* (such as those used by the collectives.*/   \
-if( 0 <= frag_tag ) {  \
-for( ; generic_recv != last_recv;  \
- generic_recv = (mca_pml_ob1_recv_request_t *) \
- ((opal_list_item_t *)generic_recv)->opal_list_next) { \
-/* Check for a match */\
-recv_tag = generic_recv->req_recv.req_base.req_tag;\
-if ( (frag_tag == recv_tag) || (recv_tag == OMPI_ANY_TAG) ) {  \
-break; \
-}  \
-}  \
-} else {   \
-for( ; generic_recv != last_recv;  \
- generic_recv = (mca_pml_ob1_recv_request_t *) \
- ((opal_list_item_t *)generic_recv)->opal_list_next) { \
-/* Check for a match */\
-recv_tag = generic_recv->req_recv.req_base.req_tag;\
-if( OPAL_UNLIKELY(frag_tag == recv_tag) ) {\
-break; \
-}  \
-}  \
-}  \
-if( generic_recv != (mca_pml_ob1_recv_request_t *) \
-opal_list_get_end(generic_receives) ) {\
-   \
-/* Match made */   \
-return_match = generic_recv;   \
-   \
-/* remove descriptor from posted specific ireceive list */ \
-opal_list_remove_item(generic_receives,\
-  (opal_list_item_t *)generic_recv);   \
-PERUSE_TRACE_COMM_EVENT (PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q, \
- &(generic_recv->req_recv.req_base),   \
-