RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Kanevsky, Arkady
I am not clear what you are proposing?
A transport specific API?

The current proposal provides on sending side:
single post, and single completion in the error free case.
This is commonality that simplify ULP.

Arkady

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Larsen, Roy K [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 06, 2006 6:50 PM
> To: Kanevsky, Arkady; Caitlin Bestler; 
> [EMAIL PROTECTED]; Sean Hefty
> Cc: openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> 
> 
> >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
> >Sent: Monday, February 06, 2006 2:27 PM
> >
> >Roy,
> >comments inline.
> >
> 
> Mine too
> 
> >>
> >> >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
> >> >Roy,
> >> >Can you explain, please?
> >> >
> >> >For IB the operation will be layered properly on Transport
> primitive.
> >> >And on Recv side it will indicate in completion event DTO that it 
> >> >matches RDMA Write with Immediate and that Immediate Data is
> >> in event.
> >> >
> >> >For iWARP I expect initially, it will be layered on RDMA
> >> Write followed
> >> >by Send. The Provider can do post more efficiently than 
> Consumer and 
> >> >guarantee atomicity.
> >> >On Recv side Consumer will get Recv DTO completion in event and 
> >> >Immediate Data inline as specified by Provider Attribute.
> >> >
> >> >From the performance point of view Consumers who program 
> to IB only 
> >> >will have no performance degradation at all. But this API
> >> also allows
> >> >Consumers to write ULP to be transport independent with minimal
> >> >penalty: one binary comparison and extra 4 bytes in recv buffer.
> >>
> >> If the application could be written transport 
> independently, I would 
> >> have no objection at all.  Instead, it must be written in a 
> >> transport-adaptive way and to be able to adapt to all possible 
> >> implementations, the application could not send arbitrary 
> >> "immediate"-sized data as messages because there is no way to 
> >> distinguish between them on the receiving side.  That is 
> HUGE!  It is 
> >> my experience that send/receive is generally used for 
> small messages 
> >> and to take away particular message sizes or to depend on 
> the so the 
> >> application can "adapt" to whatever the immediate size is for a 
> >> particular transport, if even needed, is a very weak facility to 
> >> offer.
> >
> >But the remote side does posts Recv. Since it anticipate 
> that this Recv 
> >will be matched against the RDMA Write with immediate it 
> posts the recv 
> >buffer which fits. Yes, there is an issue for 
> Transport-independent ULP 
> >that it does needs a buffer.
> >For IB it is possible to post 0-size buffer. But if this is the case 
> >Recv end Consumer DOES know that it will be macthed against 
> RDMA Write 
> >so ULP DOES know what it will be matched against.
> >So in the worst case Consumer does have to pay the price of creating 
> >LMR to handle 4 byte buffer to match RDMA Write Immediate data.
> 
> I think you missed my larger point.  The point was that the 
> application must be written in such a way that it could 
> inferred when immediate data arrived for a variety of 
> immediate data sizes and that places a constraint on the 
> application wrt to data it may want to send/receive normally. 
> Where as, if the application embraced the fact that it was 
> responsible for sending a message to indicate a write 
> completion, it is free to send whatever amount of data best 
> met its needs.
> 
> Transports that support true immediate data do not require 
> the ULP to perform buffer matching.  They can post a series 
> of receive buffers that may or may not indicate immediate 
> data.  The ULP does not have to know ahead of time when 
> immediate data will arrive **against other data receives**.  
> The fact that an IB oriented application never needs to back 
> a receive request with a buffer if they were only used to 
> indicate immediate data is orthogonal.
> 
> >
> >>
> >> It also affects interface resource allocation.  Send queue 
> sizes will 
> >&

RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Larsen, Roy K


>From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
>Sent: Monday, February 06, 2006 2:27 PM
>
>Roy,
>comments inline.
>

Mine too

>>
>> >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
>> >Roy,
>> >Can you explain, please?
>> >
>> >For IB the operation will be layered properly on Transport
primitive.
>> >And on Recv side it will indicate in completion event DTO that it
>> >matches RDMA Write with Immediate and that Immediate Data is
>> in event.
>> >
>> >For iWARP I expect initially, it will be layered on RDMA
>> Write followed
>> >by Send. The Provider can do post more efficiently than Consumer and
>> >guarantee atomicity.
>> >On Recv side Consumer will get Recv DTO completion in event and
>> >Immediate Data inline as specified by Provider Attribute.
>> >
>> >From the performance point of view Consumers who program to IB only
>> >will have no performance degradation at all. But this API
>> also allows
>> >Consumers to write ULP to be transport independent with minimal
>> >penalty: one binary comparison and extra 4 bytes in recv buffer.
>>
>> If the application could be written transport independently,
>> I would have no objection at all.  Instead, it must be
>> written in a transport-adaptive way and to be able to adapt
>> to all possible implementations, the application could not
>> send arbitrary "immediate"-sized data as messages because
>> there is no way to distinguish between them on the receiving
>> side.  That is HUGE!  It is my experience that send/receive
>> is generally used for small messages and to take away
>> particular message sizes or to depend on the so the
>> application can "adapt" to whatever the immediate size is for
>> a particular transport, if even needed, is a very weak
>> facility to offer.
>
>But the remote side does posts Recv. Since it anticipate that
>this Recv will be matched against the RDMA Write with immediate
>it posts the recv buffer which fits. Yes, there is an issue
>for Transport-independent ULP that it does needs a buffer.
>For IB it is possible to post 0-size buffer. But if this is the case
>Recv end Consumer DOES know that it will be macthed against RDMA
>Write so ULP DOES know what it will be matched against.
>So in the worst case Consumer does have to pay the price of creating
>LMR to handle 4 byte buffer to match RDMA Write Immediate data.

I think you missed my larger point.  The point was that the application
must be written in such a way that it could inferred when immediate data
arrived for a variety of immediate data sizes and that places a
constraint on the application wrt to data it may want to send/receive
normally. Where as, if the application embraced the fact that it was
responsible for sending a message to indicate a write completion, it is
free to send whatever amount of data best met its needs.

Transports that support true immediate data do not require the ULP to
perform buffer matching.  They can post a series of receive buffers that
may or may not indicate immediate data.  The ULP does not have to know
ahead of time when immediate data will arrive **against other data
receives**.  The fact that an IB oriented application never needs to
back a receive request with a buffer if they were only used to indicate
immediate data is orthogonal.

>
>>
>> It also affects interface resource allocation.  Send queue
>> sizes will have to adapt to possibly twice there size.
>>
>
>That is correct. We argued about it at the meeting.
>One alternative is to have EP and EVD attr. But this will not
>be efficient since it will double the queue size where
>a smaller increment is possible due to the depth of the RDMA Write
>pipeline outstanding.
>
>> It just dawned on me that the immediate data must be in
>> registered memory to be sent in a message.  This means the
>> API must be amended to pass an LMR or, even worse, the
>> provider would have to register memory in the speed path or
>> create and manipulate its own queue of "immediate"
>> data buffers/LMRs.  Of course, LMRs are not needed and an
>> overhead for transports that provide true immediate data.
>
>No registration on the speed path. It is Consumer responsibility
>to provide Recv Buffer of the right size.
>Yes for IB only ULP this can be avoided.
>But ULP can be written to the proposed API to take full
>advantage of IB performance but that code will not be transport
>independent.

I was referring to the sending side.  Source data of a message send must
be from registered memory.  For transports that will emulate this
service with a write/send sequence, user specified immediate data will
need to be copied to a provider managed pool of "immediate" data
buffers/LMRs or the interface changed to specify an LMR.

>
>But this API allows to write transport independent code
>albeit with certain price attached.
>
>>
>> Oh, and another thing.  InfiniBand indicates the size of the
>> RDMA write in the receive completion.  That is something that
>> will have to be addressed in a "transport independent" way or
>>

RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Kanevsky, Arkady
Roy,
comments inline.

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Larsen, Roy K [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 06, 2006 4:25 PM
> To: Kanevsky, Arkady; Caitlin Bestler; 
> [EMAIL PROTECTED]; Sean Hefty
> Cc: openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> 
> 
> >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
> >Roy,
> >Can you explain, please?
> >
> >For IB the operation will be layered properly on Transport primitive.
> >And on Recv side it will indicate in completion event DTO that it 
> >matches RDMA Write with Immediate and that Immediate Data is 
> in event.
> >
> >For iWARP I expect initially, it will be layered on RDMA 
> Write followed 
> >by Send. The Provider can do post more efficiently than Consumer and 
> >guarantee atomicity.
> >On Recv side Consumer will get Recv DTO completion in event and 
> >Immediate Data inline as specified by Provider Attribute.
> >
> >From the performance point of view Consumers who program to IB only 
> >will have no performance degradation at all. But this API 
> also allows 
> >Consumers to write ULP to be transport independent with minimal 
> >penalty: one binary comparison and extra 4 bytes in recv buffer.
> 
> If the application could be written transport independently, 
> I would have no objection at all.  Instead, it must be 
> written in a transport-adaptive way and to be able to adapt 
> to all possible implementations, the application could not 
> send arbitrary "immediate"-sized data as messages because 
> there is no way to distinguish between them on the receiving 
> side.  That is HUGE!  It is my experience that send/receive 
> is generally used for small messages and to take away 
> particular message sizes or to depend on the so the 
> application can "adapt" to whatever the immediate size is for 
> a particular transport, if even needed, is a very weak 
> facility to offer.

But the remote side does posts Recv. Since it anticipate that
this Recv will be matched against the RDMA Write with immediate
it posts the recv buffer which fits. Yes, there is an issue
for Transport-independent ULP that it does needs a buffer.
For IB it is possible to post 0-size buffer. But if this is the case
Recv end Consumer DOES know that it will be macthed against RDMA
Write so ULP DOES know what it will be matched against.
So in the worst case Consumer does have to pay the price of creating
LMR to handle 4 byte buffer to match RDMA Write Immediate data.

> 
> It also affects interface resource allocation.  Send queue 
> sizes will have to adapt to possibly twice there size.
> 

That is correct. We argued about it at the meeting.
One alternative is to have EP and EVD attr. But this will not
be efficient since it will double the queue size where
a smaller increment is possible due to the depth of the RDMA Write
pipeline outstanding.

> It just dawned on me that the immediate data must be in 
> registered memory to be sent in a message.  This means the 
> API must be amended to pass an LMR or, even worse, the 
> provider would have to register memory in the speed path or 
> create and manipulate its own queue of "immediate"
> data buffers/LMRs.  Of course, LMRs are not needed and an 
> overhead for transports that provide true immediate data.

No registration on the speed path. It is Consumer responsibility
to provide Recv Buffer of the right size.
Yes for IB only ULP this can be avoided.
But ULP can be written to the proposed API to take full
advantage of IB performance but that code will not be transport
independent.

But this API allows to write transport independent code
albeit with certain price attached.

> 
> Oh, and another thing.  InfiniBand indicates the size of the 
> RDMA write in the receive completion.  That is something that 
> will have to be addressed in a "transport independent" way or 
> dropped as part of the service.

Good point. I will augment Spec accordingly.

> 
> The bottom line here is that it is NOT transport independent. 

implementation is not transport independent.
But API allows to write Transport-specific ULP with full perfromance
as well Transport-independent ULP with better performance
than without proposed API and with "minimal" performance
penalty for Transports that provide it.

> 
> Now, the atomicity argument between write and send has some 
> credibility.
> If an application chooses to "adapt" to an

RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Kanevsky, Arkady
good point.
I will add this to the requirements and augement the necessary
transfered_length
text.
Arkady

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Davis, Arlin R [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 06, 2006 4:17 PM
> To: Kanevsky, Arkady; Sean Hefty
> Cc: [EMAIL PROTECTED]; openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> I just want to get consensus on the requirements before we 
> get too far.
> One thing I forgot is that with Infiniband, the receive with 
> immediate provides the size of the rdma write that just 
> completed. I think we should include this in the requirements 
> since there is ULP value here.
> 
> -arlin
> 
> >-Original Message-
> >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
> >Sent: Monday, February 06, 2006 11:08 AM
> >To: Kanevsky, Arkady; Davis, Arlin R; Sean Hefty
> >Cc: [EMAIL PROTECTED]; openib-general@openib.org
> >Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0
> immediatedataproposal
> >
> >Arlin,
> >It is too strong to state that Consumer should never send a message 
> >equal in size to the size of immediate data.
> >Consumer knows from the context which one it is.
> >it may be based on dedicated connection, or based on ULP protocol 
> >ordering.
> >Arkady
> >
> >Arkady Kanevsky   email: [EMAIL PROTECTED]
> >Network Appliance Inc.   phone: 781-768-5395
> >1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
> >Waltham, MA 02451   central phone: 781-768-5300
> >
> >
> >> -Original Message-----
> >> From: Kanevsky, Arkady
> >> Sent: Monday, February 06, 2006 2:05 PM
> >> To: Davis, Arlin R; Sean Hefty
> >> Cc: [EMAIL PROTECTED]; openib-general@openib.org
> >> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> >> immediatedataproposal
> >>
> >> Arlin,
> >> On Friday we agreed that receiver can not distinguish between
> >> 4 byte of Send or 4 bytes of Immediate data if RDMA Write 
> with Immed 
> >> is implemented as 2 operations:
> >> RDMA Write followed by Send.
> >>
> >> ULP Reciever "expects" Immediate data that is why it posts Recv. 
> >> Depending on Transport capability it MAY complete as Recv or as 
> >> Recv_RDMA_Write_with_Immed_in_event.
> >>
> >> Neither Provider not Consumer can distinguish between the cases 
> >> unless there is additional info.
> >>
> >> Arkady
> >>
> >> Arkady Kanevsky   email: [EMAIL PROTECTED]
> >> Network Appliance Inc.   phone: 781-768-5395
> >> 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
> >> Waltham, MA 02451   central phone: 781-768-5300
> >>
> >>
> >> > -Original Message-
> >> > From: Davis, Arlin R [mailto:[EMAIL PROTECTED]
> >> > Sent: Monday, February 06, 2006 1:25 PM
> >> > To: Kanevsky, Arkady; Sean Hefty
> >> > Cc: [EMAIL PROTECTED]; openib-general@openib.org
> >> > Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> >> > immediate dataproposal
> >> >
> >> >
> >> > Arkady,
> >> >
> >> > Your requirements are slightly different then the 
> proposed set of 
> >> > requirements.
> >> >
> >> > "iii) DAPL Provider does not provide any identification
> >> that that the
> >> > Receive operation matches remote RDMA Write with Immediate
> >> data if it
> >> > completes as Receive DTO.
> >> >
> >> >  - It is up to an ULP to separate Receive completion of remote
> >> > Send from remote RDMA Write with   Immediate Data."
> >> >
> >> > Tell me how this is possible? How can the application 
> distinguish 
> >> > between a 4 byte message and a 4 byte immediate data
> >> message? We would
> >> > have to add a new requirement... "If the provider supports
> >> immediate
> >> > data in the payload the ULP cannot send a message equal to the 
> >> > immediate data size".
> >> >
> >> > -arlin
> >> >
> &g

RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Larsen, Roy K


>From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
>Roy,
>Can you explain, please?
>
>For IB the operation will be layered properly on Transport primitive.
>And on Recv side it will indicate in completion event DTO
>that it matches RDMA Write with Immediate and that Immediate Data
>is in event.
>
>For iWARP I expect initially, it will be layered on RDMA Write
>followed by Send. The Provider can do post more efficiently
>than Consumer and guarantee atomicity.
>On Recv side Consumer will get Recv DTO completion in event
>and Immediate Data inline as specified by Provider Attribute.
>
>From the performance point of view Consumers who program to IB
>only will have no performance degradation at all. But this API also
>allows Consumers to write ULP to be transport independent
>with minimal penalty: one binary comparison and extra 4 bytes in recv
>buffer.

If the application could be written transport independently, I would
have no objection at all.  Instead, it must be written in a
transport-adaptive way and to be able to adapt to all possible
implementations, the application could not send arbitrary
"immediate"-sized data as messages because there is no way to
distinguish between them on the receiving side.  That is HUGE!  It is my
experience that send/receive is generally used for small messages and to
take away particular message sizes or to depend on the so the
application can "adapt" to whatever the immediate size is for a
particular transport, if even needed, is a very weak facility to offer.

It also affects interface resource allocation.  Send queue sizes will
have to adapt to possibly twice there size.

It just dawned on me that the immediate data must be in registered
memory to be sent in a message.  This means the API must be amended to
pass an LMR or, even worse, the provider would have to register memory
in the speed path or create and manipulate its own queue of "immediate"
data buffers/LMRs.  Of course, LMRs are not needed and an overhead for
transports that provide true immediate data.

Oh, and another thing.  InfiniBand indicates the size of the RDMA write
in the receive completion.  That is something that will have to be
addressed in a "transport independent" way or dropped as part of the
service.

The bottom line here is that it is NOT transport independent. 

Now, the atomicity argument between write and send has some credibility.
If an application chooses to "adapt" to an explicit write/send semantic
for write completion notification in environments that can't provide it
natively, this could be addressed by a generalized combined request API
that can guarantee thread-based atomicity to the send queue.  This seems
much more straightforward to me since, in essence, to adapt to
non-native immediate data services, they would have to allocate
resources and behave in virtually the same way as if they did write/send
explicitly. 

It is obvious that the proposed service is not one of immediate data in
the sense defined by InfiniBand.  Since true immediate data is a
transport specific speed path service, it needs to be implemented as a
transport specific extension.  To allow an application to initiate
multiple request sequences that must be queued sequentially to
explicitly create a write completion notification or any other
order-based sequence, a generalized combined request API should be
defined.

>
>Arkady Kanevsky   email: [EMAIL PROTECTED]
>Network Appliance Inc.   phone: 781-768-5395
>1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
>Waltham, MA 02451   central phone: 781-768-5300
>
>
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote:
> I just want to get consensus on the requirements before we get too
> far. One thing I forgot is that with Infiniband, the receive with
> immediate provides the size of the rdma write that just
> completed. I think we should include this in the requirements
> since there is ULP value here.
> 
> -arlin
> 
That *could* be done, it would be an eight byte message
over iWARP, 4 for length and 4 for the message tag.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Davis, Arlin R
I just want to get consensus on the requirements before we get too far.
One thing I forgot is that with Infiniband, the receive with immediate
provides the size of the rdma write that just completed. I think we
should include this in the requirements since there is ULP value here.

-arlin

>-Original Message-
>From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
>Sent: Monday, February 06, 2006 11:08 AM
>To: Kanevsky, Arkady; Davis, Arlin R; Sean Hefty
>Cc: [EMAIL PROTECTED]; openib-general@openib.org
>Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0
immediatedataproposal
>
>Arlin,
>It is too strong to state that Consumer should never send a message
>equal in size to the size of immediate data.
>Consumer knows from the context which one it is.
>it may be based on dedicated connection, or based on ULP protocol
>ordering.
>Arkady
>
>Arkady Kanevsky   email: [EMAIL PROTECTED]
>Network Appliance Inc.   phone: 781-768-5395
>1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
>Waltham, MA 02451   central phone: 781-768-5300
>
>
>> -Original Message-
>> From: Kanevsky, Arkady
>> Sent: Monday, February 06, 2006 2:05 PM
>> To: Davis, Arlin R; Sean Hefty
>> Cc: [EMAIL PROTECTED]; openib-general@openib.org
>> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0
>> immediatedataproposal
>>
>> Arlin,
>> On Friday we agreed that receiver can not distinguish between
>> 4 byte of Send or 4 bytes of Immediate data if RDMA Write
>> with Immed is implemented as 2 operations:
>> RDMA Write followed by Send.
>>
>> ULP Reciever "expects" Immediate data that is why it posts
>> Recv. Depending on Transport capability it MAY complete as
>> Recv or as Recv_RDMA_Write_with_Immed_in_event.
>>
>> Neither Provider not Consumer can distinguish between the
>> cases unless there is additional info.
>>
>> Arkady
>>
>> Arkady Kanevsky   email: [EMAIL PROTECTED]
>> Network Appliance Inc.   phone: 781-768-5395
>> 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
>> Waltham, MA 02451   central phone: 781-768-5300
>>
>>
>> > -Original Message-
>> > From: Davis, Arlin R [mailto:[EMAIL PROTECTED]
>> > Sent: Monday, February 06, 2006 1:25 PM
>> > To: Kanevsky, Arkady; Sean Hefty
>> > Cc: [EMAIL PROTECTED]; openib-general@openib.org
>> > Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0
>> > immediate dataproposal
>> >
>> >
>> > Arkady,
>> >
>> > Your requirements are slightly different then the proposed set of
>> > requirements.
>> >
>> > "iii) DAPL Provider does not provide any identification
>> that that the
>> > Receive operation matches remote RDMA Write with Immediate
>> data if it
>> > completes as Receive DTO.
>> >
>> >- It is up to an ULP to separate Receive completion of remote
>> > Send from remote RDMA Write with Immediate Data."
>> >
>> > Tell me how this is possible? How can the application distinguish
>> > between a 4 byte message and a 4 byte immediate data
>> message? We would
>> > have to add a new requirement... "If the provider supports
>> immediate
>> > data in the payload the ULP cannot send a message equal to the
>> > immediate
>> > data size".
>> >
>> > -arlin
>> >
>> > >-Original Message-
>> > >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
>> > >Sent: Monday, February 06, 2006 8:08 AM
>> > >To: Sean Hefty; Davis, Arlin R
>> > >Cc: [EMAIL PROTECTED]; openib-general@openib.org
>> > >Subject: RE: [dat-discussions] [openib-general] [RFC] DAT
>> > 2.0 immediate
>> > dataproposal
>> > >
>> > >Here are the changes to the existing requirements chapters
>> for RDMA
>> > >Write with Immediate Data.
>> > >
>> > >Feedback please.
>> > >Arkady
>> > >
>> > >Arkady Kanevsky   email: [EMAIL PROTECTED]
>> > >Network Appliance Inc.   phone: 781-768-5395
>> > >1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
>> > >Waltham, MA 02451   central phone: 781-768-5300
>> > >
>> > >
>> > >> -Original Message-
>> > >> From: Sean Hefty [mailto:[EMAIL PROTECTED]
>> > >> Se

RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Kanevsky, Arkady
Roy,
Can you explain, please?

For IB the operation will be layered properly on Transport primitive.
And on Recv side it will indicate in completion event DTO
that it matches RDMA Write with Immediate and that Immediate Data
is in event.

For iWARP I expect initially, it will be layered on RDMA Write
followed by Send. The Provider can do post more efficiently
than Consumer and guarantee atomicity. 
On Recv side Consumer will get Recv DTO completion in event
and Immediate Data inline as specified by Provider Attribute.

>From the performance point of view Consumers who program to IB
only will have no performance degradation at all. But this API also
allows Consumers to write ULP to be transport independent
with minimal penalty: one binary comparison and extra 4 bytes in recv
buffer.

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Larsen, Roy K [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 06, 2006 2:10 PM
> To: Caitlin Bestler; [EMAIL PROTECTED]; 
> Kanevsky, Arkady; Sean Hefty
> Cc: openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> If it is up to the ULP to separate out "normal" receive data 
> from that associated with a write immediate, how is this 
> different from the ULP doing a write followed by a send?  If 
> there is no difference, then what we're really talking about 
> is a convenience to the initiating ULP.
> 
> Perhaps what would be best is to construct an API that allows 
> the ULP to perform standard write/send operations into one 
> call which the underlying provider could optimize into one 
> transaction with the associated interconnect interface. 
> Better yet, a general request combining interface would have 
> even more value, but calling this write/send "immediate" data 
> is a stretch, if not downright silly.  Some transports have 
> true immediate data that provides unique value.  There is 
> nothing unique in a write/send sequence - ULPs do it all the time...
> 
> Roy
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Caitlin Bestler
> Sent: Monday, February 06, 2006 10:48 AM
> To: [EMAIL PROTECTED]; Kanevsky, Arkady; Sean Hefty
> Cc: openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> [EMAIL PROTECTED] wrote:
> > Arkady,
> > 
> > Your requirements are slightly different then the proposed set of 
> > requirements.
> > 
> > "iii) DAPL Provider does not provide any identification 
> that that the 
> > Receive operation matches remote RDMA Write with Immediate 
> data if it 
> > completes as Receive DTO.
> > 
> > - It is up to an ULP to separate Receive completion of remote
> > Send from remote RDMA Write with  Immediate Data."
> > 
> > Tell me how this is possible? How can the application distinguish 
> > between a 4 byte message and a 4 byte immediate data 
> message? We would 
> > have to add a new requirement... "If the provider supports 
> immediate 
> > data in the payload the ULP cannot send a message equal to the 
> > immediate data size".
> > 
> 
> The data sink knows whether the 4 bytes was sent as a message 
> or as an immediate because it is clear in the ULP context.
> Possible methods:
>   The expected completion is an immediate.
>   All 4 byte messages are immediates.
>   All 4 byte messages where the ms-byte is X are immediate.
>   If its Tuesday its an immediate.
>   If it's a prime number its an immediate
>   ...
> 
> But there is no clue from the transport layer.
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
> 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Caitlin Bestler
Larsen, Roy K wrote:
> If it is up to the ULP to separate out "normal" receive data
> from that associated with a write immediate, how is this
> different from the ULP doing a write followed by a send?  If
> there is no difference, then what we're really talking about
> is a convenience to the initiating ULP.
> 
> Perhaps what would be best is to construct an API that allows
> the ULP to perform standard write/send operations into one
> call which the underlying provider could optimize into one
> transaction with the associated interconnect interface.
> Better yet, a general request combining interface would have
> even more value, but calling this write/send "immediate" data
> is a stretch, if not downright silly.  Some transports have
> true immediate data that provides unique value.  There is
> nothing unique in a write/send sequence - ULPs do it all the time...
> 

The data provided is to identify the completion notification
that completes the RDMA Write to the data sink. So, yes, it
is not really an "immediate" value. We could consider a better
name for it, much as we renamed QP to something better.

But the meaning is "the tag value associated with a specific
RDMA Message". It is delivered in order, after that RDMA
Message has fully completed.

What varies by transport is *how* it is is delivered. We
are considering identifying it as a single work request
so that transport-specific contraction to  a single
wire message is enabled.

But we don't want to change any of the semantics vs.
the application doing Write then Send. The new call
enables an optimization, but should not change the
overall semantics. That could extend as far as having
the the receiver recognize the alternate reception.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Larsen, Roy K
If it is up to the ULP to separate out "normal" receive data from that
associated with a write immediate, how is this different from the ULP
doing a write followed by a send?  If there is no difference, then what
we're really talking about is a convenience to the initiating ULP.

Perhaps what would be best is to construct an API that allows the ULP to
perform standard write/send operations into one call which the
underlying provider could optimize into one transaction with the
associated interconnect interface. Better yet, a general request
combining interface would have even more value, but calling this
write/send "immediate" data is a stretch, if not downright silly.  Some
transports have true immediate data that provides unique value.  There
is nothing unique in a write/send sequence - ULPs do it all the time...

Roy

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Caitlin Bestler
Sent: Monday, February 06, 2006 10:48 AM
To: [EMAIL PROTECTED]; Kanevsky, Arkady; Sean Hefty
Cc: openib-general@openib.org
Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0
immediatedataproposal

[EMAIL PROTECTED] wrote:
> Arkady,
> 
> Your requirements are slightly different then the proposed set of
> requirements. 
> 
> "iii) DAPL Provider does not provide any identification that
> that the Receive operation matches remote RDMA Write with
> Immediate data if it completes as Receive DTO.
> 
>   - It is up to an ULP to separate Receive completion of remote
> Send from remote RDMA Write withImmediate Data."
> 
> Tell me how this is possible? How can the application
> distinguish between a 4 byte message and a 4 byte immediate
> data message? We would have to add a new requirement... "If
> the provider supports immediate data in the payload the ULP
> cannot send a message equal to the immediate
> data size".
> 

The data sink knows whether the 4 bytes was sent as a message
or as an immediate because it is clear in the ULP context.
Possible methods:
The expected completion is an immediate.
All 4 byte messages are immediates.
All 4 byte messages where the ms-byte is X are immediate.
If its Tuesday its an immediate.
If it's a prime number its an immediate
...

But there is no clue from the transport layer.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 immediatedataproposal

2006-02-06 Thread Kanevsky, Arkady
Arlin,
It is too strong to state that Consumer should never send a message
equal in size to the size of immediate data.
Consumer knows from the context which one it is.
it may be based on dedicated connection, or based on ULP protocol
ordering.
Arkady 

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance Inc.   phone: 781-768-5395
1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
Waltham, MA 02451   central phone: 781-768-5300
 

> -Original Message-
> From: Kanevsky, Arkady 
> Sent: Monday, February 06, 2006 2:05 PM
> To: Davis, Arlin R; Sean Hefty
> Cc: [EMAIL PROTECTED]; openib-general@openib.org
> Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> immediatedataproposal
> 
> Arlin,
> On Friday we agreed that receiver can not distinguish between 
> 4 byte of Send or 4 bytes of Immediate data if RDMA Write 
> with Immed is implemented as 2 operations:
> RDMA Write followed by Send.
> 
> ULP Reciever "expects" Immediate data that is why it posts 
> Recv. Depending on Transport capability it MAY complete as 
> Recv or as Recv_RDMA_Write_with_Immed_in_event.
> 
> Neither Provider not Consumer can distinguish between the 
> cases unless there is additional info.
> 
> Arkady
> 
> Arkady Kanevsky   email: [EMAIL PROTECTED]
> Network Appliance Inc.   phone: 781-768-5395
> 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
> Waltham, MA 02451   central phone: 781-768-5300
>  
> 
> > -Original Message-
> > From: Davis, Arlin R [mailto:[EMAIL PROTECTED]
> > Sent: Monday, February 06, 2006 1:25 PM
> > To: Kanevsky, Arkady; Sean Hefty
> > Cc: [EMAIL PROTECTED]; openib-general@openib.org
> > Subject: RE: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> > immediate dataproposal
> > 
> > 
> > Arkady,
> > 
> > Your requirements are slightly different then the proposed set of 
> > requirements.
> > 
> > "iii) DAPL Provider does not provide any identification 
> that that the 
> > Receive operation matches remote RDMA Write with Immediate 
> data if it 
> > completes as Receive DTO.
> > 
> > - It is up to an ULP to separate Receive completion of remote
> > Send from remote RDMA Write with  Immediate Data."
> > 
> > Tell me how this is possible? How can the application distinguish 
> > between a 4 byte message and a 4 byte immediate data 
> message? We would 
> > have to add a new requirement... "If the provider supports 
> immediate 
> > data in the payload the ULP cannot send a message equal to the 
> > immediate
> > data size".   
> > 
> > -arlin
> > 
> > >-Original Message-
> > >From: Kanevsky, Arkady [mailto:[EMAIL PROTECTED]
> > >Sent: Monday, February 06, 2006 8:08 AM
> > >To: Sean Hefty; Davis, Arlin R
> > >Cc: [EMAIL PROTECTED]; openib-general@openib.org
> > >Subject: RE: [dat-discussions] [openib-general] [RFC] DAT
> > 2.0 immediate
> > dataproposal
> > >
> > >Here are the changes to the existing requirements chapters 
> for RDMA 
> > >Write with Immediate Data.
> > >
> > >Feedback please.
> > >Arkady
> > >
> > >Arkady Kanevsky   email: [EMAIL PROTECTED]
> > >Network Appliance Inc.   phone: 781-768-5395
> > >1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195
> > >Waltham, MA 02451   central phone: 781-768-5300
> > >
> > >
> > >> -Original Message-
> > >> From: Sean Hefty [mailto:[EMAIL PROTECTED]
> > >> Sent: Friday, February 03, 2006 7:30 PM
> > >> To: Davis, Arlin R
> > >> Cc: [EMAIL PROTECTED]; openib-general@openib.org
> > >> Subject: Re: [dat-discussions] [openib-general] [RFC] DAT 2.0 
> > >> immediate dataproposal
> > >>
> > >> Davis, Arlin R wrote:
> > >> > "Applications need an optimized mechanism to notify the
> > >> receiving end
> > >> > that RDMA write data has completed beyond the two
> > operation method
> > >> > currently used (RDMA write followed by message send). 
> > This new RDMA
> > >> > write feature will support 4-bytes of inline data that
> > will be sent
> > >>
> > >> Is there any reason to restrict the size of the immediate data?  
> > >> Could you define the API such that the size is variable? 
>  I.e. the 
> > >> provider can simply