devel-boun...@open-mpi.org wrote: > Steve Wise wrote: >> There have been a series of discussions on the ofa general list about >> this issue, and the conclusion to date is that it cannot be resolved >> in the rdma-cm or iwarp-cm code of the linux rdma stack. Mainly >> because sending an RDMA message involves the ULP's work queue and >> completion queue, so the CM cannot do this under the covers in a >> mannor that doesn't affect the application. Thus, the applications >> must deal with this. > > Why can't uDAPL deal with this? As a uDAPL user, I really > don't care what API uDAPL is using under the hood to move > data from one place to another, nor the quirks of that API. > The whole point of uDAPL is to form a network-agnostic > abstraction layer. AFAIK, the uDAPL spec doesn't enforce any > such requirement on RDMA communication either. In my > opinion, exposing such behavior above uDAPL is incorrect and > is part of why uDAPL has seen limited adoption -- every > single uDAPL implementation behaves in different ways, making > it extremely difficult to write an application to work on any > uDAPL implementation. Sorry if this sounds harsh, but this > comes from many hours of banging my head on the wall due to > working around these sorts of problems :) >
The simple answer is that uDAPL cannot deal with this. The RDMAC verbs specification was overly focused on client/server and therefore did not realize that there was any harm in requiring that the active side did the first send. But given that DAPL could not rewrite either the RDMAC or InfiniBand verbs it had to come up with the best solution that matched the verbs as they were. One of the explicit ground rules was that DAPL MUST support all RDMA devices that were IBTA or RDMAC compliant. Given those rules, if the active side does not send a message the passive side might be held off indefinitely, and sending a message cause consumption of a receive buffer and therefore cannot be transparent to the uDAPL consumer. Given those constraints there is literally nothing that can be done to work around this problem by either DAPL or OFA.