Jason Gunthorpe wrote:
I was saying that point in the rdmacm where the rdma_cm_id is bound to a local 
RDMA device should have only been rdma_resolve_addr and rdma_accept. 
Overloading rdma_bind_addr to both bind to an IP and bind to an RDMA device was 
a bad API choice.
As you wrote, for the most case, binding comes into play only for users calling rdma_resolve_address or rdma_accept, for users the need explicit binding the rdma-cm provides rdma_bind and it binds to both IP and Device, if you can do better, send a patch, binding can't be removed from the API since it has users and it makes sense from users to require it.
Sean is right, there may be special cases that require an early binding, but a 
seperate API - like IP's SO_BINDTODEVICE - would has been better - and users 
are forewarned that calling it restricts the environments their app will support
its just naming, will $ sed s/rdma_bind/rdma_set_opt(RDMA_BINDTODEVICE)/g
make you happier? why?

As it stands we have several impossible situations. Sean, Dave, and I were disucssing the trades offs of what this means relative to IP route resolution
Don't tell me that Dave's patches are blocked b/c you discovered the rdma_bind design and now you don't like it, as I wrote you, Dave sent patch to fix the IPv6 support, during the discussion on his patches you come and bring up more and more issues you consider as problems (but I don't) and block the patch set, I don't think this is appropriate. Let the patches go and send your patches to fix the problems you see. Why anyone touching some code piece has to fix problems you see in that piece?!


- but it affects bonding too. If you rdma_bind_addr to the IP of a bonding 
device, the stack must pick one of the local RDMA ports immediately. If you 
then call rdma_listen there is a problem: incoming connections may target 
either RDMA device, but you are only bound to one of them. An app cannot say 'I 
want to listen on this IP, any RDMA device' with the current API, as you can in 
IP, and that is a shame
An app can say, I want to listen on that IP and the RDMA device which is associated with this IP now. When bonding does fail-over it generated a netevent, the rdma-cm catches this event and generates address change event, apps can redo their bind/listen at this point. For the time being, we never got a user report on a problem, people are doing listen on all IPs probably which works perfect with bonding. Currently the HA mode of bonding will respond on ARP only on one of the devices and as such connection requests will not target any rdma device but rather only the active one. If this is such a shame, send fix, spraying mud on the maintainer and/or someone sending another patch is a shame, isn't it?


Traditionally with ethernet the L2 bonding is really only used for link 
aggregation, L1 failure, and a simple multi-switch HA scheme. It is not 
deployed if you have multiple ethernet domains. Some people prefer to have 
dual, independent ethernet fabrics, and in that case you rely on routing 
features to get the multipath, and HA features of bonding.
okay, thanks for the crash course.

Go back on the list and look up the posts from Leo who first discovered this, 
what he was trying to do is kinda the L3 bonding approach.
if Loe has a problem and you want to help him, bring it on the list, debate, send patches, jumping into someone's else patch isn't the constructive way to go.

David has been doing a good job and I am glad he is working on the IPv6 
support. My comments are only intended to clarify how this is all supposed to 
work and why the IP flow is actually still relevant to RDMA connections.
As I see it, your comments block the the patches sent by Dave, Sean?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to