Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
At 01:09 PM 10/10/2005, Christoph Hellwig wrote: On Mon, Oct 10, 2005 at 12:53:29PM -0700, Michael Krause wrote: standards. There are also the new standard Sockets extension API available today that might be extended sometime in the future to include explicit which is never going to get into linux. one more of these braindead standards people masturbating in a dark room and coming up with a frankenstein bastard cases. Everyone is free to have an opinion. Sockets extensions are not braindead nor created using whatever methods you envision. The extensions were created by Sockets engineers with 20+ years experience. But, hey, why put any faith into people who develop and implement Sockets for a living? One day perhaps you'll learn a bit of professionalism and perhaps open your mind that there are people out in the world besides yourself you don't take a NIH approach to the world and are actually qualified engineers who have a clue. All you get with these constant unprofessional diatribes is a continual loss in credibility. But, hey, that is just an opinion. BTW, do you feel the same way about the people who created IB? How about iWARP? How about PCIe? Are all of the engineers who work on trying to accelerate technology, its performance, etc. who take into account and try to find a balanced approach to problem solving simply all in dark little rooms? All of these specs are created by companies. Those same companies who fund open source efforts and many of the people working here. One last thing, I'm not the only person who feels this way about your unprofessional behavior. There are many others who have simply don't want to bother writing or have simply written you off as whatever. Sad state to be in and I suspect you don't care since you view them all as in dark little rooms anyway. Just something you might want to keep in mind. There is a much larger world out there where people value other people's professional opinions and ideas. They don't simply discount what they produce because it was not done in whatever form you prefer. It is called reality. Get used to it. Mike ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Quoting Sean Hefty [EMAIL PROTECTED]: Subject: [PATCH] [CMA] RDMA CM abstraction module The following patch adds in a basic RDMA connection management abstraction. It is functional, but needs additional work for handling device removal, plus several missing features. I'd like to merge this back into the trunk, and continue working on it from there. This depends on the ib_addr module. Signed-off-by: Sean Hefty [EMAIL PROTECTED] Index: include/rdma/rdma_cm.h === --- include/rdma/rdma_cm.h(revision 0) +++ include/rdma/rdma_cm.h(revision 0) @@ -0,0 +1,201 @@ [... snip ...] + +#if !defined(RDMA_CM_H) +#define RDMA_CM_H + +#include linux/socket.h +#include rdma/ib_addr.h +#include rdma/ib_sa.h + +/* + * Upon receiving a device removal event, users must destroy the associated + * RDMA identifier and release all resources allocated with the device. + */ +enum rdma_event_type { + RDMA_EVENT_ADDR_RESOLVED, + RDMA_EVENT_ADDR_ERROR, + RDMA_EVENT_ROUTE_RESOLVED, + RDMA_EVENT_ROUTE_ERROR, + RDMA_EVENT_CONNECT_REQUEST, + RDMA_EVENT_CONNECT_ERROR, + RDMA_EVENT_UNREACHABLE, + RDMA_EVENT_REJECTED, + RDMA_EVENT_ESTABLISHED, + RDMA_EVENT_DISCONNECTED, + RDMA_EVENT_DEVICE_REMOVAL, +}; + +struct rdma_addr { + struct sockaddr src_addr; + struct sockaddr dst_addr; + union { + struct ib_addr ibaddr; + } addr; +}; + +struct rdma_route { + struct rdma_addr addr; + struct ib_sa_path_rec *path_rec; + int num_paths; +}; + +struct rdma_event { + enum rdma_event_type event; + int status; + void*private_data; + u8 private_data_len; +}; Wouldnt is be a good idea to start names with rdma_cm or rdma_cma or something like that? For example, rdma_event_type is a bit confusing since this actually only includes CM events. Similiar comments apply to other names. +struct rdma_id; I propose renaming this to rdma_connection or something else more specific than just id. Makes sense? +/** + * rdma_event_handler - Callback used to report user events. + * + * Notes: Users may not call rdma_destroy_id from this callback to destroy + * the passed in id, or a corresponding listen id. Returning a + * non-zero value from the callback will destroy the corresponding id. + */ +typedef int (*rdma_event_handler)(struct rdma_id *id, struct rdma_event *event); + +struct rdma_id { + struct ib_device*device; + void*context; + struct ib_qp*qp; + rdma_event_handler event_handler; + struct rdma_routeroute; +}; + +struct rdma_id* rdma_create_id(rdma_event_handler event_handler, void *context); + +void rdma_destroy_id(struct rdma_id *id); + +/** + * rdma_bind_addr - Bind an RDMA identifier to a source address and + * associated RDMA device, if needed. + * + * @id: RDMA identifier. + * @addr: Local address information. Wildcard values are permitted. + * + * This associates a source address with the RDMA identifier before calling + * rdma_listen. If a specific local address is given, the RDMA identifier will + * be bound to a local RDMA device. + */ +int rdma_bind_addr(struct rdma_id *id, struct sockaddr *addr); + +/** + * rdma_resolve_addr - Resolve destination and optional source addresses + * from IP addresses to an RDMA address. If successful, the specified + * rdma_id will be bound to a local device. + * + * @id: RDMA identifier. + * @src_addr: Source address information. This parameter may be NULL. + * @dst_addr: Destination address information. + * @timeout_ms: Time to wait for resolution to complete. + */ +int rdma_resolve_addr(struct rdma_id *id, struct sockaddr *src_addr, + struct sockaddr *dst_addr, int timeout_ms); + +/** + * rdma_resolve_route - Resolve the RDMA address bound to the RDMA identifier + * into route information needed to establish a connection. + * + * This is called on the client side of a connection, but its use is optional. + * Users must have first called rdma_bind_addr to resolve a dst_addr + * into an RDMA address before calling this routine. + */ +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); Not sure I understand what this does, since the only extra parameter is timeout_ms. +/** + * rdma_create_qp - Allocate a QP and associate it with the specified RDMA + * identifier. + */ +int rdma_create_qp(struct rdma_id *id, struct ib_pd *pd, +struct ib_qp_init_attr *qp_init_attr); + +/** + * rdma_destroy_qp - Deallocate the QP associated with the specified RDMA + * identifier. + * + * Users must destroy any QP associated with an RDMA identifier before + * destroying the RDMA ID. + */ +void
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Thanks for the feedback. See below. Michael S. Tsirkin wrote: Wouldnt is be a good idea to start names with rdma_cm or rdma_cma or something like that? For example, rdma_event_type is a bit confusing since this actually only includes CM events. Similiar comments apply to other names. I had that originally, but changed it. I figured that names like rdma_connect() and rdma_listen() were clear enough that they were for connection management. +struct rdma_id; I propose renaming this to rdma_connection or something else more specific than just id. Makes sense? I can change this to rdma_cm_id or rdma_cma or something else... +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); Not sure I understand what this does, since the only extra parameter is timeout_ms. For IB, this results in a path record query based on the GIDs that were set with the rdma_id from rdma_resolve_addr(). The GIDs are in rdma_id.route.addr.ibaddr. The output is saved to rdma_id.route.path_rec. My intent is to make this call optional in the future. +int rdma_create_qp(struct rdma_id *id, struct ib_pd *pd, + struct ib_qp_init_attr *qp_init_attr); + +void rdma_destroy_qp(struct rdma_id *id); Not sure what the intended usage is. When does the user need to call this? The CMA needs to associate a QP with the rdma_id, and CMA will transition the QP through its connection states. The rdma_create_qp() is called to allocate a QP and transition it to the INIT state, so users can post receives to the QP. The destroy call is a pass-through call provided simply for symmetry. +#include linux/in.h +#include linux/in6.h +#include linux/inetdevice.h +#include net/arp.h +#include net/neighbour.h +#include net/route.h +#include rdma/rdma_cm.h +#include rdma/ib_cache.h +#include rdma/ib_cm.h +#include rdma/ib_sa.h Are all of these headers really needed? For example, I dont see arp.h used anywhere. Am I missing something? They were needed at one point, but might not all be needed now. I will see which ones can be removed. Some were only needed for address translation, which was originally part of this file while I worked out its API. What about replacing switch with one case statements with if statements. Like this: if (id-device-node_type == IB_NODE_CA) ret = cma_init_ib_qp(id_priv, qp); else ret = -ENOSYS; I tried to make it easy to modify the code to support iWarp, or some other RDMA device. I'd prefer to leave these checks as switch statements for that reason, or just remove them completely. I also wander why do we really need all these node_type checks. The code above seems to imply that rdma_create_qp will fail on non-CA. Why is that? The code doesn't set the right parameters to INIT for an iWarp QP. +static inline void cma_deref_dev(struct rdma_id_private *id_priv) +{ +// if (atomic_dec_and_test(id_priv-dev_remove)) +// wake_up(id_priv-wait); +// return atomic_dec_and_test(id_priv-dev_remove) ? +//cma_notify_user(id_priv, RDMA_EVENT_DEVICE_REMOVAL, -ENODEV, +//NULL, 0) : 0; +} The above seems to need some cleanup. This has been cleaned up in my latest version. It was part of the initial device removal handling code that didn't work. I decided to just try to get connection establishment working, and then come back to fix device removal. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Quoting Sean Hefty [EMAIL PROTECTED]: Wouldnt is be a good idea to start names with rdma_cm or rdma_cma or something like that? For example, rdma_event_type is a bit confusing since this actually only includes CM events. Similiar comments apply to other names. I had that originally, but changed it. I figured that names like rdma_connect() and rdma_listen() were clear enough that they were for connection management. Yes, fine, but names like rdma_event_type probably do need the prefix, dont they? +struct rdma_id; I propose renaming this to rdma_connection or something else more specific than just id. Makes sense? I can change this to rdma_cm_id or rdma_cma or something else... Maybe rdma_connection (these things encapsulate connectin state)? Or, rdma_sock or rdma_socket, since people are used to the fact that connections are sockets? +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); Not sure I understand what this does, since the only extra parameter is timeout_ms. For IB, this results in a path record query based on the GIDs that were set with the rdma_id from rdma_resolve_addr(). The GIDs are in rdma_id.route.addr.ibaddr. The output is saved to rdma_id.route.path_rec. My intent is to make this call optional in the future. I was trying to say, why doesnt rdma_connect just do this transparently? Why do we need a separate call? +int rdma_create_qp(struct rdma_id *id, struct ib_pd *pd, + struct ib_qp_init_attr *qp_init_attr); + +void rdma_destroy_qp(struct rdma_id *id); Not sure what the intended usage is. When does the user need to call this? The CMA needs to associate a QP with the rdma_id, and CMA will transition the QP through its connection states. The rdma_create_qp() is called to allocate a QP and transition it to the INIT state, so users can post receives to the QP. The destroy call is a pass-through call provided simply for symmetry. What happends on the passive side? May we need more than one qp per rdma_id? Or is a new id created each time a connection request arrives? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Michael S. Tsirkin wrote: Yes, fine, but names like rdma_event_type probably do need the prefix, dont they? I'll fix this. Maybe rdma_connection (these things encapsulate connectin state)? Or, rdma_sock or rdma_socket, since people are used to the fact that connections are sockets? Any objection to rdma_socket? +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); I was trying to say, why doesnt rdma_connect just do this transparently? Why do we need a separate call? Eventually rdma_connect will call this for the user if a route hasn't been resolved. At some point though, the API will likely need to be expanded to specify some sort of quality of service. What happends on the passive side? May we need more than one qp per rdma_id? Or is a new id created each time a connection request arrives? A new identifier is created each time a connection request arrives. The goal is to support a single listen across multiple devices, so listen id's will not necessarily be bound to an ib_device. The new id will be bound to the device that the connection request was received on. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Sean Hefty wrote: Michael S. Tsirkin wrote: Yes, fine, but names like rdma_event_type probably do need the prefix, dont they? I'll fix this. I've just committed a patch to rename the prefix. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
From: Sean Hefty [mailto:[EMAIL PROTECTED] Sent: Monday, October 10, 2005 11:16 AM Michael S. Tsirkin wrote: Maybe rdma_connection (these things encapsulate connectin state)? Or, rdma_sock or rdma_socket, since people are used to the fact that connections are sockets? Any objection to rdma_socket? I don't like rdma_socket, since you can't actually perform any I/O operations on the rdma_socket, unlike normal sockets. We're dealing only with the connection part of the problem, and the name should reflect that. So rdma_connection, rdma_conn, or rdma_cid seem more appropriate. - Fab ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Quoting Sean Hefty [EMAIL PROTECTED]: Maybe rdma_connection (these things encapsulate connectin state)? Or, rdma_sock or rdma_socket, since people are used to the fact that connections are sockets? Any objection to rdma_socket? Fine with me, this makes the intent of bind/listen explicit. +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); I was trying to say, why doesnt rdma_connect just do this transparently? Why do we need a separate call? Eventually rdma_connect will call this for the user if a route hasn't been resolved. At some point though, the API will likely need to be expanded to specify some sort of quality of service. I thought that would also happen at connect time. No? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
At 12:13 PM 10/10/2005, Fab Tillier wrote: From: Sean Hefty [ mailto:[EMAIL PROTECTED]] Sent: Monday, October 10, 2005 11:16 AM Michael S. Tsirkin wrote: Maybe rdma_connection (these things encapsulate connectin state)? Or, rdma_sock or rdma_socket, since people are used to the fact that connections are sockets? Any objection to rdma_socket? I don't like rdma_socket, since you can't actually perform any I/O operations on the rdma_socket, unlike normal sockets. We're dealing only with the connection part of the problem, and the name should reflect that. So rdma_connection, rdma_conn, or rdma_cid seem more appropriate. Naming should not involve sockets as that is part of existing standards. There are also the new standard Sockets extension API available today that might be extended sometime in the future to include explicit RDMA support should people decide to bypass SDP and go straight to a more robust API definition. The Sockets Extensions already comprehend explicit memory management, async comms, etc. making a significant improvement over the existing sync Sockets as well as going further in solving areas like memory management beyond what was done in Winsocks. Mike ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
On Mon, Oct 10, 2005 at 12:53:29PM -0700, Michael Krause wrote: standards. There are also the new standard Sockets extension API available today that might be extended sometime in the future to include explicit which is never going to get into linux. one more of these braindead standards people masturbating in a dark room and coming up with a frankenstein bastard cases. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] Re: [PATCH] [CMA] RDMA CM abstraction module
Michael S. Tsirkin wrote: Any objection to rdma_socket? Fine with me, this makes the intent of bind/listen explicit. I have rdma_cm_id right now, and will likely just keep it as that. +int rdma_resolve_route(struct rdma_id *id, int timeout_ms); I was trying to say, why doesnt rdma_connect just do this transparently? Why do we need a separate call? Eventually rdma_connect will call this for the user if a route hasn't been resolved. At some point though, the API will likely need to be expanded to specify some sort of quality of service. I thought that would also happen at connect time. No? I went with the option of exposing the necessary functionality. Folding this into the connect call means that the user cannot view the returned route before deciding to establishing a connection, and the CMA sets the timeout/retry policy for resolving routes. The only benefit of hiding this call is a slight code simplification for the user: case RDMA_CM_EVENT_ADDR_RESOLVED: ret = rdma_resolve_route(cma_id-context, timeout); if (ret) connect_error(); break; case RDMA_CM_EVENT_ROUTE_RESOLVED: connect(cma_id-context); break; becomes: case RDMA_CM_EVENT_ADDR_RESOLVED: connect(cma_id-context); break; To make the API slightly easier to use, I thought of letting rdma_resolve_route() be optional. But, that makes it more difficult to handle device removal, and I'm not sure that it's even worth it. As for QoS, I'm not even sure that it shouldn't be specified when performing the address resolution, so that the correct device can be selected. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general