Re: [openib-general] librdmacm ABI issues with OFED 1.1
Quoting r. Sean Hefty [EMAIL PROTECTED]: If you look at Woody's backport patches, I believe that he moves the RDMA CM files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the abi_version from there. Maybe the librdmacm part should be merged to svn? So librdmacm could try to read from misc, then from /sys/class/infiniband/rdma_cm, and then assume latest. It's good to have userspace code portable across distros ... -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] dropped packets
Roland, I am trying to write a user level application that receives multicast UD packets at user level. I am seeing about 1-2 % packet loss between the send side and the receive side apparently independent of the packet rate for low rates. (Heavily traced sends and receives with very low rates still drop packets even though there are more packets posted on the receive side than are sent.) I have a couple of questions: 1. Are there any race issues with ibv_get_cq_event? The example code (ud_pingpong) seems to imply that the correct sequence is Start: Call ibv_get_cq_event Call ibv_ack_cq_event - anywhere so long as it happens before destroy_cq Call ibv_req_notify_cq Call ibv_poll_cq - just once not as usual until empty according to the example Goto start In the old days we called request notify and poll until poll was empty on a notify thread in order to prevent a race. 2. When I post say 500 receive buffers and send say 200 send buffers and tag the sends with a sequence number I often see one or two missing sequence numbers at the receive side at the poll_cq interface having checked at the post and poll interfaces of the send side to see that all the correct sequence numbers went out. I am not sure how this can be possible regardless of the notification scheme used. I would love for this to be a programming error in my code but I cant figure out how I can mess it up between post_send and poll_cq on the receive side. I see the same behavior between systems and with a loopback between two ports on the same HCA. Please let me know if this rings any bells. Bob Pearson ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp
On Wednesday 23 August 2006 23:25, Roland Dreier wrote: 5. Return the send_cq, receive cq and srq handles. ib_query_qp() needs them (required by IB Spec). ibv_query_qp() overwrites these values in user-space with appropriate user-space values. + qp_init_attr-send_cq = ibqp-send_cq; + qp_init_attr-recv_cq = ibqp-recv_cq; + qp_init_attr-srq = ibqp-srq; I really disagree with this change. It's silly to do this copying since the consumer already has the ibqp pointer. And it's especially silly to put this in a low-level driver, since there's nothing device-specific about it. Note that the same thing is done in user-space (albeit in the verbs layer): libibverbs/src/cmd.c, lines 678 ff. (procedure ibv_cmd_query_qp() ): init_attr-send_cq = qp-send_cq; init_attr-recv_cq = qp-recv_cq; init_attr-srq = qp-srq; Either return these values in both kernel and user, or remove them from the user-space verb. Regarding putting this in the low level kernel driver, I agree. A patch for core/verbs.c is given below. (P.S. IMHO, the correct thing to do is to remove the above lines from cmd.c, and not return to the user stuff that s/he already has available). - Jack - Index: ofed_1_1/drivers/infiniband/core/verbs.c === --- ofed_1_1.orig/drivers/infiniband/core/verbs.c 2006-08-03 14:30:21.0 +0300 +++ ofed_1_1/drivers/infiniband/core/verbs.c2006-08-24 10:03:51.831333000 +0300 @@ -556,9 +556,17 @@ int ib_query_qp(struct ib_qp *qp, int qp_attr_mask, struct ib_qp_init_attr *qp_init_attr) { - return qp-device-query_qp ? - qp-device-query_qp(qp, qp_attr, qp_attr_mask, qp_init_attr) : - -ENOSYS; + int rc = -ENOSYS; + if (qp-device-query_qp) { + rc = qp-device-query_qp(qp, qp_attr, + qp_attr_mask, qp_init_attr); + if(!rc) { + qp_init_attr-recv_cq = qp-recv_cq; + qp_init_attr-send_cq = qp-send_cq; + qp_init_attr-srq = qp-srq; + } + } + return rc; } EXPORT_SYMBOL(ib_query_qp); ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dropped packets
Quoting r. Robert Pearson [EMAIL PROTECTED]: Subject: dropped packets Roland, I am trying to write a user level application that receives multicast UD packets at user level. I am seeing about 1-2 % packet loss between the send side and the receive side apparently independent of the packet rate for low rates. (Heavily traced sends and receives with very low rates still drop packets even though there are more packets posted on the receive side than are sent.) I have a couple of questions: What kind of hardware do you have? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] osm: handle local events
Hi Hal This patch implements first item of the OSM todo list. OpenSM opens a thread that is listening for events on the SM's port. The events that are being taken care of are IBV_EVENT_DEVICE_FATAL and IBV_EVENT_PORT_ERROR. In case of IBV_EVENT_DEVICE_FATAL, osm is forced to exit. in case of IBV_EVENT_PORT_ERROR, osm initiates heavy sweep. Yevgeny Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED] Index: include/opensm/osm_sm_mad_ctrl.h === --- include/opensm/osm_sm_mad_ctrl.h(revision 8998) +++ include/opensm/osm_sm_mad_ctrl.h(working copy) @@ -109,6 +109,7 @@ typedef struct _osm_sm_mad_ctrl osm_mad_pool_t *p_mad_pool; osm_vl15_t *p_vl15; osm_vendor_t*p_vendor; + struct _osm_state_mgr *p_state_mgr; osm_bind_handle_t h_bind; cl_plock_t *p_lock; cl_dispatcher_t *p_disp; @@ -130,6 +131,9 @@ typedef struct _osm_sm_mad_ctrl * p_vendor * Pointer to the vendor specific interfaces object. * +* p_state_mgr +* Pointer to the state manager object. +* * h_bind * Bind handle returned by the transport layer. * @@ -233,6 +237,7 @@ osm_sm_mad_ctrl_init( IN osm_mad_pool_t* const p_mad_pool, IN osm_vl15_t* const p_vl15, IN osm_vendor_t* const p_vendor, + IN struct _osm_state_mgr* const p_state_mgr, IN osm_log_t* const p_log, IN osm_stats_t* const p_stats, IN cl_plock_t* const p_lock, @@ -251,6 +256,9 @@ osm_sm_mad_ctrl_init( * p_vendor * [in] Pointer to the vendor specific interfaces object. * +* p_state_mgr +* [in] Pointer to the state manager object. +* * p_log * [in] Pointer to the log object. * Index: include/vendor/osm_vendor_ibumad.h === --- include/vendor/osm_vendor_ibumad.h (revision 8998) +++ include/vendor/osm_vendor_ibumad.h (working copy) @@ -74,6 +74,8 @@ BEGIN_C_DECLS #define OSM_UMAD_MAX_CAS 32 #define OSM_UMAD_MAX_PORTS_PER_CA 2 +#define OSM_VENDOR_SUPPORT_EVENTS + /* OpenIB gen2 doesn't support RMPP yet */ /s* OpenSM: Vendor UMAD/osm_ca_info_t @@ -179,6 +181,10 @@ typedefstruct _osm_vendor int umad_port_id; void *receiver; int issmfd; + cl_thread_t events_thread; + void * events_callback; + void * sm_context; + struct ibv_context * ibv_context; } osm_vendor_t; #define OSM_BIND_INVALID_HANDLE 0 Index: include/vendor/osm_vendor_api.h === --- include/vendor/osm_vendor_api.h (revision 8998) +++ include/vendor/osm_vendor_api.h (working copy) @@ -526,6 +526,110 @@ osm_vendor_set_debug( * SEE ALSO */ +#ifdef OSM_VENDOR_SUPPORT_EVENTS + +#define OSM_EVENT_FATAL 1 +#define OSM_EVENT_PORT_ERR 2 + +/s* OpenSM Vendor API/osm_vend_events_callback_t +* NAME +* osm_vend_events_callback_t +* +* DESCRIPTION +* Function prototype for the vendor events callback. +* The vendor layer calls this function on driver events. +* +* SYNOPSIS +*/ +typedef void +(*osm_vend_events_callback_t)( + IN int events_mask, + IN void * const context ); +/* +* PARAMETERS +* events_mask +* [in] The received event(s). +* +* context +* [in] Context supplied as the sm_context argument in +* the osm_vendor_unreg_events_cb call +* +* RETURN VALUES +* None. +* +* NOTES +* +* SEE ALSO +* osm_vendor_reg_events_cb osm_vendor_unreg_events_cb +*/ + +/f* OpenSM Vendor API/osm_vendor_reg_events_cb +* NAME +* osm_vendor_reg_events_cb +* +* DESCRIPTION +* Registers the events callback function and start the events +* thread +* +* SYNOPSIS +*/ +int +osm_vendor_reg_events_cb( + IN osm_vendor_t * const p_vend, + IN void * const sm_callback, + IN void * const sm_context); +/* +* PARAMETERS +* p_vend +* [in] vendor handle. +* +* sm_callback +* [in] Callback function that should be called when +* the event is received. +* +* sm_context +* [in] Context supplied as the context argument in +* the subsequenct calls to the sm_callback function +* +* RETURN VALUE +* IB_SUCCESS if OK. +* +* NOTES +* +* SEE ALSO +* osm_vend_events_callback_t osm_vendor_unreg_events_cb +*/ + +/f* OpenSM Vendor API/osm_vendor_unreg_events_cb +* NAME +* osm_vendor_unreg_events_cb +* +* DESCRIPTION +* Un-Registers the events callback function and stops the events +* thread +* +* SYNOPSIS +*/ +void +osm_vendor_unreg_events_cb( + IN osm_vendor_t * const p_vend); +/* +* PARAMETERS +* p_vend +* [in] vendor handle. +* +* +* RETURN VALUE +* None. +* +* NOTES +*
Re: [openib-general] [PATCH] osm: handle local events
I did not see this on the reflector. We did have some mailer problems. So I am resending to the list One more thing to add: The only other event we considered was PORT_ACTIVE. But as it turns out the event is only generated when the port moves into ACTIVE state which means an SM already handled it... EZ Yevgeny Kliteynik wrote: Hi Hal This patch implements first item of the OSM todo list. OpenSM opens a thread that is listening for events on the SM's port. The events that are being taken care of are IBV_EVENT_DEVICE_FATAL and IBV_EVENT_PORT_ERROR. In case of IBV_EVENT_DEVICE_FATAL, osm is forced to exit. in case of IBV_EVENT_PORT_ERROR, osm initiates heavy sweep. Yevgeny Signed-off-by: Yevgeny Kliteynik [EMAIL PROTECTED] Index: include/opensm/osm_sm_mad_ctrl.h === --- include/opensm/osm_sm_mad_ctrl.h (revision 8998) +++ include/opensm/osm_sm_mad_ctrl.h (working copy) @@ -109,6 +109,7 @@ typedef struct _osm_sm_mad_ctrl osm_mad_pool_t *p_mad_pool; osm_vl15_t *p_vl15; osm_vendor_t*p_vendor; + struct _osm_state_mgr *p_state_mgr; osm_bind_handle_t h_bind; cl_plock_t *p_lock; cl_dispatcher_t *p_disp; @@ -130,6 +131,9 @@ typedef struct _osm_sm_mad_ctrl *p_vendor *Pointer to the vendor specific interfaces object. * +*p_state_mgr +*Pointer to the state manager object. +* *h_bind *Bind handle returned by the transport layer. * @@ -233,6 +237,7 @@ osm_sm_mad_ctrl_init( IN osm_mad_pool_t* const p_mad_pool, IN osm_vl15_t* const p_vl15, IN osm_vendor_t* const p_vendor, + IN struct _osm_state_mgr* const p_state_mgr, IN osm_log_t* const p_log, IN osm_stats_t* const p_stats, IN cl_plock_t* const p_lock, @@ -251,6 +256,9 @@ osm_sm_mad_ctrl_init( *p_vendor *[in] Pointer to the vendor specific interfaces object. * +*p_state_mgr +*[in] Pointer to the state manager object. +* *p_log *[in] Pointer to the log object. * Index: include/vendor/osm_vendor_ibumad.h === --- include/vendor/osm_vendor_ibumad.h(revision 8998) +++ include/vendor/osm_vendor_ibumad.h(working copy) @@ -74,6 +74,8 @@ BEGIN_C_DECLS #define OSM_UMAD_MAX_CAS 32 #define OSM_UMAD_MAX_PORTS_PER_CA2 +#define OSM_VENDOR_SUPPORT_EVENTS + /* OpenIB gen2 doesn't support RMPP yet */ /s* OpenSM: Vendor UMAD/osm_ca_info_t @@ -179,6 +181,10 @@ typedef struct _osm_vendor int umad_port_id; void *receiver; int issmfd; + cl_thread_t events_thread; + void * events_callback; + void * sm_context; + struct ibv_context * ibv_context; } osm_vendor_t; #define OSM_BIND_INVALID_HANDLE 0 Index: include/vendor/osm_vendor_api.h === --- include/vendor/osm_vendor_api.h (revision 8998) +++ include/vendor/osm_vendor_api.h (working copy) @@ -526,6 +526,110 @@ osm_vendor_set_debug( * SEE ALSO */ +#ifdef OSM_VENDOR_SUPPORT_EVENTS + +#define OSM_EVENT_FATAL 1 +#define OSM_EVENT_PORT_ERR 2 + +/s* OpenSM Vendor API/osm_vend_events_callback_t +* NAME +* osm_vend_events_callback_t +* +* DESCRIPTION +* Function prototype for the vendor events callback. +* The vendor layer calls this function on driver events. +* +* SYNOPSIS +*/ +typedef void +(*osm_vend_events_callback_t)( + IN int events_mask, + IN void * const context ); +/* +* PARAMETERS +* events_mask +* [in] The received event(s). +* +* context +* [in] Context supplied as the sm_context argument in +* the osm_vendor_unreg_events_cb call +* +* RETURN VALUES +* None. +* +* NOTES +* +* SEE ALSO +* osm_vendor_reg_events_cb osm_vendor_unreg_events_cb +*/ + +/f* OpenSM Vendor API/osm_vendor_reg_events_cb +* NAME +* osm_vendor_reg_events_cb +* +* DESCRIPTION +* Registers the events callback function and start the events +* thread +* +* SYNOPSIS +*/ +int +osm_vendor_reg_events_cb( + IN osm_vendor_t * const p_vend, + IN void * const sm_callback, + IN void * const sm_context); +/* +* PARAMETERS +* p_vend +* [in] vendor handle. +* +* sm_callback +* [in] Callback function that should be called when +* the event is received. +* +* sm_context +* [in] Context supplied as the context argument in +* the subsequenct calls to the sm_callback function +* +* RETURN VALUE +* IB_SUCCESS if OK. +* +* NOTES +* +* SEE ALSO +* osm_vend_events_callback_t
Re: [openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)
This is already done - you can use OFED 1.1-rc2 or 1.0.1 Tziporet -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Troy Telford Sent: Thursday, August 17, 2006 8:57 PM To: OPENIB Subject: Re: [openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21) Quick Question: https://docs.mellanox.com/dm/ibg2/OFED_release_notes_Mellanox.txt states that SLES 9 SP3. is planned for OFED rev 1.1. Is this still something I can look forward to, or has it been pushed back? Thanks, -- Troy Telford ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] IPoIB
Hi, Does IPoIB work across IB subnets. For example if there are 4 IB subnets and all the IB subnets are reachable (meaning there is one host that is in all the IB subnets), then is it possible to ping/ssh to hosts in other IB subnets using IPoIB(assuming there is no ethernet) Regards, John T. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: handle local events
Quoting r. Yevgeny Kliteynik [EMAIL PROTECTED]: Index: libvendor/osm_vendor_ibumad.c === --- libvendor/osm_vendor_ibumad.c (revision 8998) +++ libvendor/osm_vendor_ibumad.c (working copy) @@ -72,6 +72,7 @@ #include opensm/osm_log.h #include opensm/osm_mad_pool.h #include vendor/osm_vendor_api.h +#include infiniband/verbs.h /s* OpenSM: Vendor AL/osm_umad_bind_info_t * NAME NAK. This means that the SM becomes dependent on the uverbs module. I don't think this is a good idea. Let's not go there - SM should depend just on the umad module and libc. In particular, SM should work even on embedded platforms where uverbs do not necessarily work. Further, hotplug events still do not seem to be handled, even with this patch. For port events, it seems sane that umad module could provide a way to listen for them. A recent patch to mthca converts fatal events to hotplug events, so fatal events can and should be handled as part of general hotplug support. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)
This is already done - you can use OFED 1.1-rc2 or 1.0.1 Tziporet Hi, again, regarding SLES9 SP3. is it also planned to support later kernels than 2.6.5-7.244 ? I tried to build against 2.6.5-7.276, but failed. I also failed a bug (#192) some time ago, but never got a response, unfortunately. cheers. - Christian smime.p7s Description: S/MIME cryptographic signature ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: handle local events
Hi MST, Hotplug event of HCA FATAL is supported and this is the #1 issue we are troubled with. Regarding other OpenSM vendors implementations (like the switch stack or gen1 stacks) This patch makes OpenSM agnostic for their support of events. If they do support it OpenSM will Register and behave according to the events if they do not - then no harm is done. Regarding ibumad implementation of propagating of events. Once it is supported I will vote for Re-writing the events feature to that interface. But it does not exist now and ibuverbs do exist. If required we could make the OSM_VENDOR_SUPPORT_EVENTS depend on the availability of the libibuverbs. But today it is common practice to have that lib compiled in all distributions. Eitan Zahavi Senior Engineering Director, Software Architect Mellanox Technologies LTD Tel:+972-4-9097208 Fax:+972-4-9593245 P.O. Box 586 Yokneam 20692 ISRAEL -Original Message- From: Michael S. Tsirkin Sent: Thursday, August 24, 2006 4:29 PM To: Yevgeny Kliteynik Cc: OPENIB; Roland Dreier; Hal Rosenstock; Eitan Zahavi Subject: Re: [PATCH] osm: handle local events Quoting r. Yevgeny Kliteynik [EMAIL PROTECTED]: Index: libvendor/osm_vendor_ibumad.c === --- libvendor/osm_vendor_ibumad.c (revision 8998) +++ libvendor/osm_vendor_ibumad.c (working copy) @@ -72,6 +72,7 @@ #include opensm/osm_log.h #include opensm/osm_mad_pool.h #include vendor/osm_vendor_api.h +#include infiniband/verbs.h /s* OpenSM: Vendor AL/osm_umad_bind_info_t * NAME NAK. This means that the SM becomes dependent on the uverbs module. I don't think this is a good idea. Let's not go there - SM should depend just on the umad module and libc. In particular, SM should work even on embedded platforms where uverbs do not necessarily work. Further, hotplug events still do not seem to be handled, even with this patch. For port events, it seems sane that umad module could provide a way to listen for them. A recent patch to mthca converts fatal events to hotplug events, so fatal events can and should be handled as part of general hotplug support. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)
Christian Guggenberger wrote: is it also planned to support later kernels than 2.6.5-7.244 ? I tried to build against 2.6.5-7.276, but failed. I also failed a bug (#192) some time ago, but never got a response, unfortunately. To solve this problem in 1.1 you should apply the following change on both build_env.sh and the configure: Replace the check for kernel 2.6.5-7.244* with 2.6.5-7.* The configure script is located at: OFED-1.1/SOURCES/openib-1.1.tgz under directory openib-1.1/configure. (Need to open the tgz change it and tar again) The build_env.sh is located on the OFED-1.1 directory. Note that for OFED 1.1 we will replace the check (will be done in RC3) since we understand it's limiting us to be so specific. Tziporet ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)
On Thu, Aug 24, 2006 at 06:08:07PM +0300, Tziporet Koren wrote: Christian Guggenberger wrote: is it also planned to support later kernels than 2.6.5-7.244 ? I tried to build against 2.6.5-7.276, but failed. I also failed a bug (#192) some time ago, but never got a response, unfortunately. To solve this problem in 1.1 you should apply the following change on both build_env.sh and the configure: Replace the check for kernel 2.6.5-7.244* with 2.6.5-7.* The configure script is located at: OFED-1.1/SOURCES/openib-1.1.tgz under directory openib-1.1/configure. (Need to open the tgz change it and tar again) I tried such things before already, but never got any step further. Compilation still aborts, and the cause for this is that the build system still does not know how to handle, e.g., 2.6.5-7.276-smp. (excerpt from the logs: ... patching file drivers/infiniband/hw/mthca/mthca_provider.c Hunk #1 succeeded at 1289 (offset 2 lines). Hunk #2 succeeded at 1314 (offset 2 lines). No Patches found for 2.6.5-7.276-smp kernel /bin/rm -f /var/tmp/OFEDRPM/BUILD/openib-1.1/configure.cache ... ) cheers. - Christian smime.p7s Description: S/MIME cryptographic signature ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: handle local events
Quoting r. Eitan Zahavi [EMAIL PROTECTED]: Subject: RE: [PATCH] osm: handle local events Hi MST, Hotplug event of HCA FATAL is supported and this is the #1 issue we are troubled with. No I mean real hotplug. -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] osm: handle local events
Hotplug event of HCA FATAL is supported and this is the #1 issue we are troubled with. No I mean real hotplug. [EZ] And umad will provide it? -- MST ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm ABI issues with OFED 1.1
Michael S. Tsirkin wrote: Maybe the librdmacm part should be merged to svn? So librdmacm could try to read from misc, then from /sys/class/infiniband/rdma_cm, and then assume latest. It's good to have userspace code portable across distros ... I can go with that. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM
Michael S. Tsirkin wrote: And even with these proposed changes, there's a race condition where the CM can timeout a connection after data is received over it, but before this event can be processed. Hmm. And what happens then? The connection is aborted by the CM. The CM sends a REJ for the connection and bumps the QP into timewait. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] drop mthca from svn? (was: Rollup patch for ipath and OFED)
Sean Why not remove your code from SVN? Along those lines, how would people feel if I removed the mthca kernel code from svn, and just maintained mthca in kernel.org git trees? I am getting heartily sick of double checkins for every mthca change... - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] dropped packets
I have no idea what your bug is, but as for this part: 1. Are there any race issues with ibv_get_cq_event? The example code (ud_pingpong) seems to imply that the correct sequence is Start: Call ibv_get_cq_event Call ibv_ack_cq_event - anywhere so long as it happens before destroy_cq Call ibv_req_notify_cq Call ibv_poll_cq- just once not as usual until empty according to the example Goto start this pingpong example is somewhat of a special case, since we know in advance that no more than 1 send and 1 receive completion can ever be outstanding, so a single ibv_poll_cq suffices. (Although thinking about this now, there may be some theoretical races that could case problems) - R. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm ABI issues with OFED 1.1
On Thu, 24 Aug 2006 09:18:50 -0700 Sean Hefty [EMAIL PROTECTED] wrote: Michael S. Tsirkin wrote: Maybe the librdmacm part should be merged to svn? So librdmacm could try to read from misc, then from /sys/class/infiniband/rdma_cm, and then assume latest. It's good to have userspace code portable across distros ... I can go with that. - Sean Something like this? Ira Index: openib/src/userspace/librdmacm/src/cma.c === --- openib/src/userspace/librdmacm/src/cma.c(revision 213) +++ openib/src/userspace/librdmacm/src/cma.c(revision 220) @@ -141,9 +141,13 @@ { char value[8]; - if (ibv_read_sysfs_file(ibv_get_sysfs_path(), + if ((ibv_read_sysfs_file(ibv_get_sysfs_path(), class/misc/rdma_cm/abi_version, - value, sizeof value) 0) { + value, sizeof value) 0) + + (ibv_read_sysfs_file(ibv_get_sysfs_path(), + class/infiniband_ucma/abi_version, + value, sizeof value) 0)) { /* * Older version of Linux do not have class/misc. To support * backports, assume the most recent version of the ABI. If ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] librdmacm ABI issues with OFED 1.1
I committed this change to the librdmacm in svn 9105. It still requires a backport patch for the kernel code. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] drop mthca from svn?
Roland Dreier wrote: Sean Why not remove your code from SVN? Along those lines, how would people feel if I removed the mthca kernel code from svn, and just maintained mthca in kernel.org git trees? I am getting heartily sick of double checkins for every mthca change... I think this is a fine idea. Both drivers should now use git as the golden source. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [openfabrics-ewg] librdmacm ABI issues with OFED 1.1
Here is a patch that I used in my backport to 2.6.9 for RedHat EL4 - U3 svn 8841 openib fixups patch. It also applies to svn9006 and will likely work fine on the OFED 1.1 code if you replace the current patch that removes the creation of the abi_version with this one that creates it under /sys/class/infiniband_uma and modify the library to check either place (Ira's patch below), it should work. woody -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ira Weiny Sent: Thursday, August 24, 2006 9:58 AM To: Sean Hefty Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; openib-general@openib.org Subject: Re: [openfabrics-ewg] [openib-general] librdmacm ABI issues with OFED 1.1 On Thu, 24 Aug 2006 09:18:50 -0700 Sean Hefty [EMAIL PROTECTED] wrote: Michael S. Tsirkin wrote: Maybe the librdmacm part should be merged to svn? So librdmacm could try to read from misc, then from /sys/class/infiniband/rdma_cm, and then assume latest. It's good to have userspace code portable across distros ... I can go with that. - Sean Something like this? Ira Index: openib/src/userspace/librdmacm/src/cma.c === --- openib/src/userspace/librdmacm/src/cma.c(revision 213) +++ openib/src/userspace/librdmacm/src/cma.c(revision 220) @@ -141,9 +141,13 @@ { char value[8]; - if (ibv_read_sysfs_file(ibv_get_sysfs_path(), + if ((ibv_read_sysfs_file(ibv_get_sysfs_path(), class/misc/rdma_cm/abi_version, - value, sizeof value) 0) { + value, sizeof value) 0) + + (ibv_read_sysfs_file(ibv_get_sysfs_path(), + class/infiniband_ucma/abi_version, + value, sizeof value) 0)) { /* * Older version of Linux do not have class/misc. To support * backports, assume the most recent version of the ABI. If ___ openfabrics-ewg mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openfabrics-ewg ucma_abi_version_backport_to_2.6.9.patch Description: ucma_abi_version_backport_to_2.6.9.patch ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] drop mthca from svn? (was: Rollup patch for ipath and OFED)
On Thu, 2006-08-24 at 09:31 -0700, Roland Dreier wrote: Along those lines, how would people feel if I removed the mthca kernel code from svn, and just maintained mthca in kernel.org git trees? +1 from me. We'll drop the ipath code, too. b ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 01:57:39PM -0700, Sean Hefty wrote: OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. If an application were written to use Myrinet, would you consider it non-standard? Er, this question is a bit existential for my taste. Myrinet has its own standards. We're trying to create *inter-operable* hardware and software in this community. So we follow the IB standard. Myricom is doing their own thing, although of course they have software which obeys the Ethernet, VIA, DAPL, and other standards. And I expect that if they say they obey a standard, that they do. They're good people that way. It's up to the application to verify that the hardware that they're using provides the required features, or adjust accordingly, and publish those requirements to the end users. If that was being done (and it isn't), it would still be bad for the ecosystem as a whole. But, basically, that's about the same as what I proposed, quoted above. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] Utilities for sending traffic with different SL
Folks: Is there a utility within the OFED1.0 package which can be used for generating traffic on different SL (akin to the Voltaire perf_main utility)? Many thanks, Suri ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Wed, Aug 23, 2006 at 04:46:52PM -0700, Roland Dreier wrote: Yes, Mellanox documents that it is safe to rely on the last byte of an RDMA being written last. OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. If an application were written to use Myrinet, would you consider it non-standard? The application is simply written to take advantage of specific hardware. It's up to the application to verify that the hardware that they're using provides the required features, or adjust accordingly, and publish those requirements to the end users. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Greg wrote, OK, great. I'm fine with people using things which are supported, but then we need the big, blinking Warning! This program is non-standard, and won't work with many of the devices supported by Open Fabrics! sign. -- greg The other way to look at it is, the customer goes to the ISV and asks, what hardware should I buy, and the ISV says I support X version of MPI and vendor Y's hardware works with X version of MPI. The customer buys the software solution and will get the hardware that works with that software. So, if you want your hardware to work with basically almost any MPI today, since most of the MPIs assume this data placement ordering, then you will make your hardware so that it will guarantee this type of data delivery. If you would rather not, cause some spec says you don't have to, then you are just limiting the amount of hardware that you will sell. my 2 cents, woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote: The other way to look at it is, the customer goes to the ISV and asks, what hardware should I buy, and the ISV says I support X version of MPI and vendor Y's hardware works with X version of MPI. I thought the goal of InfiniBand was to create an ecosystem where you didn't have to do this. I guess I missed something somewhere. Adding undocumented requirements to a standard isn't the way to entice more people into implementing or using it. I would challenge you to find a single ISV that would prefer a situation where some infiniband middleware requires things which aren't in the standard. So, if you want your hardware to work with basically almost any MPI today, since most of the MPIs assume this data placement ordering, then you will make your hardware so that it will guarantee this type of data delivery. I think there's some confusion here between practicality and theory. There's no question that any IB vendor would make that kind of decision, although it will be expensive and annoying for iWarp vendors to do so. They'll feel forced to do so. But this is bad for the community. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
We're trying to create *inter-operable* hardware and software in this community. So we follow the IB standard. Atomic operations and RDD are optional, yet still part of the IB standard. An application that makes use of either of these isn't guaranteed to operate with all IB hardware. I'm not even sure that CAs are required to implement RDMA reads. It's up to the application to verify that the hardware that they're using provides the required features, or adjust accordingly, and publish those requirements to the end users. If that was being done (and it isn't), it would still be bad for the ecosystem as a whole. Applications should drive the requirements. Some poll on memory today. A lot of existing hardware provides support for this by guaranteeing that the last byte will always be written last. This doesn't mean that data cannot be placed out of order, only that the last byte is deferred. Again, if a vendor wants to work with applications written this way, then this is a feature that should be provided. If a vendor doesn't care about working with those applications, or wants to require that the apps be rewritten, then this feature isn't important. But I do not see an issue with a vendor adding value beyond what's defined in the spec. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 02:58:18PM -0700, Sean Hefty wrote: We're trying to create *inter-operable* hardware and software in this community. So we follow the IB standard. Atomic operations and RDD are optional, yet still part of the IB standard. An application that makes use of either of these isn't guaranteed to operate with all IB hardware. But those are example of things which are actually written down in the standard. The example we were talking about isn't. But I do not see an issue with a vendor adding value beyond what's defined in the spec. Neither do I. If you think so, you haven't understood my argument. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Greg wrote, On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote: The other way to look at it is, the customer goes to the ISV and asks, what hardware should I buy, and the ISV says I support X version of MPI and vendor Y's hardware works with X version of MPI. I thought the goal of InfiniBand was to create an ecosystem where you didn't have to do this. I guess I missed something somewhere. The customers still buy the hardware cause it runs an application, they don't just go out and buy some cool InfinBand hardware and then go see what they can use it for. Adding undocumented requirements to a standard isn't the way to entice more people into implementing or using it. Well, sometimes the applications drive the feature set, even though it is not part of the actual standard. Unfortunate, but a fact of live. I would challenge you to find a single ISV that would prefer a situation where some infiniband middleware requires things which aren't in the standard. If the feature gives them a huge advantage in performance (and it does) and all of the hardware vendors that they deal with already implement it, then yes, they will force, by defacto standard that all other newcomers implement it or face the fact that no one will buy their hardware. It seems like that is what is happening in this case. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:13:33PM -0700, Woodruff, Robert J wrote: If the feature gives them a huge advantage in performance (and it does) and all of the hardware vendors that they deal with already implement it, then yes, they will force, by defacto standard that all other newcomers implement it or face the fact that no one will buy their hardware. It seems like that is what is happening in this case. In this case the feature reduces performance on one HCA and increases it on another. Which shows why it's a bad idea to pick features based on a single implementation. But you're still confusing practicality and theory. I can see why it's pratical sense for newcomers to implement this new, performance- reducing feature. But why is it theoretically good? And shouldn't it be added to the standard, before all the poor iWarp people discover the hard way that they need it? -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Greg wrote, reducing feature. But why is it theoretically good? And shouldn't it be added to the standard, before all the poor iWarp people discover the hard way that they need it? -- greg Yes, IMO, the iWarp folks and the IBTA should consider making this a requirement, but even if they do not the ISVs will still require it. That being said, if you can show the ISVs how they can implement their completion model faster using some other mechanism than what they do now, they would probably listen, as they are going to do what gives them the best performance, standard or not. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On 8/24/06, Woodruff, Robert J [EMAIL PROTECTED] wrote: If the feature gives them a huge advantage in performance (and it does) and all of the hardware vendors that they deal with already implement it, then yes, they will force, by defacto standard that all other newcomers implement it or face the fact that no one will buy their hardware. It seems like that is what is happening in this case. Actually, if a hardware implementation provided the same performance (in this case latency) by polling on a CQ as one where polling on memory was garanteed to work, the customer may actually prefer the standard implementation. If the CQ entry could be updated faster than it is today, then polling the CQ would be a viable not to mention IB and iWARP standard solution. The problem is there's a considerable delay from when the last byte of data is written to when the CQE is written. My last measurement was 2us using a Mellanox PCI-X device. - Fab ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:28:44PM -0700, Woodruff, Robert J wrote: Yes, IMO, the iWarp folks and the IBTA should consider making this a requirement, but even if they do not the ISVs will still require it. Ahah. So with all this head and light, we agree that it should be added to the standard. That being said, if you can show the ISVs how they can implement their completion model faster using some other mechanism than what they do now, they would probably listen, as they are going to do what gives them the best performance, standard or not. The other way to do it, faster on some HCAs, is to follow the standard. So, no showing needed. If MPI implementations implemented this in addition to the other, non-standard way, and automagically picked the right one, we wouldn't be having this discussion. So, I think we're ending up in agreement. Feel free to disagree ;-) -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
But you're still confusing practicality and theory. I can see why it's pratical sense for newcomers to implement this new, performance- reducing feature. But why is it theoretically good? I'm missing the standard you're using to judge what's theoretically good and bad. Applications are written this way today. A vendor can either: * Support those apps by providing the feature. * Require that the apps be rewritten to use their hardware. Whether apps should have been written this way seems irrelevant. They are, and we should make decisions based on that, including extending the spec and/or implementation if needed. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Actually, if a hardware implementation provided the same performance (in this case latency) by polling on a CQ as one where polling on memory was guaranteed to work, the customer may actually prefer the standard implementation. Polling on a CQ involves a function call, synchronization to the CQ, and formatting a structure to return to the user. I don't see this ever being faster than polling memory. If the data is received in order, there's no additional delay between writing the data and polling on memory. A CQ entry requires a separate write, plus the overhead mentioned above. Even if the data arrived out of order, only the last byte needs to be deferred. Writing that byte shouldn't be any slower than writing a CQ entry. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:37:21PM -0700, Sean Hefty wrote: I'm missing the standard you're using to judge what's theoretically good and bad. Having simpler programming to get good performance is a theoretical good. Extra hacks for specific hardware is theoretically bad, pratically good only when it ends up with much better performance. Silently non-standard software is bad by both accounts. Applications are written this way today. A vendor can either: * Support those apps by providing the feature. * Require that the apps be rewritten to use their hardware. That's what I was calling practical. It's clear what a hardware vendor will do in that case. Whether apps should have been written this way seems irrelevant. They are, and we should make decisions based on that, including extending the spec and/or implementation if needed. In this case we're talking about code which can easily be changed to follow the standard, in addition to having a hack mode that's faster on 1 particular hardware implementation. You seem to be implying that the applications are set in stone, and that their authors have no interest in making them standard-conformant. I don't think that's the case. If there is a standard extension which can provide better performance on 1 particular hardware implementation, let's add it to the standard. But let's also make the software standard-conformant on other hardware. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
[EMAIL PROTECTED] wrote: On 8/24/06, Woodruff, Robert J [EMAIL PROTECTED] wrote: If the feature gives them a huge advantage in performance (and it does) and all of the hardware vendors that they deal with already implement it, then yes, they will force, by defacto standard that all other newcomers implement it or face the fact that no one will buy their hardware. It seems like that is what is happening in this case. Actually, if a hardware implementation provided the same performance (in this case latency) by polling on a CQ as one where polling on memory was garanteed to work, the customer may actually prefer the standard implementation. Exactly. The correct solution is to rely on a CQE to signal a completion. That is why the standards are written the way they are. For iWARP there are network performance reasons why in-order memory writes will never be guaranteed. Now any time an *application* wants to write something that is vendor dependent then it is really up to that application developer to decide if the benefit is worthwhile. And in my opinion the in order write solution is not a feature, it is a work around to a slow CQ. So I certainly would not want to encourage that workaround, but applications are always free to do device or vendor specific workaround in their application code. It's their code. But I would hope that we would agree that no code that is part of the project should rely on this feature. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Greg wrote, In this case we're talking about code which can easily be changed to follow the standard, in addition to having a hack mode that's faster on 1 particular hardware implementation. You seem to be implying that the applications are set in stone, and that their authors have no interest in making them standard-conformant. I don't think that's the case. If there is a standard extension which can provide better performance on 1 particular hardware implementation, let's add it to the standard. But let's also make the software standard-conformant on other hardware. -- greg If the overhead in polling the CQ rather than memory was not so high, they would have used it, but they found that it added 2us to the latency and found they could get better performance if they polled memory, so that is what they did, and as long as the hardware (or at least the hardware they care about support's it) they won't change it. If you can show them how they could get better (not just equal) performance using some other method, they would probably listen. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:43:38PM -0700, Sean Hefty wrote: Actually, if a hardware implementation provided the same performance (in this case latency) by polling on a CQ as one where polling on memory was guaranteed to work, the customer may actually prefer the standard implementation. Polling on a CQ involves a function call, synchronization to the CQ, and formatting a structure to return to the user. I don't see this ever being faster than polling memory. Why don't you measure it, then? For example, an iWarp implementation is going to be slowed down if it has to reorder segments to deliver the last byte last. This expense might be more than the function call. You guess not, but... You're also assuming that programs are only checking the last byte of the buffer. For all you know, Mellanox is delivering the whole buffer in ascending order, and the user is checking bytes in the middle, too. Which is a hazard of not-yet-specified standards extensions. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Catlin wrote, For iWARP there are network performance reasons why in-order memory writes will never be guaranteed. For iWarp, or any other RDMA over Ethernet protocol, the behavior is not to guarantee all packets are written in-order, just that the last byte of the last packet is written last. This can easily be implemented in an iWarp card or by the driver with minimal performance impact in most cases. So for example, if the last packet arrives before all the other packets have arrived, the iWarp card or driver does not place that data of the last packet until all the other packets have arrived. woody ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 03:53:37PM -0700, Woodruff, Robert J wrote: If the overhead in polling the CQ rather than memory was not so high, they would have used it, but they found that it added 2us to the latency and found they could get better performance if they polled memory, I keep on mentioning that measuring one instance of one implementation isn't necessarily a good way to evaluate a standard. Or the implementation. We already have one example where Mellanox improved their implementation in the DDR generation such that a performance workaround -- choosing a 1k MTU for RC connections -- was no longer needed. I was pleased when we all agreed to remove defaulting to the workaround -- it was clearly the right thing to do. -- g ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Polling on a CQ involves a function call, synchronization to the CQ, and formatting a structure to return to the user. I don't see this ever being faster than polling memory. Why don't you measure it, then? Why? Reading a memory location directly will be faster than calling a function to read from a memory location. For example, an iWarp implementation is going to be slowed down if it has to reorder segments to deliver the last byte last. This expense might be more than the function call. It only needs to defer the last byte. The claim being made is that the last byte will be delivered last. Yes, I can implement something non-performant in this mode, but that doesn't invalidate polling memory as a general solution. You're also assuming that programs are only checking the last byte of the buffer. The applications I care about are polling on the last byte. - Sean ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
On Thu, Aug 24, 2006 at 04:13:22PM -0700, Sean Hefty wrote: Why don't you measure it, then? Why? Reading a memory location directly will be faster than calling a function to read from a memory location. ... sigh. This is not true, there are quite obvious implementations where this is not true. We had one, actually, and changed it due to this issue. That was a practical choice. You're also assuming that programs are only checking the last byte of the buffer. The applications I care about are polling on the last byte. And tomorrow, the next app may depend on the behavior that all the bytes arrive in order. Slippery slope, you know. -- greg ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] basic IB doubt
Woodruff, Robert J wrote: Catlin wrote, For iWARP there are network performance reasons why in-order memory writes will never be guaranteed. For iWarp, or any other RDMA over Ethernet protocol, the behavior is not to guarantee all packets are written in-order, just that the last byte of the last packet is written last. This can easily be implemented in an iWarp card or by the driver with minimal performance impact in most cases. So for example, if the last packet arrives before all the other packets have arrived, the iWarp card or driver does not place that data of the last packet until all the other packets have arrived. woody Fascinating argument. The correct forum for that is the IETF RDDP Workgroup, not a Linux open source project. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH v2] ib_usa: support userspace SA queries and multicast
Changes from v1: The ib_usa module exports two files: ib_usa_default and ib_usa_raw. Use of the ib_usa_default restricts the user to sending PathRecord, MultiPathRecord, MCMemberRecord, and ServiceRecord queries, and joining / leaving multicast groups. Use of ib_usa_raw allows any MADs to be sent to the SA. An administrator can set control on these files in any appropriate way. Signed-off-by: Sean Hefty [EMAIL PROTECTED] --- Index: include/rdma/ib_usa.h === --- include/rdma/ib_usa.h (revision 0) +++ include/rdma/ib_usa.h (revision 0) @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2006 Intel Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + *copyright notice, this list of conditions and the following + *disclaimer. + * + * - Redistributions in binary form must reproduce the above + *copyright notice, this list of conditions and the following + *disclaimer in the documentation and/or other materials + *provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef IB_USA_H +#define IB_USA_H + +#include linux/types.h +#include rdma/ib_sa.h + +#define IB_USA_ABI_VERSION 1 + +#define IB_USA_EVENT_DATA 256 + +enum { + IB_USA_CMD_SEND_MAD, + IB_USA_CMD_GET_EVENT, + IB_USA_CMD_GET_DATA, + IB_USA_CMD_JOIN_MCAST, + IB_USA_CMD_FREE_ID, + IB_USA_CMD_GET_MCAST +}; + +enum { + IB_USA_EVENT_MAD, + IB_USA_EVENT_MCAST +}; + +struct ib_usa_cmd_hdr { + __u32 cmd; + __u16 in; + __u16 out; +}; + +struct ib_usa_send_mad { + __u64 response; /* unused - reserved */ + __u64 uid; + __u64 node_guid; + __u64 comp_mask; + __u64 attr; + __u8 port_num; + __u8 method; + __be16 attr_id; + __u32 timeout_ms; + __u32 retries; +}; + +struct ib_usa_join_mcast { + __u64 response; + __u64 uid; + __u64 node_guid; + __u64 comp_mask; + __u64 mcmember_rec; + __u8 port_num; +}; + +struct ib_usa_id_resp { + __u32 id; +}; + +struct ib_usa_free_resp { + __u32 events_reported; +}; + +struct ib_usa_free_id { + __u64 response; + __u32 id; +}; + +struct ib_usa_get_event { + __u64 response; +}; + +struct ib_usa_event_resp { + __u64 uid; + __u32 id; + __u32 event; + __u32 status; + __u32 data_len; + __u8 data[IB_USA_EVENT_DATA]; +}; + +struct ib_usa_get_data { + __u64 response; + __u32 id; +}; + +struct ib_usa_get_mcast { + __u64 response; + __u64 node_guid; + __u8 mgid[16]; + __u8 port_num; +}; + +#endif /* IB_USA_H */ Index: core/usa.c === --- core/usa.c (revision 0) +++ core/usa.c (revision 0) @@ -0,0 +1,846 @@ +/* + * Copyright (c) 2006 Intel Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY,
Re: [openib-general] basic IB doubt
At 07:46 PM 8/23/2006, Roland Dreier wrote: Greg Actually, that leads me to a question: does the vendor of Greg that adaptor say that this is actually safe? Just because Greg something behaves one way most of the time doesn't mean it Greg does it all of the time. So it it really smart to write Greg non-standard-conforming programs unless the vendor stands Greg behind that behavior? Yes, Mellanox documents that it is safe to rely on the last byte of an RDMA being written last. How does an adapter guarantee that no bridges or other intervening devices reorder their writes, or for that matter flush them to memory at all!? Without signalling the host processor, that is. Isn't that what the dma_sync() API is all about? Tom. ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] about rdma and send/recv model
Although SDP is not working correctly now, it can easily support many existing applications based on Sockets API. Patrick say rdma is a connected model and has serious scalability problem. But Does send/recv model has or support SDP-alike API? zhu __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general