Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Michael S. Tsirkin
Quoting r. Sean Hefty [EMAIL PROTECTED]:
 If you look at Woody's backport patches, I believe that he moves the RDMA CM
 files to /sys/class/infiniband/rdma_cm and updates the librdmacm to read the
 abi_version from there.

Maybe the librdmacm part should be merged to svn?
So librdmacm could try to read from misc, then from
/sys/class/infiniband/rdma_cm, and then assume latest.
It's good to have userspace code portable across distros ...

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] dropped packets

2006-08-24 Thread Robert Pearson








Roland,



I am
trying to write a user level application that receives multicast UD packets at
user level. I am seeing about 1-2 % packet loss between the send side and the
receive side apparently independent of the packet rate for low rates. (Heavily
traced sends and receives with very low rates still drop packets even though
there are more packets posted on the receive side than are sent.) I have a
couple of questions:



1.
Are there any race issues with ibv_get_cq_event? The
example code (ud_pingpong) seems to imply that the correct sequence is



Start:

 Call
ibv_get_cq_event

 Call
ibv_ack_cq_event -
anywhere so long as it happens before destroy_cq

 Call
ibv_req_notify_cq

 Call
ibv_poll_cq -
just once not as usual until empty according to the example

 Goto
start



In the old days we called request notify and poll
until poll was empty on a notify thread in order to prevent a race.



2.
When I post say 500 receive buffers and send say 200 send
buffers and tag the sends with a sequence number I often see one or two missing
sequence numbers at the receive side at the poll_cq interface having checked at
the post and poll interfaces of the send side to see that all the correct
sequence numbers went out. I am not sure how this can be possible regardless of
the notification scheme used.



I
would love for this to be a programming error in my code but I cant
figure out how I can mess it up between post_send and poll_cq on the receive
side. I see the same behavior between systems and with a loopback between two
ports on the same HCA.



Please
let me know if this rings any bells.



Bob
Pearson








___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] mthca: various bug fixes for mthca_query_qp

2006-08-24 Thread Jack Morgenstein
On Wednesday 23 August 2006 23:25, Roland Dreier wrote:
   5. Return the send_cq, receive cq and srq handles.  ib_query_qp() needs 
 them
  (required by IB Spec). ibv_query_qp() overwrites these values in 
 user-space
  with appropriate user-space values.
 
   +  qp_init_attr-send_cq   = ibqp-send_cq;
   +  qp_init_attr-recv_cq   = ibqp-recv_cq;
   +  qp_init_attr-srq   = ibqp-srq;
 
 I really disagree with this change.  It's silly to do this copying
 since the consumer already has the ibqp pointer.  And it's especially
 silly to put this in a low-level driver, since there's nothing
 device-specific about it.
 
Note that the same thing is done in user-space (albeit in the verbs layer):

libibverbs/src/cmd.c, lines 678 ff. (procedure ibv_cmd_query_qp() ):

init_attr-send_cq  = qp-send_cq;
init_attr-recv_cq  = qp-recv_cq;
init_attr-srq  = qp-srq;

Either return these values in both kernel and user, or remove them from
the user-space verb.

Regarding putting this in the low level kernel driver, I agree.
A patch for core/verbs.c is given below.

(P.S. IMHO, the correct thing to do is to remove the above lines from cmd.c,
 and not return to the user stuff that s/he already has available).

- Jack

-
Index: ofed_1_1/drivers/infiniband/core/verbs.c
===
--- ofed_1_1.orig/drivers/infiniband/core/verbs.c   2006-08-03 
14:30:21.0 +0300
+++ ofed_1_1/drivers/infiniband/core/verbs.c2006-08-24 10:03:51.831333000 
+0300
@@ -556,9 +556,17 @@ int ib_query_qp(struct ib_qp *qp,
int qp_attr_mask,
struct ib_qp_init_attr *qp_init_attr)
 {
-   return qp-device-query_qp ?
-   qp-device-query_qp(qp, qp_attr, qp_attr_mask, qp_init_attr) :
-   -ENOSYS;
+   int rc = -ENOSYS;
+   if (qp-device-query_qp) {
+   rc = qp-device-query_qp(qp, qp_attr,
+ qp_attr_mask, qp_init_attr);
+   if(!rc) {
+   qp_init_attr-recv_cq = qp-recv_cq;
+   qp_init_attr-send_cq = qp-send_cq;
+   qp_init_attr-srq = qp-srq;
+   }
+   }
+   return rc;
 }
 EXPORT_SYMBOL(ib_query_qp);
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] dropped packets

2006-08-24 Thread Michael S. Tsirkin
Quoting r. Robert Pearson [EMAIL PROTECTED]:
 Subject: dropped packets
 
 Roland,
 
  
 
 I am trying to write a user level application that receives multicast UD 
 packets at user level. I am seeing about 1-2 % packet loss between the send 
 side and the receive side apparently independent of the packet rate for low 
 rates. (Heavily traced sends
 and receives with very low rates still drop packets even though there are 
 more packets posted on the receive side than are sent.) I have a couple of 
 questions:

What kind of hardware do you have?

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Yevgeny Kliteynik
Hi Hal

This patch implements first item of the OSM todo list.

OpenSM opens a thread that is listening for events on the SM's port.
The events that are being taken care of are IBV_EVENT_DEVICE_FATAL and
IBV_EVENT_PORT_ERROR.

In case of IBV_EVENT_DEVICE_FATAL, osm is forced to exit.
in case of IBV_EVENT_PORT_ERROR, osm initiates heavy sweep.

Yevgeny

Signed-off-by:  Yevgeny Kliteynik [EMAIL PROTECTED]

Index: include/opensm/osm_sm_mad_ctrl.h
===
--- include/opensm/osm_sm_mad_ctrl.h(revision 8998)
+++ include/opensm/osm_sm_mad_ctrl.h(working copy)
@@ -109,6 +109,7 @@ typedef struct _osm_sm_mad_ctrl
osm_mad_pool_t  *p_mad_pool;
osm_vl15_t  *p_vl15;
osm_vendor_t*p_vendor;
+   struct _osm_state_mgr   *p_state_mgr; 
osm_bind_handle_t   h_bind;
cl_plock_t  *p_lock;
cl_dispatcher_t *p_disp;
@@ -130,6 +131,9 @@ typedef struct _osm_sm_mad_ctrl
 *  p_vendor
 *  Pointer to the vendor specific interfaces object.
 *
+*  p_state_mgr
+*  Pointer to the state manager object.
+*
 *  h_bind
 *  Bind handle returned by the transport layer.
 *
@@ -233,6 +237,7 @@ osm_sm_mad_ctrl_init(
IN osm_mad_pool_t* const p_mad_pool,
IN osm_vl15_t* const p_vl15,
IN osm_vendor_t* const p_vendor,
+   IN struct _osm_state_mgr* const p_state_mgr,
IN osm_log_t* const p_log,
IN osm_stats_t* const p_stats,
IN cl_plock_t* const p_lock,
@@ -251,6 +256,9 @@ osm_sm_mad_ctrl_init(
 *  p_vendor
 *  [in] Pointer to the vendor specific interfaces object.
 *
+*  p_state_mgr
+*  [in] Pointer to the state manager object.
+*
 *  p_log
 *  [in] Pointer to the log object.
 *
Index: include/vendor/osm_vendor_ibumad.h
===
--- include/vendor/osm_vendor_ibumad.h  (revision 8998)
+++ include/vendor/osm_vendor_ibumad.h  (working copy)
@@ -74,6 +74,8 @@ BEGIN_C_DECLS
 #define OSM_UMAD_MAX_CAS   32
 #define OSM_UMAD_MAX_PORTS_PER_CA  2
 
+#define OSM_VENDOR_SUPPORT_EVENTS
+
 /* OpenIB gen2 doesn't support RMPP yet */
 
 /s* OpenSM: Vendor UMAD/osm_ca_info_t
@@ -179,6 +181,10 @@ typedefstruct _osm_vendor
int umad_port_id;
void *receiver;
int issmfd;
+   cl_thread_t events_thread;  
+   void * events_callback;
+   void * sm_context;
+   struct ibv_context * ibv_context;
 } osm_vendor_t;
 
 #define OSM_BIND_INVALID_HANDLE 0
Index: include/vendor/osm_vendor_api.h
===
--- include/vendor/osm_vendor_api.h (revision 8998)
+++ include/vendor/osm_vendor_api.h (working copy)
@@ -526,6 +526,110 @@ osm_vendor_set_debug(
 * SEE ALSO
 */
 
+#ifdef OSM_VENDOR_SUPPORT_EVENTS
+
+#define OSM_EVENT_FATAL 1
+#define OSM_EVENT_PORT_ERR  2
+
+/s* OpenSM Vendor API/osm_vend_events_callback_t
+* NAME
+*  osm_vend_events_callback_t
+*
+* DESCRIPTION
+*  Function prototype for the vendor events callback.
+*  The vendor layer calls this function on driver events.
+*
+* SYNOPSIS
+*/
+typedef void 
+(*osm_vend_events_callback_t)(
+   IN int events_mask,
+   IN void * const context );
+/*
+* PARAMETERS
+*  events_mask
+* [in] The received event(s).
+*
+*  context
+* [in] Context supplied as the sm_context argument in
+*  the osm_vendor_unreg_events_cb call
+*
+* RETURN VALUES
+*  None.
+*
+* NOTES
+*
+* SEE ALSO
+*  osm_vendor_reg_events_cb osm_vendor_unreg_events_cb
+*/
+
+/f* OpenSM Vendor API/osm_vendor_reg_events_cb
+* NAME
+*   osm_vendor_reg_events_cb
+*
+* DESCRIPTION
+*  Registers the events callback function and start the events
+*  thread
+*
+* SYNOPSIS
+*/
+int 
+osm_vendor_reg_events_cb(
+   IN osm_vendor_t * const p_vend,
+   IN void * const sm_callback,
+   IN void * const sm_context);
+/*
+* PARAMETERS
+*   p_vend
+* [in] vendor handle.
+*
+*   sm_callback
+* [in] Callback function that should be called when
+*  the event is received.
+*
+*  sm_context
+* [in] Context supplied as the context argument in
+*  the subsequenct calls to the sm_callback function
+*
+* RETURN VALUE
+*  IB_SUCCESS if OK.
+*
+* NOTES
+*
+* SEE ALSO
+*  osm_vend_events_callback_t osm_vendor_unreg_events_cb
+*/
+
+/f* OpenSM Vendor API/osm_vendor_unreg_events_cb
+* NAME
+*   osm_vendor_unreg_events_cb
+*
+* DESCRIPTION
+*  Un-Registers the events callback function and stops the events
+*  thread
+*
+* SYNOPSIS
+*/
+void 
+osm_vendor_unreg_events_cb(
+   IN osm_vendor_t * const p_vend);
+/*
+* PARAMETERS
+*   p_vend
+* [in] vendor handle.
+*
+*
+* RETURN VALUE
+*  None.
+*
+* NOTES
+*

Re: [openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Eitan Zahavi
I did not see this on the reflector.
We did have some mailer problems. So I am resending to the list

One more thing to add:
The only other event we considered was PORT_ACTIVE.
But as it turns out the event is only generated when the port moves into ACTIVE 
state
which means an SM already handled it...

EZ

Yevgeny Kliteynik wrote:
 Hi Hal
 
 This patch implements first item of the OSM todo list.
 
 OpenSM opens a thread that is listening for events on the SM's port.
 The events that are being taken care of are IBV_EVENT_DEVICE_FATAL and
 IBV_EVENT_PORT_ERROR.
 
 In case of IBV_EVENT_DEVICE_FATAL, osm is forced to exit.
 in case of IBV_EVENT_PORT_ERROR, osm initiates heavy sweep.
 
 Yevgeny
 
 Signed-off-by:  Yevgeny Kliteynik [EMAIL PROTECTED]
 
 Index: include/opensm/osm_sm_mad_ctrl.h
 ===
 --- include/opensm/osm_sm_mad_ctrl.h  (revision 8998)
 +++ include/opensm/osm_sm_mad_ctrl.h  (working copy)
 @@ -109,6 +109,7 @@ typedef struct _osm_sm_mad_ctrl
   osm_mad_pool_t  *p_mad_pool;
   osm_vl15_t  *p_vl15;
   osm_vendor_t*p_vendor;
 + struct _osm_state_mgr   *p_state_mgr; 
   osm_bind_handle_t   h_bind;
   cl_plock_t  *p_lock;
   cl_dispatcher_t *p_disp;
 @@ -130,6 +131,9 @@ typedef struct _osm_sm_mad_ctrl
  *p_vendor
  *Pointer to the vendor specific interfaces object.
  *
 +*p_state_mgr
 +*Pointer to the state manager object.
 +*
  *h_bind
  *Bind handle returned by the transport layer.
  *
 @@ -233,6 +237,7 @@ osm_sm_mad_ctrl_init(
   IN osm_mad_pool_t* const p_mad_pool,
   IN osm_vl15_t* const p_vl15,
   IN osm_vendor_t* const p_vendor,
 + IN struct _osm_state_mgr* const p_state_mgr,
   IN osm_log_t* const p_log,
   IN osm_stats_t* const p_stats,
   IN cl_plock_t* const p_lock,
 @@ -251,6 +256,9 @@ osm_sm_mad_ctrl_init(
  *p_vendor
  *[in] Pointer to the vendor specific interfaces object.
  *
 +*p_state_mgr
 +*[in] Pointer to the state manager object.
 +*
  *p_log
  *[in] Pointer to the log object.
  *
 Index: include/vendor/osm_vendor_ibumad.h
 ===
 --- include/vendor/osm_vendor_ibumad.h(revision 8998)
 +++ include/vendor/osm_vendor_ibumad.h(working copy)
 @@ -74,6 +74,8 @@ BEGIN_C_DECLS
  #define OSM_UMAD_MAX_CAS 32
  #define OSM_UMAD_MAX_PORTS_PER_CA2
  
 +#define OSM_VENDOR_SUPPORT_EVENTS
 +
  /* OpenIB gen2 doesn't support RMPP yet */
  
  /s* OpenSM: Vendor UMAD/osm_ca_info_t
 @@ -179,6 +181,10 @@ typedef  struct _osm_vendor
   int umad_port_id;
   void *receiver;
   int issmfd;
 + cl_thread_t events_thread;  
 + void * events_callback;
 + void * sm_context;
 + struct ibv_context * ibv_context;
  } osm_vendor_t;
  
  #define OSM_BIND_INVALID_HANDLE 0
 Index: include/vendor/osm_vendor_api.h
 ===
 --- include/vendor/osm_vendor_api.h   (revision 8998)
 +++ include/vendor/osm_vendor_api.h   (working copy)
 @@ -526,6 +526,110 @@ osm_vendor_set_debug(
  * SEE ALSO
  */
  
 +#ifdef OSM_VENDOR_SUPPORT_EVENTS
 +
 +#define OSM_EVENT_FATAL 1
 +#define OSM_EVENT_PORT_ERR  2
 +
 +/s* OpenSM Vendor API/osm_vend_events_callback_t
 +* NAME
 +*  osm_vend_events_callback_t
 +*
 +* DESCRIPTION
 +*  Function prototype for the vendor events callback.
 +*  The vendor layer calls this function on driver events.
 +*
 +* SYNOPSIS
 +*/
 +typedef void 
 +(*osm_vend_events_callback_t)(
 +   IN int events_mask,
 +   IN void * const context );
 +/*
 +* PARAMETERS
 +*  events_mask
 +* [in] The received event(s).
 +*
 +*  context
 +* [in] Context supplied as the sm_context argument in
 +*  the osm_vendor_unreg_events_cb call
 +*
 +* RETURN VALUES
 +*  None.
 +*
 +* NOTES
 +*
 +* SEE ALSO
 +*  osm_vendor_reg_events_cb osm_vendor_unreg_events_cb
 +*/
 +
 +/f* OpenSM Vendor API/osm_vendor_reg_events_cb
 +* NAME
 +*   osm_vendor_reg_events_cb
 +*
 +* DESCRIPTION
 +*  Registers the events callback function and start the events
 +*  thread
 +*
 +* SYNOPSIS
 +*/
 +int 
 +osm_vendor_reg_events_cb(
 +   IN osm_vendor_t * const p_vend,
 +   IN void * const sm_callback,
 +   IN void * const sm_context);
 +/*
 +* PARAMETERS
 +*   p_vend
 +* [in] vendor handle.
 +*
 +*   sm_callback
 +* [in] Callback function that should be called when
 +*  the event is received.
 +*
 +*  sm_context
 +* [in] Context supplied as the context argument in
 +*  the subsequenct calls to the sm_callback function
 +*
 +* RETURN VALUE
 +*  IB_SUCCESS if OK.
 +*
 +* NOTES
 +*
 +* SEE ALSO
 +*  osm_vend_events_callback_t 

Re: [openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)

2006-08-24 Thread Tziporet Koren
This is already done - you can use OFED 1.1-rc2 or 1.0.1

Tziporet

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Troy Telford
Sent: Thursday, August 17, 2006 8:57 PM
To: OPENIB
Subject: Re: [openib-general] OFED 1.1-rc2 is delayed for next Monday
(Aug-21)

Quick Question:   
https://docs.mellanox.com/dm/ibg2/OFED_release_notes_Mellanox.txt

states that SLES 9 SP3. is planned for OFED rev 1.1.

Is this still something I can look forward to, or has it been pushed
back?

Thanks,
-- 
Troy Telford

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] IPoIB

2006-08-24 Thread john t
Hi,

Does IPoIB work across IB subnets. For example if there are 4 IB subnets and all the IB subnets are reachable (meaning there is one host that is in all the IB subnets), then is it possible to ping/ssh to hosts in other IB subnets using IPoIB(assuming there is no ethernet)


Regards,
John T.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Michael S. Tsirkin
Quoting r. Yevgeny Kliteynik [EMAIL PROTECTED]:
 Index: libvendor/osm_vendor_ibumad.c
 ===
 --- libvendor/osm_vendor_ibumad.c (revision 8998)
 +++ libvendor/osm_vendor_ibumad.c (working copy)
 @@ -72,6 +72,7 @@
  #include opensm/osm_log.h
  #include opensm/osm_mad_pool.h
  #include vendor/osm_vendor_api.h
 +#include infiniband/verbs.h
  
  /s* OpenSM: Vendor AL/osm_umad_bind_info_t
   * NAME

NAK.

This means that the SM becomes dependent on the uverbs module.  I don't think
this is a good idea.  Let's not go there - SM should depend just on the umad
module and libc.  In particular, SM should work even on embedded platforms where
uverbs do not necessarily work.

Further, hotplug events still do not seem to be handled, even with this patch.

For port events, it seems sane that umad module could provide a way
to listen for them.

A recent patch to mthca converts fatal events to hotplug events, so fatal events
can and should be handled as part of general hotplug support.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)

2006-08-24 Thread Christian Guggenberger

This is already done - you can use OFED 1.1-rc2 or 1.0.1

Tziporet

Hi,

again, regarding SLES9 SP3.

is it also planned to support later kernels than 2.6.5-7.244 ? I tried
to build against 2.6.5-7.276, but failed. I also failed a bug (#192)
some time ago, but never got a response, unfortunately.

cheers.

 - Christian


smime.p7s
Description: S/MIME cryptographic signature
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Eitan Zahavi
Hi MST,

Hotplug event of HCA FATAL is supported and this is the #1 issue we are
troubled with.

Regarding other OpenSM vendors implementations (like the switch stack or
gen1 stacks)
This patch makes OpenSM agnostic for their support of events. If they do
support it OpenSM will
Register and behave according to the events if they do not - then no
harm is done.

Regarding ibumad implementation of propagating of events. Once it is
supported I will vote for
Re-writing the events feature to that interface. But it does not exist
now and ibuverbs do exist.

If required we could make the OSM_VENDOR_SUPPORT_EVENTS depend on the
availability of the libibuverbs.
But today it is common practice to have that lib compiled in all
distributions.

Eitan Zahavi
Senior Engineering Director, Software Architect
Mellanox Technologies LTD
Tel:+972-4-9097208
Fax:+972-4-9593245
P.O. Box 586 Yokneam 20692 ISRAEL


 -Original Message-
 From: Michael S. Tsirkin
 Sent: Thursday, August 24, 2006 4:29 PM
 To: Yevgeny Kliteynik
 Cc: OPENIB; Roland Dreier; Hal Rosenstock; Eitan Zahavi
 Subject: Re: [PATCH] osm: handle local events
 
 Quoting r. Yevgeny Kliteynik [EMAIL PROTECTED]:
  Index: libvendor/osm_vendor_ibumad.c
 
 
 ===
  --- libvendor/osm_vendor_ibumad.c   (revision 8998)
  +++ libvendor/osm_vendor_ibumad.c   (working copy)
  @@ -72,6 +72,7 @@
   #include opensm/osm_log.h
   #include opensm/osm_mad_pool.h
   #include vendor/osm_vendor_api.h
  +#include infiniband/verbs.h
 
   /s* OpenSM: Vendor AL/osm_umad_bind_info_t
* NAME
 
 NAK.
 
 This means that the SM becomes dependent on the uverbs module.  I
don't
 think this is a good idea.  Let's not go there - SM should depend just
on the
 umad module and libc.  In particular, SM should work even on embedded
 platforms where uverbs do not necessarily work.
 
 Further, hotplug events still do not seem to be handled, even with
this patch.
 
 For port events, it seems sane that umad module could provide a way to
listen
 for them.
 
 A recent patch to mthca converts fatal events to hotplug events, so
fatal
 events can and should be handled as part of general hotplug support.
 
 --
 MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)

2006-08-24 Thread Tziporet Koren
Christian Guggenberger wrote:
 is it also planned to support later kernels than 2.6.5-7.244 ? I tried
 to build against 2.6.5-7.276, but failed. I also failed a bug (#192)
 some time ago, but never got a response, unfortunately.


   

To solve this problem in 1.1 you should apply the following change on 
both build_env.sh and the configure:
Replace the check for kernel 2.6.5-7.244* with 2.6.5-7.*

The configure script is located at: OFED-1.1/SOURCES/openib-1.1.tgz 
under directory openib-1.1/configure. (Need to open the tgz change it 
and tar again)

The build_env.sh is located on the OFED-1.1 directory.

Note that for OFED 1.1 we will replace the check (will be done in RC3) 
since we understand it's limiting us to be so specific.

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] openib-general] OFED 1.1-rc2 is delayed for next Monday (Aug-21)

2006-08-24 Thread Christian Guggenberger
On Thu, Aug 24, 2006 at 06:08:07PM +0300, Tziporet Koren wrote:
 Christian Guggenberger wrote:
 is it also planned to support later kernels than 2.6.5-7.244 ? I tried
 to build against 2.6.5-7.276, but failed. I also failed a bug (#192)
 some time ago, but never got a response, unfortunately.
 
 
   
 
 To solve this problem in 1.1 you should apply the following change on 
 both build_env.sh and the configure:
 Replace the check for kernel 2.6.5-7.244* with 2.6.5-7.*
 
 The configure script is located at: OFED-1.1/SOURCES/openib-1.1.tgz 
 under directory openib-1.1/configure. (Need to open the tgz change it 
 and tar again)
 
I tried such things before already, but never got any step further.
Compilation still aborts, and the cause for this is that the
build system still does not know how to handle, e.g., 2.6.5-7.276-smp. 
(excerpt from the logs:

...
patching file drivers/infiniband/hw/mthca/mthca_provider.c
Hunk #1 succeeded at 1289 (offset 2 lines).
Hunk #2 succeeded at 1314 (offset 2 lines).

 No Patches found for 2.6.5-7.276-smp kernel
 /bin/rm -f /var/tmp/OFEDRPM/BUILD/openib-1.1/configure.cache
...
)

cheers.
 - Christian


smime.p7s
Description: S/MIME cryptographic signature
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Michael S. Tsirkin
Quoting r. Eitan Zahavi [EMAIL PROTECTED]:
 Subject: RE: [PATCH] osm: handle local events
 
 Hi MST,
 
 Hotplug event of HCA FATAL is supported and this is the #1 issue we are 
 troubled with.

No I mean real hotplug.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] osm: handle local events

2006-08-24 Thread Eitan Zahavi
  Hotplug event of HCA FATAL is supported and this is the #1 issue we
are
 troubled with.
 
 No I mean real hotplug.
[EZ] And umad will provide it?
 
 --
 MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Sean Hefty
Michael S. Tsirkin wrote:
 Maybe the librdmacm part should be merged to svn?
 So librdmacm could try to read from misc, then from
 /sys/class/infiniband/rdma_cm, and then assume latest.
 It's good to have userspace code portable across distros ...

I can go with that.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH 0/4] Dispatch communication related events to the IB CM

2006-08-24 Thread Sean Hefty
Michael S. Tsirkin wrote:
And even with these proposed changes, there's a race condition where the CM
can timeout a connection after data is received over it, but before this event
can be processed.
 
 
 Hmm. And what happens then?

The connection is aborted by the CM.  The CM sends a REJ for the connection and 
bumps the QP into timewait.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] drop mthca from svn? (was: Rollup patch for ipath and OFED)

2006-08-24 Thread Roland Dreier
Sean Why not remove your code from SVN?

Along those lines, how would people feel if I removed the mthca kernel
code from svn, and just maintained mthca in kernel.org git trees?  I
am getting heartily sick of double checkins for every mthca change...

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] dropped packets

2006-08-24 Thread Roland Dreier
I have no idea what your bug is, but as for this part:

  1.   Are there any race issues with ibv_get_cq_event? The example code
  (ud_pingpong) seems to imply that the correct sequence is
  
  Start:
  
  Call ibv_get_cq_event
  
  Call ibv_ack_cq_event   - anywhere so long as it 
  happens before destroy_cq
  
  Call ibv_req_notify_cq
  
  Call ibv_poll_cq- just once not as 
  usual until empty according to the example
  
  Goto start

this pingpong example is somewhat of a special case, since we know in
advance that no more than 1 send and 1 receive completion can ever be
outstanding, so a single ibv_poll_cq suffices.

(Although thinking about this now, there may be some theoretical races
that could case problems)

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Ira Weiny
On Thu, 24 Aug 2006 09:18:50 -0700
Sean Hefty [EMAIL PROTECTED] wrote:

 Michael S. Tsirkin wrote:
  Maybe the librdmacm part should be merged to svn?
  So librdmacm could try to read from misc, then from
  /sys/class/infiniband/rdma_cm, and then assume latest.
  It's good to have userspace code portable across distros ...
 
 I can go with that.
 
 - Sean
 

Something like this?

Ira


Index: openib/src/userspace/librdmacm/src/cma.c
===
--- openib/src/userspace/librdmacm/src/cma.c(revision 213)
+++ openib/src/userspace/librdmacm/src/cma.c(revision 220)
@@ -141,9 +141,13 @@
 {
char value[8];
 
-   if (ibv_read_sysfs_file(ibv_get_sysfs_path(),
+   if ((ibv_read_sysfs_file(ibv_get_sysfs_path(),
class/misc/rdma_cm/abi_version,
-   value, sizeof value)  0) {
+   value, sizeof value)  0)
+   
+   (ibv_read_sysfs_file(ibv_get_sysfs_path(),
+   class/infiniband_ucma/abi_version,
+   value, sizeof value)  0)) {
/*
 * Older version of Linux do not have class/misc.  To support
 * backports, assume the most recent version of the ABI.  If


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Sean Hefty
I committed this change to the librdmacm in svn 9105.  It still requires a 
backport patch for the kernel code.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] drop mthca from svn?

2006-08-24 Thread Robert Walsh
Roland Dreier wrote:
 Sean Why not remove your code from SVN?
 
 Along those lines, how would people feel if I removed the mthca kernel
 code from svn, and just maintained mthca in kernel.org git trees?  I
 am getting heartily sick of double checkins for every mthca change...

I think this is a fine idea.  Both drivers should now use git as the 
golden source.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] librdmacm ABI issues with OFED 1.1

2006-08-24 Thread Woodruff, Robert J

Here is a patch that I used in my backport to 2.6.9 for RedHat EL4 - U3 
svn 8841 openib fixups patch. It also applies to svn9006 and will
likely work fine on the OFED 1.1 code if you replace the current patch
that removes the creation of the abi_version with this one that 
creates it under /sys/class/infiniband_uma

and modify the library to check either place (Ira's patch below), it
should work.

woody
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Ira Weiny
Sent: Thursday, August 24, 2006 9:58 AM
To: Sean Hefty
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED];
openib-general@openib.org
Subject: Re: [openfabrics-ewg] [openib-general] librdmacm ABI issues
with OFED 1.1

On Thu, 24 Aug 2006 09:18:50 -0700
Sean Hefty [EMAIL PROTECTED] wrote:

 Michael S. Tsirkin wrote:
  Maybe the librdmacm part should be merged to svn?
  So librdmacm could try to read from misc, then from
  /sys/class/infiniband/rdma_cm, and then assume latest.
  It's good to have userspace code portable across distros ...
 
 I can go with that.
 
 - Sean
 

Something like this?

Ira


Index: openib/src/userspace/librdmacm/src/cma.c
===
--- openib/src/userspace/librdmacm/src/cma.c(revision 213)
+++ openib/src/userspace/librdmacm/src/cma.c(revision 220)
@@ -141,9 +141,13 @@
 {
char value[8];
 
-   if (ibv_read_sysfs_file(ibv_get_sysfs_path(),
+   if ((ibv_read_sysfs_file(ibv_get_sysfs_path(),
class/misc/rdma_cm/abi_version,
-   value, sizeof value)  0) {
+   value, sizeof value)  0)
+   
+   (ibv_read_sysfs_file(ibv_get_sysfs_path(),
+   class/infiniband_ucma/abi_version,
+   value, sizeof value)  0)) {
/*
 * Older version of Linux do not have class/misc.  To
support
 * backports, assume the most recent version of the ABI.
If


___
openfabrics-ewg mailing list
[EMAIL PROTECTED]
http://openib.org/mailman/listinfo/openfabrics-ewg


ucma_abi_version_backport_to_2.6.9.patch
Description: ucma_abi_version_backport_to_2.6.9.patch
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] drop mthca from svn? (was: Rollup patch for ipath and OFED)

2006-08-24 Thread Bryan O'Sullivan
On Thu, 2006-08-24 at 09:31 -0700, Roland Dreier wrote:

 Along those lines, how would people feel if I removed the mthca kernel
 code from svn, and just maintained mthca in kernel.org git trees?

+1 from me.  We'll drop the ipath code, too.

b


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 01:57:39PM -0700, Sean Hefty wrote:
 OK, great. I'm fine with people using things which are supported, but
 then we need the big, blinking Warning! This program is non-standard, and
 won't work with many of the devices supported by Open Fabrics! sign.
 
 If an application were written to use Myrinet, would you consider it
 non-standard?

Er, this question is a bit existential for my taste. Myrinet has its
own standards. We're trying to create *inter-operable* hardware and
software in this community. So we follow the IB standard. Myricom is
doing their own thing, although of course they have software which
obeys the Ethernet, VIA, DAPL, and other standards. And I expect that
if they say they obey a standard, that they do. They're good people
that way.

 It's up to the application to verify that the hardware that they're
 using provides the required features, or adjust accordingly, and
 publish those requirements to the end users.

If that was being done (and it isn't), it would still be bad for the
ecosystem as a whole. But, basically, that's about the same as what I
proposed, quoted above.

-- g

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] Utilities for sending traffic with different SL

2006-08-24 Thread Suresh Shelvapille
Folks:

Is there a utility within the OFED1.0 package which can be used for
generating traffic on different SL (akin to the Voltaire perf_main utility)?


Many thanks,
Suri
 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Wed, Aug 23, 2006 at 04:46:52PM -0700, Roland Dreier wrote:

 Yes, Mellanox documents that it is safe to rely on the last byte of an
 RDMA being written last.

OK, great. I'm fine with people using things which are supported, but
then we need the big, blinking Warning! This program is non-standard, and
won't work with many of the devices supported by Open Fabrics! sign.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
OK, great. I'm fine with people using things which are supported, but
then we need the big, blinking Warning! This program is non-standard, and
won't work with many of the devices supported by Open Fabrics! sign.

If an application were written to use Myrinet, would you consider it
non-standard?  The application is simply written to take advantage of specific
hardware.  It's up to the application to verify that the hardware that they're
using provides the required features, or adjust accordingly, and publish those
requirements to the end users.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Woodruff, Robert J
Greg wrote,

OK, great. I'm fine with people using things which are supported, but
then we need the big, blinking Warning! This program is non-standard,
and
won't work with many of the devices supported by Open Fabrics! sign.

-- greg

The other way to look at it is, the customer goes to the ISV and asks,
what hardware should I buy, and the ISV says I support X version of MPI
and vendor Y's hardware works with X version of MPI. 
The customer buys the software solution and will get the hardware that 
works with that software. So, if you want your hardware to work with
basically almost any MPI today, since most of the MPIs assume this
data placement ordering, then you will make your hardware so that
it will guarantee this type of data delivery. 

If you would rather not, cause some spec says you don't have to,
then you are just limiting the amount of hardware that you will sell.

my 2 cents,

woody

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote:

 The other way to look at it is, the customer goes to the ISV and asks,
 what hardware should I buy, and the ISV says I support X version of MPI
 and vendor Y's hardware works with X version of MPI. 

I thought the goal of InfiniBand was to create an ecosystem where you
didn't have to do this. I guess I missed something somewhere.

Adding undocumented requirements to a standard isn't the way to entice
more people into implementing or using it.

I would challenge you to find a single ISV that would prefer a
situation where some infiniband middleware requires things which
aren't in the standard.

 So, if you want your hardware to work with
 basically almost any MPI today, since most of the MPIs assume this
 data placement ordering, then you will make your hardware so that
 it will guarantee this type of data delivery. 

I think there's some confusion here between practicality and theory.

There's no question that any IB vendor would make that kind of
decision, although it will be expensive and annoying for iWarp vendors
to do so. They'll feel forced to do so. But this is bad for the
community.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
We're trying to create *inter-operable* hardware and
software in this community. So we follow the IB standard.

Atomic operations and RDD are optional, yet still part of the IB standard.  An
application that makes use of either of these isn't guaranteed to operate with
all IB hardware.  I'm not even sure that CAs are required to implement RDMA
reads.

 It's up to the application to verify that the hardware that they're
 using provides the required features, or adjust accordingly, and
 publish those requirements to the end users.

If that was being done (and it isn't), it would still be bad for the
ecosystem as a whole.

Applications should drive the requirements.  Some poll on memory today.  A lot
of existing hardware provides support for this by guaranteeing that the last
byte will always be written last.  This doesn't mean that data cannot be placed
out of order, only that the last byte is deferred.

Again, if a vendor wants to work with applications written this way, then this
is a feature that should be provided.  If a vendor doesn't care about working
with those applications, or wants to require that the apps be rewritten, then
this feature isn't important.

But I do not see an issue with a vendor adding value beyond what's defined in
the spec.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 02:58:18PM -0700, Sean Hefty wrote:
 We're trying to create *inter-operable* hardware and
 software in this community. So we follow the IB standard.
 
 Atomic operations and RDD are optional, yet still part of the IB standard.  
 An
 application that makes use of either of these isn't guaranteed to operate with
 all IB hardware.

But those are example of things which are actually written down in the standard.

The example we were talking about isn't.

 But I do not see an issue with a vendor adding value beyond what's defined in
 the spec.

Neither do I. If you think so, you haven't understood my argument.

-- greg

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Woodruff, Robert J
Greg wrote,
On Thu, Aug 24, 2006 at 02:10:38PM -0700, Woodruff, Robert J wrote:

 The other way to look at it is, the customer goes to the ISV and
asks,
 what hardware should I buy, and the ISV says I support X version of
MPI
 and vendor Y's hardware works with X version of MPI. 

I thought the goal of InfiniBand was to create an ecosystem where you
didn't have to do this. I guess I missed something somewhere.

The customers still buy the hardware cause it runs an application,
they don't just go out and buy some cool InfinBand hardware and
then go see what they can use it for. 

Adding undocumented requirements to a standard isn't the way to entice
more people into implementing or using it.

Well, sometimes the applications drive the feature set, even though it
is not part of the actual standard. Unfortunate, but a fact of live. 

I would challenge you to find a single ISV that would prefer a
situation where some infiniband middleware requires things which
aren't in the standard.

If the feature gives them a huge advantage in performance (and it does)
and all of the hardware vendors that they deal with already implement
it, then yes,
they will force, by defacto standard that all other newcomers implement
it
or face the fact that no one will buy their hardware. It seems like that
is
what is happening in this case. 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 03:13:33PM -0700, Woodruff, Robert J wrote:

 If the feature gives them a huge advantage in performance (and it
 does) and all of the hardware vendors that they deal with already
 implement it, then yes, they will force, by defacto standard that
 all other newcomers implement it or face the fact that no one will
 buy their hardware. It seems like that is what is happening in this
 case.

In this case the feature reduces performance on one HCA and increases
it on another. Which shows why it's a bad idea to pick features based
on a single implementation.

But you're still confusing practicality and theory. I can see why it's
pratical sense for newcomers to implement this new, performance-
reducing feature. But why is it theoretically good? And shouldn't it
be added to the standard, before all the poor iWarp people discover
the hard way that they need it?

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Woodruff, Robert J
Greg wrote,  
reducing feature. But why is it theoretically good? And shouldn't it
be added to the standard, before all the poor iWarp people discover
the hard way that they need it?

-- greg

Yes, IMO, the iWarp folks and the IBTA should consider making this a 
requirement, but even if they do not the ISVs will still require it.

That being said, if you can show the ISVs how they can implement
their completion model faster using some other mechanism than
what they do now, they would probably listen, as they are going to do
what gives them the best performance, standard or not. 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Fabian Tillier
On 8/24/06, Woodruff, Robert J [EMAIL PROTECTED] wrote:
 If the feature gives them a huge advantage in performance (and it does)
 and all of the hardware vendors that they deal with already implement
 it, then yes,
 they will force, by defacto standard that all other newcomers implement
 it
 or face the fact that no one will buy their hardware. It seems like that
 is
 what is happening in this case.

Actually, if a hardware implementation provided the same performance
(in this case latency) by polling on a CQ as one where polling on
memory was garanteed to work, the customer may actually prefer the
standard implementation.

If the CQ entry could be updated faster than it is today, then polling
the CQ would be a viable not to mention IB and iWARP standard
solution.  The problem is there's a considerable delay from when the
last byte of data is written to when the CQE is written.  My last
measurement was 2us using a Mellanox PCI-X device.

- Fab

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 03:28:44PM -0700, Woodruff, Robert J wrote:

 Yes, IMO, the iWarp folks and the IBTA should consider making this a 
 requirement, but even if they do not the ISVs will still require it.

Ahah. So with all this head and light, we agree that it should be
added to the standard.

 That being said, if you can show the ISVs how they can implement
 their completion model faster using some other mechanism than
 what they do now, they would probably listen, as they are going to do
 what gives them the best performance, standard or not. 

The other way to do it, faster on some HCAs, is to follow the
standard.  So, no showing needed. If MPI implementations implemented
this in addition to the other, non-standard way, and automagically
picked the right one, we wouldn't be having this discussion.

So, I think we're ending up in agreement. Feel free to disagree ;-)

-- greg



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
But you're still confusing practicality and theory. I can see why it's
pratical sense for newcomers to implement this new, performance-
reducing feature. But why is it theoretically good?

I'm missing the standard you're using to judge what's theoretically good and
bad.

Applications are written this way today.  A vendor can either:

* Support those apps by providing the feature.
* Require that the apps be rewritten to use their hardware.

Whether apps should have been written this way seems irrelevant.  They are, and
we should make decisions based on that, including extending the spec and/or
implementation if needed.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
Actually, if a hardware implementation provided the same performance
(in this case latency) by polling on a CQ as one where polling on
memory was guaranteed to work, the customer may actually prefer the
standard implementation.

Polling on a CQ involves a function call, synchronization to the CQ, and
formatting a structure to return to the user.  I don't see this ever being
faster than polling memory.

If the data is received in order, there's no additional delay between writing
the data and polling on memory.  A CQ entry requires a separate write, plus the
overhead mentioned above.  Even if the data arrived out of order, only the last
byte needs to be deferred.  Writing that byte shouldn't be any slower than
writing a CQ entry.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 03:37:21PM -0700, Sean Hefty wrote:

 I'm missing the standard you're using to judge what's theoretically good and
 bad.

Having simpler programming to get good performance is a theoretical
good. Extra hacks for specific hardware is theoretically bad,
pratically good only when it ends up with much better performance.

Silently non-standard software is bad by both accounts.

 Applications are written this way today.  A vendor can either:
 
 * Support those apps by providing the feature.
 * Require that the apps be rewritten to use their hardware.

That's what I was calling practical. It's clear what a hardware
vendor will do in that case.

 Whether apps should have been written this way seems irrelevant.  They are, 
 and
 we should make decisions based on that, including extending the spec and/or
 implementation if needed.

In this case we're talking about code which can easily be changed to
follow the standard, in addition to having a hack mode that's faster
on 1 particular hardware implementation.

You seem to be implying that the applications are set in stone, and
that their authors have no interest in making them
standard-conformant. I don't think that's the case. If there is a
standard extension which can provide better performance on 1 particular
hardware implementation, let's add it to the standard. But let's
also make the software standard-conformant on other hardware.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Caitlin Bestler
[EMAIL PROTECTED] wrote:
 On 8/24/06, Woodruff, Robert J [EMAIL PROTECTED] wrote:
 If the feature gives them a huge advantage in performance (and it
 does) and all of the hardware vendors that they deal with already
 implement it, then yes, they will force, by defacto standard that all
 other newcomers implement it or face the fact that no one will buy
 their hardware. It seems like that is what is happening in this case.
 
 Actually, if a hardware implementation provided the same
 performance (in this case latency) by polling on a CQ as one
 where polling on memory was garanteed to work, the customer
 may actually prefer the standard implementation.
 

Exactly. The correct solution is to rely on a CQE to signal
a completion. That is why the standards are written the way
they are.

For iWARP there are network performance reasons why in-order
memory writes will never be guaranteed.

Now any time an *application* wants to write something that 
is vendor dependent then it is really up to that application
developer to decide if the benefit is worthwhile. And in my
opinion the in order write solution is not a feature, it 
is a work around to a slow CQ. So I certainly would not want
to encourage that workaround, but applications are always
free to do device or vendor specific workaround in their
application code. It's their code.

But I would hope that we would agree that no code that is
part of the project should rely on this feature.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Woodruff, Robert J
 Greg wrote,
In this case we're talking about code which can easily be changed to
follow the standard, in addition to having a hack mode that's faster
on 1 particular hardware implementation.

You seem to be implying that the applications are set in stone, and
that their authors have no interest in making them
standard-conformant. I don't think that's the case. If there is a
standard extension which can provide better performance on 1 particular
hardware implementation, let's add it to the standard. But let's
also make the software standard-conformant on other hardware.

-- greg

If the overhead in polling the CQ rather than memory was not so high,
they would have used it, but they found that it added  2us to the
latency
and found they could get better performance if they polled memory, so
that 
is what they did, and as long as the hardware (or at least the hardware
they
care about support's it) they won't change it. 
If you can show them how they could get better (not just equal)
performance
using some other method, they would probably listen.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 03:43:38PM -0700, Sean Hefty wrote:
 Actually, if a hardware implementation provided the same performance
 (in this case latency) by polling on a CQ as one where polling on
 memory was guaranteed to work, the customer may actually prefer the
 standard implementation.
 
 Polling on a CQ involves a function call, synchronization to the CQ, and
 formatting a structure to return to the user.  I don't see this ever being
 faster than polling memory.

Why don't you measure it, then? For example, an iWarp implementation
is going to be slowed down if it has to reorder segments to deliver
the last byte last. This expense might be more than the function call.
You guess not, but...

You're also assuming that programs are only checking the last byte of
the buffer. For all you know, Mellanox is delivering the whole buffer
in ascending order, and the user is checking bytes in the middle, too.
Which is a hazard of not-yet-specified standards extensions.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Woodruff, Robert J
Catlin wrote,

For iWARP there are network performance reasons why in-order
memory writes will never be guaranteed.

For iWarp, or any other RDMA over Ethernet protocol, the
behavior is not to guarantee all packets are written
in-order, just that the last byte of the last packet is
written last. This can easily be implemented in an iWarp 
card or by the driver with minimal performance impact in most cases.

So for example, if the last packet arrives before all the other packets
have arrived, the iWarp card or driver does not place that data
of the last packet until all the other packets have arrived. 

woody

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 03:53:37PM -0700, Woodruff, Robert J wrote:

 If the overhead in polling the CQ rather than memory was not so
 high, they would have used it, but they found that it added  2us to
 the latency and found they could get better performance if they
 polled memory,

I keep on mentioning that measuring one instance of one implementation
isn't necessarily a good way to evaluate a standard. Or the
implementation.

We already have one example where Mellanox improved their
implementation in the DDR generation such that a performance
workaround -- choosing a 1k MTU for RC connections -- was no longer
needed. I was pleased when we all agreed to remove defaulting to the
workaround -- it was clearly the right thing to do.

-- g


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Sean Hefty
 Polling on a CQ involves a function call, synchronization to the CQ, and
 formatting a structure to return to the user.  I don't see this ever being
 faster than polling memory.

Why don't you measure it, then?

Why?  Reading a memory location directly will be faster than calling a function
to read from a memory location.

For example, an iWarp implementation
is going to be slowed down if it has to reorder segments to deliver
the last byte last. This expense might be more than the function call.

It only needs to defer the last byte.  The claim being made is that the last
byte will be delivered last.

Yes, I can implement something non-performant in this mode, but that doesn't
invalidate polling memory as a general solution.

You're also assuming that programs are only checking the last byte of
the buffer.

The applications I care about are polling on the last byte.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Greg Lindahl
On Thu, Aug 24, 2006 at 04:13:22PM -0700, Sean Hefty wrote:

 Why don't you measure it, then?
 
 Why?  Reading a memory location directly will be faster than calling a 
 function
 to read from a memory location.

... sigh. This is not true, there are quite obvious implementations
where this is not true. We had one, actually, and changed it due to
this issue. That was a practical choice.

 You're also assuming that programs are only checking the last byte of
 the buffer.
 
 The applications I care about are polling on the last byte.

And tomorrow, the next app may depend on the behavior that all the
bytes arrive in order. Slippery slope, you know.

-- greg


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] basic IB doubt

2006-08-24 Thread Caitlin Bestler
Woodruff, Robert J wrote:
 Catlin wrote,
 
 For iWARP there are network performance reasons why in-order memory
 writes will never be guaranteed.
 
 For iWarp, or any other RDMA over Ethernet protocol, the
 behavior is not to guarantee all packets are written
 in-order, just that the last byte of the last packet is
 written last. This can easily be implemented in an iWarp card
 or by the driver with minimal performance impact in most cases.
 
 So for example, if the last packet arrives before all the
 other packets have arrived, the iWarp card or driver does not
 place that data of the last packet until all the other
 packets have arrived.
 
 woody

Fascinating argument. The correct forum for that is the IETF
RDDP Workgroup, not a Linux open source project.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH v2] ib_usa: support userspace SA queries and multicast

2006-08-24 Thread Sean Hefty
Changes from v1:

The ib_usa module exports two files: ib_usa_default and ib_usa_raw.

Use of the ib_usa_default restricts the user to sending PathRecord,
MultiPathRecord, MCMemberRecord, and ServiceRecord queries, and joining /
leaving multicast groups.

Use of ib_usa_raw allows any MADs to be sent to the SA.

An administrator can set control on these files in any appropriate way.

Signed-off-by: Sean Hefty [EMAIL PROTECTED]
---
Index: include/rdma/ib_usa.h
===
--- include/rdma/ib_usa.h   (revision 0)
+++ include/rdma/ib_usa.h   (revision 0)
@@ -0,0 +1,123 @@
+/*
+ * Copyright (c) 2006 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ *copyright notice, this list of conditions and the following
+ *disclaimer in the documentation and/or other materials
+ *provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef IB_USA_H
+#define IB_USA_H
+
+#include linux/types.h
+#include rdma/ib_sa.h
+
+#define IB_USA_ABI_VERSION 1
+
+#define IB_USA_EVENT_DATA  256
+
+enum {
+   IB_USA_CMD_SEND_MAD,
+   IB_USA_CMD_GET_EVENT,
+   IB_USA_CMD_GET_DATA,
+   IB_USA_CMD_JOIN_MCAST,
+   IB_USA_CMD_FREE_ID,
+   IB_USA_CMD_GET_MCAST
+};
+
+enum {
+   IB_USA_EVENT_MAD,
+   IB_USA_EVENT_MCAST
+};
+
+struct ib_usa_cmd_hdr {
+   __u32 cmd;
+   __u16 in;
+   __u16 out;
+};
+
+struct ib_usa_send_mad {
+   __u64 response; /* unused - reserved */
+   __u64 uid;
+   __u64 node_guid;
+   __u64 comp_mask;
+   __u64 attr;
+   __u8  port_num;
+   __u8  method;
+   __be16 attr_id;
+   __u32 timeout_ms;
+   __u32 retries;
+};
+
+struct ib_usa_join_mcast {
+   __u64 response;
+   __u64 uid;
+   __u64 node_guid;
+   __u64 comp_mask;
+   __u64 mcmember_rec;
+   __u8  port_num;
+};
+
+struct ib_usa_id_resp {
+   __u32 id;
+};
+
+struct ib_usa_free_resp {
+   __u32 events_reported;
+};
+
+struct ib_usa_free_id {
+   __u64 response;
+   __u32 id;
+};
+
+struct ib_usa_get_event {
+   __u64 response;
+};
+
+struct ib_usa_event_resp {
+   __u64 uid;
+   __u32 id;
+   __u32 event;
+   __u32 status;
+   __u32 data_len;
+   __u8  data[IB_USA_EVENT_DATA];
+};
+
+struct ib_usa_get_data {
+   __u64 response;
+   __u32 id;
+};
+
+struct ib_usa_get_mcast {
+   __u64 response;
+   __u64 node_guid;
+   __u8  mgid[16];
+   __u8  port_num;
+};
+
+#endif /* IB_USA_H */
Index: core/usa.c
===
--- core/usa.c  (revision 0)
+++ core/usa.c  (revision 0)
@@ -0,0 +1,846 @@
+/*
+ * Copyright (c) 2006 Intel Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ *  - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ *  - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, 

Re: [openib-general] basic IB doubt

2006-08-24 Thread Talpey, Thomas
At 07:46 PM 8/23/2006, Roland Dreier wrote:
Greg Actually, that leads me to a question: does the vendor of
Greg that adaptor say that this is actually safe? Just because
Greg something behaves one way most of the time doesn't mean it
Greg does it all of the time. So it it really smart to write
Greg non-standard-conforming programs unless the vendor stands
Greg behind that behavior?

Yes, Mellanox documents that it is safe to rely on the last byte of an
RDMA being written last.

How does an adapter guarantee that no bridges or other intervening devices
reorder their writes, or for that matter flush them to memory at all!?

Without signalling the host processor, that is. Isn't that what the dma_sync()
API is all about?

Tom.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] about rdma and send/recv model

2006-08-24 Thread zhu shi song
Although SDP is not working correctly now, it can
easily support many existing applications based on
Sockets API.  Patrick say rdma is a connected model
and has serious scalability problem.  But Does
send/recv model has or support SDP-alike API?

zhu



__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general