[openib-general] RE: [iSER]How to send the login request PDU?

2005-08-29 Thread Dan Bar Dov




  
  As it is known to all,the iSCSI layer at the Initiator should send a 
  Login Request PDU to the Target after allocating the connection resources. My 
  question is
  How to send this PDU? Could the Send_Control Primitive Operation be used? 
  But itis not in theiSER-assisted mode at present. Or else, 
  the dapl API should used directly.
  Yes, 
  you should use send_control to send the login PDU.With this ISER 
  implementationyou can do itbefore allocate connection 
  resources.
  There is the similar problem at the Target when sending the Login Response 
  PDU.
  Same 
  answer.
  Hope my descriptionis clear. Any suggestion is 
  appreciated.Ian Jiang 
  [EMAIL PROTECTED] 
   
  Computer Architecture Laboratory 
  Institute of Computing Technology 
  Chinese Academy of Sciences 
  Beijing,P.R.China 
  Zip code: 100080 
  Tel: +86-10-62564394(office) 
  
  
  免费下载 MSN 
  Explorer 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] RDMA connection and address translation API

2005-08-29 Thread Guy German
Sean wrote:
 It looks like this would work.  If a client wanted to create multiple
 connections to the same remote service (for example, to separate control and
 data), then it seems more efficient to move the asynchronous at outside of 
 the
 connect call.
 - Sean
 
 Thats a good point. What I had in mind was mainly simplicity for the
 consumer - save him dealing with another upcall. 
 
 Maybe caching in at module would make things better, but I agree 
 that for multiple connections to the same remote service, the
 asynchronous at aproach, seems more appropriate.

OTOH,
After thinking about it some more, there might be problems in letting
each and every consumer do his own caching. The at.c has a (non
implemented yet) mechanism with invalidate for caching tables.

Do we really want to let the consumer handle all the cases of routing
tables changing on the fly etc. or centralize it in one place (i.e
at.c) ?

What do you think, Sean ?

Guy


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Payload Length in first RMPP sent segment

2005-08-29 Thread Hal Rosenstock
Hi,

On the RMPP send side, while the Payload Length field in the last
segment is clear that it indicates the number of valid bytes in
Transferred Data, there seems to be some ambiguity in the optional
Payload Length field in the first segment. I think it can work either
way but I also think the intent was to reflect the valid bytes. Maybe it
is this way to allow flexibility (choice in the implementation). What is
the correct interpretation ? Should I enter a comment on this ? Thanks.

-- Hal

IBA 1.2 p.775 line 37

In the first packet of an RMPP transfer (RMPPFlags.First=1),
PayloadLength may indicate the sum of the lengths, in bytes, of the
TransferredData fields in all packets of the entire multipacket
response; this is done by using a nonzero value for PayloadLength in the
first packet. 

IBA 1.2 p. 776 line 8

In the last packet of an RMPP transfer (RMPPFlags.Last=1), PayloadLength
indicates the number of valid bytes in the TransferredData field,
allowing data transfers that are not an integral multiple of the length
of the TransferredData field. A transfer terminates when either: (a) a
packet containing RMPPFlags.Last=1 is received; or (b) a nonzero
PayloadLength was given in the first packet of a transfer, and a packet
is received containing sufficient TransferredData bytes to equal or
exceed the PayloadLength originally provided. If case (b) occurs and
RMPPFlags.Last is not 1 for that packet, the Receiver sends an ABORT
packet with RMPPStatus of Inconsistent Last and PayloadLength and
terminates the transfer.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH][iWARP] IW CM Verbs

2005-08-29 Thread James Lentini


On Fri, 26 Aug 2005, Tom Tucker wrote:

 Index: ib_verbs.h
 ===
 --- ib_verbs.h(revision 3120)
 +++ ib_verbs.h(working copy)
 @@ -804,6 +806,7 @@
   struct ib_gid_cache   **gid_cache;
  };
  
 +struct iw_cm;
  struct ib_device {
   struct device*dma_device;
  
 @@ -820,6 +823,8 @@
  
   u32   flags;
  
 + struct iw_cm  *iwcm;
 +
   int(*query_device)(struct ib_device *device,
  struct ib_device_attr 
 *device_attr);
   int(*query_port)(struct ib_device *device,

Why does the ib_device need a cm structure for iWARP but not IB? If 
you used either Guy or Roland's generic RDMA connection API and did 
the iWARP implementation, would you still need to add the iw_cm 
structure?
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] RMPP: Fix length in first segment of multipacket sends

2005-08-29 Thread Hal Rosenstock
Hi Michael,

On Mon, 2005-08-29 at 03:23, Michael S. Tsirkin wrote:
 Quoting Hal Rosenstock [EMAIL PROTECTED]:
  Index: mad_rmpp.c
  ===
  --- mad_rmpp.c  (revision 3197)
  +++ mad_rmpp.c  (working copy)
  @@ -593,7 +593,8 @@
 
 Hal, could you diff with -p in the future please?
 This makes the function name visible in the patch, making it
 possible to understand what is being changed without applying it.

I'll try harder to remember to do this.

  rmpp_mad-rmpp_hdr.paylen_newwin =
  cpu_to_be32(mad_send_wr-total_seg *
  (sizeof(struct ib_rmpp_mad) -
  -  offsetof(struct ib_rmpp_mad, data)));
  +  offsetof(struct ib_rmpp_mad, data)) -
  +   mad_send_wr-pad);
 
 BTW, I just noticed that whitespace was (and remains) broken in these lines:
 indentation is done by spaces.

The whitespace is preceeded by tabs and is to make the parameters line
up. I thought that was allowable coding style. It has been used in many
places in OpenIB code.

-- Hal



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] uDAPL added ibv_query support

2005-08-29 Thread James Lentini


On Thu, 25 Aug 2005, Arlin Davis wrote:

arlin James,
arlin 
arlin Support for ibv_query_port, device, and gid.
arlin 
arlin Thanks,
arlin 
arlin -arlin

Committed in revision 3227.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Question on the best approach to debug aninfiniband connectio n problem

2005-08-29 Thread Hal Rosenstock
Hi Michael,

On Sun, 2005-08-28 at 02:34, Michael S. Tsirkin wrote:
 Quoting r. Hal Rosenstock [EMAIL PROTECTED]:
  I think you are referring to a Set of PortInfo which causes an event to
  IPoIB.
 
 Yes.
 
   There seems to be some bug related to local MAD handling:

Can you elaborate more on this ?

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] RMPP: Fix payload length of middle RMPP sent segments

2005-08-29 Thread Hal Rosenstock
RMPP: Fix payload length of middle RMPP sent segments. Middle payload
lengths should be 0 on the send side.

(This is a compliance and should not be an interop issue as middle
payload lengths are supposed to be ignored on receive).

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Note also that diff -p did not show the routine name for this 1 line
change.

Index: mad_rmpp.c
===
--- mad_rmpp.c  (revision 3197)
+++ mad_rmpp.c  (working copy)
@@ -602,6 +603,7 @@
mad_send_wr-sg_list[1].length = sizeof(struct ib_rmpp_mad) -
 mad_send_wr-data_offset;
mad_send_wr-sg_list[1].lkey = mad_send_wr-sg_list[0].lkey;
+   rmpp_mad-rmpp_hdr.paylen_newwin = 0;
}
 
if (mad_send_wr-seg_num == mad_send_wr-total_seg) {





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] RDMA Generic Connection Management

2005-08-29 Thread Guy German
Hi,

After receiving feedbacks from people here - I want to see if we can
agree on a generic CM API, so we can start implementing it. 
I will try and summarize the 2 options, the way I understand it. 

If I am missing something or misrepresenting - please don't hesitate to
correct me.

both suggestion include the following verbs (or semantically
equivalent): ib_cma_get_device, ib_cma_create_qp, ib_cma_connect,
ib_cma_disconnect, ib_cma_listen, ib_cma_destroy, ib_cma_accept,
ib_cma_reject, ib_cma_get_src_ip.

a connect flow will be something like:

- ib_cma_get_device (...) /* get device(1) or device+path(2) */
- pd = ib_alloc_pd(...) /* pd allocated in the given device */
- qp = ib_cma_create_qp(...) /* qp returned in init state */
- ib_post_recv(qp, ...);
- ib_cma_connect (qp, dst_addr(1)/path(2), ...);

Now, there are 2 suggestions for the device discovery:
1. get_device returns device and port, according the local routing
tables, synchronously. ib_cma_connect calls the at module for address
resolving (cache handled) before calling the cm_connect.
2. get_device registers an upcall and return in the upcall the data path
to the consumer. In this case caching is done by the consumer.

I prefer option 1, because it makes the consumer code simpler, without
having to handle upcalls for address translations (which are not
asynchronous in iWARP) or hold the transport's data information. Also it
saves the consumer the trouble of caching routes to destinations.

I would like to hear what other people in the list think of it ...

Thanks,
Guy


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] RMPP: Fix length in first segment of multipacket sends

2005-08-29 Thread Michael S. Tsirkin
Hal, first, sorry about nitpicking.

Quoting Hal Rosenstock [EMAIL PROTECTED]:
   rmpp_mad-rmpp_hdr.paylen_newwin =
   cpu_to_be32(mad_send_wr-total_seg *
   (sizeof(struct ib_rmpp_mad) -
   -  offsetof(struct ib_rmpp_mad, 
   data)));
   +  offsetof(struct ib_rmpp_mad, 
   data)) -
   +   mad_send_wr-pad);
  
  BTW, I just noticed that whitespace was (and remains) broken in these lines:
  indentation is done by spaces.
 
 The whitespace is preceeded by tabs and is to make the parameters line
 up. I thought that was allowable coding style. It has been used in many
 places in OpenIB code.

Sure, whitespace preceeded by tabs is OK.
But, pls take a look at the original file, or the patch, I think
you'll see that its not preceeded by tabs in this instance.

Some more nitpicks:

In this specific instance, its probably best to just add a temp variable and
write 

w =  mad_send_wr-total_seg *
(sizeof(struct ib_rmpp_mad) -
offsetof(struct ib_rmpp_mad, data)) -
mad_send_wr-pad;

rmpp_mad-rmpp_hdr.paylen_newwin = cpu_to_be32(w)

to avoid placing the descendant cpu_to_be32 left of the = sign.

Documentation/CodingStyle says about this:
 Descendants are always substantially shorter than the parent and are
 placed substantially to the right.


And by the way, wouldnt it be a good idea to replace hard-coded
220 and such in ib_mad.h by a symbolic constant, and then we'd just
have

w =  mad_send_wr-total_seg * IB_RMMP_DATA_SIZE -
mad_send_wr-pad;

rmpp_mad-rmpp_hdr.paylen_newwin = cpu_to_be32(w)

What do you think?
Thanks,
MST

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Question on the best approach to debug aninfiniband connectio n problem

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Hal Rosenstock [EMAIL PROTECTED]:
There seems to be some bug related to local MAD handling:
 
 Can you elaborate more on this ?

I have a back to back setup.
I sometimes start with all modules unloaded, load them back
and bring up ipoib with a fixed ip address.

I then start opensm on one node, and try to ping one from another.
This does not work until I down and up ib0 on the node with
opensm running.

down and up on the other node does not help, which made me think
local mad handling is the culprit.


-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] OSM vendor layer: In umad_receiver, handle allocating RMPP large MADs from OSM MAD pool

2005-08-29 Thread Hal Rosenstock
Hi Grant,

On Fri, 2005-08-26 at 13:23, Grant Grundler wrote:
 On Fri, Aug 26, 2005 at 12:34:58PM -0400, Hal Rosenstock wrote:
  - MAD_BLOCK_SIZE, osm_addr))) {
  +   length  MAD_BLOCK_SIZE ? 
  +   length : 
  MAD_BLOCK_SIZE,
  +   osm_addr))) {
 
 Can max(length, MAD_BLOCK_SIZE) be used instead?

Yes; I made that change. There is a MAX macro in complib/cl_math.h.
Thanks.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] uDAPL common code fix for default attribute settings

2005-08-29 Thread James Lentini


On Thu, 25 Aug 2005, Arlin Davis wrote:

arlin James,
arlin 
arlin Please review this common code patch that fixes default 
arlin settings so they don't exceed device maximums.
arlin 
arlin Thanks,
arlin 
arlin -arlin

I've moved the check into dapli_ep_default_attrs() so all future 
callers will also benefit from this.

Index: dapl/common/dapl_ep_util.c
===
--- dapl/common/dapl_ep_util.c  (revision 3231)
+++ dapl/common/dapl_ep_util.c  (working copy)
@@ -260,7 +260,9 @@ void
 dapli_ep_default_attrs (
IN DAPL_EP  *ep_ptr )
 {
+DAT_EP_ATTRep_attr_limit;
 DAT_EP_ATTR*ep_attr;
+DAT_RETURN dat_status;
 
 ep_attr = ep_ptr-param.ep_attr;
 /* Set up defaults */
@@ -295,7 +297,36 @@ dapli_ep_default_attrs (
  *- provider_specific_params: 0
  */
 
-return;
+ dat_status = dapls_ib_query_hca (ep_ptr-header.owner_ia-hca_ptr, 
+ NULL, ep_attr_limit, NULL);
+ /* check against HCA maximums */
+ if (dat_status == DAT_SUCCESS)
+ {
+ep_ptr-param.ep_attr.max_mtu_size =
+DAPL_MIN(ep_ptr-param.ep_attr.max_mtu_size,
+ ep_attr_limit.max_mtu_size);
+ep_ptr-param.ep_attr.max_rdma_size =
+DAPL_MIN(ep_ptr-param.ep_attr.max_rdma_size,
+ ep_attr_limit.max_rdma_size);
+ep_ptr-param.ep_attr.max_recv_dtos =
+DAPL_MIN(ep_ptr-param.ep_attr.max_recv_dtos,
+ ep_attr_limit.max_recv_dtos);
+ep_ptr-param.ep_attr.max_request_dtos =
+DAPL_MIN(ep_ptr-param.ep_attr.max_request_dtos,
+ ep_attr_limit.max_request_dtos);
+ep_ptr-param.ep_attr.max_recv_iov =
+DAPL_MIN(ep_ptr-param.ep_attr.max_recv_iov,
+ ep_attr_limit.max_recv_iov);
+ep_ptr-param.ep_attr.max_request_iov =
+DAPL_MIN(ep_ptr-param.ep_attr.max_request_iov,
+ ep_attr_limit.max_request_iov);
+ep_ptr-param.ep_attr.max_rdma_read_in =
+DAPL_MIN(ep_ptr-param.ep_attr.max_rdma_read_in,
+ ep_attr_limit.max_rdma_read_in);
+ep_ptr-param.ep_attr.max_rdma_read_out =
+DAPL_MIN(ep_ptr-param.ep_attr.max_rdma_read_out,
+ ep_attr_limit.max_rdma_read_out);
+ }
 }
 
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] sdp: use linux/list.h in sdp_link.c

2005-08-29 Thread Michael S. Tsirkin
The following kills sdp_link.h and converts sdp_link.c to use linux/list.h
Locking is still missing here.

Signed-off-by: Michael S. Tsirkin [EMAIL PROTECTED]

Index: linux-2.6.12.2/drivers/infiniband/ulp/sdp/sdp_link.c
===
--- linux-2.6.12.2.orig/drivers/infiniband/ulp/sdp/sdp_link.c
+++ linux-2.6.12.2/drivers/infiniband/ulp/sdp/sdp_link.c
@@ -35,17 +35,105 @@
 
 #include ipoib.h
 #include sdp_main.h
-#include sdp_link.h
 
-static kmem_cache_t *wait_cache = NULL;
-static kmem_cache_t *info_cache = NULL;
+#define SDP_LINK_F_VALID 0x01 /* valid path info record. */
+#define SDP_LINK_F_ARP   0x02 /* arp request in progress. */
+#define SDP_LINK_F_PATH  0x04 /* arp request in progress. */
+/*
+ * wait for an ARP event to complete.
+ */
+struct sdp_path_info {
+   u32 src;/* source IP address. */
+   u32 dst;/* destination IP address */
+   int dif;/* bound device interface option */
+   u32 gw; /* gateway IP address */
+   int qid;/* path record query ID */
+   u8  port;   /* HCA port */
+   u32 flags;  /* record flags */
+   int sa_time;/* path_rec request timeout */
+   unsigned long arp_time; /* ARP request timeout */
+   unsigned long use;  /* last time accessed. */
+   struct ib_device  *ca;  /* HCA device. */
+   struct ib_sa_path_rec path; /* path record info */
+struct ib_sa_query *query;
 
-static struct sdp_path_info *info_list = NULL;
+   struct work_struct timer;   /* arp request timers. */
+
+   struct list_head info_list;
+
+   struct list_head wait_list;
+};
+
+struct sdp_path_wait {
+   u64 id;  /* request identifier */
+   void (*completion)(u64 id,
+  int status,
+  u32 dst_addr,
+  u32 src_addr,
+  u8  hw_port,
+  struct ib_device *ca,
+  struct ib_sa_path_rec *path,
+  void *arg);
+   void *arg;
+   int retry;
+   struct list_head list;
+};
+
+struct sdp_work {
+   struct work_struct work;
+   void *arg;
+};
+
+struct sdp_link_arp {
+   /*
+* generic arp header
+*/
+   u16 addr_type;/* format of hardware address   */
+   u16 proto_type;   /* format of protocol address   */
+   u8  addr_len; /* length of hardware address   */
+   u8  proto_len;/* length of protocol address   */
+   u16 op;   /* ARP opcode (command) */
+   /*
+* begin IB specific section
+*/
+   u32  src_qpn; /* MSB = reserved, low 3 bytes=QPN */
+   union ib_gid src_gid;
+   u32  src_ip;
+
+   u32  dst_qpn; /* MSB = reserved, low 3 bytes=QPN */
+   union ib_gid dst_gid;
+   u32  dst_ip;
+
+} __attribute__ ((packed)); /* sdp_link_arp */
+
+#define SDP_LINK_SWEEP_INTERVAL (10 * (HZ)) /* frequency of sweep function */
+#define SDP_LINK_INFO_TIMEOUT   (300UL * (HZ)) /* unused time */
+#define SDP_LINK_SA_RETRY   (3)  /* number of SA retry requests */
+#define SDP_LINK_ARP_RETRY  (3)  /* number of ARP retry requests */
+
+#define SDP_LINK_SA_TIME_MIN(500)   /* milliseconds. */
+#define SDP_LINK_SA_TIME_MAX(1) /* milliseconds. */
+#define SDP_LINK_ARP_TIME_MIN   (HZ)
+#define SDP_LINK_ARP_TIME_MAX   (32UL * (HZ))
+
+#if 0
+#define SDP_IPOIB_RETRY_VALUE3/* number of retries. */
+#define SDP_IPOIB_RETRY_INTERVAL (HZ * 1) /* retry frequency */
+
+#define SDP_DEV_PATH_WAIT   (5 * (HZ))
+#define SDP_PATH_TIMER_INTERVAL (15 * (HZ))  /* cache sweep frequency */
+#define SDP_PATH_REAPING_AGE(300 * (HZ)) /* idle time before reaping */
+#endif
+
+static kmem_cache_t *wait_cache;
+static kmem_cache_t *info_cache;
+
+static LIST_HEAD(info_list);
 
 static struct workqueue_struct *link_wq;
 static struct work_struct   link_timer;
 
-static u64 path_lookup_id = 0;
+static u64 path_lookup_id;
 
 #define _SDP_PATH_LOOKUP_ID() \
   ((++path_lookup_id) ? path_lookup_id : ++path_lookup_id)
@@ -95,42 +183,6 @@ static void sdp_link_path_complete(u64 i
 }
 
 /*
- * sdp_path_wait_add - add a wait entry into the wait list for a path
- */
-static void sdp_path_wait_add(struct sdp_path_info *info,
- struct sdp_path_wait *wait)
-{
-
-   wait-next  = info-wait_list;
-   info-wait_list = wait;
-   wait-pext  = info-wait_list;
-
-   if (wait-next)
-   wait-next-pext = wait-next;
-}
-
-/*
- * sdp_path_wait_destroy - destroy an entry for a wait element
- */
-static void sdp_path_wait_destroy(struct sdp_path_wait *wait)
-{
-   /*
-* if it's in the list, pext will not be null
-*/
-   if 

[openib-general] [PATCH} OpenSM vendor layer: Use MAX macro in umad_receiver

2005-08-29 Thread Hal Rosenstock
OpenSM vendor layer: Use MAX macro in umad_receiver rather than open
coding it. Pointed out by Grant Grundler [EMAIL PROTECTED]

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: osm_vendor_ibumad.c
===
--- osm_vendor_ibumad.c (revision 3200)
+++ osm_vendor_ibumad.c (working copy)
@@ -271,8 +271,7 @@ umad_receiver(void *p_ptr)
 
if (!(madw_p = osm_mad_pool_get(p_bind-p_mad_pool,
(osm_bind_handle_t)p_bind,
-   length  MAD_BLOCK_SIZE ? 
-   length : MAD_BLOCK_SIZE,
+   MAX(length, MAD_BLOCK_SIZE),
osm_addr))) {
osm_log( p_vend-p_log, OSM_LOG_ERROR, umad_receiver: 
request for a new madw failed -- dropping 
packet\n );


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] OpenIB user space includes location

2005-08-29 Thread Hal Rosenstock
Hi,

Should OpenIB user space library includes be under
/usr/include/infiniband or /usr/include/rdma now (similar to the kernel
move of includes) ? The two that make me think this are user verbs and
perhaps CM. IB management (OpenSM and diags) as well as AT (address
translation) are IB specific.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA connection and address translation API

2005-08-29 Thread Roland Dreier
Michael What about using an Externally Administrated Service ID?
Michael Openib gets Service ID = 0x1H00 1405   where H is
Michael any digit.

That would work.  I think we've already converged on picking a service
ID range for our iWARP emulation spec.  The only question is whether
it should be in the IBTA or IETF service ID range, and I don't think
that really matters much.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] RMPP: Fix length in first segment of multipacket sends

2005-08-29 Thread Hal Rosenstock
Hi Michael,

On Mon, 2005-08-29 at 10:47, Michael S. Tsirkin wrote:
 Hal, first, sorry about nitpicking.
 
 Quoting Hal Rosenstock [EMAIL PROTECTED]:
rmpp_mad-rmpp_hdr.paylen_newwin =
cpu_to_be32(mad_send_wr-total_seg *
(sizeof(struct ib_rmpp_mad) -
-  offsetof(struct ib_rmpp_mad, 
data)));
+  offsetof(struct ib_rmpp_mad, 
data)) -
+   mad_send_wr-pad);
   
   BTW, I just noticed that whitespace was (and remains) broken in these 
   lines:
   indentation is done by spaces.
  
  The whitespace is preceeded by tabs and is to make the parameters line
  up. I thought that was allowable coding style. It has been used in many
  places in OpenIB code.
 
 Sure, whitespace preceeded by tabs is OK.
 But, pls take a look at the original file, or the patch, I think
 you'll see that its not preceeded by tabs in this instance.

I think this is in the original file.

 Some more nitpicks:
 
 In this specific instance, its probably best to just add a temp variable and
 write 
 
   w =  mad_send_wr-total_seg *
   (sizeof(struct ib_rmpp_mad) -
   offsetof(struct ib_rmpp_mad, data)) -
   mad_send_wr-pad;
 
   rmpp_mad-rmpp_hdr.paylen_newwin = cpu_to_be32(w)
 
 to avoid placing the descendant cpu_to_be32 left of the = sign.
 
 Documentation/CodingStyle says about this:
  Descendants are always substantially shorter than the parent and are
  placed substantially to the right.
 
 
 And by the way, wouldnt it be a good idea to replace hard-coded
 220 and such in ib_mad.h by a symbolic constant, and then we'd just
 have
 
   w =  mad_send_wr-total_seg * IB_RMMP_DATA_SIZE -
   mad_send_wr-pad;
 
   rmpp_mad-rmpp_hdr.paylen_newwin = cpu_to_be32(w)
 
 What do you think?

All seem reasonable to me. Sean should comment and has the final say on
this.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCH][iWARP] IW CM Verbs

2005-08-29 Thread Steve Wise
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of James Lentini
 Sent: Monday, August 29, 2005 8:42 AM
 To: Tom Tucker
 Cc: openib-general@openib.org
 Subject: Re: [openib-general] [PATCH][iWARP] IW CM Verbs
 
 
 
 On Fri, 26 Aug 2005, Tom Tucker wrote:
 
  Index: ib_verbs.h
  ===
  --- ib_verbs.h  (revision 3120)
  +++ ib_verbs.h  (working copy)
  @@ -804,6 +806,7 @@
  struct ib_gid_cache   **gid_cache;
   };
   
  +struct iw_cm;
   struct ib_device {
  struct device*dma_device;
   
  @@ -820,6 +823,8 @@
   
  u32   flags;
   
  +   struct iw_cm  *iwcm;
  +
  int(*query_device)(struct 
 ib_device *device,
 struct 
 ib_device_attr *device_attr);
  int(*query_port)(struct 
 ib_device *device,
 
 Why does the ib_device need a cm structure for iWARP but not IB?

Some RNIC devices fully establish the connection in hardware.  However,
_all_ openib IB devices export the MAD interface, which is used to send
low level connection setup primitives, thus allowing connection setup in
the host. 

 If 
 you used either Guy or Roland's generic RDMA connection API and did 
 the iWARP implementation, would you still need to add the iw_cm 
 structure?

Yes.

The struct iw_cm allows the RNIC device to export a set of high level
connection verbs, if that device supports it.

Hope this helps...

Steve.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCH][iWARP] Added provider CM verbs andquery provider methods

2005-08-29 Thread Asgeir Eiriksson

Caitlin

For the openib folk: keep in mind in the following that iWARP runs on
top of TCP, and the TCP is offloaded so TOE enters the iWARP picture.

I second the requirement that an iWARP RNIC needs to integrate with host
configuration, reporting, and security mechanisms, and this is the
approach taken in the Chelsio TOE patches that we have submitted.
Standard tools such as netstat, ifconfig, work with the Chelsio TOE
today, and there's nothing to prevent netfilter to work, etc. For those
that are interested there is more information available at
https://service.chelsio.com/open_toe and in particular in
Chelsio_toe_arch.pdf

I have to disagree that this means that the connection is necessarily
setup by the host stack. The Chelsio 10GE NIC/TOE/iSCSI/iWARP products
setup the connection on the card, but the setup includes an ASK host
phase initiated by the card where the host can filter the connect
request, and modify any of the initial TCP values chosen by the card,
etc., and then respond with accept or reject.

In general the iWARP connection manager needs to accommodate three
possible TCP connection setup models in use today in iWARP devices: a)
what you seem to be advocating, i.e. TCP connection setup on the host
stack, b) what I brought up, i.e. offloaded TCP connection setup with an
ASK phase, and c) what was brought up previously and that's full TCP
connection setup offload.

'Asgeir


From: Caitlin Bestler [EMAIL PROTECTED]
To: Roland Dreier [EMAIL PROTECTED],Tom Tucker [EMAIL PROTECTED]
CC: openib-general@openib.org
Subject: RE: [openib-general] [PATCH][iWARP] Added provider CM verbs
andquery provider methods
Date: Thu, 25 Aug 2005 16:46:38 -0700

Device vendors would jump at the opportunity to have a stable
interface with the host stack. Things like routing, protection
from denial of service attacks, rules for logging and filtering
connection requests and more all *should* be handled by the host
stack.

That's where the end user wants to control them, it's where
the security code can be kept most current and most robust.
It is also largely on packets that do not require offload
optimization.

But we also need time to ensure that the community understands
this as giving the host stack control of an offload connection
during connection establishment -- rather than as the offload
device stealing the connection from the host stack.

Moving the entire TCP connection logic to the offload device
not only increases the work that the offload device must do,
it reduces the auditability of the system and the user's control
over their network activity.

So the intent is not to evade the stack, rather it is to allow
time for proper integration with host stack control. The tradeoffs
are complex, and neither side fully understands the other's
issues yet. We need to work together to determine how to provide
the acceleration that our users want without sacrificing the OS
provided security that they assume will not be sacrificed.


  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Roland
Dreier
  Sent: Thursday, August 25, 2005 4:21 PM
  To: Tom Tucker
  Cc: openib-general@openib.org
  Subject: Re: [openib-general] [PATCH][iWARP] Added provider
  CM verbs and query provider methods
 
  Tom RNIC Verbs imply that the modify qp verb takes a handle to
a
  Tom connection -- presumably a socket. This CAN'T be done on
  Tom Linux in any fashion that is acceptable to the netdev
  Tom crowd. SOOO we modeled this after DAPL.  Trust me, I would
  Tom LOVE to be able to establish the connection using bind,
  Tom listen, etc..., query the Linux connection state and then
  Tom pass this down to the qp modify verb...but I can't.
 
  Let's not be too quick to say that this is impossible.  I
  think we should work with the Linux networking community and
  come up with the right answer, and not accept a bad solution
  just because it lets us go around the networking people.
 
  Has there been any real discussion of this on netdev?
 
   - R.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA connection and address translation API

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: RDMA connection and address translation API
 
 Michael What about using an Externally Administrated Service ID?
 Michael Openib gets Service ID = 0x1H00 1405   where H is
 Michael any digit.
 
 That would work.  I think we've already converged on picking a service
 ID range for our iWARP emulation spec.  The only question is whether
 it should be in the IBTA or IETF service ID range, and I don't think
 that really matters much.

Or neither :)
Are there disadvantages to Externally Administrated Service ID?
This avoids any need for approvals from either IETF or IBTA.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH][iWARP] Added provider CM verbs andquery provider methods

2005-08-29 Thread Roland Dreier
Asgeir ...this is the approach taken in the Chelsio TOE patches
Asgeir that we have submitted.

What are your plans for these patches?  I am not subscribed to netdev,
but from reading the archives, it seems that your most recent
submission was rejected quite strongly.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [openib-general] [PATCH] ipoib: device removal races
 
 Thanks, I finally applied this and put it in my git queue for 2.6.14.
 
 I'm still thinking about the bigger patch that adds a second work
 queue.  Having one extra work queue because of the rtnl_lock issues is
 ugly enough, and I'd really like to find a way to avoid two queues.

I thought about this some more.
I think I see even more problems, like potential deadlocks were
ipoib needs to flush the link wq, link wq waits for sa query to
time out, and that needs the default wq to run.

It seems that most of the complexity comes from the way core
uses work queues to pass events to upper layers.
And I wander: couldnt core be simplified by passing up events
directly in the interrupt context?

I think IPoIB could then use plain spinlocks for most synchronisations.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA connection and address translation API

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: RDMA connection and address translation API
 
 Michael What about using an Externally Administrated Service ID?
 Michael Openib gets Service ID = 0x1H00 1405   where H is
 Michael any digit.
 
 That would work.

So I'm saying, we dont need reserved bits in cm req then, do we?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA connection and address translation API

2005-08-29 Thread Roland Dreier
Michael So I'm saying, we dont need reserved bits in cm req then,
Michael do we?

Right.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] ipoib: device removal races

2005-08-29 Thread Roland Dreier
Michael It seems that most of the complexity comes from the way
Michael core uses work queues to pass events to upper layers.
Michael And I wander: couldnt core be simplified by passing up
Michael events directly in the interrupt context?

I'm confused -- which core and which events are you talking about?
The ib_core module just passes on asynchronous events directly from
the low-level driver, without using work queues.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [PATCH] kdapl: Change for new include location

2005-08-29 Thread Hal Rosenstock
kdapl: Change for new include location (rdma rather than infiniband)

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: dapl_openib_cm.h
===
--- dapl_openib_cm.h(revision 3232)
+++ dapl_openib_cm.h(working copy)
@@ -34,9 +34,9 @@
 #ifndef DAPL_OPENIB_CM_H
 #define DAPL_OPENIB_CM_H
 
-#include ib_cm.h
-#include ib_sa.h
-#include ib_at.h
+#include rdma/ib_cm.h
+#include rdma/ib_sa.h
+#include rdma/ib_at.h
 
 struct dapl_cm_ctx {
struct ib_at_ib_route dapl_rt;
Index: dapl_openib_util.h
===
--- dapl_openib_util.h  (revision 3232)
+++ dapl_openib_util.h  (working copy)
@@ -33,8 +33,8 @@
 #define DAPL_OPENIB_UTIL_H
 
 #include dapl.h
-#include ib_verbs.h
-#include ib_cm.h
+#include rdma/ib_verbs.h
+#include rdma/ib_cm.h
 
 enum dapl_async_handler_type {
DAPL_ASYNC_UNAFILIATED,



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCH][iWARP] Added provider CM verbsandquery provider methods

2005-08-29 Thread Tom Tucker

From my reading of the thread, there is resistence to 
TOE in general. The patch is just the messenger. The principle 
opponent is Dave Miller who strongly believes that stateless
acceleration such as TSO (TCP Segmentation Offload) suffices for 
all needs. Ironically, this requires a much higher level of stack 
integration than TOE does.

TOE for the purposes of RDMA may have more legs within the 
community, however, this has yet to be tested.

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Roland Dreier
 Sent: Monday, August 29, 2005 11:24 AM
 To: Asgeir Eiriksson
 Cc: openib-general@openib.org
 Subject: Re: [openib-general] [PATCH][iWARP] Added provider 
 CM verbsandquery provider methods
 
 Asgeir ...this is the approach taken in the Chelsio TOE patches
 Asgeir that we have submitted.
 
 What are your plans for these patches?  I am not subscribed to netdev,
 but from reading the archives, it seems that your most recent
 submission was rejected quite strongly.
 
  - R.
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit 
 http://openib.org/mailman/listinfo/openib-general
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general][PATCH][kdapl]: FMR and EVD patch

2005-08-29 Thread James Lentini

Hi Guy,

I agree with you on the problems poised by the current interface. I 
hope we can find a solution that fixes the problem. Note that the same 
problem must be handled by a ULP using the native verbs.

I still think that there may be a race condition with this patch. 
Here's the scenario I'm concerned about:

 - Receive an evd upcall
 - Disable evd upcall policy
 - Wakeup polling thread
 - Dequeue all events
 - Enable evd upcall policy by:
   1. Call dapl_evd_modify_upcall() to enable the evd upcall
   2. Obtain the EVD spin lock via spin_lock_irqsave, thus 
  disabling local interrupts
   3. Check that the EVD's ring buffer is empty (there are no DAPL 
  software events)
   4. A DTO completion occurs on the EVD's CQ
   5. Enable the CQ's upcall via ib_req_notify_cq()

If I understand you correctly, you are asserting that event #4, the 
CQ's DTO completion, cannot occur because the local interrupts are 
disabled by spin_lock_irqsave(). Have I understood you correctly?

My belief is that the completion will occur on the card regardless of 
the interrupt state. Can you provide me with a reference that 
guarantees this will not happen?

james

On Thu, 18 Aug 2005, Guy German wrote:

 Hi James,
 
 I will try to explain the reason behind this patch:
 
 In IB, a “normal” working flow, for a consumer, is:
 - Receive a CQ notification callback
 - Wakeup polling thread
 - Poll for completion (empty the queue) 
 - Request completion notification
 
 There is no problem here.
 
 In kdapl, however, the consumer will keep getting upcalls, until he 
 sets the upcall policy to disable. So a working flow will be:
 - Receive an evd upcall
 - Disable evd upcall policy
 - Wakeup polling thread
 - Dequeue all evd’s
 - Enable evd upcall policy
 
 There is a race here: A completion can come after the last dequeu 
 and before the Enabling. The provider won’t call for the consumer 
 (policy is disabled) and the consumer would not dequeu any more 
 because he “knows” the queue is empty.
 
 I think it is a very bad idea, to solve this race by adding another 
 evd_dequeue after you enable the upcall policy. If you do that you 
 would have a polling thread (because while you dequeue one 
 completion you can have many more following) and at the same time 
 you will receive upcall from the dapl provider. Beside the fact that 
 this is an expensive and unnecessary context switch you have an 
 upcall and a thread racing. You will have a situation that the 
 upcall has an event at hand and the thread has an event, both not 
 handled yet - you will have to queue them again internally or 
 something to keep the order. And I think that is only a partial list 
 of the problems in this case.
 
 SO…
 
 My suggestion is simple, it solves the race, it saves the 
 unnecessary context switch and it spares the complexity from the 
 consumer side. The solution is to notify the consumer when he tries 
 to enable upcall policy, that the queue is actually not empty, and 
 force him to continue polling (in the same thread context he is 
 now). dat_evd_modify_upcall is guarded by a spin_lock_irqsave, when 
 it checks the queue and so the race would not occur.
 
 BTW,
 I’m not sure if it is still the case, but I think that one of the 
 ulps in openib, did not use a kernel thread for dequeu-ing. This is 
 a very bad design, as the upcall can be polling for *long* periods 
 of time, in a tasklet/interrupt context.
 
 That’s it…
 Sorry for the long mail – I hope It was not to blur …
 
 Guy.
 
 
 
 
 
 -Original Message-
 From: James Lentini [mailto:[EMAIL PROTECTED]
 Sent: Thu 8/18/2005 10:28 PM
 To: Guy German
 Cc: Openib
 Subject: Re: [openib-general][PATCH][kdapl]: FMR and EVD patch
  
 
 
 Hi Guy,
 
 The one piece of this patch that remains unaccepted is:
 
 Index: ib/dapl_evd.c
 ===
 --- ib/dapl_evd.c (revision 3136)
 +++ ib/dapl_evd.c (working copy)
 @@ -1028,6 +1028,7 @@
  {
   struct dapl_evd *evd;
   int status = 0;
 + int pending_events;
  
   evd = (struct dapl_evd *)evd_handle;
   dapl_dbg_log (DAPL_DBG_TYPE_API, %s: (evd=%p, upcall_policy=%d)\n,
 @@ -1035,14 +1036,25 @@
  
   spin_lock_irqsave(evd-common.lock, evd-common.flags);
   if ((upcall_policy != DAT_UPCALL_TEARDOWN) 
 - (upcall_policy != DAT_UPCALL_DISABLE) 
 - (evd-evd_flags  DAT_EVD_DTO_FLAG)) {
 - status = ib_req_notify_cq(evd-cq, IB_CQ_NEXT_COMP);
 - if (status) {
 - printk(KERN_ERR %s: dapls_ib_completion_notify failed 
 -(status=0x%x)\n,__func__, status);
 + (upcall_policy != DAT_UPCALL_DISABLE)) {
 + pending_events = dapl_rbuf_count(evd-pending_event_queue);
 + if (pending_events) {
 + dapl_dbg_log(DAPL_DBG_TYPE_WARN,
 +  

[openib-general] Re: [PATCH] IBAT resolve_ats_route

2005-08-29 Thread Hal Rosenstock
On Fri, 2005-08-26 at 16:42, James Lentini wrote:
 I was reading through the IBAT sources when I noticed that in 
 resolve_ats_route() you set req-pend.sa_query to null on line 1127 
 and then check to see if it is null a few lines later. I don't think 
 you need to do that.

Yes, it looks like that code path could never be taken.

Thanks. Applied.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCH][iWARP] Added provider CM verbsandquery provider methods

2005-08-29 Thread Caitlin Bestler
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tom Tucker
 Sent: Monday, August 29, 2005 9:47 AM
 To: Roland Dreier; Asgeir Eiriksson
 Cc: openib-general@openib.org
 Subject: RE: [openib-general] [PATCH][iWARP] Added provider 
 CM verbsandquery provider methods
 
 
 From my reading of the thread, there is resistence to
 TOE in general. The patch is just the messenger. The 
 principle opponent is Dave Miller who strongly believes that 
 stateless acceleration such as TSO (TCP Segmentation Offload) 
 suffices for all needs. Ironically, this requires a much 
 higher level of stack integration than TOE does.
 
 TOE for the purposes of RDMA may have more legs within the 
 community, however, this has yet to be tested.
 

And even once we have concensus to do it, we then need to
reach concensus on issues such as connect-on-chip-with-host-approval
and/or connect-on-host-then-transfer to work through. For example,
I think the host stack should support either, leaving the tradeoffs
between NIC and host processing to be resolved in the marketplace.
 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Shubha Mudgal In Concert TOMORROW NIGHT!

2005-08-29 Thread RAVE*SQ Magazine







  
   
	
   
 

  
   
   
 
















Update Profile
 | 
Unsubscribe
 | 
Confirm
| 
Complain










___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] iser: Make iser Makefile like other OpenIB ULP makefiles

2005-08-29 Thread Hal Rosenstock
Make iser Makefile like other OpenIB ULP makefiles

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: Makefile
===
--- Makefile(revision 3232)
+++ Makefile(working copy)
@@ -1,16 +1,14 @@
-ISER_OBJ = iser_mod.o
-ISER_OBJ += iser_conn.o
-ISER_OBJ += iser_initiator.o
-ISER_OBJ += iser_memory.o
-ISER_OBJ += iser_task.o
-ISER_OBJ += iser_utils.o
-ISER_OBJ += iser_dto.o
-ISER_OBJ += iser_lkdapl.o
+EXTRA_CFLAGS += -Idrivers/infiniband/include -Idrivers/infiniband/ulp/kdapl \
+   -I$(src)/include -DLINUX_KDAT
 
-EXTRA_CFLAGS += -Idrivers/infiniband/include
-EXTRA_CFLAGS += -Idrivers/infiniband/ulp/kdapl
-EXTRA_CFLAGS += -I$(src)/include
-EXTRA_CFLAGS += -DLINUX_KDAT
+obj-$(CONFIG_INFINIBAND_ISER)  += ib_iser.o
 
-obj-$(CONFIG_INFINIBAND_ISER) += $(ISER_OBJ)
+ib_iser-y  := iser_mod.o \
+  iser_conn.o \
+  iser_initiator.o \
+  iser_memory.o \
+  iser_task.o \
+  iser_utils.o \
+  iser_dto.o \
+  iser_lkdapl.o
 



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] kdapl: Change for new include location

2005-08-29 Thread James Lentini


On Mon, 29 Aug 2005, Hal Rosenstock wrote:

halr kdapl: Change for new include location (rdma rather than infiniband)

Committed in revision 3235.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 I'm confused -- which core and which events are you talking about?

OK, I wasnt exactly clear.
I was talking about activating the sa queries: if starting and cancelling
sa queries could be done from under spinlock/interrupt context,
IPoIB could use straight spinlocks for synchronisation, and avoid
using workqueues altogether.

Is this feasible?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Roland Dreier
James What happens if multiple devices can reach the destination
James address? How will they be enumerated to the consumer?
 
I guess we need to move towards the full horror of getaddrinfo().
Probably we need some unusable native API, and then library functions
layered on top for consumers that don't care.

Although maybe it's not necessary -- are there any consumers of this
API that really want to choose among different equal-metric routes?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Caitlin Bestler

  a connect flow will be something like:
  
  - ib_cma_get_device (...) /* get device(1) or device+path(2) */
  - pd = ib_alloc_pd(...) /* pd allocated in the given device */
  - qp = ib_cma_create_qp(...) /* qp returned in init state */
  - ib_post_recv(qp, ...);
  - ib_cma_connect (qp, dst_addr(1)/path(2), ...);
  
  Now, there are 2 suggestions for the device discovery:
  1. get_device returns device and port, according the local routing 
  tables, synchronously. ib_cma_connect calls the at module 
 for address 
  resolving (cache handled) before calling the cm_connect.
  2. get_device registers an upcall and return in the upcall the data 
  path to the consumer. In this case caching is done by the consumer.
 
 What happens if multiple devices can reach the destination address? 
 How will they be enumerated to the consumer?
  

At the DAT layer the assumption was that multiple paths would
be chosen based upon the Class of Service.

So either the CoS must be passed down, or get_device must
return an array of devices with the required info to allow
the DAT Provider to make the determination.

Passing it down sounds simpler to me.
 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Caitlin Bestler
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Roland Dreier
 Sent: Monday, August 29, 2005 12:41 PM
 To: James Lentini
 Cc: openib-general@openib.org
 Subject: Re: [openib-general] RDMA Generic Connection Management
 
 James What happens if multiple devices can reach the destination
 James address? How will they be enumerated to the consumer?
  
 I guess we need to move towards the full horror of getaddrinfo().
 Probably we need some unusable native API, and then library 
 functions layered on top for consumers that don't care.
 
 Although maybe it's not necessary -- are there any consumers 
 of this API that really want to choose among different 
 equal-metric routes?
 

The assumption implicit in the DAT connection APIs is that
there are none (i.e., if you can't distinguish based on
Class of Service then you don't care which actual path
you get).
 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH] ipoib: device removal races
 
 Michael OK, I wasnt exactly clear.  I was talking about
 Michael activating the sa queries: if starting and cancelling sa
 Michael queries could be done from under spinlock/interrupt
 Michael context, IPoIB could use straight spinlocks for
 Michael synchronisation, and avoid using workqueues altogether.
 
 I'd have to audit the code to make sure, but as far as I know it
 should be fine to call the SA query API with spinlocks held.

Okay, but it also seems that, at least to cancel a query, its unsufficient
to call ib_sa_cancel_query - you then have to wait until you get a
callback, which seems to be performed from a work queue.

Can sa query be changed to perform the callback directly, and so
guarantee that query isnt used after ib_sa_cancel_query returns?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Roland Dreier
Caitlin The assumption implicit in the DAT connection APIs is
Caitlin that there are none (i.e., if you can't distinguish based
Caitlin on Class of Service then you don't care which actual path
Caitlin you get).
 
Let's forget about what DAT specified and just try to come up with the
right answer.  In any case, DAT ignored routing completely, so I don't
think it's helpful to consider it.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Roland Dreier
Michael Can sa query be changed to perform the callback directly,
Michael and so guarantee that query isnt used after
Michael ib_sa_cancel_query returns?

Hmm, that gets into the MAD layer design, but I think it gets very
tricky.  For example, how do we know that the query isn't already
completing on a different CPU as we enter the cancel call?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general][PATCH][kdapl]: FMR and EVD patch

2005-08-29 Thread Guy German
James Lentini wrote:
 I agree with you on the problems poised by the current interface. I 
 hope we can find a solution that fixes the problem. 
 Note that the same problem must be handled by a ULP using the native verbs.

I don't think we have the same problem in the verbs.
In the currently Mellanox hw (which is AFAIK the only available hw in openib) 
there is no race at all (because of the proprietary, more “considerate”, 
completion notification implementation). 

 - Receive a CQ notification callback
 - Wakeup polling thread
 - Poll for completion (empty the queue) 
 - Request completion notification
[you will get a completion notification even for “old” completions on the queue]
- exit thread

In the case of other, more harsh ib compliant future hw implementation – 
Request completion Notification “extended verb” could encapsulate:
- request CQ notification
- if cq !empty request CQ notification _again_
(note that you are not *polling* the cq – just checking the queue.
This is different then draining the evd one more time)

And the race is solved.
Indeed, it is not as efficient as sparing the context switch 
(to interrupt and back to thread) altogether. 

I still think that there may be a race condition with this patch. 
Here's the scenario I'm concerned about:
 - Receive an evd upcall
 - Disable evd upcall policy
 - Wakeup polling thread
 - Dequeue all events
 - Enable evd upcall policy by:
   1. Call dapl_evd_modify_upcall() to enable the evd upcall
   2. Obtain the EVD spin lock via spin_lock_irqsave, thus 
  disabling local interrupts
   3. Check that the EVD's ring buffer is empty (there are no DAPL 
  software events)
   4. A DTO completion occurs on the EVD's CQ
   5. Enable the CQ's upcall via ib_req_notify_cq()

If I understand you correctly, you are asserting that event #4, the 
CQ's DTO completion, cannot occur because the local interrupts are 
disabled by spin_lock_irqsave(). Have I understood you correctly?

Not quite. The *consumer’s upcall* would not be called, due to the irq disable. 
The race would not occur, OTOH, because the Mellanox hw will initiate a 
completion notification even if the completions in the cq arrived before 
the notification request.
If you want to be more ib compliant, for future possible implementations, 
you can apply the “extended-notify-routine” (as mentioned above).

 My belief is that the completion will occur on the card 
 regardless of the interrupt state. 

True, but the consumer will be notified only as soon as the irq 
is enabled again

 Can you provide me with a reference that guarantees this 
 will not happen?

I’m not saying that it won’t  ;) but I don't think there will be a race...

Guy
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Guy German
James What happens if multiple devices can reach the destination
James address? How will they be enumerated to the consumer?

Roland I guess we need to move towards the full horror of getaddrinfo().
Roland Probably we need some unusable native API, and then library functions
Roland layered on top for consumers that don't care.

Roland Although maybe it's not necessary -- are there any consumers of this
Roland API that really want to choose among different equal-metric routes?

I don't think iSER does.

Any way, I think we need to agree on the basic principle API,
and if we want to extend it, along the way of implementation
(like an array of suitable devices instead of a chosen one)
we would be able to patch it.

Guy
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Guy German [EMAIL PROTECTED]:
 1. get_device returns device and port, according the local routing
 tables, synchronously. ib_cma_connect calls the at module for address
 resolving (cache handled) before calling the cm_connect.

How does one cancel address resolution request?

 2. get_device registers an upcall and return in the upcall the data path
 to the consumer. In this case caching is done by the consumer.
 
 I prefer option 1, because it makes the consumer code simpler, without
 having to handle upcalls for address translations (which are not
 asynchronous in iWARP) or hold the transport's data information. Also it
 saves the consumer the trouble of caching routes to destinations.
 
 I would like to hear what other people in the list think of it ...

In the case of callback (option 2) I really hope functions will work
with some kind of object pointer, avoiding another layer of
hash lookups and stuff.

Something like

struct ib_cma_path {
  struct ib_device *device;
  struct list_head arp_list;
  struct ib_sa_query *query;
  int id;
  .
  void (*comp_handler)(struct ib_cma_path *, int status);
};

Users should simply pass this object back to the cancel request.

I am also in favor of making this structure public, making it possible
for users to add arbitrary amount of private data by simply inheriting
the structure and using container_of in comp_handler, but this is a
separate issue.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: RDMA Generic Connection Management
 
 James What happens if multiple devices can reach the destination
 James address? How will they be enumerated to the consumer?
  
 I guess we need to move towards the full horror of getaddrinfo().
 Probably we need some unusable native API, and then library functions
 layered on top for consumers that don't care.

I see a problem in that the number of paths may be very big.
It just does not make sense to me to pass them all up the layer
to let the ULP deal with selecting one.

 Although maybe it's not necessary -- are there any consumers of this
 API that really want to choose among different equal-metric routes?

I think that yes:
APM might be one good reason to want to get more than one path, would it not?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH] ipoib: device removal races
 
 Michael Can sa query be changed to perform the callback directly,
 Michael and so guarantee that query isnt used after
 Michael ib_sa_cancel_query returns?
 
 Hmm, that gets into the MAD layer design, but I think it gets very
 tricky.  For example, how do we know that the query isn't already
 completing on a different CPU as we enter the cancel call?

Something like:

Remove it from the idr before completing, under a spinlock.
Now if its in idr its not completing.

Could this work?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Caitlin Bestler
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Michael S. Tsirkin
 Sent: Monday, August 29, 2005 1:19 PM
 To: Roland Dreier
 Cc: openib-general@openib.org
 Subject: [openib-general] Re: RDMA Generic Connection Management
 

 
 I think that yes:
 APM might be one good reason to want to get more than one 
 path, would it not?
 

But you would have to define automatic path migration in
generic/transport neutral terms. I've actually come up with
some definitions that are inclusive of IB and SCTP, but a
definition of Automatic Path Migration that includes TCP
isn't going to be very meaningul since the migration of
a TCP connection occurs one layer lower than for IB.

If APM can only be defined for IB then it does not have
to be addressed in the generic interface.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] Re: RMPP Message Format Errors

2005-08-29 Thread Sean Hefty
 In my interpretation, partial data is indicated by the PayloadLength field in
 the last segment only.  It's quite possible that my interpretation is
incorrect,
 in which case the calculation in the RMPP code is off.
I agree the text might be missing an example or two for clarification.
Anyway, we probably can use the IB Analyzer as the ultimate
interpretation test. Note that there are IB implementations that uses
the first segment payload length as the source of packet length and
count on it to represent the correct DATA length.

We can take your interpretation to discussion in the IBTA MGTWG for
further discussion.
Is the effort for fixing it big?

It's not a big deal to change it.  If the common interpretation is to only
include the partial data size, I will change it.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [openib-general] [PATCH][iWARP] Added provider CM verbs andquery provider methods

2005-08-29 Thread Asgeir Eiriksson
Roland

We're planning to go back with a new submission which addresses the
concerns that were directly relevant to the patch itself. In the
process, we'll be porting the patch to 2.6.14.

A couple of comments:

If one were to look at the patch in its current form, you'd find that it
is already quite minimal compared to the changes needed for 10GE TCP/IP
alternatives.

The architecture that we propose also accommodates different TOE
approaches, e.g. different connection setup models, etc.

We currently have the proposed architecture running on Linux in
conjunction with a regular NIC, iWARP RNIC, and iSCSI HBA.

Finally, with the new submission, we're hoping to get a more
constructive dialogue going, which focuses on the patch itself, because
it is clear that there is user interest in the technology, and Linux
support would be beneficial to all parties.

'Asgeir
 

 -Original Message-
 From: Roland Dreier [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 29, 2005 9:24 AM
 To: Asgeir Eiriksson
 Cc: openib-general@openib.org
 Subject: Re: [openib-general] [PATCH][iWARP] Added provider CM verbs
 andquery provider methods
 
 Asgeir ...this is the approach taken in the Chelsio TOE patches
 Asgeir that we have submitted.
 
 What are your plans for these patches?  I am not subscribed to netdev,
 but from reading the archives, it seems that your most recent
 submission was rejected quite strongly.
 
  - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Roland Dreier
Michael Something like:

Michael Remove it from the idr before completing, under a
Michael spinlock.  Now if its in idr its not completing.

Michael Could this work?

I think you have to hold the spinlock across the consumer callback to
avoid all races.  And that's kind of a bummer, because it means you
can't do anything that might sleep (like modify a QP) from the
callback.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting Caitlin Bestler [EMAIL PROTECTED]:
  APM might be one good reason to want to get more than one 
  path, would it not?

 But you would have to define automatic path migration in
 generic/transport neutral terms. I've actually come up with

I dont see a problem.

For the sake of this argument, lets assume APM cant be done with iWARP.
How is an iWARP card different from an HCA on a fabric where
there's only a single path to a specific node then?

I dont have the IB spec in front of me now - is APM support optional
or required in IB?

 If APM can only be defined for IB then it does not have
 to be addressed in the generic interface.

I dont see how you can layer this on top without
support for reporting multiple paths, so it
will need to be addressed as part of this module.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] [PATCH][iWARP] IW CM Verbs

2005-08-29 Thread Sean Hefty

James Lentini wrote:
Why does the ib_device need a cm structure for iWARP but not IB? If 
you used either Guy or Roland's generic RDMA connection API and did 
the iWARP implementation, would you still need to add the iw_cm 
structure?


Their connection protocol is implemented in hardware.  Even with a 
generic CM API, I believe that they'll need these calls.


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Sean Hefty

Guy German wrote:

- ib_cma_get_device (...) /* get device(1) or device+path(2) */
- pd = ib_alloc_pd(...) /* pd allocated in the given device */
- qp = ib_cma_create_qp(...) /* qp returned in init state */
- ib_post_recv(qp, ...);
- ib_cma_connect (qp, dst_addr(1)/path(2), ...);


To focus on something a little different... do we want an API that 
returns a pointer to a device structure?  Specifically, how does this 
affect device removal?


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] RE: Re: RDMA Generic Connection Management

2005-08-29 Thread Caitlin Bestler
 

 -Original Message-
 From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 29, 2005 1:46 PM
 To: Caitlin Bestler
 Cc: Roland Dreier; openib-general@openib.org
 Subject: Re: Re: RDMA Generic Connection Management
 
 Quoting Caitlin Bestler [EMAIL PROTECTED]:
   APM might be one good reason to want to get more than one path, 
   would it not?
 
  But you would have to define automatic path migration in 
  generic/transport neutral terms. I've actually come up with
 
 I dont see a problem.
 
 For the sake of this argument, lets assume APM cant be done 
 with iWARP.
 How is an iWARP card different from an HCA on a fabric where 
 there's only a single path to a specific node then?
 

There is a very important difference.

The iWARP card *can* support automatic path migration
that is not visible to the RDMA layer -- i.e.,it can
move an L3 address to a new L2 address (port migration).

As such it is very different from an IB device where
all path migration that exists is visible.

Making it even more complex iWARP/SCTP could support
both L4 and L3 migration.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] RDMA Generic Connection Management

2005-08-29 Thread Roland Dreier
Sean To focus on something a little different... do we want an
Sean API that returns a pointer to a device structure?
Sean Specifically, how does this affect device removal?

Hey, that's a really good point.  We should make sure that our API
makes it easy to handle device hotplug.

One solution is to start reference counting device references, but
that inevitably leads to bugs in ULPs -- protocol authors won't get it
right unless we make it really easy.  And I don't see how to make the
reference counting trivial.

Anyone have a better idea?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: [PATCH] ipoib: device removal races
 
 Michael Something like:
 
 Michael Remove it from the idr before completing, under a
 Michael spinlock.  Now if its in idr its not completing.
 
 Michael Could this work?
 
 I think you have to hold the spinlock across the consumer callback to
 avoid all races.

Hmm. I think I see what you mean. Would setting the completion callback to NULL
in the query structure under the idr spinlock work?
It now seems to me it will.

 And that's kind of a bummer, because it means you
 can't do anything that might sleep (like modify a QP) from the
 callback.

Its an sa query, so I'm not sure why would you want to modify a QP
there.
Further, please note that in the current API the callback is
always called even if the query is cancelled.

And clearly you cant allow cancel under a spinlock and 
at the same time ensure callback is performed and is allowed to sleep.

I think its not a big problem to let cancel return a code meaning
completion was cancelled, perform the callback yourself if you want
to. I imagine ulps may special-case cancellation, anyway.

Would such an API change be OK?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: RDMA Generic Connection Management
 
 Sean To focus on something a little different... do we want an
 Sean API that returns a pointer to a device structure?
 Sean Specifically, how does this affect device removal?
 
 Hey, that's a really good point.  We should make sure that our API
 makes it easy to handle device hotplug.
 
 One solution is to start reference counting device references, but
 that inevitably leads to bugs in ULPs -- protocol authors won't get it
 right unless we make it really easy.  And I don't see how to make the
 reference counting trivial.
 
 Anyone have a better idea?

Roland, could you please explain what the problem is?
If you have an outstanding request, and all devices went down,
cant it simply be completed with an error status?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Sean Hefty

Michael S. Tsirkin wrote:

Its an sa query, so I'm not sure why would you want to modify a QP
there.
Further, please note that in the current API the callback is
always called even if the query is cancelled.

And clearly you cant allow cancel under a spinlock and 
at the same time ensure callback is performed and is allowed to sleep.


I think its not a big problem to let cancel return a code meaning
completion was cancelled, perform the callback yourself if you want
to. I imagine ulps may special-case cancellation, anyway.

Would such an API change be OK?


This is similar to some of the discussions that went into cancel MADs. 
It should be possible for the SA to return a value from cancel that 
indicates that no callback will occur.  However, it's not possible for 
it to return a value that indicates that one will occur.  In the latter 
case, the callback could have already occurred or may be in progress. 
Which means that a user calling cancel has to be able to deal with a 
callback occurring anyway.


- Sean
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Michael S. Tsirkin
Quoting Sean Hefty [EMAIL PROTECTED]:
 In the latter 
 case, the callback could have already occurred or may be in progress. 
 Which means that a user calling cancel has to be able to deal with a 
 callback occurring anyway.

Wont a bit in the query structure suffice?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Roland Dreier
Michael Roland, could you please explain what the problem is?  If
Michael you have an outstanding request, and all devices went
Michael down, cant it simply be completed with an error status?

Something like:

get_device_for_route(device);
/* hot unplug device */
ib_create_qp(device);  /* how do we handle this? */


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Sean Hefty [EMAIL PROTECTED]:
 Subject: Re: RDMA Generic Connection Management
 
 Guy German wrote:
 - ib_cma_get_device (...) /* get device(1) or device+path(2) */
 - pd = ib_alloc_pd(...) /* pd allocated in the given device */
 - qp = ib_cma_create_qp(...) /* qp returned in init state */
 - ib_post_recv(qp, ...);
 - ib_cma_connect (qp, dst_addr(1)/path(2), ...);
 
 To focus on something a little different... do we want an API that 
 returns a pointer to a device structure?

Yes, I think its much better than dealing with type-unsafe indexes,
wasting memory on tables and/or forcing table lookups on each call.

 Specifically, how does this 
 affect device removal?
 
 - Sean

How is this different from what we have with ib_verbs now?

I think that reasonable ULPs must register for hotplug events
in the ib layer, anyway.
So when they get a device removal callback, they close the qps etc.

Makes sense?

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] ipoib: device removal races

2005-08-29 Thread Roland Dreier
Sean said it well, but to repeat: the problem you run into is what to
do when a consumer tries to cancel while the callback is running.  For
example, one CPU might be in the middle of jumping to the consumer's
callback when the other CPU enters the cancel function.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Michael S. Tsirkin
Quoting r. Roland Dreier [EMAIL PROTECTED]:
 Subject: Re: RDMA Generic Connection Management
 
 Michael Roland, could you please explain what the problem is?  If
 Michael you have an outstanding request, and all devices went
 Michael down, cant it simply be completed with an error status?
 
 Something like:
 
 get_device_for_route(device);
 /* hot unplug device */
 ib_create_qp(device);  /* how do we handle this? */
 

Register with ib layer for hotplug events, flush the queue that
does this.

-- 
MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: RDMA Generic Connection Management

2005-08-29 Thread Sean Hefty

Michael S. Tsirkin wrote:

How is this different from what we have with ib_verbs now?


With ib_verbs, users receive notification of device addition/removal. 
This interface doesn't require receiving that notification.



I think that reasonable ULPs must register for hotplug events
in the ib layer, anyway.
So when they get a device removal callback, they close the qps etc.

Makes sense?


This opens up the possibility for a user to receive a reference to a 
device that they may not have received previous notification for. 
Similarly, the device could have been removed before the call returned, 
making the pointer invalid.


- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] license mismatches

2005-08-29 Thread Kanevsky, Arkady
I had reviewed the licenses used by files in
https://openib.org/svn/gen2/trunk.
The following .c and .h files do not match the OpenIB licenses:
https://openib.org/svn/gen2/trunk/src/userspace/tvflash/src/tvflash.c
https://openib.org/svn/gen2/trunk/src/userspace/tvflash/src/firmware.h
https://openib.org/svn/gen2/trunk/src/userspace/examples/aio/ttcp.aio.c
https://openib.org/svn/gen2/trunk/src/userspace/management/osm/complib/M
akefile.mlx
https://openib.org/svn/gen2/trunk/src/userspace/management/osm/opensm/os
m_indent

all files in directories:
https://openib.org/svn/gen2/trunk/src/userspace/mstflint/
https://openib.org/svn/gen2/trunk/src/userspace/mpi/

files in directory
https://openib.org/svn/gen2/trunk/src/userspace/libsdp/src/
have the right licenses but the copyright message does not match the
OpenIB copyright.

Several files do not have any licences, like Makefile, configure and map
files.
For example,
https://openib.org/svn/gen2/trunk/src/userspace/libibcm/src/libibcm.map
https://openib.org/svn/gen2/trunk/src/userspace/libibcm/Makefile.am
I think this is OK.

I suspect that all these are oversites and all the files should be
available under both BSD and GPL2 licenses.

Thanks,
Arkady

Arkady Kanevsky   email: [EMAIL PROTECTED]
Network Appliance phone: 781-768-5395
375 Totten Pond Rd.  Fax: 781-895-1195
Waltham, MA 02451-2010  central phone: 781-768-5300
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] license mismatches

2005-08-29 Thread Roland Dreier
Arkady I suspect that all these are oversites and all the files
Arkady should be available under both BSD and GPL2 licenses.

tvflash at least is licensed correctly.  It links to the pciutils
library, which is licensed under the GPL.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] rc ping pong error

2005-08-29 Thread viswanath krishnamurthy
I have the latest openib code on 2.16 machine, when
I run the rc pingpong program I get the following
error (The first time it passed, but subsequent ones
got an error, I tried changing the iteration count to
a large number, 10 after the first time)

#dmesg

ib_mthca :05:00.0: Mapped page at 395aa000 to
8 for ICM.
ib_mthca :05:00.0: CQ overrun on CQN 5b0083 
=
ib_mthca :05:00.0: Unmapping 1 pages at 8 from
ICM.

[EMAIL PROTECTED] ./ibv_rc_pingpong 192.169.8.117
  local address:  LID 0x0003, QPN 0x440405, PSN
0xd6ae4e
  remote address: LID 0x0001, QPN 0x3a0405, PSN
0x9317a4
  [ 0] 00440405
  [ 4] 
  [ 8] 
  [ c] 
  [10] 1581
  [14] 
  [18] 8002
  [1c] ff10
Failed status 12 for wr_id 2









Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [PATCH] uat: make uat.c compile on 2.6.13-rc3

2005-08-29 Thread Hal Rosenstock
On Thu, 2005-07-28 at 14:50, Tom Duffy wrote: 
 This patch is similar to the one for ucm.  It updates the class code to
 work with 2.6.13-rc3.

Thanks. (Finally) applied now that 2.6.13 is out :-)

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [mgtwg] Payload Length in first RMPP sent segment

2005-08-29 Thread Greg Pfister

Hal,
My take is that there's no ambiguity.
Then again, I wrote it, so I would think that, right? :-)
The idea is that we're trying to allow
*either* of the usual two options for specifying a string of stuff: (a)
Start out by giving the length; or (b) go until you reach a special mark
meaning the end.
The thing is it gets complicated when
there is only one packet. So take two cases: 1 packet, and ==1 packet.
length 1 packet: 
-- PayloadLength  0 on 1st packet
means case (a). Just read until you get that many bytes, which may use
only part of the last packet. If the last packet isn't also marked last,
scream about inconsistency. 
-- PayloadLength=0 on first packet -
case (b). Read until you get a marked last packet. PayloadLength in that
last packet tells you how many are valid in that packet (zero in that case
-- I'm not sure; whole packet, I think).
length ==1 packet meaning RMPPFlags.Last=1
and RMPPFlags.First=1 in the same packet. 
-- Interpretation is the same as the
last packet case above, i.e., RMPPFlags.Last=1 dominates the
interpretation.
As far as I know, that's it. Any comments
from others?
(This may not forward to openib-general,
since I'm not on that list; if it doesn't please forward.)

Greg Pfister
IBM Distinguished Engineer, Member IBM Academy of Technology
IBM Systems  Technology Group, Server Technology  Architecture
(512) 838-8338 | IBM tieline 678-8338 | FAX (512) 838-3418
Sic Crustulum Frangitur





Hal Rosenstock [EMAIL PROTECTED]

08/29/2005 08:14 AM




To
[EMAIL PROTECTED]


cc
openib-general@openib.org


Subject
[mgtwg] Payload Length in
first RMPP sent segment







Hi,

On the RMPP send side, while the Payload Length field in the last
segment is clear that it indicates the number of valid bytes in
Transferred Data, there seems to be some ambiguity in the optional
Payload Length field in the first segment. I think it can work either
way but I also think the intent was to reflect the valid bytes. Maybe it
is this way to allow flexibility (choice in the implementation). What is
the correct interpretation ? Should I enter a comment on this ? Thanks.

-- Hal

IBA 1.2 p.775 line 37

In the first packet of an RMPP transfer (RMPPFlags.First=1),
PayloadLength may indicate the sum of the lengths, in bytes, of the
TransferredData fields in all packets of the entire multipacket
response; this is done by using a nonzero value for PayloadLength in the
first packet. 

IBA 1.2 p. 776 line 8

In the last packet of an RMPP transfer (RMPPFlags.Last=1), PayloadLength
indicates the number of valid bytes in the TransferredData field,
allowing data transfers that are not an integral multiple of the length
of the TransferredData field. A transfer terminates when either: (a) a
packet containing RMPPFlags.Last=1 is received; or (b) a nonzero
PayloadLength was given in the first packet of a transfer, and a packet
is received containing sufficient TransferredData bytes to equal or
exceed the PayloadLength originally provided. If case (b) occurs and
RMPPFlags.Last is not 1 for that packet, the Receiver sends an ABORT
packet with RMPPStatus of Inconsistent Last and PayloadLength
and
terminates the transfer.






smime.p7s
Description: S/MIME Cryptographic Signature
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH] sdp: use linux/list.h in sdp_link.c

2005-08-29 Thread Tom Duffy


On Aug 29, 2005, at 8:00 AM, Michael S. Tsirkin wrote:


The following kills sdp_link.h and converts sdp_link.c to use linux/ 
list.h

Locking is still missing here.



Cool, cool.  I was going go get to this eventually.

I just got back from vacation and I am still waiting for a machine so  
I can setup a rudimentary IB network at home to test my code.  I have  
a patch that converts sdp_buff.[ch] to use linux/list.h (glad you  
didn't decide to work on that), but I want to test it before  
submitting to the list (it compiles!).


-tduffy

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] Re: [mgtwg] Payload Length in first RMPP sent segment

2005-08-29 Thread Hal Rosenstock
Hi Greg,

On Mon, 2005-08-29 at 18:56, Greg Pfister wrote:
 Hal,
 
 My take is that there's no ambiguity. Then again, I wrote it, so I
 would think that, right? :-)
 
 The idea is that we're trying to allow *either* of the usual two
 options for specifying a string of stuff: (a) Start out by giving the
 length; or (b) go until you reach a special mark meaning the end.

The latter being streaming mode.

 The thing is it gets complicated when there is only one packet. So
 take two cases: 1 packet, and ==1 packet.

It seems more complicated (perhaps 2 options when there is more than 1
packet).

 length 1 packet: 
 
 -- PayloadLength  0 on 1st packet means case (a). Just read until
 you get that many bytes, which may use only part of the last packet.
 If the last packet isn't also marked last, scream about inconsistency.

So if one is using this option, does the payload length in the 1st
packet reduced by 220 * (number of packets - 1) need to match the
payload length in the last packet ? That's a slightly different
inconsistency from the packet not being marked last but the original
length not exhausted.

 -- PayloadLength=0 on first packet - case (b). Read until you get a
 marked last packet. PayloadLength in that last packet tells you how
 many are valid in that packet (zero in that case -- I'm not sure;
 whole packet, I think).

For SA, wouldn't anything less than 20 would be an error in the last
packet ? If it were 20, it would be legal but an inefficient
implementation (as really the previous packet was full and could have
terminated the RMPP send).

 length ==1 packet meaning RMPPFlags.Last=1 and RMPPFlags.First=1 in
 the same packet. 
 
 -- Interpretation is the same as the last packet case above, i.e.,
 RMPPFlags.Last=1 dominates the interpretation.
 
 As far as I know, that's it. Any comments from others?
 
 (This may not forward to openib-general, since I'm not on that list;
 if it doesn't please forward.)

It made it to openib. It's an open list as far as posting goes.

Thanks.

-- Hal

 Greg Pfister
 IBM Distinguished Engineer, Member IBM Academy of Technology
 IBM Systems  Technology Group, Server Technology  Architecture
 (512) 838-8338 | IBM tieline 678-8338 | FAX (512) 838-3418
 Sic Crustulum Frangitur
 
 Hal Rosenstock [EMAIL PROTECTED]
 
 08/29/2005 08:14 AM
To
 [EMAIL PROTECTED]
cc
 openib-general@openib.org
   Subject
 [mgtwg] Payload
 Length in first
 RMPP sent segment
 
 
 
 Hi,
 
 On the RMPP send side, while the Payload Length field in the last
 segment is clear that it indicates the number of valid bytes in
 Transferred Data, there seems to be some ambiguity in the optional
 Payload Length field in the first segment. I think it can work either
 way but I also think the intent was to reflect the valid bytes. Maybe
 it
 is this way to allow flexibility (choice in the implementation). What
 is
 the correct interpretation ? Should I enter a comment on this ?
 Thanks.
 
 -- Hal
 
 IBA 1.2 p.775 line 37
 
 In the first packet of an RMPP transfer (RMPPFlags.First=1),
 PayloadLength may indicate the sum of the lengths, in bytes, of the
 TransferredData fields in all packets of the entire multipacket
 response; this is done by using a nonzero value for PayloadLength in
 the
 first packet. 
 
 IBA 1.2 p. 776 line 8
 
 In the last packet of an RMPP transfer (RMPPFlags.Last=1),
 PayloadLength
 indicates the number of valid bytes in the TransferredData field,
 allowing data transfers that are not an integral multiple of the
 length
 of the TransferredData field. A transfer terminates when either: (a) a
 packet containing RMPPFlags.Last=1 is received; or (b) a nonzero
 PayloadLength was given in the first packet of a transfer, and a
 packet
 is received containing sufficient TransferredData bytes to equal or
 exceed the PayloadLength originally provided. If case (b) occurs and
 RMPPFlags.Last is not 1 for that packet, the Receiver sends an ABORT
 packet with RMPPStatus of Inconsistent Last and PayloadLength and
 terminates the transfer.
 
 
 
 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] [MailServer Notification]To Recipient virus found and action taken.

2005-08-29 Thread Administrator
ScanMail for Microsoft Exchange has detected virus-infected attachment(s).

Sender = [EMAIL PROTECTED]
Recipient(s) = openib-general@openib.org
Subject = [openib-general] Your password has been successfully updated
Scanning time = 8/29/2005 10:08:35 PM
Engine/Pattern = 7.510-1002/2.805.00

Action on virus found:
The attachment account-password.zip contains WORM_MYTOB.EI virus. ScanMail has 
Deleted it. 

Warning to recipient. ScanMail has detected a virus.

8/29/2005
account-password.zip/Deleted 
openib-general@openib.org
[EMAIL PROTECTED]
[openib-general] Your password has been successfully updated
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [openib-general] rc ping pong error

2005-08-29 Thread Roland Dreier
viswanath I have the latest openib code on 2.16 machine, when I
viswanath run the rc pingpong program I get the following error
viswanath (The first time it passed, but subsequent ones got an
viswanath error, I tried changing the iteration count to a large
viswanath number, 10 after the first time)

I left ibv_rc_pingpong -n 10 running in a loop between two of my
machines with no problems, so there's something specific to your setup.

When you say latest openib code, what does this mean?  Are you
running something from subversion or a standard Linux kernel?  Do you
have 1-port or 2-port HCAs?  What HCA firmware version are you
running?

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general