date:20050531

RE: [openib-general] cmpost: failure sending REQ: -22

2005-05-31 Thread Sean Hefty

Has anyone seen ib_send_cm_req() return -22?

I believe that this is a timeout error, possibly indicating that the server
side of the connection wasn't running.  You may also want to verify the slid
and dlid are correct for your configuration.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] cmpost: failure sending REQ: -22

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 03:51, Sean Hefty wrote: 
 Has anyone seen ib_send_cm_req() return -22?
 
 I believe that this is a timeout error, possibly indicating that the server
 side of the connection wasn't running.  You may also want to verify the slid
 and dlid are correct for your configuration.

Don't you get a REJ now when there is no one listening on a service ID
requested ?

-22 is EINVAL. In terms of ib_send_cm_req, it is returned for a number
of cases:
1. peer to peer connection is requested
2. No primary path is supplied
3. QP is not RC or UC
4. private data is supplied and length  92
5. alternate path supplied and PKEY or MTU does not match primary path
6. connection state is not IDLE
7. Primary or alternate path SGID or PKey does not match those of port

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCH] [ib_at]: Update async structure prio r to returning requests to appropriate cache

2005-05-31 Thread Itamar Rabenstein

Thanks Hal,
this patch fixed the problem (oops in ib_at.c)

  Itamar

 -Original Message-
 From: Hal Rosenstock [mailto:[EMAIL PROTECTED]
 Sent: Monday, May 30, 2005 8:37 PM
 To: James Lentini
 Cc: openib-general@openib.org
 Subject: [openib-general] [PATCH] [ib_at]: Update async 
 structure prior
 to returning requests to appropriate cache
 
 
 [ib_at]: Update async structure prior to returning requests to
 appropriate cache. This change affacts req_free, free_route_req, and
 free_path_req.
 
 Also, some other minor changes to eliminate unneeded 
 parameter passed to
 path_req_output and changes to some DEBUG messages.
 
 Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]
 
 Index: at.c
 ===
 --- at.c  (revision 2507)
 +++ at.c  (working copy)
 @@ -155,7 +155,8 @@
  
  static void free_route_req(void *async);
  static void free_path_req(void *async);
 -static void path_req_complete(int stat, struct 
 ib_sa_path_rec *ret, void *ctx);
 +static void path_req_complete(int status, struct 
 ib_sa_path_rec *resp,
 +   void *context);
  static int resolve_path(struct path_req *req);
  
  static int resolve_ip(struct ib_at_src *src, u32 dst_ip, u32 src_ip,
 @@ -274,7 +275,6 @@
   }
  
   memset(dgid, 0, sizeof *dgid);
 -
   return 0;
  }
  
 @@ -319,11 +319,10 @@
   break;
   default:
   WARN(bad async req type %d, pend-type);
 + pend-status = IB_AT_STATUS_INVALID;
 + pend-type = IBAT_REQ_NONE;
 + pend-sa_query = NULL;
   }
 -
 - pend-status = IB_AT_STATUS_INVALID;
 - pend-type = IBAT_REQ_NONE;
 - pend-sa_query = NULL;
  }
  
  static int req_start(struct async *q, struct async *pend,
 @@ -464,6 +463,11 @@
   struct route_req *req = container_of(async, struct 
 route_req, pend);
  
   DEBUG(free async %p req %p, async, req);
 +
 + req-pend.status = IB_AT_STATUS_INVALID;
 + req-pend.type = IBAT_REQ_NONE;
 + req-pend.sa_query = NULL;
 +
   kmem_cache_free(route_req_cache, req);
  }
  
 @@ -472,6 +476,11 @@
   struct path_req *req = container_of(async, struct 
 path_req, pend);
  
   DEBUG(free async %p req %p, async, req);
 +
 + req-pend.status = IB_AT_STATUS_INVALID;
 + req-pend.type = IBAT_REQ_NONE;
 + req-pend.sa_query = NULL;
 +
   kmem_cache_free(path_req_cache, req);
  }
  
 @@ -537,15 +546,14 @@
   return 1;   /* one entry is filled */
  }
  
 -static int path_req_output(struct path_req *req, struct 
 ib_sa_path_rec *resp,
 -int npath, struct ib_sa_path_rec 
 *out, int nelem)
 +static int path_req_output(struct ib_sa_path_rec *resp, int npath,
 +struct ib_sa_path_rec *out, int nelem)
  {
   int n = min(npath, nelem);
  
 - DEBUG(parent %p output %d records, req, n);
 + DEBUG(fill ib_sa_path_rec %p output %d records, out, n);
  
   memcpy(out, resp, n * sizeof (struct ib_sa_path_rec));
 -
   return n;
  }
  
 @@ -579,7 +587,7 @@
   unsigned long flags;
   struct async *pend;
  
 - DEBUG(req %p, req);
 + DEBUG(req %p status %d, req, status);
  
   if (req-pend.parent) {
   WARN(for child req %p???, req);
 @@ -598,12 +606,12 @@
   return;
   }
  
 - req-pend.nelem = path_req_output(req, resp, 1,
 + req-pend.nelem = path_req_output(resp, 1,
 req-pend.data, 
 req-pend.nelem);
  
   spin_lock_irqsave(pending_reqs.lock, flags);
   for (pend = req-pend.waiting; pend; pend = pend-waiting)  
 - pend-nelem = path_req_output(req, resp, 1,
 + pend-nelem = path_req_output(resp, 1,
 pend-data, pend-nelem);
  
   req_end(req-pend, req-pend.nelem, NULL);
 @@ -876,7 +884,7 @@
   if (in_cache) {
   DEBUG(!in_cache free req %p, preq);
   kmem_cache_free(path_req_cache, preq);
 - return path_req_output(preq, cached_arr, n, 
 path_arr, npath);
 + return path_req_output(cached_arr, n, path_arr, npath);
   }
   */
  
 @@ -969,7 +977,6 @@
  EXPORT_SYMBOL(ib_at_status);
  
  
 -
  /*
   * Internal init/cleanup functions:
   */
 
 
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit 
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH][kdapl] replace spin_lock with spin_lock_irqsave in kdapltest

2005-05-31 Thread James Lentini


Itamar,

Why does this patch comment out uses of the g_PerfTestLock? 

james

On Sun, 29 May 2005, Itamar wrote:

itamar With this patch i can run kdapltest -T T ... -t 4 -w 8 ...
itamar I still see problems but in general this patch helps the stability a 
lot.
itamar 
itamar replace spin_lock with spin_lock_irqsave in kdapltest
itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED]
itamar 
itamar Index: test/dapl_transaction_stats.c
itamar ===
itamar --- test/dapl_transaction_stats.c   (revision 2509)
itamar +++ test/dapl_transaction_stats.c   (working copy)
itamar @@ -45,12 +45,13 @@
itamar  DT_transaction_stats_set_ready (DT_Tdep_Print_Head *phead, 
itamar Transaction_Stats_t * transaction_stats)
itamar  {
itamar -DT_Mdep_Lock (transaction_stats-lock);
itamar +   unsigned long flags;
itamar +spin_lock_irqsave (transaction_stats-lock,flags);
itamar  transaction_stats-wait_count--;
itamar  
itamar  DT_Tdep_PT_Debug (1,(phead,Received Sync Message from server (%d 
left)\n,
itamar transaction_stats-wait_count));
itamar -DT_Mdep_Unlock (transaction_stats-lock);
itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
itamar  }
itamar  
itamar  boolean_t
itamar @@ -86,7 +87,8 @@
itamarunsigned int bytes_rdma_read,
itamarunsigned int bytes_rdma_write)
itamar  {
itamar -DT_Mdep_Lock (transaction_stats-lock);
itamar +   unsigned long flags;
itamar +spin_lock_irqsave (transaction_stats-lock,flags);
itamar  
itamar  /* look for the longest time... */
itamar  if (time_ms  transaction_stats-time_ms)
itamar @@ -99,5 +101,5 @@
itamar  transaction_stats-bytes_recv += bytes_recv;
itamar  transaction_stats-bytes_rdma_read += bytes_rdma_read;
itamar  transaction_stats-bytes_rdma_write += bytes_rdma_write;
itamar -DT_Mdep_Unlock (transaction_stats-lock);
itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
itamar  }
itamar Index: test/dapl_server.c
itamar ===
itamar --- test/dapl_server.c  (revision 2509)
itamar +++ test/dapl_server.c  (working copy)
itamar @@ -49,7 +49,7 @@
itamar  unsigned char   *buffp  = NULL;
itamar  unsigned char   *module = DT_cs_Server;
itamar  intstatus  = 0;
itamar -
itamar +   unsigned long flags;
itamar  DAT_DTO_COOKIE dto_cookie;
itamar  struct dat_dto_completion_event_data dto_stat;
itamar  u32  ret;
itamar @@ -616,9 +616,9 @@
itamar  
itamar  
itamar /* Count this new client and get ready for the next */
itamar -   DT_Mdep_Lock (ps_ptr-num_clients_lock);
itamar +   spin_lock_irqsave (ps_ptr-num_clients_lock,flags);
itamar ps_ptr-num_clients++;
itamar -   DT_Mdep_Unlock (ps_ptr-num_clients_lock);
itamar +   spin_unlock_irqrestore (ps_ptr-num_clients_lock,flags);
itamar  
itamar /* we passed the pt_ptr to the thread and must now 'forget' it 
*/
itamar pt_ptr = NULL;
itamar Index: test/dapl_thread.c
itamar ===
itamar --- test/dapl_thread.c  (revision 2509)
itamar +++ test/dapl_thread.c  (working copy)
itamar @@ -83,6 +83,7 @@
itamar   unsigned int stacksize)
itamar  {
itamar  Thread *thread_ptr;
itamar +   unsigned long flags;
itamar  thread_ptr = (Thread *) DT_MemListAlloc (pt_ptr, thread.c, 
THREAD, sizeof (Thread));
itamar  if (thread_ptr == NULL)
itamar  {
itamar @@ -93,9 +94,9 @@
itamar  thread_ptr-thread_handle = 0;
itamar  thread_ptr-stacksize = stacksize;
itamar  
itamar -DT_Mdep_Lock (pt_ptr-Thread_counter_lock);
itamar +spin_lock_irqsave (pt_ptr-Thread_counter_lock,flags);
itamar  pt_ptr-Thread_counter++;
itamar -DT_Mdep_Unlock (pt_ptr-Thread_counter_lock);
itamar +spin_unlock_irqrestore (pt_ptr-Thread_counter_lock,flags);
itamar  
itamar  DT_Mdep_Thread_Init_Attributes (thread_ptr);
itamar  
itamar @@ -108,11 +109,12 @@
itamar  void
itamar  DT_Thread_Destroy (Thread * thread_ptr, Per_Test_Data_t * pt_ptr)
itamar  {
itamar +   unsigned long flags;
itamar  if (thread_ptr)
itamar  {
itamar -   DT_Mdep_Lock (pt_ptr-Thread_counter_lock);
itamar +   spin_lock_irqsave (pt_ptr-Thread_counter_lock,flags);
itamar pt_ptr-Thread_counter--;
itamar -   DT_Mdep_Unlock (pt_ptr-Thread_counter_lock);
itamar +   spin_unlock_irqrestore (pt_ptr-Thread_counter_lock,flags);
itamar  
itamar DT_Mdep_Thread_Destroy_Attributes (thread_ptr);
itamar DT_MemListFree (pt_ptr, thread_ptr);
itamar Index: test/dapl_test_data.c
itamar ===
itamar --- test/dapl_test_data.c   (revision 2509)
itamar +++

Re: [openib-general] Problem compiling userspace driver.

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 10:59, Gleb Natapov wrote:
 Hello, 
 
 I am trying to compile libmthca but I get following error:
 src/mthca.c:101: error: unknown field `query_gid' specified in initializer
 src/mthca.c:101: warning: initialization from incompatible pointer type
 src/mthca.c:102: error: unknown field `query_pkey' specified in initializer
 src/mthca.c:102: warning: initialization from incompatible pointer type

Also:
src/mthca.c:115: unknown field `attach_mcast' specified in initializer
src/mthca.c:115: warning: excess elements in struct initializer
src/mthca.c:115: warning: (near initialization for `mthca_ctx_ops')
src/mthca.c:116: unknown field `detach_mcast' specified in initializer
src/mthca.c:117: warning: excess elements in struct initializer

 Those fields indeed are missing in verbs.h.
 
 If I remove those two lines driver compiles but when I run ibv_devices I
 get:
 libibverbs: Warning: no userspace device-specific driver found for
 uverbs0
 driver search path: /home/glebn/OpenIB/install/lib/infiniband
 
 $ ls /home/glebn/OpenIB/install/lib/infiniband
 mthca.a  mthca.la  mthca.so

Did you modprobe ib_uverbs ?

-- Hal

 Any help?
 
 --
   Gleb.
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

SPAM:Re: [openib-general] Problem compiling userspace driver.

2005-05-31 Thread Gleb Natapov

On Tue, May 31, 2005 at 11:09:58AM -0400, Hal Rosenstock wrote:
 On Tue, 2005-05-31 at 10:59, Gleb Natapov wrote:
  Hello, 
  
  I am trying to compile libmthca but I get following error:
  src/mthca.c:101: error: unknown field `query_gid' specified in initializer
  src/mthca.c:101: warning: initialization from incompatible pointer type
  src/mthca.c:102: error: unknown field `query_pkey' specified in initializer
  src/mthca.c:102: warning: initialization from incompatible pointer type
 
 Also:
 src/mthca.c:115: unknown field `attach_mcast' specified in initializer
 src/mthca.c:115: warning: excess elements in struct initializer
 src/mthca.c:115: warning: (near initialization for `mthca_ctx_ops')
 src/mthca.c:116: unknown field `detach_mcast' specified in initializer
 src/mthca.c:117: warning: excess elements in struct initializer
 
Right, but those only warnings.

  Those fields indeed are missing in verbs.h.
  
  If I remove those two lines driver compiles but when I run ibv_devices I
  get:
  libibverbs: Warning: no userspace device-specific driver found for
  uverbs0
  driver search path: /home/glebn/OpenIB/install/lib/infiniband
  
  $ ls /home/glebn/OpenIB/install/lib/infiniband
  mthca.a  mthca.la  mthca.so
 
 Did you modprobe ib_uverbs ?
Yes. I had another error before I did this.


--
Gleb.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH] [ib_at]: Update async structure prior to returning requests to appropriate cache

2005-05-31 Thread James Lentini


Committed in revision 2513.

On Mon, 30 May 2005, Hal Rosenstock wrote:

halr [ib_at]: Update async structure prior to returning requests to
halr appropriate cache. This change affacts req_free, free_route_req, and
halr free_path_req.
halr 
halr Also, some other minor changes to eliminate unneeded parameter passed to
halr path_req_output and changes to some DEBUG messages.
halr 
halr Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]
halr 
halr Index: at.c
halr ===
halr --- at.c  (revision 2507)
halr +++ at.c  (working copy)
halr @@ -155,7 +155,8 @@
halr  
halr  static void free_route_req(void *async);
halr  static void free_path_req(void *async);
halr -static void path_req_complete(int stat, struct ib_sa_path_rec *ret, void 
*ctx);
halr +static void path_req_complete(int status, struct ib_sa_path_rec *resp,
halr +   void *context);
halr  static int resolve_path(struct path_req *req);
halr  
halr  static int resolve_ip(struct ib_at_src *src, u32 dst_ip, u32 src_ip,
halr @@ -274,7 +275,6 @@
halr   }
halr  
halr   memset(dgid, 0, sizeof *dgid);
halr -
halr   return 0;
halr  }
halr  
halr @@ -319,11 +319,10 @@
halr   break;
halr   default:
halr   WARN(bad async req type %d, pend-type);
halr + pend-status = IB_AT_STATUS_INVALID;
halr + pend-type = IBAT_REQ_NONE;
halr + pend-sa_query = NULL;
halr   }
halr -
halr - pend-status = IB_AT_STATUS_INVALID;
halr - pend-type = IBAT_REQ_NONE;
halr - pend-sa_query = NULL;
halr  }
halr  
halr  static int req_start(struct async *q, struct async *pend,
halr @@ -464,6 +463,11 @@
halr   struct route_req *req = container_of(async, struct route_req, pend);
halr  
halr   DEBUG(free async %p req %p, async, req);
halr +
halr + req-pend.status = IB_AT_STATUS_INVALID;
halr + req-pend.type = IBAT_REQ_NONE;
halr + req-pend.sa_query = NULL;
halr +
halr   kmem_cache_free(route_req_cache, req);
halr  }
halr  
halr @@ -472,6 +476,11 @@
halr   struct path_req *req = container_of(async, struct path_req, pend);
halr  
halr   DEBUG(free async %p req %p, async, req);
halr +
halr + req-pend.status = IB_AT_STATUS_INVALID;
halr + req-pend.type = IBAT_REQ_NONE;
halr + req-pend.sa_query = NULL;
halr +
halr   kmem_cache_free(path_req_cache, req);
halr  }
halr  
halr @@ -537,15 +546,14 @@
halr   return 1;   /* one entry is filled */
halr  }
halr  
halr -static int path_req_output(struct path_req *req, struct ib_sa_path_rec 
*resp,
halr -int npath, struct ib_sa_path_rec *out, int nelem)
halr +static int path_req_output(struct ib_sa_path_rec *resp, int npath,
halr +struct ib_sa_path_rec *out, int nelem)
halr  {
halr   int n = min(npath, nelem);
halr  
halr - DEBUG(parent %p output %d records, req, n);
halr + DEBUG(fill ib_sa_path_rec %p output %d records, out, n);
halr  
halr   memcpy(out, resp, n * sizeof (struct ib_sa_path_rec));
halr -
halr   return n;
halr  }
halr  
halr @@ -579,7 +587,7 @@
halr   unsigned long flags;
halr   struct async *pend;
halr  
halr - DEBUG(req %p, req);
halr + DEBUG(req %p status %d, req, status);
halr  
halr   if (req-pend.parent) {
halr   WARN(for child req %p???, req);
halr @@ -598,12 +606,12 @@
halr   return;
halr   }
halr  
halr - req-pend.nelem = path_req_output(req, resp, 1,
halr + req-pend.nelem = path_req_output(resp, 1,
halr req-pend.data, req-pend.nelem);
halr  
halr   spin_lock_irqsave(pending_reqs.lock, flags);
halr   for (pend = req-pend.waiting; pend; pend = pend-waiting)  
halr - pend-nelem = path_req_output(req, resp, 1,
halr + pend-nelem = path_req_output(resp, 1,
halr pend-data, pend-nelem);
halr  
halr   req_end(req-pend, req-pend.nelem, NULL);
halr @@ -876,7 +884,7 @@
halr   if (in_cache) {
halr   DEBUG(!in_cache free req %p, preq);
halr   kmem_cache_free(path_req_cache, preq);
halr - return path_req_output(preq, cached_arr, n, path_arr, npath);
halr + return path_req_output(cached_arr, n, path_arr, npath);
halr   }
halr   */
halr  
halr @@ -969,7 +977,6 @@
halr  EXPORT_SYMBOL(ib_at_status);
halr  
halr  
halr -
halr  /*
halr   * Internal init/cleanup functions:
halr   */
halr 
halr 
halr 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Re: [PATCH][kdapl] fix fatal bug in triger the evd upcall

2005-05-31 Thread James Lentini


Committed in revision 2514.

On Sun, 29 May 2005, Itamar wrote:

itamar Hi James,
itamar 
itamar This patch fix a fatal bug that exist in current lastet bits in kdapl 
(svn rev 2507)
itamar As you can see we need to triger the upcall when dapl_evd_dequeue 
return with good status
itamar and quit the method when dapl_evd_dequeue return with non zero status 
which mean queue is empty.
itamar In the current bits no kdapltest can run even the simple quit test.
itamar 
itamar Please in the future before you commit changes to the svn run a simple 
regression.
itamar Any way with this patch the code is working again.
itamar 
itamar fix fatal bug in triger the evd upcall 
itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED]
itamar 
itamar Index: dapl_cno_util.c
itamar ===
itamar --- dapl_cno_util.c (revision 2509)
itamar +++ dapl_cno_util.c (working copy)
itamar @@ -115,12 +115,8 @@
itamar  
itamar for (;;) {
itamar status = dapl_evd_dequeue((DAT_EVD_HANDLE)evd, event);
itamar -   if (DAT_SUCCESS == status) {
itamar -   dapl_dbg_log(DAPL_DBG_TYPE_ERR, 
itamar -dapl_evd_dequeue failed: %x\n, 
status);
itamar +   if (DAT_SUCCESS != status)
itamar return;
itamar -   }
itamar -
itamar 
cno-cno_upcall.upcall_func(cno-cno_upcall.instance_data,
itamar event, FALSE);
itamar }
itamar -- 
itamar Itamar
itamar 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] RE: [PATCH][kdapl] replace spin_lock with spin_lock_irqsave in kd apltest

2005-05-31 Thread Itamar Rabenstein

it is only declared not in use so we dont need it (;-)

  Itamar

 -Original Message-
 From: James Lentini [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 31, 2005 5:46 PM
 To: Itamar
 Cc: openib-general
 Subject: Re: [PATCH][kdapl] replace spin_lock with 
 spin_lock_irqsave in
 kdapltest

 Itamar,

 Why does this patch comment out uses of the g_PerfTestLock? 

 james

 On Sun, 29 May 2005, Itamar wrote:

 itamar With this patch i can run kdapltest -T T ... -t 4 -w 8 ...
 itamar I still see problems but in general this patch helps 
 the stability a lot.
 itamar 
 itamar replace spin_lock with spin_lock_irqsave in kdapltest
 itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED]
 itamar 
 itamar Index: test/dapl_transaction_stats.c
 itamar 
 ===
 itamar --- test/dapl_transaction_stats.c (revision 2509)
 itamar +++ test/dapl_transaction_stats.c (working copy)
 itamar @@ -45,12 +45,13 @@
 itamar  DT_transaction_stats_set_ready (DT_Tdep_Print_Head *phead, 
 itamar   Transaction_Stats_t * 
 transaction_stats)
 itamar  {
 itamar -DT_Mdep_Lock (transaction_stats-lock);
 itamar + unsigned long flags;
 itamar +spin_lock_irqsave (transaction_stats-lock,flags);
 itamar  transaction_stats-wait_count--;
 itamar  
 itamar  DT_Tdep_PT_Debug (1,(phead,Received Sync 
 Message from server (%d left)\n,
 itamar   transaction_stats-wait_count));
 itamar -DT_Mdep_Unlock (transaction_stats-lock);
 itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
 itamar  }
 itamar  
 itamar  boolean_t
 itamar @@ -86,7 +87,8 @@
 itamar  unsigned int bytes_rdma_read,
 itamar  unsigned int bytes_rdma_write)
 itamar  {
 itamar -DT_Mdep_Lock (transaction_stats-lock);
 itamar + unsigned long flags;
 itamar +spin_lock_irqsave (transaction_stats-lock,flags);
 itamar  
 itamar  /* look for the longest time... */
 itamar  if (time_ms  transaction_stats-time_ms)
 itamar @@ -99,5 +101,5 @@
 itamar  transaction_stats-bytes_recv += bytes_recv;
 itamar  transaction_stats-bytes_rdma_read += bytes_rdma_read;
 itamar  transaction_stats-bytes_rdma_write += bytes_rdma_write;
 itamar -DT_Mdep_Unlock (transaction_stats-lock);
 itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
 itamar  }
 itamar Index: test/dapl_server.c
 itamar 
 ===
 itamar --- test/dapl_server.c(revision 2509)
 itamar +++ test/dapl_server.c(working copy)
 itamar @@ -49,7 +49,7 @@
 itamar  unsigned char   *buffp  = NULL;
 itamar  unsigned char   *module = DT_cs_Server;
 itamar  int  status  = 0;
 itamar -
 itamar + unsigned long flags;
 itamar  DAT_DTO_COOKIE   dto_cookie;
 itamar  struct dat_dto_completion_event_data dto_stat;
 itamar  u32  ret;
 itamar @@ -616,9 +616,9 @@
 itamar  
 itamar  
 itamar   /* Count this new client and get ready for the next */
 itamar - DT_Mdep_Lock (ps_ptr-num_clients_lock);
 itamar + spin_lock_irqsave (ps_ptr-num_clients_lock,flags);
 itamar   ps_ptr-num_clients++;
 itamar - DT_Mdep_Unlock (ps_ptr-num_clients_lock);
 itamar + spin_unlock_irqrestore 
 (ps_ptr-num_clients_lock,flags);
 itamar  
 itamar   /* we passed the pt_ptr to the thread and must 
 now 'forget' it */
 itamar   pt_ptr = NULL;
 itamar Index: test/dapl_thread.c
 itamar 
 ===
 itamar --- test/dapl_thread.c(revision 2509)
 itamar +++ test/dapl_thread.c(working copy)
 itamar @@ -83,6 +83,7 @@
 itamar unsigned int stacksize)
 itamar  {
 itamar  Thread *thread_ptr;
 itamar + unsigned long flags;
 itamar  thread_ptr = (Thread *) DT_MemListAlloc (pt_ptr, 
 thread.c, THREAD, sizeof (Thread));
 itamar  if (thread_ptr == NULL)
 itamar  {
 itamar @@ -93,9 +94,9 @@
 itamar  thread_ptr-thread_handle = 0;
 itamar  thread_ptr-stacksize = stacksize;
 itamar  
 itamar -DT_Mdep_Lock (pt_ptr-Thread_counter_lock);
 itamar +spin_lock_irqsave (pt_ptr-Thread_counter_lock,flags);
 itamar  pt_ptr-Thread_counter++;
 itamar -DT_Mdep_Unlock (pt_ptr-Thread_counter_lock);
 itamar +spin_unlock_irqrestore 
 (pt_ptr-Thread_counter_lock,flags);
 itamar  
 itamar  DT_Mdep_Thread_Init_Attributes (thread_ptr);
 itamar  
 itamar @@ -108,11 +109,12 @@
 itamar  void
 itamar  DT_Thread_Destroy (Thread * thread_ptr, 
 Per_Test_Data_t * pt_ptr)
 itamar  {
 itamar + unsigned long flags;
 itamar  if (thread_ptr)
 itamar  {
 itamar - DT_Mdep_Lock (pt_ptr-Thread_counter_lock);
 itamar + spin_lock_irqsave (pt_ptr-Thread_counter_lock,flags);
 itamar   pt_ptr-Thread_counter--;
 itamar - DT_Mdep_Unlock

[openib-general] Re: [PATCH] kDAPL: remove typedef DAT_CONTEXT

2005-05-31 Thread James Lentini


Mostly committed in revision 2515. 

I didn't remove DAT_UPCALL_NULL and DAT_UPCALL_SAME. DAT_UPCALL_NULL 
is provided as a convenience to the consumer. I think it is useful, but 
I'm willing to hear other opinions. The provider's implementation of 
dat_evd_modify_upcall() should check for the DAT_UPCALL_SAME value. 
The fact that it doesn't is a bug.

james

On Fri, 27 May 2005, Tom Duffy wrote:

tduffy Get rid of the typedef DAT_CONTEXT.
tduffy 
tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED]
tduffy 
tduffy Index: linux-kernel/test/dapltest/include/dapl_common.h
tduffy ===
tduffy --- linux-kernel/test/dapltest/include/dapl_common.h(revision 2506)
tduffy +++ linux-kernel/test/dapltest/include/dapl_common.h(working copy)
tduffy @@ -42,7 +42,7 @@ typedef enum
tduffy  typedef struct
tduffy  {
tduffy  DAT_RMR_CONTEXT rmr_context;
tduffy -DAT_CONTEXT mem_address;
tduffy +union dat_context mem_address;
tduffy  } RemoteMemoryInfo;
tduffy  #pragma pack()
tduffy  
tduffy Index: linux-kernel/dat-provider/dapl_get_consumer_context.c
tduffy ===
tduffy --- linux-kernel/dat-provider/dapl_get_consumer_context.c   
(revision 2506)
tduffy +++ linux-kernel/dat-provider/dapl_get_consumer_context.c   
(working copy)
tduffy @@ -48,7 +48,7 @@
tduffy   * DAT_SUCCESS
tduffy   * DAT_INVALID_PARAMETER
tduffy   */
tduffy -u32 dapl_get_consumer_context(DAT_HANDLE dat_handle, DAT_CONTEXT 
*context)
tduffy +u32 dapl_get_consumer_context(DAT_HANDLE dat_handle, union dat_context 
*context)
tduffy  {
tduffy u32 dat_status = DAT_SUCCESS;
tduffy struct dapl_header *header;
tduffy Index: linux-kernel/dat-provider/dapl_set_consumer_context.c
tduffy ===
tduffy --- linux-kernel/dat-provider/dapl_set_consumer_context.c   
(revision 2506)
tduffy +++ linux-kernel/dat-provider/dapl_set_consumer_context.c   
(working copy)
tduffy @@ -47,7 +47,7 @@
tduffy   * DAT_SUCCESS
tduffy   * DAT_INVALID_HANDLE
tduffy   */
tduffy -u32 dapl_set_consumer_context(DAT_HANDLE dat_handle, DAT_CONTEXT 
context)
tduffy +u32 dapl_set_consumer_context(DAT_HANDLE dat_handle, union dat_context 
context)
tduffy  {
tduffy u32 dat_status = DAT_SUCCESS;
tduffy struct dapl_header *header;
tduffy Index: linux-kernel/dat-provider/dapl.h
tduffy ===
tduffy --- linux-kernel/dat-provider/dapl.h(revision 2506)
tduffy +++ linux-kernel/dat-provider/dapl.h(working copy)
tduffy @@ -177,7 +177,7 @@ struct dapl_header {
tduffy enum dat_handle_type handle_type; 
tduffy struct dapl_ia *owner_ia;
tduffy struct dapl_llist_entry ia_list_entry;  
tduffy -   DAT_CONTEXT user_context; /* user context - opaque to 
DAPL */
tduffy +   union dat_context user_context;   /* user context - opaque to 
DAPL */
tduffy spinlock_t lock;
tduffy unsigned long flags;  /* saved lock flag values */
tduffy  };
tduffy @@ -423,9 +423,11 @@ extern u32 dapl_ia_query(DAT_IA_HANDLE, 
tduffy  
tduffy  /* helper functions */
tduffy  
tduffy -extern u32 dapl_set_consumer_context(DAT_HANDLE handle, DAT_CONTEXT 
context);
tduffy +extern u32 dapl_set_consumer_context(DAT_HANDLE handle,
tduffy +union dat_context context);
tduffy  
tduffy -extern u32 dapl_get_consumer_context(DAT_HANDLE handle, DAT_CONTEXT 
*context);
tduffy +extern u32 dapl_get_consumer_context(DAT_HANDLE handle,
tduffy +union dat_context *context);
tduffy  
tduffy  extern u32 dapl_get_handle_type(DAT_HANDLE handle,
tduffy enum dat_handle_type *type);
tduffy Index: linux-kernel/dat/dat.h
tduffy ===
tduffy --- linux-kernel/dat/dat.h  (revision 2506)
tduffy +++ linux-kernel/dat/dat.h  (working copy)
tduffy @@ -361,14 +361,14 @@ typedef enum {
tduffy TRUE = 1
tduffy  } boolean_t;
tduffy  
tduffy -typedef union dat_context {
tduffy +union dat_context {
tduffy void *as_ptr;
tduffy u64 as_64;
tduffy unsigned long long as_index;
tduffy -} DAT_CONTEXT;
tduffy +};
tduffy  
tduffy -typedef DAT_CONTEXT DAT_DTO_COOKIE;
tduffy -typedef DAT_CONTEXT DAT_RMR_COOKIE;
tduffy +typedef union dat_context DAT_DTO_COOKIE;
tduffy +typedef union dat_context DAT_RMR_COOKIE;
tduffy  
tduffy  enum dat_completion_flags {
tduffy /* Completes with notification  
   */
tduffy @@ -920,13 +920,6 @@ struct dat_upcall_object {
tduffy DAT_UPCALL_FUNC upcall_func;
tduffy  };
tduffy  
tduffy -/* Define NULL upcall */
tduffy -
tduffy -#define DAT_UPCALL_NULL \
tduffy -   ((struct dat_upcall_object) {

[openib-general] RE: [PATCH][kdapl] replace spin_lock with spin_lock_irqsave in kd apltest

2005-05-31 Thread James Lentini


Ok, then it should be removed completely not commented out. I'll do 
that and commit.

On Tue, 31 May 2005, Itamar Rabenstein wrote:

itamar it is only declared not in use so we dont need it (;-)
itamar 
itamar   Itamar
itamar 
itamar  -Original Message-
itamar  From: James Lentini [mailto:[EMAIL PROTECTED]
itamar  Sent: Tuesday, May 31, 2005 5:46 PM
itamar  To: Itamar
itamar  Cc: openib-general
itamar  Subject: Re: [PATCH][kdapl] replace spin_lock with 
itamar  spin_lock_irqsave in
itamar  kdapltest
itamar  
itamar  
itamar  
itamar  Itamar,
itamar  
itamar  Why does this patch comment out uses of the g_PerfTestLock? 
itamar  
itamar  james
itamar  
itamar  On Sun, 29 May 2005, Itamar wrote:
itamar  
itamar  itamar With this patch i can run kdapltest -T T ... -t 4 -w 8 ...
itamar  itamar I still see problems but in general this patch helps 
itamar  the stability a lot.
itamar  itamar 
itamar  itamar replace spin_lock with spin_lock_irqsave in kdapltest
itamar  itamar Signed-off-by: Itamar Rabenstein [EMAIL PROTECTED]
itamar  itamar 
itamar  itamar Index: test/dapl_transaction_stats.c
itamar  itamar 
itamar  ===
itamar  itamar --- test/dapl_transaction_stats.c (revision 2509)
itamar  itamar +++ test/dapl_transaction_stats.c (working copy)
itamar  itamar @@ -45,12 +45,13 @@
itamar  itamar  DT_transaction_stats_set_ready (DT_Tdep_Print_Head *phead, 
itamar  itamar   Transaction_Stats_t * 
itamar  transaction_stats)
itamar  itamar  {
itamar  itamar -DT_Mdep_Lock (transaction_stats-lock);
itamar  itamar + unsigned long flags;
itamar  itamar +spin_lock_irqsave (transaction_stats-lock,flags);
itamar  itamar  transaction_stats-wait_count--;
itamar  itamar  
itamar  itamar  DT_Tdep_PT_Debug (1,(phead,Received Sync 
itamar  Message from server (%d left)\n,
itamar  itamar   transaction_stats-wait_count));
itamar  itamar -DT_Mdep_Unlock (transaction_stats-lock);
itamar  itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
itamar  itamar  }
itamar  itamar  
itamar  itamar  boolean_t
itamar  itamar @@ -86,7 +87,8 @@
itamar  itamar  unsigned int bytes_rdma_read,
itamar  itamar  unsigned int bytes_rdma_write)
itamar  itamar  {
itamar  itamar -DT_Mdep_Lock (transaction_stats-lock);
itamar  itamar + unsigned long flags;
itamar  itamar +spin_lock_irqsave (transaction_stats-lock,flags);
itamar  itamar  
itamar  itamar  /* look for the longest time... */
itamar  itamar  if (time_ms  transaction_stats-time_ms)
itamar  itamar @@ -99,5 +101,5 @@
itamar  itamar  transaction_stats-bytes_recv += bytes_recv;
itamar  itamar  transaction_stats-bytes_rdma_read += bytes_rdma_read;
itamar  itamar  transaction_stats-bytes_rdma_write += bytes_rdma_write;
itamar  itamar -DT_Mdep_Unlock (transaction_stats-lock);
itamar  itamar +spin_unlock_irqrestore (transaction_stats-lock,flags);
itamar  itamar  }
itamar  itamar Index: test/dapl_server.c
itamar  itamar 
itamar  ===
itamar  itamar --- test/dapl_server.c(revision 2509)
itamar  itamar +++ test/dapl_server.c(working copy)
itamar  itamar @@ -49,7 +49,7 @@
itamar  itamar  unsigned char   *buffp  = NULL;
itamar  itamar  unsigned char   *module = DT_cs_Server;
itamar  itamar  int  status  = 0;
itamar  itamar -
itamar  itamar + unsigned long flags;
itamar  itamar  DAT_DTO_COOKIE   dto_cookie;
itamar  itamar  struct dat_dto_completion_event_data dto_stat;
itamar  itamar  u32  ret;
itamar  itamar @@ -616,9 +616,9 @@
itamar  itamar  
itamar  itamar  
itamar  itamar   /* Count this new client and get ready for the next */
itamar  itamar - DT_Mdep_Lock (ps_ptr-num_clients_lock);
itamar  itamar + spin_lock_irqsave (ps_ptr-num_clients_lock,flags);
itamar  itamar   ps_ptr-num_clients++;
itamar  itamar - DT_Mdep_Unlock (ps_ptr-num_clients_lock);
itamar  itamar + spin_unlock_irqrestore 
itamar  (ps_ptr-num_clients_lock,flags);
itamar  itamar  
itamar  itamar   /* we passed the pt_ptr to the thread and must 
itamar  now 'forget' it */
itamar  itamar   pt_ptr = NULL;
itamar  itamar Index: test/dapl_thread.c
itamar  itamar 
itamar  ===
itamar  itamar --- test/dapl_thread.c(revision 2509)
itamar  itamar +++ test/dapl_thread.c(working copy)
itamar  itamar @@ -83,6 +83,7 @@
itamar  itamar unsigned int stacksize)
itamar  itamar  {
itamar  itamar  Thread *thread_ptr;
itamar  itamar + unsigned long flags;
itamar  itamar  thread_ptr = (Thread *) DT_MemListAlloc (pt_ptr, 
itamar  thread.c, THREAD, sizeof (Thread));
itamar  itamar  if (thread_ptr == NULL)
itamar  itamar  {

Re: [openib-general] Problem compiling userspace driver.

2005-05-31 Thread Roland Dreier

Sorry, I had some uncommitted changes left in my tree.  So of course I
didn't see any problems.

I just checked in the required libibverbs changes.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] Re: [PATCH] kDAPL: remove typedef DAT_CONTEX T

2005-05-31 Thread Itamar Rabenstein

hi all,
I have tried to use DAT_UPCALL_NULL and I got compile error
and I don't think that it is good to try to make comparator ( = ) between
structs
if we want to check for DAT_UPCALL_NULL we need to check that the CB
function pointer is NULL.
I mean that if you want to use DAT_UPCALL_NULL you need to have 
real dat_upcall struct and to set the CB function to NULL.
instead of casting NULL to be a struct.

Currently dat_evd_modify_upcall() is not implemented according the spec.

 Itamar  

 -Original Message-
 From: James Lentini [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, May 31, 2005 6:39 PM
 To: Tom Duffy
 Cc: openib-general@openib.org
 Subject: [openib-general] Re: [PATCH] kDAPL: remove typedef 
 DAT_CONTEXT
 
 
 
 Mostly committed in revision 2515. 
 
 I didn't remove DAT_UPCALL_NULL and DAT_UPCALL_SAME. DAT_UPCALL_NULL 
 is provided as a convenience to the consumer. I think it is 
 useful, but 
 I'm willing to hear other opinions. The provider's implementation of 
 dat_evd_modify_upcall() should check for the DAT_UPCALL_SAME value. 
 The fact that it doesn't is a bug.
 
 james
 
 On Fri, 27 May 2005, Tom Duffy wrote:
 
 tduffy Get rid of the typedef DAT_CONTEXT.
 tduffy 
 tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED]
 tduffy 
 tduffy Index: linux-kernel/test/dapltest/include/dapl_common.h
 tduffy 
 ===
 tduffy --- linux-kernel/test/dapltest/include/dapl_common.h  
 (revision 2506)
 tduffy +++ linux-kernel/test/dapltest/include/dapl_common.h  
 (working copy)
 tduffy @@ -42,7 +42,7 @@ typedef enum
 tduffy  typedef struct
 tduffy  {
 tduffy  DAT_RMR_CONTEXT rmr_context;
 tduffy -DAT_CONTEXT mem_address;
 tduffy +union dat_context mem_address;
 tduffy  } RemoteMemoryInfo;
 tduffy  #pragma pack()
 tduffy  
 tduffy Index: linux-kernel/dat-provider/dapl_get_consumer_context.c
 tduffy 
 ===
 tduffy --- 
 linux-kernel/dat-provider/dapl_get_consumer_context.c (revision 2506)
 tduffy +++ 
 linux-kernel/dat-provider/dapl_get_consumer_context.c (working copy)
 tduffy @@ -48,7 +48,7 @@
 tduffy   *   DAT_SUCCESS
 tduffy   *   DAT_INVALID_PARAMETER
 tduffy   */
 tduffy -u32 dapl_get_consumer_context(DAT_HANDLE dat_handle, 
 DAT_CONTEXT *context)
 tduffy +u32 dapl_get_consumer_context(DAT_HANDLE dat_handle, 
 union dat_context *context)
 tduffy  {
 tduffy   u32 dat_status = DAT_SUCCESS;
 tduffy   struct dapl_header *header;
 tduffy Index: linux-kernel/dat-provider/dapl_set_consumer_context.c
 tduffy 
 ===
 tduffy --- 
 linux-kernel/dat-provider/dapl_set_consumer_context.c (revision 2506)
 tduffy +++ 
 linux-kernel/dat-provider/dapl_set_consumer_context.c (working copy)
 tduffy @@ -47,7 +47,7 @@
 tduffy   *   DAT_SUCCESS
 tduffy   *   DAT_INVALID_HANDLE
 tduffy   */
 tduffy -u32 dapl_set_consumer_context(DAT_HANDLE dat_handle, 
 DAT_CONTEXT context)
 tduffy +u32 dapl_set_consumer_context(DAT_HANDLE dat_handle, 
 union dat_context context)
 tduffy  {
 tduffy   u32 dat_status = DAT_SUCCESS;
 tduffy   struct dapl_header *header;
 tduffy Index: linux-kernel/dat-provider/dapl.h
 tduffy 
 ===
 tduffy --- linux-kernel/dat-provider/dapl.h  (revision 2506)
 tduffy +++ linux-kernel/dat-provider/dapl.h  (working copy)
 tduffy @@ -177,7 +177,7 @@ struct dapl_header {
 tduffy   enum dat_handle_type handle_type; 
 tduffy   struct dapl_ia *owner_ia;
 tduffy   struct dapl_llist_entry ia_list_entry;  
 tduffy - DAT_CONTEXT user_context; /* user 
 context - opaque to DAPL */
 tduffy + union dat_context user_context;   /* user 
 context - opaque to DAPL */
 tduffy   spinlock_t lock;
 tduffy   unsigned long flags;  /* saved lock 
 flag values */
 tduffy  };
 tduffy @@ -423,9 +423,11 @@ extern u32 dapl_ia_query(DAT_IA_HANDLE, 
 tduffy  
 tduffy  /* helper functions */
 tduffy  
 tduffy -extern u32 dapl_set_consumer_context(DAT_HANDLE 
 handle, DAT_CONTEXT context);
 tduffy +extern u32 dapl_set_consumer_context(DAT_HANDLE handle,
 tduffy +  union dat_context context);
 tduffy  
 tduffy -extern u32 dapl_get_consumer_context(DAT_HANDLE 
 handle, DAT_CONTEXT *context);
 tduffy +extern u32 dapl_get_consumer_context(DAT_HANDLE handle,
 tduffy +  union dat_context 
 *context);
 tduffy  
 tduffy  extern u32 dapl_get_handle_type(DAT_HANDLE handle,
 tduffy   enum dat_handle_type *type);
 tduffy Index: linux-kernel/dat/dat.h
 tduffy 
 ===
 tduffy --- linux-kernel/dat/dat.h(revision 2506)
 tduffy +++ linux-kernel/dat/dat.h(working copy)
 tduffy @@ -361,14 +361,14 @@ typedef enum {
 tduffy   TRUE = 1
 tduffy  }

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Grant Grundler

On Sat, May 28, 2005 at 04:26:43PM -0700, Caitlin Bestler wrote:
...
 if so what the best strategy for
 achieving it is (try to plan an IB/iWARP merge immediately
 or wait until there is an iWARP code base).

If there is no iWARP code base, I fail to see how one can merge.
Having a specification is one basis for communication.
Linux developers normally use existing code as the basis.
Committees submit CRs (Change Requests) to update specs.
The CRs get voted on by the committee.
Linux developers submit patches.
The Linux subsystems maintainer(s) decide if patches are ok or not.


 Claiming that an InfiniBand-specific  interface is somehow
 thinking long term is just plain ludicrous.

It Works is worth 10x more to *any* customer than a transport
neutral API that only exists as a spec.

The specs are guides to how something *should* work and
linux tries to comply with them (e.g. 802.3 or T10) where
HW implementations actually follow the spec. That doesn't
mean linux has to implement every brain damaged spec that
some committee comes up withOTOH, rdmaconsortium.org
does have a fair shot given I2O made it into the kernel. :^/

(I'm willing to have a conversation about why I think I2O
is brain damaged if someone else is buying drinks. It's
not total crap, but it certainly has it's downside.)

 Now it may be that the short term interest of the InfiniBand
 vendors is such that they cannot commit resources to
 helping build a transport neutral API. That is always a
 legitimate tradeoff, but it is short term corporate thinking.

Please, that horse is already dead.
They have offered to review patches to make the API transport neutral.
Test that offer. Submit patches and move the conversation
on to something that is more constructive.

 Last time I looked most of the commits being made to
 OpenIB (or sourceforge DAPL) were from being drawing
 paychecks from those evil corporations.

Yes, so?
The issue isn't the funding - it's the goals.

Compare the gen1 stack (I'm being careful to not pick on
any IB vendors) to the gen2 stack. The difference is between
corporate code and linux code - mostly funded by the
same corporation with several of the same programmers.
gen1 stack came from somehing that attempted to build/run
a shared user/kernel space on every distro. The Makefiles
are just a mess - nevermind the code.

grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Grant Grundler

On Sat, May 28, 2005 at 05:18:39PM -0700, Caitlin Bestler wrote:
 Verus..
 
 struct rdma_xyz {
  /* common fields */
 };
 
 struct rdma_xyz_ib {
 struct rdma_xyz common;
 /* ib fields */
 };
 
 struct rdma_xyz_iwarp {
 struct rdma_xyz common;
 /* iwarp fields */
 };
 
 
 The latter style is extensible, but makes it difficult to properly
 allocate a buffer that works for all variants.

The latter assumes the transport specific code is owns responsibilty
for allocating/deallocating those buffers.
It also forces the generic code to be completely ignorant of
the transport specific stuff. It doesn't allow the programmer
to hacking around in the public unions.

 The union style is also already in use in both IT-API and RNIC-PI.
 
 I personally prefer sub-classing to unions, but I have found myself in the 
 minority on *most* projects where the issue has been discussed.
 One reason is that sub-classing provides very little type-safety.
 struct sockaddr is an example of this. It takes manual inspection
 to ensure that the variants are properly differentiated and it is
 still common for developers to pass in a plain struct sockaddr
 without realizing that it is not large enough for a struct sockaddr_in6.

IMHO, unions are a sort of casts on whole structures. 
Neither method really offers an advantage in type checking.
Both require one knows which type is the right one.

grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OpenSM crash

2005-05-31 Thread Hal Rosenstock

On Fri, 2005-05-27 at 17:30, Tom Duffy wrote: 
  Also, did
  you pick up the user_mad.c fix on Tuesday AM ? If it was, any other
  changes are either not related or trivial.
  
  After you picked up these changes, did you regenerate the various OpenSM
  makefiles (a define for RMPP changed in them) or just rebuild ? [This
  would not explain the crash, but is different from how my OpenSM is
  built.]
 
 I just reran make from the toplevel (management) after updating.  I
 would think it would rebuild them if something changed, no?

There are certain changes where the makefiles need to be regenerated
(and this is not done automatically). Since there was an additional
compile flag added, they need to be regenerated or else it is being
built the old way (without the real RMPP support enabled).

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread James Lentini



On Fri, 27 May 2005, Tom Duffy wrote:


On Thu, 2005-05-26 at 22:25 -0700, Sean Hefty wrote:

So, here is the strategy I am taking.  Please let me know if it is
wrong.

When dapl_ep_connect() is called, I save off the timeout value into the
dapl_ep struct.  Then, when we get ready to call ib_send_cm_req(), I
stuff the timeout value (after munging it into IB's strange format) into
the conn params remote_cm_response_timeout.


From a CM perspective, this sounds fine.  Note that the CM timeout will not
occur until the number of retries has been met.  So I don't know if the
timeout passed to dapl_ep_connect() should convert directly into the
remote_cm_response_timeout, or needs to be divided by the number of retries.


So, are you saying that if you have a timeout of 4 seconds (you pass in
20) and you have retries set to 2, that it will fail after 8 seconds?

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?


It is the total timeout value.


Also, did you notice that dapl_ib_connect always sets the timeout to 20
(4 seconds) no matter what?  Should this be the case?


The timeout should not be constant as it is now. It was being 
unnecessarily emulated with the extra timeout thread.



If the connection fails to complete within the timeout,
dapl_cm_active_cb_handler() is called with IB_CM_REQ_ERROR which in turn
calls dapl_evd_connection_callback() which does the same thing that
dapl_ep_timeout() used to do -- tear down the connection.


I haven't looked at your changes, but note that calling ib_destroy_cm_id
from within the CM callback thread will hang.  The callback holds a
reference on the cm_id.  The good news is that there should be code in kDAPL
to catch this.


I will take a look and see if this could happen.


Tom, I don't believe that you've changed Hal and Sean's implementation 
of this.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread James Lentini



Tom,

We should attempt to connect for no less than dat_ep_connect's timeout 
value. We don't need to guarantee that the connection attempts will 
last for exactly a specific time.


Sean,

Is there any way of requesting an infinite number of retries?

On Fri, 27 May 2005, Sean Hefty wrote:


From a CM perspective, this sounds fine.  Note that the CM timeout will

not

occur until the number of retries has been met.  So I don't know if the
timeout passed to dapl_ep_connect() should convert directly into the
remote_cm_response_timeout, or needs to be divided by the number of

retries.

So, are you saying that if you have a timeout of 4 seconds (you pass in
20) and you have retries set to 2, that it will fail after 8 seconds?

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?


If you pass in a timeout of 4 seconds with retries to 2, the call will
timeout in 12 seconds.  The request will be sent 3 times (2 retries).  I
should also note that the CM timeout includes the packet lifetime (round
trip time) in its timeout calculation, but this should be small.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Venkata Jagana


Exactly, the code matters from Linux
community standpoint
and the discussion around the convergence
of common PI is
mute until we have that header file
definition but which will come
out soon. 

However, I am quite glad to see the
OpenIB and
OpenRDMA communities in agreement on
common
ULP's and DAPL/IT-API (even though,
there are some
disagreements on these APIs).

Also, as you pointed out, I absolutely
agree the differences between 
Gen1 and Gen2 but which is exactly what
I wanted to avoid
with OpenRDMA and rather start from
a clean slate right from
the beginning through opensource
fashion - basically,
don't want the code to be dumped by
some corporate
developers. 

Thanks
Venkat



[EMAIL PROTECTED] wrote
on 05/31/2005 09:38:51 AM:

 On Sat, May 28, 2005 at 04:26:43PM -0700, Caitlin Bestler wrote:
 ...
  if so what the best strategy for
  achieving it is (try to plan an IB/iWARP merge immediately
  or wait until there is an iWARP code base).
 
 If there is no iWARP code base, I fail to see how one can merge.
 Having a specification is one basis for communication.
 Linux developers normally use existing code as the basis.
 Committees submit CRs (Change Requests) to update specs.
 The CRs get voted on by the committee.
 Linux developers submit patches.
 The Linux subsystems maintainer(s) decide if patches are ok or not.
 
 
  Claiming that an InfiniBand-specific interface is somehow
  thinking long term is just plain ludicrous.
 
 It Works is worth 10x more to *any* customer than a transport
 neutral API that only exists as a spec.
 
 The specs are guides to how something *should* work and
 linux tries to comply with them (e.g. 802.3 or T10) where
 HW implementations actually follow the spec. That doesn't
 mean linux has to implement every brain damaged spec that
 some committee comes up withOTOH, rdmaconsortium.org
 does have a fair shot given I2O made it into the kernel. :^/
 
 (I'm willing to have a conversation about why I think I2O
 is brain damaged if someone else is buying drinks. It's
 not total crap, but it certainly has it's downside.)
 
  Now it may be that the short term interest of the InfiniBand
  vendors is such that they cannot commit resources to
  helping build a transport neutral API. That is always a
  legitimate tradeoff, but it is short term corporate thinking.
 
 Please, that horse is already dead.
 They have offered to review patches to make the API transport neutral.
 Test that offer. Submit patches and move the conversation
 on to something that is more constructive.
 
  Last time I looked most of the commits being made to
  OpenIB (or sourceforge DAPL) were from being drawing
  paychecks from those evil corporations.
 
 Yes, so?
 The issue isn't the funding - it's the goals.
 
 Compare the gen1 stack (I'm being careful to not pick
on
 any IB vendors) to the gen2 stack. The difference is between
 corporate code and linux code - mostly funded by the
 same corporation with several of the same programmers.
 gen1 stack came from somehing that attempted to build/run
 a shared user/kernel space on every distro. The Makefiles
 are just a mess - nevermind the code.
 
 grant
 
 
 ---
 This SF.Net email is sponsored by Yahoo.
 Introducing Yahoo! Search Developer Network - Create apps using Yahoo!
 Search APIs Find out how you can build Yahoo! directly into your own
 Applications - visit http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005
 ___
 Rdma-developers mailing list
 [EMAIL PROTECTED]
 https://lists.sourceforge.net/lists/listinfo/rdma-developers
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 13:27, James Lentini wrote:
  James, what is the timeout value passed into dapl_ep_connect mean, the
  total timeout time?  Or how much for each retry?
 
 It is the total timeout value.

Total meaning all everything inclusive ? If that is what it is supposed
to be, that is not what is implemented now:

DAPL_IB_CM_RESPONSE_TIMEOUT 20 /* 4 sec */
DAPL_IB_MAX_CM_RETRIES 4

There are also the timeout/retries of IBAT as well.
DAPL_IB_MAX_AT_RETRY 3
IB_AT_REQ_RETRY_MS  100

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread James Lentini



Here's the specification's exact description:

 timeout: Duration of time, in microseconds, that a consumer waits for
  connection establishment. The value of DAT_TIMEOUT_INFINITE
  represents no timeout, indefinite wait. Values must be
  positive.

My perspective is that we are not implementing this API for a real 
time operating system and therefore should take a fuzzy view of time.


My interpretation of the definition above is that a provider should 
attempt to establish a connection for a least [timeout] time. If a 
connection is not established after attempting for at least [timeout] 
time, the provider should should give up and post a connection failure 
event. If there is some reasonable additional time needed for address 
resolution, etc., I think that is acceptable.


james

On Tue, 31 May 2005, Hal Rosenstock wrote:


On Tue, 2005-05-31 at 13:27, James Lentini wrote:

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?


It is the total timeout value.


Total meaning all everything inclusive ? If that is what it is supposed
to be, that is not what is implemented now:

DAPL_IB_CM_RESPONSE_TIMEOUT 20 /* 4 sec */
DAPL_IB_MAX_CM_RETRIES 4

There are also the timeout/retries of IBAT as well.
DAPL_IB_MAX_AT_RETRY 3
IB_AT_REQ_RETRY_MS  100

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread James Lentini




On Fri, 27 May 2005, Bob Woodruff wrote:


Caitlin wrote,

Both uDAPL and kDAPL were designed for *application* use.
Even kDAPL is more intended for use by a kernel daemon that
is loaded separately from the kernel than for use within
the kernel itself.


kDAPL is intended as a kernel-level API
for RDMA enabled fabrics. As it was initially written,
it does not meet the Linux coding style and that is why
it is being totally reworked as we speak to meet that goal.


An ideal API for use within the kernel would abstract as
much as possible (without requiring emulation), and then
have transport specific unions or enum values. It would
hide no control options, merely provide common controls
for common capabilities.


So for every new RDMA device type that comes along, you need to add a new
enum, and unions for device class specific stuff, etc.
Seems rather static and not easily extended. Not
to mention that testing nightmare when the thing has to support
20 different types of RDMA enabled devices.
I think code like that could get pretty ugly pretty fast.

I'd rather see a registration mechanism like what we already have
with DAPL that does not require any code changes to add a new RDMA
device/provider.  We have already proven that this works in DAPL
as I know if at least 3 providers, IB, Myrinet, and RNIC (Ammasso)
that were developed separately and were able to co-exist without
any changes (enums and device class unions) in the DAT mid-layer.
I assume that this can also be done with kDAPL in the kernel, but
I defer to the DAPL experts to answer that one.


Correct. The DAT API (kernel and user) is designed to support 
heterogeneous providers. The modifications we are making in


https://openib.org/svn/gen2/users/jlentini/

will not change that.

james
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] [PATCH] [kdapl CM] Add more debug on connection destruction

2005-05-31 Thread Hal Rosenstock

[kdapl CM] Add more debug on connection destruction
Also, make naming of retry defines consistent

Signed-off-by: Hal Rosenstock [EMAIL PROTECTED]

Index: dapl_openib_cm.c
===
--- dapl_openib_cm.c(revision 2507)
+++ dapl_openib_cm.c(working copy)
@@ -42,7 +42,7 @@
 #define DAPL_IB_RNR_RETRY_COUNT 6
 #define DAPL_IB_CM_RESPONSE_TIMEOUT 20  /* 4 sec */
 #define DAPL_IB_MAX_CM_RETRIES  4
-#define DAPL_IB_MAX_AT_RETRY3
+#define DAPL_IB_MAX_AT_RETRIES  3
 
 /* Should these be queried ? */
 #define DAPL_IB_TARGET_MAX  4 /* responder resources (max_qp_ous_rd_atom) 
*/
@@ -65,6 +65,9 @@
spin_unlock_irqrestore(conn-lock, flags);
 
if (!in_callback) {
+   dapl_dbg_log(DAPL_DBG_TYPE_CM,
+   dapl_destroy_cm_id: conn %p CM ID %p\n,
+conn, conn-cm_id);
ib_destroy_cm_id(conn-cm_id);
if (conn-ep)
conn-ep-cm_handle = NULL;
@@ -297,7 +300,7 @@
if (rec_num = 0) {
printk(KERN_ERR dapl_path_comp_handler: path resolution 
   failed %d retry %d!!!\n, rec_num, conn-retries + 1);
-   if (++conn-retries  DAPL_IB_MAX_AT_RETRY) {
+   if (++conn-retries  DAPL_IB_MAX_AT_RETRIES) {
printk(KERN_ERR dapl_path_comp_handler: ep_ptr 0x%p\n,
   conn-ep);
event = DAT_CONNECTION_EVENT_UNREACHABLE;
@@ -346,7 +349,7 @@
if (rec_num = 0) {
printk(KERN_ERR dapl_rt_comp_handler: rec num %d retry %d\n,
   rec_num, conn-retries + 1);
-   if (++conn-retries  DAPL_IB_MAX_AT_RETRY) {
+   if (++conn-retries  DAPL_IB_MAX_AT_RETRIES) {
event = DAT_CONNECTION_EVENT_UNREACHABLE;
goto error;
}
@@ -580,6 +583,9 @@
struct dapl_ia *ia_ptr;
int  ib_status;
 
+   dapl_dbg_log(DAPL_DBG_TYPE_CM,
+   dapl_ib_reinit_ep: EP %p\n, ep_ptr);
+
ia_ptr = ep_ptr-header.owner_ia;
 
/*
@@ -671,6 +677,10 @@
  */
 u32 dapl_ib_remove_conn_listener(struct dapl_ia *ia_ptr, struct dapl_sp 
*sp_ptr)
 {
+dapl_dbg_log(DAPL_DBG_TYPE_CM,
+   dapl_ib_remove_conn_listener: SP %p conn %p\n,
+sp_ptr, sp_ptr-cm_srvc_handle);
+
/*
 * This will hang if called from CM thread context...
 * Move back to using WQ...



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread James Lentini




On Tue, 31 May 2005, Hal Rosenstock wrote:


On Tue, 2005-05-31 at 14:17, James Lentini wrote:

Here's the specification's exact description:

  timeout: Duration of time, in microseconds, that a consumer waits for
   connection establishment. The value of DAT_TIMEOUT_INFINITE
   represents no timeout, indefinite wait. Values must be
   positive.

My perspective is that we are not implementing this API for a real 
time operating system and therefore should take a fuzzy view of 
time.


Fuzzy in that we are certainly not concerned with the granularity of 
microseconds.


My interpretation of the definition above is that a provider should 
attempt to establish a connection for a least [timeout] time.



So any number of retries is allowed up to the time period specified 
(depending on the timeout used) ?


Correct, any number of retries (including 0) is allowed. Once the time 
period expires, the provider should post a result as quickly as 
possible.


 If a connection is not established after attempting for at least 
[timeout] time, the provider should should give up and post a 
connection failure event. If there is some reasonable additional 
time needed for address resolution, etc., I think that is 
acceptable.


This all can be bundled in. One just needs to know what the 
requirement is.


If we included address resolution, how would we divide up the time 
between address resolution and cm protocol? Wouldn't we have to 
track how long address resolution took to complete?



-- Hal


james

On Tue, 31 May 2005, Hal Rosenstock wrote:


On Tue, 2005-05-31 at 13:27, James Lentini wrote:

James, what is the timeout value passed into dapl_ep_connect mean, the
total timeout time?  Or how much for each retry?


It is the total timeout value.


Total meaning all everything inclusive ? If that is what it is supposed
to be, that is not what is implemented now:

DAPL_IB_CM_RESPONSE_TIMEOUT 20 /* 4 sec */
DAPL_IB_MAX_CM_RETRIES 4

There are also the timeout/retries of IBAT as well.
DAPL_IB_MAX_AT_RETRY 3
IB_AT_REQ_RETRY_MS  100

-- Hal




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 15:57, James Lentini wrote:
 If we included address resolution, how would we divide up the time 
 between address resolution and cm protocol? Wouldn't we have to 
 track how long address resolution took to complete?

Yes, to follow the requirement closely, one would need to time the
duration of the address translation but that is pretty straightforward
to do. IBAT already has to time out requests anyway. The worst case for
address resolution is currently 4 * 100 msec. Other alternatives are to
subtract the maximal address translation time off the time supplied and
use the rest for CM, or as you said ignore this time and use it all for
CM purposes (and just go over by whatever amount this is). Did other
implementations factor this in or did they ignore this ?

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread James Lentini




On Tue, 31 May 2005, Hal Rosenstock wrote:


On Tue, 2005-05-31 at 15:57, James Lentini wrote:

If we included address resolution, how would we divide up the time
between address resolution and cm protocol? Wouldn't we have to
track how long address resolution took to complete?


Yes, to follow the requirement closely, one would need to time the
duration of the address translation but that is pretty straightforward
to do. IBAT already has to time out requests anyway. The worst case for
address resolution is currently 4 * 100 msec.


If we can account for all of the time properly, then we should 
implement it that way.


Other alternatives are to subtract the maximal address translation 
time off the time supplied and use the rest for CM, or as you said 
ignore this time and use it all for CM purposes (and just go over by 
whatever amount this is). Did other implementations factor this in 
or did they ignore this ?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] OpenSM crash

2005-05-31 Thread Tom Duffy

On Tue, 2005-05-31 at 13:09 -0400, Hal Rosenstock wrote:
 There are certain changes where the makefiles need to be regenerated
 (and this is not done automatically). Since there was an additional
 compile flag added, they need to be regenerated or else it is being
 built the old way (without the real RMPP support enabled).

$ make automake

at the toplevel should take care of this, no?

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] Re: [PATCH] kDAPL: remove typedef DAT_CONTEX T

2005-05-31 Thread James Lentini




On Tue, 31 May 2005, Itamar Rabenstein wrote:


hi all,
I have tried to use DAT_UPCALL_NULL and I got compile error
and I don't think that it is good to try to make comparator ( = ) between
structs
if we want to check for DAT_UPCALL_NULL we need to check that the CB
function pointer is NULL.


Where is DAT_UPCALL_NULL used in a comparison?


I mean that if you want to use DAT_UPCALL_NULL you need to have
real dat_upcall struct and to set the CB function to NULL.
instead of casting NULL to be a struct.


DAT_UPCALL_NULL is not NULL cast to a struct. It is defined as:

#define DAT_UPCALL_NULL \
 ((struct dat_upcall_object) { (void *) NULL, (DAT_UPCALL_FUNC) NULL })



Currently dat_evd_modify_upcall() is not implemented according the spec.

Itamar


-Original Message-
From: James Lentini [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 31, 2005 6:39 PM
To: Tom Duffy
Cc: openib-general@openib.org
Subject: [openib-general] Re: [PATCH] kDAPL: remove typedef
DAT_CONTEXT



Mostly committed in revision 2515.

I didn't remove DAT_UPCALL_NULL and DAT_UPCALL_SAME. DAT_UPCALL_NULL
is provided as a convenience to the consumer. I think it is
useful, but
I'm willing to hear other opinions. The provider's implementation of
dat_evd_modify_upcall() should check for the DAT_UPCALL_SAME value.
The fact that it doesn't is a bug.

james

On Fri, 27 May 2005, Tom Duffy wrote:

tduffy Get rid of the typedef DAT_CONTEXT.
tduffy
tduffy Signed-off-by: Tom Duffy [EMAIL PROTECTED]
tduffy
tduffy Index: linux-kernel/test/dapltest/include/dapl_common.h
tduffy
===
tduffy --- linux-kernel/test/dapltest/include/dapl_common.h
(revision 2506)
tduffy +++ linux-kernel/test/dapltest/include/dapl_common.h
(working copy)
tduffy @@ -42,7 +42,7 @@ typedef enum
tduffy  typedef struct
tduffy  {
tduffy  DAT_RMR_CONTEXT rmr_context;
tduffy -DAT_CONTEXT mem_address;
tduffy +union dat_context mem_address;
tduffy  } RemoteMemoryInfo;
tduffy  #pragma pack()
tduffy
tduffy Index: linux-kernel/dat-provider/dapl_get_consumer_context.c
tduffy
===
tduffy ---
linux-kernel/dat-provider/dapl_get_consumer_context.c   (revision 2506)
tduffy +++
linux-kernel/dat-provider/dapl_get_consumer_context.c   (working copy)
tduffy @@ -48,7 +48,7 @@
tduffy   *  DAT_SUCCESS
tduffy   *  DAT_INVALID_PARAMETER
tduffy   */
tduffy -u32 dapl_get_consumer_context(DAT_HANDLE dat_handle,
DAT_CONTEXT *context)
tduffy +u32 dapl_get_consumer_context(DAT_HANDLE dat_handle,
union dat_context *context)
tduffy  {
tduffy  u32 dat_status = DAT_SUCCESS;
tduffy  struct dapl_header *header;
tduffy Index: linux-kernel/dat-provider/dapl_set_consumer_context.c
tduffy
===
tduffy ---
linux-kernel/dat-provider/dapl_set_consumer_context.c   (revision 2506)
tduffy +++
linux-kernel/dat-provider/dapl_set_consumer_context.c   (working copy)
tduffy @@ -47,7 +47,7 @@
tduffy   *  DAT_SUCCESS
tduffy   *  DAT_INVALID_HANDLE
tduffy   */
tduffy -u32 dapl_set_consumer_context(DAT_HANDLE dat_handle,
DAT_CONTEXT context)
tduffy +u32 dapl_set_consumer_context(DAT_HANDLE dat_handle,
union dat_context context)
tduffy  {
tduffy  u32 dat_status = DAT_SUCCESS;
tduffy  struct dapl_header *header;
tduffy Index: linux-kernel/dat-provider/dapl.h
tduffy
===
tduffy --- linux-kernel/dat-provider/dapl.h (revision 2506)
tduffy +++ linux-kernel/dat-provider/dapl.h (working copy)
tduffy @@ -177,7 +177,7 @@ struct dapl_header {
tduffy  enum dat_handle_type handle_type;
tduffy  struct dapl_ia *owner_ia;
tduffy  struct dapl_llist_entry ia_list_entry;
tduffy -DAT_CONTEXT user_context; /* user
context - opaque to DAPL */
tduffy +union dat_context user_context;   /* user
context - opaque to DAPL */
tduffy  spinlock_t lock;
tduffy  unsigned long flags;  /* saved lock
flag values */
tduffy  };
tduffy @@ -423,9 +423,11 @@ extern u32 dapl_ia_query(DAT_IA_HANDLE,
tduffy
tduffy  /* helper functions */
tduffy
tduffy -extern u32 dapl_set_consumer_context(DAT_HANDLE
handle, DAT_CONTEXT context);
tduffy +extern u32 dapl_set_consumer_context(DAT_HANDLE handle,
tduffy + union dat_context context);
tduffy
tduffy -extern u32 dapl_get_consumer_context(DAT_HANDLE
handle, DAT_CONTEXT *context);
tduffy +extern u32 dapl_get_consumer_context(DAT_HANDLE handle,
tduffy + union dat_context
*context);
tduffy
tduffy  extern u32 dapl_get_handle_type(DAT_HANDLE handle,
tduffy  enum dat_handle_type *type);
tduffy Index: linux-kernel/dat/dat.h
tduffy
===
tduffy --- linux-kernel/dat/dat.h   (revision 2506)
tduffy +++ linux-kernel/dat/dat.h   (working

Re: [openib-general] OpenSM crash

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 16:43, Tom Duffy wrote:
 On Tue, 2005-05-31 at 13:09 -0400, Hal Rosenstock wrote:
  There are certain changes where the makefiles need to be regenerated
  (and this is not done automatically). Since there was an additional
  compile flag added, they need to be regenerated or else it is being
  built the old way (without the real RMPP support enabled).
 
 $ make automake
 
 at the toplevel should take care of this, no?

Yes.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Michael Krause



At 06:47 AM 5/28/2005, Christoph Hellwig wrote:
On Sat, May 28, 2005 at
05:17:54AM -0700, Sukanta ganguly wrote:
 That's a pretty bold statement. Linux grew up to be
 popular via mass acceptance. Seems like that charter
 has changed and a few have control over Linux and its
 future. The My way or the highway philosophy has
 gotten embedded in the Linux way of life.
 Life is getting tough.
You're totally missing the point. Linux is successfull exactly
because it's lookinf for the right solution, not something the
business people need short-term. 
Hence why some of us contend that the end-game, i.e. the right solution,
is not necessarily the short-term implementation that is present today
that just evolves creating that legacy inertia that I wrote about
earlier. I think there is validity to having an implementation to
critique - accept, reject, modify. I think there is validity to
examining industry standards as the basis for new work /
implementation. If people are unwilling to discuss these standards
and only stay focused on their business people's short-term needs, then
some might contend as above that Linux is evolving to be much like the
dreaded Pacific NW company in the end. Not intending to offend
anyone but if there can be no debate without implementation on what is
the right solution, then people might as well just go off and implement
and propose their solution for incorporation into the Linux kernel.
It may be that OpenIB wins in the end or it may be that it
does not. Just having OpenIB subsume control of anything iWARP or
impose only DAPL for all RDMA infrastructure because it just happens to
be there today seems rather stifling. Just stating that some OpenIB
steering group is somehow empowered to decide this for Linux is also
rather strange. Open source is about being open and not under the
control of any one entity in the end. Perhaps that is no
longer the case.
Mike

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread Sean Hefty

Sean,

Is there any way of requesting an infinite number of retries?

There is not, but nothing prevents a user from simply re-issuing a request
after it times out.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers instead of own

2005-05-31 Thread Tom Duffy

On Tue, 2005-05-31 at 14:17 -0400, James Lentini wrote:
 Here's the specification's exact description:
 
   timeout: Duration of time, in microseconds, that a consumer waits for
connection establishment. The value of DAT_TIMEOUT_INFINITE
represents no timeout, indefinite wait. Values must be
positive.

Let me make sure I got this right: timeout is in s (10^-6 seconds), not
ms (10^-3 seconds).  If so, I am off by 3 orders of magnitude in my
calculation.  Right?

 My perspective is that we are not implementing this API for a real 
 time operating system and therefore should take a fuzzy view of time.

Trust me, it is going to fuzzy what with the mechanism IB uses to encode
timeouts.

BTW, what do you think would be a good test case to make sure the new
code is working as intended?

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] user CM uses devfs

2005-05-31 Thread William Jordan

Why does the userlevel CM use devfs to create device nodes? Userlevel
verbs and mad layers appear to rely on udev.

-- 
Bill Jordan
SilverStorm Technologies
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadof own

2005-05-31 Thread Tom Duffy

On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote:
 Sean,
 
 Is there any way of requesting an infinite number of retries?
 
 There is not, but nothing prevents a user from simply re-issuing a request
 after it times out.

Infinite retries inside the kernel does not sound like a good idea.  How
would you break it?  At least we should have some sort of exponential
backoff to prevent flooding the network.

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Grant Grundler

On Tue, May 31, 2005 at 02:31:19PM -0700, Michael Krause wrote:
...
 Not intending 
 to offend anyone but if there can be no debate without implementation on 
 what is the right solution, then people might as well just go off and 
 implement and propose their solution for incorporation into the Linux 
 kernel.

That is certainly one option.
I didn't see anyone in openib.org trying to take that choice away.

Is it easier to submit a new subsystem than fixup an existing one?

I honestly don't know the answer since both options could fail
depending on how people approach them.  But my gut feeling is
if rdmaconsortium can't play nicely with openib.org, they won't
be able to play nicely with kernel.org either.

I've been advocating rdmaconsortium folks submit patches
against openib.org for several reasons:
1) start with a code base that works
2) start with a code base that is already upstream
3) get advice/guidance from people who know how to collaborate
   in an open source environment.

I thought (2) was the most important...but now I have to wonder
if it's really (3). Several very good people are driving
openib.org developement.


 Just having OpenIB subsume control of anything iWARP or impose only 
 DAPL for all RDMA infrastructure because it just happens to be there today 
 seems rather stifling.  Just stating that some OpenIB steering group is 
 somehow empowered to decide this for Linux is also rather strange.

steering group is Committee talk.
AFAICT the openib.org steering group doesn't control the content
of the svn.openib.org source tree. It manages things like web content,
overall charter, etc. People do NOT have to be members of the steering
committee or openib.org to become either maintainers or to submit code.


 Open source is about being open and not under the control of any one
 entity in the end.   Perhaps that is no longer the case.

No. SOME entity always controls what goes in (or not)
any given source tree.  That has nothing to do with open source.

Open source is about collaboration and being able to fork
if that collaboration ceases to be useful. One can substitute
trust for the word collaboration and it would be accurate too.
Figure out how to build trust (without contracts!) and then how
to get things done in open source becomes clear.

hth,
grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] user CM uses devfs

2005-05-31 Thread Libor Michalek

On Tue, May 31, 2005 at 05:54:38PM -0400, William Jordan wrote:
 Why does the userlevel CM use devfs to create device nodes? Userlevel
 verbs and mad layers appear to rely on udev.

  No reason except that I went with what I thought was the simpler
model at the time. Unlike the verbs and mad layers which need a certain
number of device nodes for every physical device installed including
hot-plug support, the userlevel CM needs just one device node for
communicating with the kernel CM which should be present the entire
time the kernel CM is loaded. 


-Libor
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Bernard Metzler


I completely agree. I think this thread
was not started to get one of
the projects out of the way of the other.
I would think it was started
to coordinate the development cycles
of two related projects, where
one project is admittedly much more
advanced, just due to the since
years now availability of IB technology.
I also think that it is not yet proven
that good software must always
get created by writing something down
and then reshaping it to meet
upcoming requirements. OpenRDMA has
choosen to start by trying to
identify main requirements and then
agreeing upon an appropriate
architecture. We did not start with
discussing the style of commentary
lines, because it was assumed that the
style of commentary lines is
less important and even easier to fix.
Enabling iWARP under Linux is not an
easy task and we are dependent
on the open source communities help
and support to make this happen.
We are not in the position nor willing
to bypass this procedure - and
its always good to have fruitful discussion.

Bernard.

[EMAIL PROTECTED] wrote
on 31.05.2005 23:31:19:

 At 06:47 AM 5/28/2005, Christoph Hellwig wrote:
 On Sat, May 28, 2005 at 05:17:54AM -0700, Sukanta
ganguly wrote:
  That's a pretty bold statement. Linux grew up to be
  popular via mass acceptance. Seems like that charter
  has changed and a few have control over Linux and its
  future. The My way or the highway philosophy has
  gotten embedded in the Linux way of life.
  Life is getting tough.
 
 You're totally missing the point. Linux is successfull exactly
 because it's lookinf for the right solution, not something the
 business people need short-term. 
 
 Hence why some of us contend that the end-game, i.e. the right 
 solution, is not necessarily the short-term implementation that is

 present today that just evolves creating that legacy inertia that
I 
 wrote about earlier. I think there is validity to having an

 implementation to critique - accept, reject, modify. I think
there 
 is validity to examining industry standards as the basis for new 
 work / implementation. If people are unwilling to discuss these

 standards and only stay focused on their business people's short-
 term needs, then some might contend as above that Linux is evolving

 to be much like the dreaded Pacific NW company in the end. Not

 intending to offend anyone but if there can be no debate without 
 implementation on what is the right solution, then people might as

 well just go off and implement and propose their solution for 
 incorporation into the Linux kernel. It may be that OpenIB wins

 in the end or it may be that it does not. Just having OpenIB

 subsume control of anything iWARP or impose only DAPL for all RDMA

 infrastructure because it just happens to be there today seems 
 rather stifling. Just stating that some OpenIB steering group
is 
 somehow empowered to decide this for Linux is also rather strange.

 Open source is about being open and not under the control of any one
 entity in the end.  Perhaps that is no longer the case.
 
 Mike___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Libor Michalek

On Tue, May 31, 2005 at 02:03:06PM -0700, Tom Duffy wrote:
 On Sat, 2005-05-28 at 09:13 +0200, Christoph Hellwig wrote:
  On Fri, May 27, 2005 at 03:56:58PM -0700, Bob Woodruff wrote:
   kDAPL is intended as a kernel-level API
   for RDMA enabled fabrics. As it was initially written,
   it does not meet the Linux coding style and that is why
   it is being totally reworked as we speak to meet that goal. 
  
  The codingstyle alone isn't the problem.  The whole design philosophy
  is rather odd.
 
 As one of the people trying to clean up kDAPL, I would like to know what
 you think, from a design philosophy, is wrong with it.  We *can* correct
 any daim bramaged parts.

  Well, from a kernel API design philosophy the evd is somewhat odd.
The whole idea behind the event model seems a bit convoluted. First
multiplex a wide variety of events from the provider into a single event
queue, and then have an API so the consumer can tell what type of event
they actually have and can still receive the event notification in the
provider's context.

  This seems to be a lot of work to first hide useful information, but
also not loose the information in case the consumer really does want it.
It appears to be a case of a decent userspace idea that doesn't make
much sense in the kernel. Why is it there? I imagine it's to abstract a
variety of OS kernels, which was one of the goals of the design.

  Also, I realize it's just an implementation detail, but I've got a 
number of issues with ATS.


-Libor
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] user CM uses devfs

2005-05-31 Thread Roland Dreier

William Why does the userlevel CM use devfs to create device
William nodes? Userlevel verbs and mad layers appear to rely on
William udev.

Good point.  devfs is dying a richly deserved death in a month (cf
Documentation/feature-removal.txt) -- we should just use the standard
character device stuff and let udev handle things.

 - R.
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] RE: [ANNOUNCE][PATCH] New Linux 2.6.9 backport patches and corresponding userspace tar ball available

2005-05-31 Thread Woodruff, Robert J

Michael Wrote  
 
 Patches are located in the SVN tree under 
 gen2/trunk/src/linux-kernel/patches/backport-to-2.6.9/
 
 infiniband-backport-svn2425-to-2.6.9-kernel-fixups-01.diff   
 infiniband-backport-svn2425-to-2.6.9-openib-drivers-02.diff  
 infiniband-backport-svn2425-to-2.6.9-openib-fixups-03.diff   
 infiniband-backport-svn2425-userspace.tar.gz 
 
 woody
 

Woody, could you please move these patches to gen2/branches?

-- 
MST - Michael S. Tsirkin

What do you guys think, would these be better kept under gen2/branches
or
where I have put them under linux-kernel/patches ?

I can see arguments both ways. 

woody

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] [PATCHv2][RFC] kDAPL: use cm timers insteadofown

2005-05-31 Thread Sean Hefty

On Tue, 2005-05-31 at 14:34 -0700, Sean Hefty wrote:
 Sean,
 
 Is there any way of requesting an infinite number of retries?

 There is not, but nothing prevents a user from simply re-issuing a
request
 after it times out.

Infinite retries inside the kernel does not sound like a good idea.  How
would you break it?  At least we should have some sort of exponential
backoff to prevent flooding the network.

To be a little more clear.  The CM protocol uses 4-bits for its number of
retries with a linear timeout.  What an app does above that is undefined.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Venkata Jagana



 I've been advocating rdmaconsortium folks submit patches
 against openib.org for several reasons:

Probably, you meant openrdma.org opensource project but not
a standards setting body (i.e. RDMA consortium - 
http://www.rdmaconsortium.org/home) :)

 1) start with a code base that works
 2) start with a code base that is already upstream
 3) get advice/guidance from people who know how to collaborate
  in an open source environment.
 
 I thought (2) was the most important...but now I have to wonder
 if it's really (3). 

You are mistaken. I know people in the OpenRDMA community have 
worked with the opensource projects before and they 
know how to play and collaborate in an open source environment. 
The early part of the work in openrdma is in fact, a true example 
of that effort (which you may disagree with but having worked with
several other opensource projects and with OpenIB, we have
solved the issues which other projects including OpenIB have faced)
and the next phase of work which is of course the code development, 
a key aspect of broader community effort. 

I think we are diverging from the real issue - the fundamental differences 
in the views of each community in how we can solve this common problem of
supporting multiple RDMA fabrics, which is what we need to focus on.

 
  Just having OpenIB subsume control of anything iWARP or impose only 
  DAPL for all RDMA infrastructure because it just happens to be there today 
  seems rather stifling. Just stating that some OpenIB steering group is 
  somehow empowered to decide this for Linux is also rather strange.
 
 AFAICT the openib.org steering group doesn't control the content
 of the svn.openib.org source tree. It manages things like web content,
 overall charter, etc 

Don't agree. If you have read the email thread on this discussion, 
you would find that steering committee need to decide whether openIB 
should work on including the support for iWARP. Not that I am 
supporting this idea -:)

In the opensource world, developers should/will have the freedom to 
add what they want to do but of course, the acceptance of their contributions
into mainline is completely a different matter.

Thanks
Venkat___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] cmpost: failure sending REQ: -22

2005-05-31 Thread Sean Hefty

 Has anyone seen ib_send_cm_req() return -22?

 I believe that this is a timeout error, possibly indicating that the
server
 side of the connection wasn't running.  You may also want to verify the
slid
 and dlid are correct for your configuration.

Don't you get a REJ now when there is no one listening on a service ID
requested ?

You do if the CM is loaded on the destination.

-22 is EINVAL. In terms of ib_send_cm_req, it is returned for a number
of cases:
1. peer to peer connection is requested
2. No primary path is supplied
3. QP is not RC or UC
4. private data is supplied and length  92
5. alternate path supplied and PKEY or MTU does not match primary path
6. connection state is not IDLE
7. Primary or alternate path SGID or PKey does not match those of port

You're right.  I was thinking about the request failing asynchronously, not
synchronously when called.  Mostly likely cause is a bad slid/dlid.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Tom Duffy

On Tue, 2005-05-31 at 16:43 -0700, Venkata Jagana wrote:
  AFAICT the openib.org steering group doesn't control the content
  of the svn.openib.org source tree. It manages things like web
 content,
  overall charter, etc 
 
 Don't agree. If you have read the email thread on this discussion, 
 you would find that steering committee need to decide whether openIB 
 should work on including the support for iWARP. Not that I am 
 supporting this idea -:)

Please don't confuse the development effort going on this list
(openib-general) with the corporation that sponsors the development.
OpenIB as a (non-profit) corporation is setup with a charter and has
bylaws, etc.  Its goal may be IB specific at the moment.

But, the developers and the development on this list and in the
subversion repository don't answer to the OpenIB board of directors.
Developers are free to write whatever code they chose to.  There is
nothing stopping the maintainers from taking iWARP patches *today*.
OpenIB, as a corporation, may not put its name behind the work, but that
is another matter.

If the board of directors of OpenIB decide to cease sponsorship because
the developers don't jive with its corporate goals, then development can
continue elsewhere.

-tduffy


signature.asc
Description: This is a digitally signed message part
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [Rdma-developers] Re: [openib-general] OpenIB and OpenRDMA: Convergence on common RDMAAPIs and ULPs for Linux

2005-05-31 Thread Grant Grundler

On Tue, May 31, 2005 at 04:43:58PM -0700, Venkata Jagana wrote:
  I've been advocating rdmaconsortium folks submit patches
  against openib.org for several reasons:
 
 Probably, you meant openrdma.org opensource project but not
 a standards setting body (i.e. RDMA consortium -
 http://www.rdmaconsortium.org/home) :)

Yes - sorry. My bad.

 You are mistaken. I know people in the OpenRDMA community have
 worked with the opensource projects before and they
 know how to play and collaborate in an open source environment.

I likely am.  But comments about requiring commitment and
business planning resources suggest otherwise.

 The early part of the work in openrdma is in fact, a true example
 of that effort (which you may disagree with but having worked with
 several other opensource projects and with OpenIB, we have
 solved the issues which other projects including OpenIB have faced)
 and the next phase of work which is of course the code development,
 a key aspect of broader community effort.

Well, a true example of that effort would have included code.

 I think we are diverging from the real issue - the fundamental differences
 in the views of each community in how we can solve this common problem of
 supporting multiple RDMA fabrics, which is what we need to focus on.

If there is a fundemental difference, it's something along 
the lines of:
openrmda: Hey! We have this transport neutral RNIC PI spec that
needs IB support!
openib: Nice. Where is the code for iWarp?
openrdma: Uhm, well, we've only written the spec so far.
openib: Ok. What do you want from us?
openrdma: Well, we want you to review this RNIC-PI spec and then
write the code to support IB.
openib: Are you crazy? We have a working implementation.
And it's in kernel.org.
openrmda: We know. That's why we should collaborate.
RNIC PI spec is transport neutral.
Could you review it and then implement it in openib.org?
openib: No. You can submit patches and we'll review those.
openrmda: Ok. But I'm not gonna write any code unless someone
commits to accept it. We can't plan our business
unless someone commits resources to work on
accepting our patches.
openib: No. You can submit patches and we'll review those.
... 

I'm trying to NOT be sarcastic - just summarize what I've
understood so far. Please correct or post your own version
(sans rude talk by certain people).


...
 Don't agree. If you have read the email thread on this discussion,
 you would find that steering committee need to decide whether openIB
 should work on including the support for iWARP. Not that I am
 supporting this idea -:)

Tom answered this nicely already.

 In the opensource world, developers should/will have the freedom to
 add what they want to do

Open source developers have *some* allegiance to their funders.
HP pays me to look out for their interests  - but I don't do that
unconditionally. If an HP person is pushing for the wrong things
that I know won't fly, I have an obligation to push back.

 but of course, the acceptance of their
 contributions into mainline is completely a different matter.

We agree acceptance of contributions is conditional.
But earlier emails stated someone needed a firm commitment
that openrdma RNIC PI would get accepted into openib.org.
There's a disconnect there.

hth,
grant

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] RE: [ANNOUNCE][PATCH] New Linux 2.6.9 backport patches and corresponding userspace tar ball available

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 18:56, Woodruff, Robert J wrote:
 Michael Wrote  
  
  Patches are located in the SVN tree under 
  gen2/trunk/src/linux-kernel/patches/backport-to-2.6.9/
  
  infiniband-backport-svn2425-to-2.6.9-kernel-fixups-01.diff   
  infiniband-backport-svn2425-to-2.6.9-openib-drivers-02.diff  
  infiniband-backport-svn2425-to-2.6.9-openib-fixups-03.diff   
  infiniband-backport-svn2425-userspace.tar.gz 
  
  woody
  
 
 Woody, could you please move these patches to gen2/branches?
 
 -- 
 MST - Michael S. Tsirkin
 
 What do you guys think, would these be better kept under gen2/branches
 or
 where I have put them under linux-kernel/patches ?
 
 I can see arguments both ways. 

Me too (see arguments both ways). I'm not so much concerned with exactly
where they are as that they are available. If they are to move out of
trunk, another possibility is gen2/users.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] Send Side RMPP and OpenSM GetTableResp

2005-05-31 Thread Sean Hefty

 --SA GetTableResp
 
 RMPP flags 0x05 (Data, Last)
 SegmentNumber 4
 PayloadLength 0x34
 TID 8
 SA GetTable --
 RMPP flags 0x02 (ACK)
 SegmentNumber 1
 NewWindowLast 6
 TID 8

 This segment number is off - not sure why.

It is off in that the 3 segments just sent are not acknowledged but it
is legal to acknowledge what you have already received. This does not
violate anything.

The RMPP implementation sends an ACK under the following conditions:

* Upon completion of a received datagram.
* If a duplicate segment is receive.
* After all segments of the current window are received
  (including the initial window)

So, this ACK isn't violating the protocol, but I don't see which of these
cases the ACK matches up against in the implementation.

 It could indicate that segment 2 was lost,

That's one possibility but I doubt it is getting lost.

 or that its processing came after that of a later segment.

After re-examining the RMPP code, the implementation doesn't automatically
send an ACK just because a segment is processed out of order.  It tries to
be intelligent about it in case receive processing is occurring in multiple
threads.

The gap was 769.912 usec. Not sure whether this corresponds to any IBTA
timeout. Is this the hardcoded timeout that is used ?

RMPP uses 40 seconds to complete a receive.  The sender uses a 2 second
timer to wait for an ACK before resending segments.  (This is recently
reduced from 5 seconds.)  The receiver uses a 10 second timer to maintain
state after completing a receive in order to re-generate lost final ACKs.

   Regardless what went wrong on the SA
 side, the client needs to be able to deal with it.

This applies in both directions but in this case I think you mean the
other direction (whatever went wrong on the SA client side the SA needs
to be able to deal with it).

Obviously all bugs need to be fixed.  I was simply trying to state that the
receiving side must be able to handle a buggy transmitter without adversely
affecting the system.

 --SA GetTableResp
 RMPP flags 0x01 (Data)
 SegmentNumber 5
 PayloadLength 0x34
 TID 8

 This should not occur.  The maximum segment number sent should have
stayed
 at 4.  I guess one area to check is to make sure that the PayloadLength
in
 the original MAD is set correctly.  I do not know what would happen if it
 were set incorrectly.  There could also be an error in how RMPP
calculates
 the number of segments that will be sent.

It does look like it is trying to resend the last (at least based on the
PayloadLength) ? I will find where to instrument this in the code.

The code on the send side calculates the total segment number using both the
PayloadLength and sge.length field.  If either is off, the sender side could
probably be thrown off in its calculations.  Even if this were the case, I
still can't see what would cause segment number 5 to be transmitted...

 This segment should have been dropped by the client as an invalid segment
 number.

It's not invalid, is it ? Just a repeat. Should it reset one of the RMPP
timers too ?

If segment 4 had the last bit set, segment 5 is invalid.  The RMPP code
should drop this.

It also fills in 0 in RRespTime. Should it fill in something to
correspond to the hard coded time it uses ? Or perhaps 32 (0x1F) ?

I don't think the value of RRespTime matters at this point.

I will try to get back to gathering more info on this.

Having some more info would help, but I can also try modifying grmpp to see
if I can reproduce this.  My intention is to focus on finding a fix for the
MAD problems at the moment, however, so I'll queue this up to look at it
when I get back to RMPP.

- Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

RE: [openib-general] Send Side RMPP and OpenSM GetTableResp

2005-05-31 Thread Hal Rosenstock

On Tue, 2005-05-31 at 22:05, Sean Hefty wrote: 
  --SA GetTableResp
  
RMPP flags 0x05 (Data, Last)
SegmentNumber 4
PayloadLength 0x34
TID 8
  SA GetTable --
  RMPP flags 0x02 (ACK)
  SegmentNumber 1
  NewWindowLast 6
  TID 8
 
  This segment number is off - not sure why.
 
 It is off in that the 3 segments just sent are not acknowledged but it
 is legal to acknowledge what you have already received. This does not
 violate anything.
 
 The RMPP implementation sends an ACK under the following conditions:
 
 * Upon completion of a received datagram.
 * If a duplicate segment is receive.
 * After all segments of the current window are received
   (including the initial window)
 
 So, this ACK isn't violating the protocol, but I don't see which of these
 cases the ACK matches up against in the implementation.

That's (the ACK) not from OpenIB but from the Solaris 10 SA client.

 The code on the send side calculates the total segment number using both the
 PayloadLength and sge.length field.  If either is off, the sender side could
 probably be thrown off in its calculations.  Even if this were the case, I
 still can't see what would cause segment number 5 to be transmitted...

Perhaps there is something wrong with umad in terms of this but it's
hard to see what as it just posts the send MAD built with
ib_create_send_mad.

  This segment should have been dropped by the client as an invalid segment
  number.
 
 It's not invalid, is it ? Just a repeat. Should it reset one of the RMPP
 timers too ?

I was referring to the reACK from the client not the retransmitted data
segment from the SA which has the wrong segment number).

 If segment 4 had the last bit set, segment 5 is invalid.  The RMPP code
 should drop this.

Right. Is just dropping sufficient ? It looks to me that the receiver
should if it is not the expected segment also send ACK for ES - 1 per
Figure 178. [There was more to the sequence which I omitted; I only
showed up to the point where things looked like they went wrong on the
SA side.]

 I will try to get back to gathering more info on this.
 
 Having some more info would help, but I can also try modifying grmpp to see
 if I can reproduce this.  My intention is to focus on finding a fix for the
 MAD problems at the moment, however, so I'll queue this up to look at it
 when I get back to RMPP.

OK. I'll try to get more info so this can be more focused.

-- Hal

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

48 matches

Mail list logo