[ANNOUNCE] OFED 1.5.1 rc4 release is available

2010-03-11 Thread Vladimir Sokolovsky


OFED 1.5.1-rc4 is available

Notes:

The tarball is available on:
http://www.openfabrics.org/downloads/OFED/ofed-1.5.1/OFED-1.5.1-rc4.tgz



To get BUILD_ID run ofed_info

Please report any issues in bugzilla https://bugs.openfabrics.org/ for OFED 
1.5.1

Vladimir  Tziporet


Supported Platforms and Operating Systems
-
 o   CPU architectures:
   - x86_64
   - x86
   - ppc64
   - ia64

 o   Linux Operating Systems:
   - RedHat EL4 up72.6.9-78.ELsmp
   - RedHat EL4 up82.6.9-89.ELsmp
   - RedHat EL5 up32.6.18-128.el5
   - RedHat EL5 up42.6.18-164.el5
   - SLES10 SP22.6.16.60-0.21-smp
   - SLES10 SP32.6.16.60-0.54-smp
   - SLES112.6.27.19-5-default
   - OEL 4 up7 2.6.9-78.ELsmp
   - OEL 4 up8 2.6.9-89.ELsmp
   - CentOS5.3 2.6.18-128.el5
   - CentOS5.4 2.6.18-164.el5
   - Fedora Core12 2.6.31.5-127.fc12*
   - OpenSuSE 11.2 2.6.31.5-0.1-default *
   - kernel.org2.6.29, 2.6.30,
   2.6.31 and 2.6.32*

 * Minimal QA for these versions

Main changes from 1.5.1-rc3:
===
1. Updated packages:
 - ibutils: ibutils-1.5.4
 - libmlx4: libmlx4-1.0-0.6.g72e73dc
 Bug fix in mlx4_create_ah
 - install.pl:
 Add '--builddir' parameter
 NFSoRDMA will not support SLES10 SPx
 - NFSoRDMA is not supported under SLES10SPx

2. Bug fixes







commit 3e2e26b64187f3d5292653b4df761dfcd1e353ea
Merge: f52992f eada57c
Author: Vladimir Sokolovsky v...@mellanox.co.il
Date:   Thu Mar 11 13:01:40 2010 +0200

Merge remote branch 'vu/ofed_kernel_1_5' into ofed_kernel_1_5

commit eada57cc6dfcd0f779c5254a1fc354702ac41247
Author: Vu Pham (Mellanox) v...@lists.openfabrics.org
Date:   Thu Mar 11 02:30:28 2010 -0800

srp: fixing panic bug happened during manual unload ib_srp module

Signed-off-by: Vu Pham v...@mellanox.com

commit f52992f15c05d70d17412ca92e14e6f1bf2c1ac7
Author: Yevgeny Petrilin yevge...@mellanox.co.il
Date:   Wed Mar 10 18:46:55 2010 +0200

mlx4_en: reconfigure mac address

When the other port removes a mac address that is the same that
the current port has, the table should be reconfigured.
fixes bugzilla #1965

Signed-off-by: Yevgeny Petrilin yevge...@mellanox.co.il

commit 7a1bbb340356e0489e50678f7e6d563f2f0be268
Author: Eli Cohen e...@mellanox.co.il
Date:   Thu Mar 11 10:33:52 2010 +0200

IPoIB: Fix multicast handling

After reverting c124815 it was necessary to modify 
ipoib_mcast_addr_is_valid()
so it will not filter out valid ipoib multicast addresses.

Signed-off-by: Eli Cohen e...@mellanox.co.il

commit 5daec886e9a7d7baea848180c3c8dbbc7b249e79
Author: Eli Cohen e...@mellanox.co.il
Date:   Thu Mar 11 09:17:21 2010 +0200

Revert ipoib/mcast: Fix IPoIB multicast backport

The reverted comit changes the multicat address that the kernel created 
causing
resource leaks and other problems.

This reverts commit c12481586c4ba09cb88dc2090c67fdce7c856cde.

commit efe60c7da58f9bf235eef0381aa4a93c014805aa
Merge: 3df6ee7 0ff7a6e
Author: Vladimir Sokolovsky v...@mellanox.co.il
Date:   Wed Mar 10 08:12:42 2010 +0200

Merge branch 'ofed_kernel_1_5' of 
git://git.openfabrics.org/~ralphc/linux-2.6/ into ofed_kernel_1_5

commit 3df6ee73e2364080d7ac179d1ecd8c4aaf9a9e43
Merge: 7a227ee 4c29a30
Author: Vladimir Sokolovsky v...@mellanox.co.il
Date:   Wed Mar 10 08:11:37 2010 +0200

Merge branch 'ofed_kernel_1_5' of 
ssh://sofa.openfabrics.org/home/ctung/scm/ofed-1.5 into ofed_kernel_1_5

commit 7a227ee270783b7f1d773939e82343fdc3e69fb4
Merge: a679ae9 8721f26
Author: Vladimir Sokolovsky v...@mellanox.co.il
Date:   Wed Mar 10 08:06:29 2010 +0200

Merge branch 'ofed_1_5' of 
ssh://sofa.openfabrics.org/~swise/scm/ofed_kernel into ofed_kernel_1_5

commit 0ff7a6e94e7e98a8f10d1c01e70c6f26d776c4ee
Author: Ralph Campbell (QLogic) ral...@lists.openfabrics.org
Date:   Tue Mar 9 16:09:47 2010 -0800

IB/qib: clear symbol error counters on link UP

Clear symbol error counters on link UP.

Signed-off-by: Ralph Campbell ralph.campb...@qlogic.com

commit 4c29a3078cee40ee9800ade03f6f91c69202f368
Author: Chien Tung chien.tin.t...@intel.com
Date:   Tue Mar 9 15:59:02 2010 -0600

RDMA/nes: make nesadapter-phy_lock usage consistent

nes_{read,write}_1G_phy_reg() are using phy_lock while
nes_{read,write}_10G_phy_reg() leave that to the caller.

Remove phy_lock from 1G routines and leave the locking to the caller.
Add additional phy_lock calls around 1G read/write.

Signed-off-by: Chien Tung 

[PATCH] IB/cm: fix device_create() return value check

2010-03-11 Thread Jani Nikula
From: Jani Nikula ext-jani.1.nik...@nokia.com

Use IS_ERR() instead of comparing to NULL.

Signed-off-by: Jani Nikula ext-jani.1.nik...@nokia.com

---

NOTE: I'm afraid I'm unable to test this; please consider this more a
bug report than a complete patch.
---
 drivers/infiniband/core/cm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 764787e..c9730cb 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3693,7 +3693,7 @@ static void cm_add_one(struct ib_device *ib_device)
cm_dev-device = device_create(cm_class, ib_device-dev,
   MKDEV(0, 0), NULL,
   %s, ib_device-name);
-   if (!cm_dev-device) {
+   if (IS_ERR(cm_dev-device)) {
kfree(cm_dev);
return;
}
-- 
1.6.5.2

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IB/cm: fix device_create() return value check

2010-03-11 Thread Sean Hefty
From: Jani Nikula ext-jani.1.nik...@nokia.com

Use IS_ERR() instead of comparing to NULL.

Signed-off-by: Jani Nikula ext-jani.1.nik...@nokia.com

---

NOTE: I'm afraid I'm unable to test this; please consider this more a
bug report than a complete patch.

This looks correct to me.  Good catch.

---
 drivers/infiniband/core/cm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 764787e..c9730cb 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -3693,7 +3693,7 @@ static void cm_add_one(struct ib_device *ib_device)
   cm_dev-device = device_create(cm_class, ib_device-dev,
  MKDEV(0, 0), NULL,
  %s, ib_device-name);
-  if (!cm_dev-device) {
+  if (IS_ERR(cm_dev-device)) {
   kfree(cm_dev);
   return;
   }
--
1.6.5.2


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rnfs: rq_respages pointer is bad

2010-03-11 Thread Tom Tucker

David J. Wilder wrote:

Tom

I have been chasing an rnfs related Oops in svc_process().  I have found
the source of the Oops but I am not sure of my fix.  I am seeing the
problem on ppc64, kernel 2.6.32, I have not tried other arch yet.

The source of the problem is in rdma_read_complete(), I am finding that
rqstp-rq_respages is set to point past the end of the rqstp-rq_pages
page list.  This results in a NULL reference in svc_process() when
passing rq_respages[0] to page_address().

In rdma_read_complete() we are using rqstp-rq_arg.pages as the base of
the page list then indexing by page_no, however rq_arg.pages is not
pointing to the start of the list so rq_respages ends up pointing to:

rqstp-rq_pages[(head-count+1) + head-hdr_count]

In my case, it ends up pointing one past the end of the list by one.

Here is the change I made.

static int rdma_read_complete(struct svc_rqst *rqstp,
  struct svc_rdma_op_ctxt *head)
{
int page_no;
int ret;

BUG_ON(!head);

/* Copy RPC pages */
for (page_no = 0; page_no  head-count; page_no++) {
put_page(rqstp-rq_pages[page_no]);
rqstp-rq_pages[page_no] = head-pages[page_no];
}
/* Point rq_arg.pages past header */
rqstp-rq_arg.pages = rqstp-rq_pages[head-hdr_count];
rqstp-rq_arg.page_len = head-arg.page_len;
rqstp-rq_arg.page_base = head-arg.page_base;

/* rq_respages starts after the last arg page */
-   rqstp-rq_respages = rqstp-rq_arg.pages[page_no];
+   rqstp-rq_respages = rqstp-rq_pages[page_no];
  


This might be clearer as:

   rqstp-rq_respages = rqstp-rq_pages[head-count];


.
.
.

The change works for me, but I am not sure it is safe to assume the
rqstp-rq_pages[head-count] will always point to the last arg page.

Dave.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 0/2] Add support for enhanced atomic operations

2010-03-11 Thread Håkon Bugge
Hi Vlad,

Did you consider my input in 
http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg02803.html wrt. to 
these enhancements?

 

Thanks, Håkon

On Mar 10, 2010, at 16:57 , Vladimir Sokolovsky wrote:

 Hi Roland,
 
 This patchset adds support for the following enhanced atomic
 operations:
 - Masked atomic compare and swap
 - Masked atomic fetch and add
 
 These operations enable using a smaller amount of memory when using
 multiple locks by using portions of a 64 bit value in an atomic
 operation.
 For some applications the memory savings are very significant. One
 example is fine grain lock implementations for huge data sets. In
 other cases, the benefit is the ability to update multiple fields with
 a single io operation.
 
 Vladimir Sokolovsky(2):
 IB/core: Add support for enhanced atomic operations
 mlx4/IB: Add support for enhanced atomic operations
 
 changes from V2:
 - patch #1: 
  Updated description
  Renamed:
IB_WR_ATOMIC_MASKED_CMP_AND_SWP - IB_WR_MASKED_ATOMIC_CMP_AND_SWP
IB_WR_ATOMIC_MASKED_FETCH_AND_ADD - IB_WR_MASKED_ATOMIC_FETCH_AND_ADD
  In the ib_send_wr struct the new fields added before the rkey field
 
 - patch #2:
  Set IB_DEVICE_MASKED_ATOMIC flag with other flags that get set for
  all devices
 
 Regards,
 Vladimir
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Håkon Bugge
haakon.bu...@sun.com
+47 924 84 514



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V3 0/2] Add support for enhanced atomic operations

2010-03-11 Thread Roland Dreier
  Did you consider my input in 
  http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg02803.html wrt. to 
  these enhancements?

I think we can worry about that if/when an HCA comes along that supports
global atomics for ordinary atomics but not enhanced atomics.  Although
perhaps it would be cleaner to change the atomic_cap enum to:

/*
 * IB_ATOMIC_NONE:  no atomic capability
 * IB_ATOMIC_HCA:   all ops are atomic within HCA
 * IB_ATOMIC_GLOB:  standard ops atomic with respect to all
memory ops; masked ops atomic within HCA
 * IB_ATOMIC_GLOB_MASKED: all ops atomic with respect to all
 *  memory ops
 */
enum ib_atomic_cap {
IB_ATOMIC_NONE,
IB_ATOMIC_HCA,
IB_ATOMIC_GLOB,
IB_ATOMIC_GLOB_MASKED
};

(with better wording for the comments)
Thoughts?

 - R.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 15/15] opensm: Cause status of unicast routing attempt to propogate to callers of osm_ucast_mgr_process().

2010-03-11 Thread Jim Schutt

On Wed, 2010-03-10 at 11:06 -0700, Jim Schutt wrote:
 If unicast routing fails, there is no point to continuing with fabric 
 bring-up.
 Just restart a new heavy sweep instead.
 
 Signed-off-by: Jim Schutt jasc...@sandia.gov
 ---
  opensm/opensm/osm_state_mgr.c |   12 +---
  opensm/opensm/osm_ucast_mgr.c |   14 +-
  2 files changed, 18 insertions(+), 8 deletions(-)
 
 diff --git a/opensm/opensm/osm_state_mgr.c b/opensm/opensm/osm_state_mgr.c
 index 96ad348..e666034 100644
 --- a/opensm/opensm/osm_state_mgr.c
 +++ b/opensm/opensm/osm_state_mgr.c
 @@ -1140,7 +1140,11 @@ static void do_sweep(osm_sm_t * sm)
   /* Re-program the switches fully */
   sm-p_subn-ignore_existing_lfts = TRUE;
  
 - osm_ucast_mgr_process(sm-ucast_mgr);
 + if (osm_ucast_mgr_process(sm-ucast_mgr)) {
 + OSM_LOG_MSG_BOX(sm-p_log, OSM_LOG_VERBOSE,
 + REROUTE FAILED);
 + return;
 + }
   osm_qos_setup(sm-p_subn-p_osm);
  
   /* Reset flag */
 @@ -1299,12 +1303,14 @@ repeat_discovery:
   LID ASSIGNMENT COMPLETE - STARTING SWITCH TABLE 
 CONFIG);
  
   /*
 -  * Proceed with unicast forwarding table configuration.
 +  * Proceed with unicast forwarding table configuration; repeat
 +  * if unicast routing fails.
*/
  
   if (!sm-ucast_mgr.cache_valid ||
   osm_ucast_cache_process(sm-ucast_mgr))
 - osm_ucast_mgr_process(sm-ucast_mgr);
 + if (osm_ucast_mgr_process(sm-ucast_mgr))
 + goto repeat_discovery;
  
   osm_qos_setup(sm-p_subn-p_osm);
  

Sorry I missed this: do_sweep() should just return early on 
unicast route failure.

If osm_ucast_mgr_process() fails, no configured routing engine was able
to route the fabric.  In that case, do_sweep() should just return,
and a new sweep will be triggered either on a trap due to a fabric
change, or by the configured sweep_interval.

I think this should just be:

@@ -1299,12 +1303,14 @@ repeat_discovery:
LID ASSIGNMENT COMPLETE - STARTING SWITCH TABLE 
CONFIG);
 
/*
-* Proceed with unicast forwarding table configuration.
+* Proceed with unicast forwarding table configuration; if it fails
+* return early to wait for a trap or the next sweep interval.
 */
 
if (!sm-ucast_mgr.cache_valid ||
osm_ucast_cache_process(sm-ucast_mgr))
-   osm_ucast_mgr_process(sm-ucast_mgr);
+   if (osm_ucast_mgr_process(sm-ucast_mgr))
+   return;
 
osm_qos_setup(sm-p_subn-p_osm);
 


 diff --git a/opensm/opensm/osm_ucast_mgr.c b/opensm/opensm/osm_ucast_mgr.c
 index fbc9244..8ea2e52 100644
 --- a/opensm/opensm/osm_ucast_mgr.c
 +++ b/opensm/opensm/osm_ucast_mgr.c
 @@ -955,6 +955,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
   osm_opensm_t *p_osm;
   struct osm_routing_engine *p_routing_eng;
   cl_qmap_t *p_sw_guid_tbl;
 + int failed = 0;
  
   OSM_LOG_ENTER(p_mgr-p_log);
  
 @@ -973,7 +974,8 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
  
   p_osm-routing_engine_used = NULL;
   while (p_routing_eng) {
 - if (!ucast_mgr_route(p_routing_eng, p_osm))
 + failed = ucast_mgr_route(p_routing_eng, p_osm);
 + if (!failed)
   break;
   p_routing_eng = p_routing_eng-next;
   }
 @@ -984,9 +986,11 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
   struct osm_routing_engine *r = p_osm-default_routing_engine;
  
   r-build_lid_matrices(r-context);
 - r-ucast_build_fwd_tables(r-context);
 - p_osm-routing_engine_used = r;
 - osm_ucast_mgr_set_fwd_tables(p_mgr);
 + failed = r-ucast_build_fwd_tables(r-context);
 + if (!failed) {
 + p_osm-routing_engine_used = r;
 + osm_ucast_mgr_set_fwd_tables(p_mgr);
 + }
   }
  
   if (p_osm-routing_engine_used) {
 @@ -1006,7 +1010,7 @@ int osm_ucast_mgr_process(IN osm_ucast_mgr_t * p_mgr)
  Exit:
   CL_PLOCK_RELEASE(p_mgr-p_lock);
   OSM_LOG_EXIT(p_mgr-p_log);
 - return 0;
 + return failed;
  }
  
  static int ucast_build_lid_matrices(void *context)


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rnfs: rq_respages pointer is bad

2010-03-11 Thread Roland Dreier
Someone please make sure that a final patch with a full description gets
sent to the NFS guys for merging.  Tom, are you going to handle this?
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rnfs: rq_respages pointer is bad

2010-03-11 Thread Tom Tucker

Roland Dreier wrote:

Someone please make sure that a final patch with a full description gets
sent to the NFS guys for merging.  Tom, are you going to handle this?
  

Yes, and I have several more in queue.

Tom

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ipoib: Fix lockup of the tx queue

2010-03-11 Thread Roland Dreier
good debugging, applied thanks.

I do worry (as Moni mentioned) that this doesn't explain why you would
get send failures in this case, but the patch itself is well-explained
and looks obviously correct so I think we should apply it.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue

2010-03-11 Thread Ralph Campbell
On Thu, 2010-03-11 at 13:38 -0800, Roland Dreier wrote:
 good debugging, applied thanks.
 
 I do worry (as Moni mentioned) that this doesn't explain why you would
 get send failures in this case, but the patch itself is well-explained
 and looks obviously correct so I think we should apply it.

Well, after more testing it seems there may still be a problem.
I haven't isolated it yet though. I could definitely use help
reviewing the code changes.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] ib/ipoib: include err code in trace message for ib_post_send() failures

2010-03-11 Thread Roland Dreier
thanks, applied this one for .34.  Will hold the TSO change for .35.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue

2010-03-11 Thread Ralph Campbell
Sorry, I was referring to my patch not Eli's.

On Thu, 2010-03-11 at 13:41 -0800, Ralph Campbell wrote:
 On Thu, 2010-03-11 at 13:38 -0800, Roland Dreier wrote:
  good debugging, applied thanks.
  
  I do worry (as Moni mentioned) that this doesn't explain why you would
  get send failures in this case, but the patch itself is well-explained
  and looks obviously correct so I think we should apply it.
 
 Well, after more testing it seems there may still be a problem.
 I haven't isolated it yet though. I could definitely use help
 reviewing the code changes.
 
 ___
 ewg mailing list
 e...@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue

2010-03-11 Thread Roland Dreier
  Sorry, I was referring to my patch not Eli's.

Heh, I never would have said anything about your patch was obvious.
I skimmed yours once but I do want to read it more carefully.

Did you ever say what test case you are using to provoke the problem you're 
fixing?
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/cxgb3: wait at least one schedule cycle during device removal.

2010-03-11 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] [PATCH] ipoib: Fix lockup of the tx queue

2010-03-11 Thread Ralph Campbell
On Thu, 2010-03-11 at 13:52 -0800, Roland Dreier wrote:
  Sorry, I was referring to my patch not Eli's.
 
 Heh, I never would have said anything about your patch was obvious.
 I skimmed yours once but I do want to read it more carefully.
 
 Did you ever say what test case you are using to provoke the problem you're 
 fixing?

I think I did but it is just UDP stress tests in general.
Throwing in some link failures and switching between connected
and datagram modes helps too. netperf, qperf, etc. should work.
Anything which causes the connected mode QP to fail should
exercise the fix too.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Simplified iWARP Consumer Library

2010-03-11 Thread Sean Hefty
Which of the features would you want to see in librdmacm? I guess the
integration of the cm_id and qp as well as the simpler (synchronous)
connection
establishment and teardown functionality would fit. However, all cm events
will thereafter be handled within the library rather than being presented to
the user. Does that make sense?

I have some ideas for simplifying connection establishment as part of my changes
to support native IB addresses through the rdma_cm interface.  This includes
adding new calls to the librdmacm: rdma_getaddrinfo and
rdma_create_id_and_optional_qp_and_bind_addr_and_set_route.
I'll pick a different name for the latter. :)  Combined, I think these calls
could allow a user to more easily connect, while still providing low level
control for anyone who wants it.

I haven't looked at the details for this, but we may be able to support
synchronous operation by allowing the user to provide a NULL rdma_event_channel
when creating the rdma_cm_id.  We can take advantage of rdma_migrate_id to
implement this, or even toggle between synchronous and asynchronous mode.  The
hope is that the librdmacm can remain threadless.

If you can give me a month or so to finish my current patches, I'll take a look
at this as part of my work.

- Sean

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: ethtool to read hardware registers for Rx/Tx error stats

2010-03-11 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] RDMA/nes: set assume_alligned_header bit

2010-03-11 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: ethtool to read hardware registers for Rx/Tx error stats

2010-03-11 Thread Roland Dreier
er, no I didn't apply this one yet... will hold for the .35 merge window.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: fix CX4 link problem in back-to-back configuration

2010-03-11 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: clear stall bit before destroying nic qp

2010-03-11 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier  rola...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Simplified iWARP Consumer Library

2010-03-11 Thread philip . frey
Sounds good to me. I'll wait with the integration of my library routines as 
they are designed around a CM thread.

Looking forward to your update.

 Phil
--Original Message--
From: Sean Hefty
Sender: linux-rdma-ow...@vger.kernel.org
To: Hefty, Sean
To: p...@frey.ws
Cc: rdre...@cisco.com
Cc: Bernard Metzler
Cc: linux-rdma@vger.kernel.org
Subject: RE: Simplified iWARP Consumer Library
Sent: Mar 11, 2010 23:49

Which of the features would you want to see in librdmacm? I guess the
integration of the cm_id and qp as well as the simpler (synchronous)
connection
establishment and teardown functionality would fit. However, all cm events
will thereafter be handled within the library rather than being presented to
the user. Does that make sense?

I have some ideas for simplifying connection establishment as part of my changes
to support native IB addresses through the rdma_cm interface.  This includes
adding new calls to the librdmacm: rdma_getaddrinfo and
rdma_create_id_and_optional_qp_and_bind_addr_and_set_route.
I'll pick a different name for the latter. :)  Combined, I think these calls
could allow a user to more easily connect, while still providing low level
control for anyone who wants it.

I haven't looked at the details for this, but we may be able to support
synchronous operation by allowing the user to provide a NULL rdma_event_channel
when creating the rdma_cm_id.  We can take advantage of rdma_migrate_id to
implement this, or even toggle between synchronous and asynchronous mode.  The
hope is that the librdmacm can remain threadless.

If you can give me a month or so to finish my current patches, I'll take a look
at this as part of my work.

- Sean

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html