Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms

2010-04-21 Thread Sasha Khapyorsky
On 20:32 Mon 19 Apr , Line Holen wrote:
  @@ -69,6 +70,9 @@
   #include opensm/osm_prefix_route.h
   #include opensm/osm_ucast_lash.h
   
  +
  +#define MAX_HOPS 128
  
  IB spec defines maximal number of hops for a fabric which is 64. Would
  it be netter to use this value here?
  
  Sasha
 
 The value of 128 was chosen as 2x max DR path allowing the SM to be in
 the middle of a fabric. But I have no problem lowering to 64.

The path in this calculation is between ports and SM is not part of the
game.

For me it seems that 64 would be better number. Hypothetically it could
be even unrelated to LFTs transition issue - when path exceeds 64 hops
SA can return NOT FOUND just well.

Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] opensm/osm_sa_path_record.c: livelock in pr_rcv_get_path_parms

2010-04-21 Thread Sasha Khapyorsky
On 20:32 Mon 19 Apr , Line Holen wrote:
 
 The value of 128 was chosen as 2x max DR path allowing the SM to be in
 the middle of a fabric. But I have no problem lowering to 64.

Would you care about patch?

Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] opensm/osm_sa_path_record.c: Lower max number of hops allowed

2010-04-21 Thread Line Holen
Lower max number of hops allowed in a path from 128 to 64.

Signed-off-by: Line Holen line.ho...@sun.com

---

diff --git a/opensm/opensm/osm_sa_path_record.c 
b/opensm/opensm/osm_sa_path_record.c
index 62102f4..9f508db 100644
--- a/opensm/opensm/osm_sa_path_record.c
+++ b/opensm/opensm/osm_sa_path_record.c
@@ -70,7 +70,7 @@
 #include opensm/osm_prefix_route.h
 #include opensm/osm_ucast_lash.h
 
-#define MAX_HOPS 128
+#define MAX_HOPS 64
 
 typedef struct osm_pr_item {
cl_list_item_t list_item;
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Socket Direct Protocol: help (2)

2010-04-21 Thread Amir Vadai
Hi Andrea,

I am preparing the fix right now.

- Amir

On 04/20/2010 04:53 PM, Andrea Gozzelino wrote:
 Hi Amir,

 have you any news about bugs 2027 SDP not respecting # SGEs as reported
 from HW and 2028 SDP should support fastreg mrs?

 When those bugs will be fixed, I will test the NE020 cards performance
 with SDP protocol and I will compare SDP and TCP.

 Keep in touch,

 Andrea Gozzelino

 INFN - Laboratori Nazionali di Legnaro(LNL)
 Viale dell'Universita' 2
 I-35020 - Legnaro (PD)- ITALIA
 Tel: +39 049 8068346
 Fax: +39 049 641925
 Mail: andrea.gozzel...@lnl.infn.it







 On Apr 15, 2010 10:38 AM, Amir Vadai am...@mellanox.co.il wrote:

   
 It should be a simple fix and I plan to do soon - just add yourself as
 CC in bugzilla  - that way I won't forget to notify you.

 - amir

 On 04/15/2010 10:07 AM, Andrea Gozzelino wrote:
 
 On Apr 15, 2010 08:24 AM, Amir Vadai am...@mellanox.co.il wrote:

   
   
 I hope to have a fix next week for the first one.

 Thanks,
 Amir

 On 04/14/2010 09:48 PM, Tung, Chien Tin wrote:
 
 
 Tung, Chien Tin wrote:
 
 
 
 One more thing - Please open a bug regarding the num_sge
 limitation at:
 https://bugs.openfabrics.org/

 
 
 
 Done, Bug 2027.

 Chien

   
   
   
 And 2028 opened to request fastreg support.

 
 
 
 I am open to test fixes for these two bugs.

 Chien

   
   
   
 
 
 Hi Amir, 
 Hi Chien,

 I understand that the bug 2027 could be solved next week, so I will
 test
 SDP protocol performance on NE020 cards.
 Is it correct? 
 If yes, could you point out the code modifies?

 Keep in touch and take care.
 Regards,
 Andrea


 Andrea Gozzelino

 INFN - Laboratori Nazionali di Legnaro  (LNL)
 Viale dell'Universita' 2
 I-35020 - Legnaro (PD)- ITALIA
 Tel: +39 049 8068346
 Fax: +39 049 641925
 Mail: andrea.gozzel...@lnl.infn.it  


   
   
 

   

 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
   

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: two questions about RDMA-WRITE

2010-04-21 Thread Ding Dinghua
2010/4/16 Sean Hefty sean.he...@intel.com:
static void jm_cq_comp_handler(struct ib_cq *cq, void *context) {
        struct jm_rdma_conn *conn = context;
        struct ib_wc wc;
        struct jm_send_ctx *send;

        /* No idea why it should be called twice. */
        printk(cq comp for id %p\n, conn-jc_id);
        ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);
        while (ib_poll_cq(cq, 1, wc) == 1) {
                if (wc.opcode != IB_WC_RDMA_WRITE) {
                        printk(completed unknown opcode %d\n, wc.opcode);
                        /* continue; */
                }
                send = (struct jm_send_ctx *)wc.wr_id;
                printk(got send=%p\n, send);
                printk(completed RDMA_WRITE of IO(%Lu, %u)\n,
                       send-s_offset, send-s_size);
                send-s_done = wc.status == IB_WC_SUCCESS ? 1 : -EIO;
                wake_up_all(send-s_wait);
        }
        ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);

 unrelated to your problem, but this second call to ib_req_notify_cq isn't
 necessary.

static int jm_rdma_cm_event_handler(struct rdma_cm_id *id, struct 
rdma_cm_event
*event) {
 ..
        case RDMA_CM_EVENT_DISCONNECTED:
                connstate = -ECONNABORTED;
                goto connected;
 ..
connected:
                printk(%pI4:%u (event 0x%x)\n,
                       conn-jc_remoteaddr.sin_addr.s_addr,
                       ntohs(conn-jc_remoteaddr.sin_port),
                       event-event  11);
                conn-jc_connstate = connstate;
                wake_up_all(conn-jc_connect_wait);
                break;

 How quickly do you respond to the disconnect event?  The remote side will wait
 until it receives a response or times out, which may be several seconds or
 minutes.

Thanks a lot, I think the problem lays here.
 - Sean





-- 
Ding Dinghua
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: opensm with multiple IB subnets

2010-04-21 Thread Yevgeny Kliteynik

Ken,

On 4/21/2010 3:07 AM, Ken Teague wrote:

On Tue, Apr 20, 2010 at 2:13 PM, Ken Teaguektea...@pobox.com  wrote:

I have a 17-node cluster and each node has a single IB card that has
2x IB ports (ib0 and ib1).


After doing a little more research, I confirmed that my understanding
of the manual page is correct.  To run opensm for each GUID, I
modified my init script to run a for loop based on the information
returned from ibstat -p.


I added this near the beginning of the script where the other
environment variables are located:
snip
OFA_HOME=/usr/local/sbin
IBSTAT_BIN=${OFA_HOME}/ibstat
IBSTAT_ARG=-p
OPENSM_BIN=${OFA_HOME}/opensm
OPENSM_ARG=-B -g
snip


I replaced the single line which started opensm with this for loop:
for i in `${IBSTAT_BIN} ${IBSTAT_ARG}`
do
 ${OPENSM_BIN} ${OPENSM_ARG} ${i}
done
snip

If anyone has a more elegant way to handle this, I'm open to
suggestions.  Many thanks.


OpenSM dumps various files to /var/log and /var/cache/opensm folders.
When you have more than one OpenSM process, they will all dump the
same files, which is probably not a good idea.

To change the output directories, set the OSM_TMP_DIR and
OSM_CACHE_DIR env. variables to some other place.
In addition, you need to make sure that each SM instance
prints its log in a different place. You need to do
something like this:

foreach guid in guid_list
export OSM_TMP_DIR=/tmp/osm_dump_dir${guid}
export OSM_CACHE_DIR=/tmp/osm_dump_dir${guid}
opensm --log_file /tmp/osm_dump_dir${guid}/osm.log -g ${guid} [your 
other options]

-- Yevgeny


Ken
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [infiniband-diags] support diffing lids and nodedesc on remoteports in ibnetdiscover

2010-04-21 Thread Al Chu
Hey Sasha,

A slight tweak to the patch.  Support diffing lids and node descriptions
on remote ports (previously it diffed only local lids and node
descriptions).  Also add appropriate manpage notes.

Al

On Tue, 2010-04-20 at 15:30 -0700, Al Chu wrote:
 Hey Sasha,
 
 This patch supports diffing node descriptions on remote ports
 (previously diffing of just the local node description was supported).
 
 Al
 
 email message attachment
   Forwarded Message 
  From: Albert Chu ch...@llnl.gov
  Subject: [PATCH] support diffing nodedesc on remoteports in
  ibnetdiscover
  Date: Tue, 20 Apr 2010 15:09:59 -0700
  
  Signed-off-by: Albert Chu ch...@llnl.gov
  ---
   infiniband-diags/src/ibnetdiscover.c |   11 +++
   1 files changed, 11 insertions(+), 0 deletions(-)
  
  diff --git a/infiniband-diags/src/ibnetdiscover.c 
  b/infiniband-diags/src/ibnetdiscover.c
  index 57f9625..eeb1b9f 100644
  --- a/infiniband-diags/src/ibnetdiscover.c
  +++ b/infiniband-diags/src/ibnetdiscover.c
  @@ -720,6 +720,17 @@ static void diff_ports(ibnd_node_t * fabric1_node, 
  ibnd_node_t * fabric2_node,
  fabric2_out++;
  }
   
  +   if (data-diff_flags  DIFF_FLAG_PORT_CONNECTION
  +data-diff_flags  DIFF_FLAG_NODE_DESCRIPTION
  +fabric1_port  fabric2_port
  +fabric1_port-remoteport  fabric2_port-remoteport
  +memcmp(fabric1_port-remoteport-node-nodedesc,
  + fabric2_port-remoteport-node-nodedesc,
  + IB_SMP_DATA_SIZE)) {
  +   fabric1_out++;
  +   fabric2_out++;
  +   }
  +
  if (fabric1_out) {
  diff_iter_out_header(fabric1_node, data,
   out_header_flag);
-- 
Albert Chu
ch...@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
---BeginMessage---

Signed-off-by: Albert Chu ch...@llnl.gov
---
 infiniband-diags/man/ibnetdiscover.8 |5 -
 infiniband-diags/src/ibnetdiscover.c |   20 
 2 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/infiniband-diags/man/ibnetdiscover.8 
b/infiniband-diags/man/ibnetdiscover.8
index 76cfbc8..3beb70b 100644
--- a/infiniband-diags/man/ibnetdiscover.8
+++ b/infiniband-diags/man/ibnetdiscover.8
@@ -71,7 +71,10 @@ are: \fIsw\fR = switches, \fIca\fR = channel adapters, 
\fIrouter\fR = routers,
 \fIport\fR = port connections, \fIlid\fR = lids, \fInodedesc\fR = node
 descriptions.  Note that \fIport\fR, \fIlid\fR, and \fInodedesc\fR are
 checked only for the node types that are specified (e.g. \fIsw\fR,
-\fIca\fR, \fIrouter\fR).
+\fIca\fR, \fIrouter\fR).  If \fIport\fR is specified alongside \fIlid\fR
+or \fInodedesc\fR, remote port lids and node descriptions will also be 
compared.
+
+
 .TP
 \fB\-p\fR, \fB\-\-ports\fR
 Obtain a ports report which is a
diff --git a/infiniband-diags/src/ibnetdiscover.c 
b/infiniband-diags/src/ibnetdiscover.c
index 57f9625..23e6dd4 100644
--- a/infiniband-diags/src/ibnetdiscover.c
+++ b/infiniband-diags/src/ibnetdiscover.c
@@ -720,6 +720,26 @@ static void diff_ports(ibnd_node_t * fabric1_node, 
ibnd_node_t * fabric2_node,
fabric2_out++;
}
 
+   if (data-diff_flags  DIFF_FLAG_PORT_CONNECTION
+data-diff_flags  DIFF_FLAG_NODE_DESCRIPTION
+fabric1_port  fabric2_port
+fabric1_port-remoteport  fabric2_port-remoteport
+memcmp(fabric1_port-remoteport-node-nodedesc,
+ fabric2_port-remoteport-node-nodedesc,
+ IB_SMP_DATA_SIZE)) {
+   fabric1_out++;
+   fabric2_out++;
+   }
+
+   if (data-diff_flags  DIFF_FLAG_PORT_CONNECTION
+data-diff_flags  DIFF_FLAG_LID
+fabric1_port  fabric2_port
+fabric1_port-remoteport  fabric2_port-remoteport
+fabric1_port-remoteport-base_lid != 
fabric2_port-remoteport-base_lid) {
+   fabric1_out++;
+   fabric2_out++;
+   }
+
if (fabric1_out) {
diff_iter_out_header(fabric1_node, data,
 out_header_flag);
-- 
1.5.4.5

---End Message---


Re: [PATCH v3 4/4] libibverbs: Undo changes in memory range tree when madvise() fails

2010-04-21 Thread Roland Dreier
Thanks, looks great, applied.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/amso1100: use the dma state API instead of the pci equivalents

2010-04-21 Thread Roland Dreier
Thanks, applied all three of these conversion patches.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/cxgb3: Don't free skbs on NET_XMIT_* indications from LLD.

2010-04-21 Thread Roland Dreier
thanks, applied
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 05/10] iw_cxgb4: Add connection management functions.

2010-04-21 Thread Roland Dreier
Thanks, all this looks pretty clean and small so I added it (as one big
patch).  One tiny issue that we can fix with a follow-up patch:

  +int c4iw_ep_redirect(void *ctx, struct dst_entry *old, struct dst_entry 
  *new,
  + struct l2t_entry *l2t)
  +{
  +struct c4iw_ep *ep = ctx;
  +
  +if (ep-dst != old)
  +return 0;
  +
  +PDBG(%s ep %p redirect to dst %p l2t %p\n, __func__, ep, new,
  + l2t);
  +dst_hold(new);
  +cxgb4_l2t_release(ep-l2t);
  +ep-l2t = l2t;
  +dst_release(old);
  +ep-dst = new;
  +return 1;
  +}

As far as I can see this function is not called or otherwise referenced
anywhere else (except for a declaration in a header).  Can we drop it?
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: make nesadapter-phy_lock usage consistent

2010-04-21 Thread Roland Dreier
thanks, applied.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: make nesadapter-phy_lock usage consistent

2010-04-21 Thread Roland Dreier
actually added a chunk to delete the (now-unused) nesadapter variable
from nes_write_1G_phy_reg to fix a compile warning... no problem tho.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/nes: make nesadapter-phy_lock usage consistent

2010-04-21 Thread Roland Dreier
By the way, any problem with me merging the following trivial patch for
2.6.35?


RDMA/nes: Make unnecessarily global functions static

This allows the compiler to do a bit better; on my x86-64 build:

add/remove: 0/2 grow/shrink: 1/0 up/down: 2288/-2365 (-77)
function old new   delta
nes_init_phy 2732561   +2288
nes_init_1g_phy  469   --469
nes_init_2025_phy   1896   -   -1896

Signed-off-by: Roland Dreier rola...@cisco.com
---
 drivers/infiniband/hw/nes/nes_hw.c|4 ++--
 drivers/infiniband/hw/nes/nes_verbs.c |2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes_hw.c 
b/drivers/infiniband/hw/nes/nes_hw.c
index 8b67207..86acb7d 100644
--- a/drivers/infiniband/hw/nes/nes_hw.c
+++ b/drivers/infiniband/hw/nes/nes_hw.c
@@ -1297,7 +1297,7 @@ int nes_destroy_cqp(struct nes_device *nesdev)
 /**
  * nes_init_1g_phy
  */
-int nes_init_1g_phy(struct nes_device *nesdev, u8 phy_type, u8 phy_index)
+static int nes_init_1g_phy(struct nes_device *nesdev, u8 phy_type, u8 
phy_index)
 {
u32 counter = 0;
u16 phy_data;
@@ -1351,7 +1351,7 @@ int nes_init_1g_phy(struct nes_device *nesdev, u8 
phy_type, u8 phy_index)
 /**
  * nes_init_2025_phy
  */
-int nes_init_2025_phy(struct nes_device *nesdev, u8 phy_type, u8 phy_index)
+static int nes_init_2025_phy(struct nes_device *nesdev, u8 phy_type, u8 
phy_index)
 {
u32 temp_phy_data = 0;
u32 temp_phy_data2 = 0;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c 
b/drivers/infiniband/hw/nes/nes_verbs.c
index e54f312..925e1f2 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -374,7 +374,7 @@ static int alloc_fast_reg_mr(struct nes_device *nesdev, 
struct nes_pd *nespd,
 /*
  * nes_alloc_fast_reg_mr
  */
-struct ib_mr *nes_alloc_fast_reg_mr(struct ib_pd *ibpd, int max_page_list_len)
+static struct ib_mr *nes_alloc_fast_reg_mr(struct ib_pd *ibpd, int 
max_page_list_len)
 {
struct nes_pd *nespd = to_nespd(ibpd);
struct nes_vnic *nesvnic = to_nesvnic(ibpd-device);
-- 
1.7.0.5


-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] amso1100: Add missing memset

2010-04-21 Thread Roland Dreier
I think this patch is actually not needed.  c2_rnic_query() is only
called for the c2dev-props memory, and c2dev is allocated with
ib_alloc_device, which will always zero it out.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V4 2/2] mlx4/IB: Add support for enhanced atomic operations

2010-04-21 Thread Roland Dreier
thanks, applied both these patches.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html