Re: NULL pointer dereference in rdma_ucm

2010-07-20 Thread Or Gerlitz

Josh England wrote:

It may be that the in-kernel field cm_id_priv has a NULL -alt_av.port , 
causing the Oops, but I don't know for sure.  Any ideas on how to debug this?

seems like this was reported in the past but remained unsolved,
http://lists.openfabrics.org/pipermail/general/2009-August/thread.html#61522

Or.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Or Gerlitz
Jason Gunthorpe wrote:
 It is a bit wider problem than just ND entries, changes in routing can
 also alter the L2 address, so that needs to be tracked as well. 

sure, when we did the address change work, see commit dd5bdff RDMA/cma: Add 
RDMA_CM_EVENT_ADDR_CHANGE event, the problem I wanted to solve was related to 
the local bonding. Over the review thread, remote address change related 
to bonding fail-over and routing changes were mentioned, and left to future 
work.


 this is back to original criticisms from netdev of this whole separated 
 stack idea - it isn't integrated, so where do you draw the line? What gets 
 left out? 
 Today, it is pretty clear that only the CM portion integrates at all
 with netdev and after that things are separate.

the address change event was an attempt to make the CM part which integrates 
with netdev
go a step further and help the data path which is offloaded to be more 
consistent with netdev,
this email is about going another step.

 So.. I think to tackle this you need to start looking at how the
 dst_entry structure works in netdev and apply the same idea to RDMA-CM
 and reflect the changes in AH back to the QP owner.

I can take a look (pointer would be very much appreciated...) still, the dst 
entry is used
for every netdev xmit where here the xmit is offloaded, so I don't see what 
could be really used from the dst code, but I might be wrong. The rdma app uses 
the neighbour once, upon address resolving, and I was trying to see if we can 
ref the neighbour so the neigh sub-system probes would keep going even though 
the neighbour is not directly used.

 Is this an iwarp problem too? Not sure how L3-L2 translation works there.

I never managed to understand how address resolving really works with iwarp... 

Doing a bit of detective work... you can see that addr4_resolve says

 /* If the device does ARP internally, return 'done' */
 if (rt-idev-dev-flags  IFF_NOARP) {
 rdma_copy_addr(addr, rt-idev-dev, NULL);
 goto put;
 }

and later cma_connect_iw places into the iwarp cm the src/dst IP addresses

 sin = (struct sockaddr_in*) id_priv-id.route.addr.src_addr;
 cm_id-local_addr = *sin;
 sin = (struct sockaddr_in*) id_priv-id.route.addr.dst_addr;
 cm_id-remote_addr = *sin;

so all the iwarp providers do ARP resolving in their TOE stack?! Steve, can you
clarify that?

 
 Not sure what you do about UD.. Maybe RDMA-CM learns to do UC where
 the only action is to register notification monitors for L2 addressing
 changes in the kernel?

The problem exists for all IB transports (even for RD, if it would have been 
implemented...), the only difference between the U and R onces is that for the 
R's, if the remote side vanished, eventually the IB HW would let you know on 
that in the form of CQ error.

 Can this be hidden with Sean's recent work on simplified progamming models?

not sure how Sean's work relates to this proposed change.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: some dapl assistance - [PATCH] dapl-2.0 improperly handles pkeycheck/query in host order

2010-07-20 Thread Itay Berman
Works!

[r...@dodly0 OMB-3.1.1]# mpiexec -ppn 1 -n 2 -env I_MPI_FABRICS dapl:dapl -env 
I_MPI_DEBUG 5 -env I_MPI_CHECK_DAPL_PROVIDER_MISMATCH none -env DAPL_DBG_TYPE 
0x -env DAPL_IB_PKEY 0x0280 -env DAPL_IB_SL 4 /tmp/osu_long
dodly0:5bc3: dapl_init: dbg_type=0x,dbg_dest=0x1
dodly0:5bc3:  open_hca: device mlx4_0 not found
dodly0:5bc3:  open_hca: device mlx4_0 not found
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 
ack_time 16 mr 4294967295
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 
ack_time 16 mr 4294967295
dodly0:5bc3:  query_hca: port.link_layer = 0x1
dodly0:5bc3:  query_hca: (a0.0) eps 64512, sz 16384 evds 65408, sz 131071 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly0:5bc3:  query_hca: msg 2147483648 rdma 2147483648 iov 27 lmr 131056 rmr 0 
ack_time 16 mr 4294967295
dodly0:5bc3:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly4:1e8d: dapl_init: dbg_type=0x,dbg_dest=0x1
[0] MPI startup(): DAPL provider ofa-v2-mthca0-1
[0] MPI startup(): dapl data transfer mode
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 
ack_time 16 mr 4294967295
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 
ack_time 16 mr 4294967295
dodly4:1e8d:  query_hca: port.link_layer = 0x1
dodly4:1e8d:  query_hca: (a0.0) eps 262076, sz 16351 evds 65408, sz 4194303 mtu 
2048 - pkey 640 p_idx 1 sl 4
dodly4:1e8d:  query_hca: msg 1073741824 rdma 1073741824 iov 32 lmr 524272 rmr 0 
ack_time 16 mr 4294967295
dodly4:1e8d:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
[1] MPI startup(): DAPL provider ofa-v2-mlx4_0-1
[1] MPI startup(): dapl data transfer mode
[0] MPI startup(): static connections storm algo
dodly0:5bc3:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=1, evnts=0x4
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=15 ret=1, evnts=0x1
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=17 ret=1, evnts=0x1
dodly0:5bc3:  dapl_poll: fd=17 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=14 ret=0, evnts=0x0
dodly0:5bc3:  dapl_poll: fd=19 ret=1, evnts=0x1
[0] MPI startup(): I_MPI_CHECK_DAPL_PROVIDER_MISMATCH=none
[0] MPI startup(): I_MPI_DEBUG=5
dodly4:1e8d:  dapl_poll: fd=15 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=13 ret=0, evnts=0x0
dodly4:1e8d:  dapl_poll: fd=17 ret=1, evnts=0x1
[0] MPI startup(): I_MPI_FABRICS=dapl:dapl
[0] MPI startup(): set domain to {0,1,2,3} on node dodly0
[1] MPI startup(): set domain to {0,1,2,3} on node dodly4
[0] RankPid  Node name  Pin cpu
[0] 0   23491dodly0 {0,1,2,3}
[0] 1   7821 dodly4 {0,1,2,3}
# OSU MPI Bandwidth Test v3.1.1
# SizeBandwidth (MB/s)
4194304 978.30
4194304 978.45
4194304 978.69
4194304 978.24
dodly0:5bc3: dapl async_event: DEV ERR 12
dodly4:1e8d: dapl async_event: DEV ERR 12
dodly4:1e8d:  DTO completion ERROR: 12: op 0xff
dodly4:1e8d: DTO completion ERR: status 12, op OP_RDMA_READ, vendor_err 0x81 - 
172.30.3.230
[1:dodly4][../../dapl_module_poll.c:3972] Intel MPI fatal error: 
ofa-v2-mlx4_0-1 DTO operation posted for [0:dodly0] completed with error. 
status=0x8. cookie=0x4
Assertion failed in file ../../dapl_module_poll.c at line 3973: 0
internal ABORT - process 1
rank 1 in job 41  dodly0_54941   caused collective abort of all ranks
  exit status of rank 1: killed by signal 9

dapl reports p_idx 1. this is an output of an osu test that I removed the 
configured pkey. At that time the mpi died. So it indeed ran over that pkey.

To test the sl I will have to change my 

[PATCH for-2.6.36] ib: fix some sparse warnings

2010-07-20 Thread Or Gerlitz
fixed the following drivers/infiniband sparse pointed issues

  CHECK   drivers/infiniband/hw/cxgb3/iwch_cm.c
iwch_cm.c:140:5: warning: symbol 'iwch_l2t_send' was not declared. Should it be 
static?
  CHECK   drivers/infiniband/hw/nes/nes_verbs.c
nes_verbs.c:1944:45: warning: Using plain integer as NULL pointer
nes_verbs.c:1944:48: warning: Using plain integer as NULL pointer
  CHECK   drivers/infiniband/hw/nes/nes_cm.c
nes_cm.c:2645:43: warning: mixing different enum types
nes_cm.c:2645:43: int enum iw_cm_event_type  versus
nes_cm.c:2645:43: int enum iw_cm_event_status
  CHECK   drivers/infiniband/ulp/iser/iser_initiator.c
iser_initiator.c:173:5: warning: symbol 'iser_alloc_rx_descriptors' was not 
declared. Should it be static?

Signed-off-by: Or Gerlitz ogerl...@voltaire.com


I didn't address these two

  CHECK   drivers/infiniband/hw/cxgb3/iwch_cq.c
drivers/infiniband/hw/cxgb3/iwch_cq.c:192:9: warning: context imbalance in 
'iwch_poll_cq_one' - different lock contexts for basic block
  CHECK   drivers/infiniband/hw/cxgb3/iwch_qp.c
drivers/infiniband/hw/cxgb3/iwch_qp.c:805:13: warning: context imbalance in 
'__flush_qp' - unexpected unlock

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c 
b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index ebfb117..3cdb535 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -137,7 +137,7 @@ static void stop_ep_timer(struct iwch_ep *ep)
put_ep(ep-com);
 }

-int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct l2t_entry 
*l2e)
+static int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct 
l2t_entry *l2e)
 {
int error = 0;
struct cxio_rdev *rdev;
diff --git a/drivers/infiniband/hw/nes/nes_cm.c 
b/drivers/infiniband/hw/nes/nes_cm.c
index 986d6f3..98887af 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -2565,7 +2565,7 @@ static int nes_cm_disconn_true(struct nes_qp *nesqp)
u16 last_ae;
u8 original_hw_tcp_state;
u8 original_ibqp_state;
-   enum iw_cm_event_type disconn_status = IW_CM_EVENT_STATUS_OK;
+   enum iw_cm_event_status  disconn_status = IW_CM_EVENT_STATUS_OK;
int issue_disconn = 0;
int issue_close = 0;
int issue_flush = 0;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c 
b/drivers/infiniband/hw/nes/nes_verbs.c
index 9bc2d74..0df51a4 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -1941,7 +1941,7 @@ static int nes_reg_mr(struct nes_device *nesdev, struct 
nes_pd *nespd,
u8  use_256_pbls = 0;
u8  use_4k_pbls = 0;
u16 use_two_level = (pbl_count_4k  1) ? 1 : 0;
-   struct nes_root_vpbl new_root = {0, 0, 0};
+   struct nes_root_vpbl new_root = {0, NULL, NULL};
u32 opcode = 0;
u16 major_code;

diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 0b9ef07..95a08a8 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -170,7 +170,7 @@ static void iser_create_send_desc(struct iser_conn  
*ib_conn,
 }


-int iser_alloc_rx_descriptors(struct iser_conn *ib_conn)
+static int iser_alloc_rx_descriptors(struct iser_conn *ib_conn)
 {
int i, j;
u64 dma_addr;
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH for-2.6.36] ib: fix some sparse warnings

2010-07-20 Thread Tung, Chien Tin
 Subject: [PATCH for-2.6.36] ib: fix some sparse warnings
 
 fixed the following drivers/infiniband sparse pointed issues
 

 nes_verbs.c:1944:45: warning: Using plain integer as NULL pointer
 nes_verbs.c:1944:48: warning: Using plain integer as NULL pointer
   CHECK   drivers/infiniband/hw/nes/nes_cm.c
 nes_cm.c:2645:43: warning: mixing different enum types
 nes_cm.c:2645:43: int enum iw_cm_event_type  versus
 nes_cm.c:2645:43: int enum iw_cm_event_status


 
 Signed-off-by: Or Gerlitz ogerl...@voltaire.com


Thanks Or!

Acked-by: Chien Tung chien.tin.t...@intel.com



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/36] drivers/infiniband: Remove unnecessary casts of private_data

2010-07-20 Thread Jiri Kosina
On Tue, 13 Jul 2010, Ralph Campbell wrote:

 Acked-by: Ralph Campbell ralph.campb...@qlogic.com
 
 On Mon, 2010-07-12 at 13:50 -0700, Joe Perches wrote:
  Signed-off-by: Joe Perches j...@perches.com
  ---
   drivers/infiniband/hw/ipath/ipath_file_ops.c |2 +-
   1 files changed, 1 insertions(+), 1 deletions(-)
  
  diff --git a/drivers/infiniband/hw/ipath/ipath_file_ops.c 
  b/drivers/infiniband/hw/ipath/ipath_file_ops.c
  index 9c5c66d..65eb892 100644
  --- a/drivers/infiniband/hw/ipath/ipath_file_ops.c
  +++ b/drivers/infiniband/hw/ipath/ipath_file_ops.c
  @@ -2055,7 +2055,7 @@ static int ipath_close(struct inode *in, struct file 
  *fp)
   
  mutex_lock(ipath_mutex);
   
  -   fd = (struct ipath_filedata *) fp-private_data;
  +   fd = fp-private_data;
  fp-private_data = NULL;
  pd = fd-pd;
  if (!pd) {

As the patch is not present in linux-next as of today, I have applied it. 
Thanks,

-- 
Jiri Kosina
SUSE Labs, Novell Inc.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: When IBoE will be merged to upstream?

2010-07-20 Thread Jason Gunthorpe
On Mon, Jul 19, 2010 at 10:28:24PM -0700, Paul Grun wrote:

 This is incorrect.  The intent of the text is that a verbs consumer deals
 ONLY in layer 3 addresses (GIDs).  It doesn't make any sense to restrict the
 interpretation to mean 'LIDs', since LIDs are a NOP for RoCE.  
 
 CA16-17: When accessing the services of a RoCE verbs provider, the
 source and destination identifiers contained in the address vector shall
 consist of GIDs; the address vector shall not contain layer 2 references
 (e.g. local addresses). Layer 2 references include source and destination
 local identifiers and LID Path Bits.
 
 layer 2 references refers to LIDs, MAC IDs or any other form of layer 2
 address.  If the wording of the text is insufficiently clear, please post a
 comment on comment tracker on the IBTA website.  Nevertheless, that is the
 intent.  

Since it never actually says MAC address it reads like it is just
excluding existing IB L2 addresses from use in ROCEE which makes alot
sense. If MAC address was ment, it should have been listed explicitly!

  Exactly how and where the MAC address comes about was never decided,
 
 Correct.  The IBTA IBXoE WG felt that defining the mapping from GID to MAC
 ID should be a function of the underlying fabric (Ethernet) and thus was out
 of scope for us to define the mapping mechanism.

IHMO, it is a bad design to create a architecture that requires the L2
information in the AH, forbid the L2 information from being passed
into the AH APIs, and then not specify how the L2 information is
created.

How is anyone supposed to implement this?

  BTW, I absolutely hate the mixing of 'Sometimes it is a IPv4,
  sometimes it is a GID, and sometimes it is an IPv6' in the same
  field. That is just so nasty. The GID is a GID, don't overload it in
  an ambiguous way to mean 2 other things!
 
 I am unaware of any overloading, at least in the RoCE spec.

This is a feature added as part of the proposed patch set that spurned
this whole discussion. It is this feature that made create_ah into a
blocking call ...

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Jason Gunthorpe
On Tue, Jul 20, 2010 at 10:25:52AM +0300, Or Gerlitz wrote:

  So.. I think to tackle this you need to start looking at how the
  dst_entry structure works in netdev and apply the same idea to RDMA-CM
  and reflect the changes in AH back to the QP owner.
 
 I can take a look (pointer would be very much appreciated...) still,
 the dst entry is used for every netdev xmit where here the xmit is
 offloaded, so I don't see what could be really used from the dst
 code, but I might be wrong. The rdma app uses the neighbour once,
 upon address resolving, and I was trying to see if we can ref the
 neighbour so the neigh sub-system probes would keep going even
 though the neighbour is not directly used.

It has been a while since I looked through this .. but, IIRC, the
general idea was that the socket held onto a cached dst and then at
each send it would use that dst to generate the L2 headers. Somehow
the dst would become invalidated when the routing cache was flushed
out.

So, basically, if you can add to RDMA-CM a way to get, hold and re-get
the dst you have solved the first problem, - how do know the current
routing information, and hold onto it, keep it in caches, etc.

The second problem, is how do you get notified that the dst may have
been changed? sockets seem to basically just poll every packet, so you
might need to use some netdev notifications, maybe also a timer?

I'd see a flow like this:
 - in the current route lookup code stash the dst
 - add a function to freshen the dst
 - hook events that might indicate the dst is invalid
 - on event trigger freshen the dst, regenerate the L2 address info
   and compare it to what is already in use
 - If different, send an event to user space.

stashing the dst lets you get back to the L2 information by using the
routing cache, and by holding onto a neighbor reference (in the dst)

Also, while doing this you are going to need to do something to have
the kernel send ND probes to keep the ND entry fresh when the
connection is open. Not sure, but I think this also has something to
do with the DST.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Steve Wise

Or Gerlitz wrote:

Jason Gunthorpe wrote:
  

It is a bit wider problem than just ND entries, changes in routing can
also alter the L2 address, so that needs to be tracked as well. 



sure, when we did the address change work, see commit dd5bdff RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event, the problem I wanted to solve was related to 
the local bonding. Over the review thread, remote address change related 
to bonding fail-over and routing changes were mentioned, and left to future work.



  
this is back to original criticisms from netdev of this whole separated 
stack idea - it isn't integrated, so where do you draw the line? What gets left out? 
Today, it is pretty clear that only the CM portion integrates at all

with netdev and after that things are separate.



the address change event was an attempt to make the CM part which integrates 
with netdev
go a step further and help the data path which is offloaded to be more 
consistent with netdev,
this email is about going another step.

  

So.. I think to tackle this you need to start looking at how the
dst_entry structure works in netdev and apply the same idea to RDMA-CM
and reflect the changes in AH back to the QP owner.



I can take a look (pointer would be very much appreciated...) still, the dst 
entry is used
for every netdev xmit where here the xmit is offloaded, so I don't see what 
could be really used from the dst code, but I might be wrong. The rdma app uses 
the neighbour once, upon address resolving, and I was trying to see if we can 
ref the neighbour so the neigh sub-system probes would keep going even though 
the neighbour is not directly used.

  

Is this an iwarp problem too? Not sure how L3-L2 translation works there.



I never managed to understand how address resolving really works with iwarp... 


Doing a bit of detective work... you can see that addr4_resolve says

  

/* If the device does ARP internally, return 'done' */
if (rt-idev-dev-flags  IFF_NOARP) {
rdma_copy_addr(addr, rt-idev-dev, NULL);
goto put;
}



and later cma_connect_iw places into the iwarp cm the src/dst IP addresses

  

sin = (struct sockaddr_in*) id_priv-id.route.addr.src_addr;
cm_id-local_addr = *sin;
sin = (struct sockaddr_in*) id_priv-id.route.addr.dst_addr;
cm_id-remote_addr = *sin;



so all the iwarp providers do ARP resolving in their TOE stack?! Steve, can you
clarify that?

  


The Ammasso driver uses the IFF_NOARP, and I think actually that is the 
only iwarp driver that uses it. 



The cxgb3/4 drivers do not set IFF_NOARP and rely on ND being done as 
part of connection setup.  The driver will initiate ND if there isn't a 
neigh entry available at the time the iwarp driver tries to send a SYN 
or SYN/ACK.  So even though the rdma_cm does ND initially, the cxgb* 
drivers don't assume that.  The code that handles all this is in 
cxgb3.ko.  See drivers/net/cxgb3/l2t.c.  The iwarp driver code that uses 
the L2T services is mainly in drivers/infiniband/hw/cxgb3/iwch_cm.c. 



The cxgb* drivers actually reference the neigh and dst structs until the 
offload connection is gone.  Also if the the offloaded connection has 
problems transmitting (due to a L2 address change, for example), then 
the driver will initiate ND again by calling neigh_event_send().  See 
t4_l2t_send_event() in l2t.c which is called by the iwarp driver in 
peer_abort() from iwch_cm.c when the HW tells us its retransmitting too 
much.



The cxgb* drivers also handle routing redirects, but I think that path 
has bugs.



What doesn't happen is active positive feedback during the connection to 
avoid NUD.  IE once the connection is setup, nobody calls 
dst_confirm().   It is only called during connection setup/teardown.




Steve.





 
  

Not sure what you do about UD.. Maybe RDMA-CM learns to do UC where
the only action is to register notification monitors for L2 addressing
changes in the kernel?



The problem exists for all IB transports (even for RD, if it would have been 
implemented...), the only difference between the U and R onces is that for the 
R's, if the remote side vanished, eventually the IB HW would let you know on 
that in the form of CQ error.

  

Can this be hidden with Sean's recent work on simplified progamming models?



not sure how Sean's work relates to this proposed change.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH for-2.6.36] ib: fix some sparse warnings

2010-07-20 Thread Steve Wise

Acked-by: Steve Wise sw...@opengridcomputing.com

Or Gerlitz wrote:

fixed the following drivers/infiniband sparse pointed issues

  CHECK   drivers/infiniband/hw/cxgb3/iwch_cm.c
iwch_cm.c:140:5: warning: symbol 'iwch_l2t_send' was not declared. Should it be 
static?
  CHECK   drivers/infiniband/hw/nes/nes_verbs.c
nes_verbs.c:1944:45: warning: Using plain integer as NULL pointer
nes_verbs.c:1944:48: warning: Using plain integer as NULL pointer
  CHECK   drivers/infiniband/hw/nes/nes_cm.c
nes_cm.c:2645:43: warning: mixing different enum types
nes_cm.c:2645:43: int enum iw_cm_event_type  versus
nes_cm.c:2645:43: int enum iw_cm_event_status
  CHECK   drivers/infiniband/ulp/iser/iser_initiator.c
iser_initiator.c:173:5: warning: symbol 'iser_alloc_rx_descriptors' was not 
declared. Should it be static?

Signed-off-by: Or Gerlitz ogerl...@voltaire.com


I didn't address these two

  CHECK   drivers/infiniband/hw/cxgb3/iwch_cq.c
drivers/infiniband/hw/cxgb3/iwch_cq.c:192:9: warning: context imbalance in 
'iwch_poll_cq_one' - different lock contexts for basic block
  CHECK   drivers/infiniband/hw/cxgb3/iwch_qp.c
drivers/infiniband/hw/cxgb3/iwch_qp.c:805:13: warning: context imbalance in 
'__flush_qp' - unexpected unlock

diff --git a/drivers/infiniband/hw/cxgb3/iwch_cm.c 
b/drivers/infiniband/hw/cxgb3/iwch_cm.c
index ebfb117..3cdb535 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_cm.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_cm.c
@@ -137,7 +137,7 @@ static void stop_ep_timer(struct iwch_ep *ep)
put_ep(ep-com);
 }

-int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct l2t_entry 
*l2e)
+static int iwch_l2t_send(struct t3cdev *tdev, struct sk_buff *skb, struct 
l2t_entry *l2e)
 {
int error = 0;
struct cxio_rdev *rdev;
diff --git a/drivers/infiniband/hw/nes/nes_cm.c 
b/drivers/infiniband/hw/nes/nes_cm.c
index 986d6f3..98887af 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -2565,7 +2565,7 @@ static int nes_cm_disconn_true(struct nes_qp *nesqp)
u16 last_ae;
u8 original_hw_tcp_state;
u8 original_ibqp_state;
-   enum iw_cm_event_type disconn_status = IW_CM_EVENT_STATUS_OK;
+   enum iw_cm_event_status  disconn_status = IW_CM_EVENT_STATUS_OK;
int issue_disconn = 0;
int issue_close = 0;
int issue_flush = 0;
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c 
b/drivers/infiniband/hw/nes/nes_verbs.c
index 9bc2d74..0df51a4 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -1941,7 +1941,7 @@ static int nes_reg_mr(struct nes_device *nesdev, struct 
nes_pd *nespd,
u8  use_256_pbls = 0;
u8  use_4k_pbls = 0;
u16 use_two_level = (pbl_count_4k  1) ? 1 : 0;
-   struct nes_root_vpbl new_root = {0, 0, 0};
+   struct nes_root_vpbl new_root = {0, NULL, NULL};
u32 opcode = 0;
u16 major_code;

diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c 
b/drivers/infiniband/ulp/iser/iser_initiator.c
index 0b9ef07..95a08a8 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -170,7 +170,7 @@ static void iser_create_send_desc(struct iser_conn  
*ib_conn,
 }


-int iser_alloc_rx_descriptors(struct iser_conn *ib_conn)
+static int iser_alloc_rx_descriptors(struct iser_conn *ib_conn)
 {
int i, j;
u64 dma_addr;
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Steve Wise

Jason Gunthorpe wrote:

On Tue, Jul 20, 2010 at 01:12:17PM -0500, Steve Wise wrote:

  
The cxgb* drivers actually reference the neigh and dst structs until the  
offload connection is gone.  Also if the the offloaded connection has  
problems transmitting (due to a L2 address change, for example), then  
the driver will initiate ND again by calling neigh_event_send().  See  
t4_l2t_send_event() in l2t.c which is called by the iwarp driver in  
peer_abort() from iwch_cm.c when the HW tells us its retransmitting too  
much.


0
That strikes me as mildly scary.. The cxgb can't possibly get the
right dst (ie, the same dst that the RDMA CM got) in all the corner
cases? Ie how can setting oif to 0 in iwch_cm.c:find_route be right??

  



I guess it should be using the oif from the cm_id?



So, looks like there is a larger cleanup here, if the RDMACM holds the
dst and has functions to freshen it/track it then the iwarp driver
should rely on the RDMACM to manage the dst..

In other words, moving the dst handling from iwch_cm into RDMACM would
also mostly satisfy why Or is trying to do.

Does that make sense to you Steve?
  


Yes, in principle. 

If you want to move all this into the RDMACM, then an interface must be 
devised so the drivers can tell the RDMACM that an offload connection is 
failing and probably needs ND/NUD done.  Or some such feedback 
interface.  And the RDMACM needs to call the devices if something 
changes like routing redirects I guess.   You might want the device to 
specify whether it wants the rdma-cm to handle all this or not.  Some 
devices might be better able to handle this stuff.




How does the cxgb3 driver know when to update the HW if the dst/nd
entries change?
  


It uses netevents.  See nb_callback() in drivers/net/cxgb3/cxgb3_offload.c.




Jason
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NULL pointer dereference in rdma_ucm

2010-07-20 Thread Josh England
Do you think upgrading to OFED-1.5.1 would help at all?

-JE

On Mon, Jul 19, 2010 at 11:40 PM, Or Gerlitz ogerl...@voltaire.com wrote:
 Josh England wrote:

 It may be that the in-kernel field cm_id_priv has a NULL -alt_av.port ,
 causing the Oops, but I don't know for sure.  Any ideas on how to debug
 this?

 seems like this was reported in the past but remained unsolved,
 http://lists.openfabrics.org/pipermail/general/2009-August/thread.html#61522

 Or.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Jason Gunthorpe
On Tue, Jul 20, 2010 at 02:20:21PM -0500, Steve Wise wrote:

 I guess it should be using the oif from the cm_id?

Not sure exactly what is best here :|

 So, looks like there is a larger cleanup here, if the RDMACM holds the
 dst and has functions to freshen it/track it then the iwarp driver
 should rely on the RDMACM to manage the dst..

 In other words, moving the dst handling from iwch_cm into RDMACM would
 also mostly satisfy why Or is trying to do.

 Does that make sense to you Steve?
   

 Yes, in principle. 

 If you want to move all this into the RDMACM, then an interface must be  
 devised so the drivers can tell the RDMACM that an offload connection is  
 failing and probably needs ND/NUD done.  Or some such feedback  
 interface.  And the RDMACM needs to call the devices if something  
 changes like routing redirects I guess.   

I think if RDMACM manages the dst and lets the devices access it then
all the existing netdev infrastructure for poking at a dst should be
available to the device?

 You might want the device to specify whether it wants the rdma-cm to
 handle all this or not.  Some devices might be better able to handle
 this stuff.

?? either you integrate with netdev in this area or your device is
broken :( :( Ie doing ND under the covers is broken, it breaks corner
case netdev ND management stuff like static ND entries. Same for ICMP
redirects, same for route lookups and caching, same for route PMTU
.. :(

IMHO, going down the path of integration is all or nothing, you don't
get to support things like Amasso doing seperate ND while providing
much fuller integration for cxgb. That just creates a huge complex
mess for end users.

 How does the cxgb3 driver know when to update the HW if the dst/nd
 entries change?

 It uses netevents.  See nb_callback() in
 drivers/net/cxgb3/cxgb3_offload.c.

What about route table changes?

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Steve Wise

Jason Gunthorpe wrote:

On Tue, Jul 20, 2010 at 02:20:21PM -0500, Steve Wise wrote:

  

I guess it should be using the oif from the cm_id?



Not sure exactly what is best here :|

  

So, looks like there is a larger cleanup here, if the RDMACM holds the
dst and has functions to freshen it/track it then the iwarp driver
should rely on the RDMACM to manage the dst..

In other words, moving the dst handling from iwch_cm into RDMACM would
also mostly satisfy why Or is trying to do.

Does that make sense to you Steve?
  
  
Yes, in principle. 

If you want to move all this into the RDMACM, then an interface must be  
devised so the drivers can tell the RDMACM that an offload connection is  
failing and probably needs ND/NUD done.  Or some such feedback  
interface.  And the RDMACM needs to call the devices if something  
changes like routing redirects I guess.   



I think if RDMACM manages the dst and lets the devices access it then
all the existing netdev infrastructure for poking at a dst should be
available to the device?
  



Yes. But I'm not sure exactly how the logic I described previous for 
cxgb* would be handled in the design being ironed out here.




  

You might want the device to specify whether it wants the rdma-cm to
handle all this or not.  Some devices might be better able to handle
this stuff.



?? either you integrate with netdev in this area or your device is
broken :( :( Ie doing ND under the covers is broken, it breaks corner
case netdev ND management stuff like static ND entries. Same for ICMP
redirects, same for route lookups and caching, same for route PMTU
.. :(

IMHO, going down the path of integration is all or nothing, you don't
get to support things like Amasso doing seperate ND while providing
much fuller integration for cxgb. That just creates a huge complex
mess for end users.

  



Guess you'd have to remove the Ammasso driver then. ;)



How does the cxgb3 driver know when to update the HW if the dst/nd
entries change?
  


  

It uses netevents.  See nb_callback() in
drivers/net/cxgb3/cxgb3_offload.c.



What about route table changes?

  


Currently route table changes don't have any affect on existing 
connections. Only new connections would be affected.



Steve.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: When IBoE will be merged to upstream?

2010-07-20 Thread Liran Liss
   Small correction needed regarding the multicast forwarding.
   Since we are talking about IPv6 multicast groups, which 
 translate to   33:33:xx:xx:xx:xx MAC address, the router 
 listener notification protocol   is going to be MLD and not 
 IGMP. Still there are switches which support   MLD 
 forwarding to prevent the network flooding.
 
 Well as I said the mapping of IBoE MGID to Ethernet address 
 is not specified.  However I agree that using the same 
 mapping as IPv6 so we end up with 33:33:... addresses makes sense.

Agreed.

 
 Yes, you are right that MLD snooping is the mechanism for 
 switches to discover IPv6 multicast group membership.  
 However for the IBoE case there is no requirement that IPv6 
 multicast group membership corresponds in any way to the IBoE 
 multicast group membership for the interface (and indeed as 
 far as I can tell from the IBoE spec, there is no requirement 
 that any IPv6 interface be configured on an IBoE port).
 
 Furthermore, even if an IBoE interface sends MLD messages for a given
 IPv6 group, there is no requirement that a switch use the 
 membership information for that group to forward multicast 
 packets with a non-IPv6 ethertype.

Right. Initially there can be flooding within the VLAN. In the future we can
evolve to use a group-membership protocol when customers that care about
the efficiency drive their switch vendors to support it.

Are there any other issues that you would like us to address before updating 
the patches?
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NULL pointer dereference in rdma_ucm

2010-07-20 Thread Roland Dreier
  I'm experimenting with an rdma_cm application to push data around
  between nodes on an ~1000 node cluster (CentOS-5.3 with 2.6.18-128.el5
  and OFED-1.4.2).  Under heavy load, I'm seeing several nodes per day
  kernel panic due to a NULL pointer dereference.  It may be that the
  in-kernel field cm_id_priv has a NULL -alt_av.port , causing the
  Oops, but I don't know for sure.  Any ideas on how to debug this?

You have a pretty unsupportable combination of ancient kernel and old
OFED stack.  Is there any way you can test this with a recent mainline
kernel?

If I were debugging this I guess I would try to find out for sure where
the NULL dereference occurs -- I guess ib_cm_init_qp_attr() has all the
leaf functions (cm_init_qp_init_attr etc) inlined, so the first step
would be to figure out which one of those functions you're crashing in
(and also confirm that it's always the same one).  You could do that by
marking them noinline, or just put a WARN_ON(!ptr) before every
pointer dereference (does 2.6.18 even have WARN_ON?).

 - R.
-- 
Roland Dreier rola...@cisco.com || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Jason Gunthorpe
On Tue, Jul 20, 2010 at 03:50:05PM -0500, Steve Wise wrote:

 I think if RDMACM manages the dst and lets the devices access it then
 all the existing netdev infrastructure for poking at a dst should be
 available to the device?

 Yes. But I'm not sure exactly how the logic I described previous for  
 cxgb* would be handled in the design being ironed out here.

I'm thinking something like this..

- The RDMA CM gets the dst from its route lookup locks it and stores
  it.
- Instead of doing a route lookup cxgb gets the dst from RDMA CM,
  locks it and stores it
- RDMA CM traps all notifications/etc and generates callback to cxgb
  to say the dst has changed.
- cxgb releases the old dst and grabs the new one, updates the HW,
  etc.

Basically the same as what you have now, but all the logic to find
and monitor the dst moves to RDMA CM..

redirects/etc are all handled by netdev/rdma cm and just generate the
same 'dst has changed' call back to cxgb..

Or's user space notification stuff hooks the same callback to generate
a notification to userspace about the new dst.

All the stuff you do now with the dst you can keep doing, you just
remove all the route lookup and netdev hooking to get the dst from
RDMA CM.

Jason
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Steve Wise

Jason Gunthorpe wrote:

On Tue, Jul 20, 2010 at 03:50:05PM -0500, Steve Wise wrote:

  

I think if RDMACM manages the dst and lets the devices access it then
all the existing netdev infrastructure for poking at a dst should be
available to the device?
  
Yes. But I'm not sure exactly how the logic I described previous for  
cxgb* would be handled in the design being ironed out here.



I'm thinking something like this..

- The RDMA CM gets the dst from its route lookup locks it and stores
  it.
- Instead of doing a route lookup cxgb gets the dst from RDMA CM,
  locks it and stores it
- RDMA CM traps all notifications/etc and generates callback to cxgb
  to say the dst has changed.
- cxgb releases the old dst and grabs the new one, updates the HW,
  etc.

Basically the same as what you have now, but all the logic to find
and monitor the dst moves to RDMA CM..

redirects/etc are all handled by netdev/rdma cm and just generate the
same 'dst has changed' call back to cxgb..

Or's user space notification stuff hooks the same callback to generate
a notification to userspace about the new dst.

All the stuff you do now with the dst you can keep doing, you just
remove all the route lookup and netdev hooking to get the dst from
RDMA CM.

Jason
  


Sounds like this would work nicely.


Steve.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sense remote hardware address change by rdma-cm applications

2010-07-20 Thread Steve Wise


Steve Wise wrote:

Jason Gunthorpe wrote:

On Tue, Jul 20, 2010 at 03:50:05PM -0500, Steve Wise wrote:

 

I think if RDMACM manages the dst and lets the devices access it then
all the existing netdev infrastructure for poking at a dst should be
available to the device?
  
Yes. But I'm not sure exactly how the logic I described previous 
for  cxgb* would be handled in the design being ironed out here.



I'm thinking something like this..

- The RDMA CM gets the dst from its route lookup locks it and stores
  it.
- Instead of doing a route lookup cxgb gets the dst from RDMA CM,
  locks it and stores it
- RDMA CM traps all notifications/etc and generates callback to cxgb
  to say the dst has changed.
- cxgb releases the old dst and grabs the new one, updates the HW,
  etc.

Basically the same as what you have now, but all the logic to find
and monitor the dst moves to RDMA CM..

redirects/etc are all handled by netdev/rdma cm and just generate the
same 'dst has changed' call back to cxgb..

Or's user space notification stuff hooks the same callback to generate
a notification to userspace about the new dst.

All the stuff you do now with the dst you can keep doing, you just
remove all the route lookup and netdev hooking to get the dst from
RDMA CM.

Jason
  


Sounds like this would work nicely.


Steve.



Need to hear from Intel.  CCing Chien.


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NULL pointer dereference in rdma_ucm

2010-07-20 Thread Josh England
My kernel selections are limited to which versions I can get a panfs
module for.  The most recent kernel I see support for is a 2.6.31
variant from FC12.  I could ask them for a custom port for a newer
mainline kernel but the turn-around will likely be several weeks.
I'll go ahead and ask for one I guess.

It looks like 2.6.18 WARN_ON doesn't really do much:
#define WARN_ON(condition) do { if (condition) ; } while (0)

-JE

On Tue, Jul 20, 2010 at 1:52 PM, Roland Dreier rdre...@cisco.com wrote:
   I'm experimenting with an rdma_cm application to push data around
   between nodes on an ~1000 node cluster (CentOS-5.3 with 2.6.18-128.el5
   and OFED-1.4.2).  Under heavy load, I'm seeing several nodes per day
   kernel panic due to a NULL pointer dereference.  It may be that the
   in-kernel field cm_id_priv has a NULL -alt_av.port , causing the
   Oops, but I don't know for sure.  Any ideas on how to debug this?

 You have a pretty unsupportable combination of ancient kernel and old
 OFED stack.  Is there any way you can test this with a recent mainline
 kernel?

 If I were debugging this I guess I would try to find out for sure where
 the NULL dereference occurs -- I guess ib_cm_init_qp_attr() has all the
 leaf functions (cm_init_qp_init_attr etc) inlined, so the first step
 would be to figure out which one of those functions you're crashing in
 (and also confirm that it's always the same one).  You could do that by
 marking them noinline, or just put a WARN_ON(!ptr) before every
 pointer dereference (does 2.6.18 even have WARN_ON?).

  - R.
 --
 Roland Dreier rola...@cisco.com || For corporate legal information go to:
 http://www.cisco.com/web/about/doing_business/legal/cri/index.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/6] infiniband: remove dependency on __GFP_NOFAIL

2010-07-20 Thread David Rientjes
The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Cc: Roland Dreier rola...@cisco.com
Signed-off-by: David Rientjes rient...@google.com
---
 drivers/infiniband/hw/cxgb4/cq.c  |4 ++--
 drivers/infiniband/hw/cxgb4/mem.c |2 +-
 drivers/infiniband/hw/cxgb4/qp.c  |6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -43,7 +43,7 @@ static int destroy_cq(struct c4iw_rdev *rdev, struct t4_cq 
*cq,
int ret;
 
wr_len = sizeof *res_wr + sizeof *res;
-   skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(wr_len, GFP_KERNEL);
if (!skb)
return -ENOMEM;
set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
@@ -118,7 +118,7 @@ static int create_cq(struct c4iw_rdev *rdev, struct t4_cq 
*cq,
/* build fw_ri_res_wr */
wr_len = sizeof *res_wr + sizeof *res;
 
-   skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(wr_len, GFP_KERNEL);
if (!skb) {
ret = -ENOMEM;
goto err4;
diff --git a/drivers/infiniband/hw/cxgb4/mem.c 
b/drivers/infiniband/hw/cxgb4/mem.c
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -59,7 +59,7 @@ static int write_adapter_mem(struct c4iw_rdev *rdev, u32 
addr, u32 len,
wr_len = roundup(sizeof *req + sizeof *sc +
 roundup(copy_len, T4_ULPTX_MIN_IO), 16);
 
-   skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(wr_len, GFP_KERNEL);
if (!skb)
return -ENOMEM;
set_wr_txq(skb, CPL_PRIORITY_CONTROL, 0);
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -130,7 +130,7 @@ static int create_qp(struct c4iw_rdev *rdev, struct t4_wq 
*wq,
/* build fw_ri_res_wr */
wr_len = sizeof *res_wr + 2 * sizeof *res;
 
-   skb = alloc_skb(wr_len, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(wr_len, GFP_KERNEL);
if (!skb) {
ret = -ENOMEM;
goto err7;
@@ -961,7 +961,7 @@ static int rdma_fini(struct c4iw_dev *rhp, struct c4iw_qp 
*qhp)
PDBG(%s qhp %p qid 0x%x tid %u\n, __func__, qhp, qhp-wq.sq.qid,
 qhp-ep-hwtid);
 
-   skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
if (!skb)
return -ENOMEM;
set_wr_txq(skb, CPL_PRIORITY_DATA, qhp-ep-txq_idx);
@@ -1035,7 +1035,7 @@ static int rdma_init(struct c4iw_dev *rhp, struct c4iw_qp 
*qhp)
PDBG(%s qhp %p qid 0x%x tid %u\n, __func__, qhp, qhp-wq.sq.qid,
 qhp-ep-hwtid);
 
-   skb = alloc_skb(sizeof *wqe, GFP_KERNEL | __GFP_NOFAIL);
+   skb = alloc_skb(sizeof *wqe, GFP_KERNEL);
if (!skb)
return -ENOMEM;
set_wr_txq(skb, CPL_PRIORITY_DATA, qhp-ep-txq_idx);
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/6] infiniband: remove dependency on __GFP_NOFAIL

2010-07-20 Thread Steve Wise

Acked-by: Steve Wise sw...@opengridcomputing.com

David Rientjes wrote:

The alloc_skb() in various allocations are failable, so remove
__GFP_NOFAIL from their masks.

Cc: Roland Dreier rola...@cisco.com
Signed-off-by: David Rientjes rient...@google.com
---
 drivers/infiniband/hw/cxgb4/cq.c  |4 ++--
 drivers/infiniband/hw/cxgb4/mem.c |2 +-
 drivers/infiniband/hw/cxgb4/qp.c  |6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html