Re: Kernel oops/panic with NFS over RDMA mount after disrupted Infiniband connection

2014-03-28 Thread sagi grimberg

On 3/29/2014 3:05 AM, Chuck Lever wrote:

On Mar 28, 2014, at 4:06 PM, sagi grimberg  wrote:


On 3/29/2014 1:30 AM, Chuck Lever wrote:

On Mar 28, 2014, at 2:42 AM, Senn Klemens  wrote:


Hi Chuck,

On 03/27/2014 04:59 PM, Chuck Lever wrote:

Hi-


On Mar 27, 2014, at 12:53 AM, Reiter Rafael  wrote:


On 03/26/2014 07:15 PM, Chuck Lever wrote:

Hi Rafael-

I’ll take a look. Can you report your HCA and how you reproduce this issue?

The HCA is Mellanox Technologies MT26428.

Reproduction:
1) Mount a directory via NFS/RDMA
mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.2:/ /mnt/

An additional "ls /mnt" is needed here (between step 1 and 2)


2) Pull the Infiniband cable or use ibportstate to disrupt the Infiniband 
connection
3) ls /mnt
4) wait 5-30 seconds

Thanks for the information.

I have that HCA, but I won’t have access to my test systems for a week 
(traveling). So can you try this:

# rpcdebug -m rpc -s trans

then reproduce (starting with step 1 above). Some debugging output will appear 
at the tail of /var/log/messages. Copy it to this thread.


The output of /var/log/messages is:

[  143.233701] RPC:  1688 xprt_rdma_allocate: size 1112 too large for
buffer[1024]: prog 13 vers 4 proc 1
[  143.233708] RPC:  1688 xprt_rdma_allocate: size 1112, request
0x88105894c000
[  143.233715] RPC:  1688 rpcrdma_inline_pullup: pad 0 destp
0x88105894d7dc len 124 hdrlen 124
[  143.233718] RPC:   rpcrdma_register_frmr_external: Using frmr
88084e589260 to map 1 segments
[  143.233722] RPC:  1688 rpcrdma_create_chunks: reply chunk elem
652@0x105894d92c:0xced01 (last)
[  143.233725] RPC:  1688 rpcrdma_marshal_req: reply chunk: hdrlen 48
rpclen 124 padlen 0 headerp 0x88105894d100 base 0x88105894d760
lkey 0x8000
[  143.233785] RPC:   rpcrdma_event_process: event rep
88084e589260 status 0 opcode 8 length 0
[  177.272397] RPC:   rpcrdma_event_process: event rep
(null) status C opcode 8808 length 4294967295
[  177.272649] RPC:   rpcrdma_event_process: event rep
880848ed status 5 opcode 8808 length 4294936584

The mlx4 provider is returning a WC completion status of
IB_WC_WR_FLUSH_ERR.


[  177.272651] RPC:   rpcrdma_event_process: WC opcode -30712 status
5, connection lost

-30712 is a bogus WC opcode. So the mlx4 provider is not filling in the
WC opcode. rpcrdma_event_process() thus can’t depend on the contents of
the ib_wc.opcode field when the WC completion status != IB_WC_SUCCESS.

Hey Chuck,

That is correct, the opcode field in the wc is not reliable in FLUSH errors.


A copy of the opcode reachable from the incoming rpcrdma_rep could be
added, initialized in the forward paths. rpcrdma_event_process() could
use the copy in the error case.

How about suppressing completions alltogether for fast_reg and local_inv work 
requests?
if these shall fail you will get an error completion and the QP will transition 
to error state
generating FLUSH_ERR completions for all pending WRs. In this case, you can 
just ignore
flush fast_reg + local_inv errors.

see http://marc.info/?l=linux-rdma&m=139047309831997&w=2

While considering your suggestion, I see that my proposed fix doesn’t work. In 
the FAST_REG_MR and LOCAL_INV cases, wr_id points to a struct rpcrdma_mw, not a 
struct rpcrdma_rep. Putting a copy of the opcode in rpcrdma_rep would have no 
effect. Worse:


  158 if (IB_WC_SUCCESS != wc->status) {
  159 dprintk("RPC:   %s: WC opcode %d status %X, connection 
lost\n",
  160 __func__, wc->opcode, wc->status);
  161 rep->rr_len = ~0U;

Suppose this is an IB_WC_FAST_REG_MR completion, so “rep” here is actually a 
struct rpcrdma_mw, not a struct rpcrdma_rep. Line 161 pokes 32 one-bits at the top 
of that struct rpcrdma_mw. If wc->opcode was always usable, we’d at least have 
to fix that.



So for error completions one needs to be careful dereferencing wr_id as 
the opcode might not reliable. it will be better to first identify that 
wr_id is indeed a pointer to rep.



  162 if (wc->opcode != IB_WC_FAST_REG_MR && wc->opcode != 
IB_WC_LOCAL_INV)
  163 rpcrdma_schedule_tasklet(rep);
  164 return;
  165 }
  166
  167 switch (wc->opcode) {
  168 case IB_WC_FAST_REG_MR:
  169 frmr = (struct rpcrdma_mw *)(unsigned long)wc->wr_id;
  170 frmr->r.frmr.state = FRMR_IS_VALID;
  171 break;


To make my initial solution work, you’d have to add a field to both struct 
rpcrdma_mw and struct rpcrdma_rep, and ensure they are at the same offset in 
both structures. Ewe.

Eliminating completions for FAST_REG_MR and LOCAL_INV might be a preferable way 
to address this.


Agree, Same applies for MW.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.

Re: Kernel oops/panic with NFS over RDMA mount after disrupted Infiniband connection

2014-03-28 Thread Chuck Lever

On Mar 28, 2014, at 4:06 PM, sagi grimberg  wrote:

> On 3/29/2014 1:30 AM, Chuck Lever wrote:
>> On Mar 28, 2014, at 2:42 AM, Senn Klemens  wrote:
>> 
>>> Hi Chuck,
>>> 
>>> On 03/27/2014 04:59 PM, Chuck Lever wrote:
 Hi-
 
 
 On Mar 27, 2014, at 12:53 AM, Reiter Rafael  
 wrote:
 
> On 03/26/2014 07:15 PM, Chuck Lever wrote:
>> Hi Rafael-
>> 
>> I’ll take a look. Can you report your HCA and how you reproduce this 
>> issue?
> The HCA is Mellanox Technologies MT26428.
> 
> Reproduction:
> 1) Mount a directory via NFS/RDMA
> mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.2:/ /mnt/
>>> An additional "ls /mnt" is needed here (between step 1 and 2)
>>> 
> 2) Pull the Infiniband cable or use ibportstate to disrupt the Infiniband 
> connection
> 3) ls /mnt
> 4) wait 5-30 seconds
 Thanks for the information.
 
 I have that HCA, but I won’t have access to my test systems for a week 
 (traveling). So can you try this:
 
 # rpcdebug -m rpc -s trans
 
 then reproduce (starting with step 1 above). Some debugging output will 
 appear at the tail of /var/log/messages. Copy it to this thread.
 
>>> The output of /var/log/messages is:
>>> 
>>> [  143.233701] RPC:  1688 xprt_rdma_allocate: size 1112 too large for
>>> buffer[1024]: prog 13 vers 4 proc 1
>>> [  143.233708] RPC:  1688 xprt_rdma_allocate: size 1112, request
>>> 0x88105894c000
>>> [  143.233715] RPC:  1688 rpcrdma_inline_pullup: pad 0 destp
>>> 0x88105894d7dc len 124 hdrlen 124
>>> [  143.233718] RPC:   rpcrdma_register_frmr_external: Using frmr
>>> 88084e589260 to map 1 segments
>>> [  143.233722] RPC:  1688 rpcrdma_create_chunks: reply chunk elem
>>> 652@0x105894d92c:0xced01 (last)
>>> [  143.233725] RPC:  1688 rpcrdma_marshal_req: reply chunk: hdrlen 48
>>> rpclen 124 padlen 0 headerp 0x88105894d100 base 0x88105894d760
>>> lkey 0x8000
>>> [  143.233785] RPC:   rpcrdma_event_process: event rep
>>> 88084e589260 status 0 opcode 8 length 0
>>> [  177.272397] RPC:   rpcrdma_event_process: event rep
>>> (null) status C opcode 8808 length 4294967295
>>> [  177.272649] RPC:   rpcrdma_event_process: event rep
>>> 880848ed status 5 opcode 8808 length 4294936584
>> The mlx4 provider is returning a WC completion status of
>> IB_WC_WR_FLUSH_ERR.
>> 
>>> [  177.272651] RPC:   rpcrdma_event_process: WC opcode -30712 status
>>> 5, connection lost
>> -30712 is a bogus WC opcode. So the mlx4 provider is not filling in the
>> WC opcode. rpcrdma_event_process() thus can’t depend on the contents of
>> the ib_wc.opcode field when the WC completion status != IB_WC_SUCCESS.
> 
> Hey Chuck,
> 
> That is correct, the opcode field in the wc is not reliable in FLUSH errors.
> 
>> 
>> A copy of the opcode reachable from the incoming rpcrdma_rep could be
>> added, initialized in the forward paths. rpcrdma_event_process() could
>> use the copy in the error case.
> 
> How about suppressing completions alltogether for fast_reg and local_inv work 
> requests?
> if these shall fail you will get an error completion and the QP will 
> transition to error state
> generating FLUSH_ERR completions for all pending WRs. In this case, you can 
> just ignore
> flush fast_reg + local_inv errors.
> 
> see http://marc.info/?l=linux-rdma&m=139047309831997&w=2

While considering your suggestion, I see that my proposed fix doesn’t work. In 
the FAST_REG_MR and LOCAL_INV cases, wr_id points to a struct rpcrdma_mw, not a 
struct rpcrdma_rep. Putting a copy of the opcode in rpcrdma_rep would have no 
effect. Worse:

>  158 if (IB_WC_SUCCESS != wc->status) {
>  159 dprintk("RPC:   %s: WC opcode %d status %X, 
> connection lost\n",
>  160 __func__, wc->opcode, wc->status);
>  161 rep->rr_len = ~0U;

Suppose this is an IB_WC_FAST_REG_MR completion, so “rep” here is actually a 
struct rpcrdma_mw, not a struct rpcrdma_rep. Line 161 pokes 32 one-bits at the 
top of that struct rpcrdma_mw. If wc->opcode was always usable, we’d at least 
have to fix that.

>  162 if (wc->opcode != IB_WC_FAST_REG_MR && wc->opcode != 
> IB_WC_LOCAL_INV)
>  163 rpcrdma_schedule_tasklet(rep);
>  164 return;
>  165 }
>  166 
>  167 switch (wc->opcode) {
>  168 case IB_WC_FAST_REG_MR:
>  169 frmr = (struct rpcrdma_mw *)(unsigned long)wc->wr_id;
>  170 frmr->r.frmr.state = FRMR_IS_VALID;
>  171 break;


To make my initial solution work, you’d have to add a field to both struct 
rpcrdma_mw and struct rpcrdma_rep, and ensure they are at the same offset in 
both structures. Ewe.

Eliminating completions for FAST_REG_MR and LOCAL_INV might be a preferable way 
to address this.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com



--
To unsubscribe from this

Re: Kernel oops/panic with NFS over RDMA mount after disrupted Infiniband connection

2014-03-28 Thread sagi grimberg

On 3/29/2014 1:30 AM, Chuck Lever wrote:

On Mar 28, 2014, at 2:42 AM, Senn Klemens  wrote:


Hi Chuck,

On 03/27/2014 04:59 PM, Chuck Lever wrote:

Hi-


On Mar 27, 2014, at 12:53 AM, Reiter Rafael  wrote:


On 03/26/2014 07:15 PM, Chuck Lever wrote:

Hi Rafael-

I’ll take a look. Can you report your HCA and how you reproduce this issue?

The HCA is Mellanox Technologies MT26428.

Reproduction:
1) Mount a directory via NFS/RDMA
mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.2:/ /mnt/

An additional "ls /mnt" is needed here (between step 1 and 2)


2) Pull the Infiniband cable or use ibportstate to disrupt the Infiniband 
connection
3) ls /mnt
4) wait 5-30 seconds

Thanks for the information.

I have that HCA, but I won’t have access to my test systems for a week 
(traveling). So can you try this:

# rpcdebug -m rpc -s trans

then reproduce (starting with step 1 above). Some debugging output will appear 
at the tail of /var/log/messages. Copy it to this thread.


The output of /var/log/messages is:

[  143.233701] RPC:  1688 xprt_rdma_allocate: size 1112 too large for
buffer[1024]: prog 13 vers 4 proc 1
[  143.233708] RPC:  1688 xprt_rdma_allocate: size 1112, request
0x88105894c000
[  143.233715] RPC:  1688 rpcrdma_inline_pullup: pad 0 destp
0x88105894d7dc len 124 hdrlen 124
[  143.233718] RPC:   rpcrdma_register_frmr_external: Using frmr
88084e589260 to map 1 segments
[  143.233722] RPC:  1688 rpcrdma_create_chunks: reply chunk elem
652@0x105894d92c:0xced01 (last)
[  143.233725] RPC:  1688 rpcrdma_marshal_req: reply chunk: hdrlen 48
rpclen 124 padlen 0 headerp 0x88105894d100 base 0x88105894d760
lkey 0x8000
[  143.233785] RPC:   rpcrdma_event_process: event rep
88084e589260 status 0 opcode 8 length 0
[  177.272397] RPC:   rpcrdma_event_process: event rep
(null) status C opcode 8808 length 4294967295
[  177.272649] RPC:   rpcrdma_event_process: event rep
880848ed status 5 opcode 8808 length 4294936584

The mlx4 provider is returning a WC completion status of
IB_WC_WR_FLUSH_ERR.


[  177.272651] RPC:   rpcrdma_event_process: WC opcode -30712 status
5, connection lost

-30712 is a bogus WC opcode. So the mlx4 provider is not filling in the
WC opcode. rpcrdma_event_process() thus can’t depend on the contents of
the ib_wc.opcode field when the WC completion status != IB_WC_SUCCESS.


Hey Chuck,

That is correct, the opcode field in the wc is not reliable in FLUSH errors.



A copy of the opcode reachable from the incoming rpcrdma_rep could be
added, initialized in the forward paths. rpcrdma_event_process() could
use the copy in the error case.


How about suppressing completions alltogether for fast_reg and local_inv 
work requests?
if these shall fail you will get an error completion and the QP will 
transition to error state
generating FLUSH_ERR completions for all pending WRs. In this case, you 
can just ignore

flush fast_reg + local_inv errors.

see http://marc.info/?l=linux-rdma&m=139047309831997&w=2

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel oops/panic with NFS over RDMA mount after disrupted Infiniband connection

2014-03-28 Thread Chuck Lever

On Mar 28, 2014, at 2:42 AM, Senn Klemens  wrote:

> Hi Chuck,
> 
> On 03/27/2014 04:59 PM, Chuck Lever wrote:
>> Hi-
>> 
>> 
>> On Mar 27, 2014, at 12:53 AM, Reiter Rafael  wrote:
>> 
>>> On 03/26/2014 07:15 PM, Chuck Lever wrote:
 
 Hi Rafael-
 
 I’ll take a look. Can you report your HCA and how you reproduce this issue?
>>> 
>>> The HCA is Mellanox Technologies MT26428.
>>> 
>>> Reproduction:
>>> 1) Mount a directory via NFS/RDMA
>>> mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.2:/ /mnt/
> 
> An additional "ls /mnt" is needed here (between step 1 and 2)
> 
>>> 2) Pull the Infiniband cable or use ibportstate to disrupt the Infiniband 
>>> connection
>>> 3) ls /mnt
>>> 4) wait 5-30 seconds
>> 
>> Thanks for the information.
>> 
>> I have that HCA, but I won’t have access to my test systems for a week 
>> (traveling). So can you try this:
>> 
>> # rpcdebug -m rpc -s trans
>> 
>> then reproduce (starting with step 1 above). Some debugging output will 
>> appear at the tail of /var/log/messages. Copy it to this thread.
>> 
> 
> The output of /var/log/messages is:
> 
> [  143.233701] RPC:  1688 xprt_rdma_allocate: size 1112 too large for
> buffer[1024]: prog 13 vers 4 proc 1
> [  143.233708] RPC:  1688 xprt_rdma_allocate: size 1112, request
> 0x88105894c000
> [  143.233715] RPC:  1688 rpcrdma_inline_pullup: pad 0 destp
> 0x88105894d7dc len 124 hdrlen 124
> [  143.233718] RPC:   rpcrdma_register_frmr_external: Using frmr
> 88084e589260 to map 1 segments
> [  143.233722] RPC:  1688 rpcrdma_create_chunks: reply chunk elem
> 652@0x105894d92c:0xced01 (last)
> [  143.233725] RPC:  1688 rpcrdma_marshal_req: reply chunk: hdrlen 48
> rpclen 124 padlen 0 headerp 0x88105894d100 base 0x88105894d760
> lkey 0x8000
> [  143.233785] RPC:   rpcrdma_event_process: event rep
> 88084e589260 status 0 opcode 8 length 0
> [  177.272397] RPC:   rpcrdma_event_process: event rep
> (null) status C opcode 8808 length 4294967295
> [  177.272649] RPC:   rpcrdma_event_process: event rep
> 880848ed status 5 opcode 8808 length 4294936584

The mlx4 provider is returning a WC completion status of
IB_WC_WR_FLUSH_ERR.

> [  177.272651] RPC:   rpcrdma_event_process: WC opcode -30712 status
> 5, connection lost

-30712 is a bogus WC opcode. So the mlx4 provider is not filling in the
WC opcode. rpcrdma_event_process() thus can’t depend on the contents of
the ib_wc.opcode field when the WC completion status != IB_WC_SUCCESS.

A copy of the opcode reachable from the incoming rpcrdma_rep could be
added, initialized in the forward paths. rpcrdma_event_process() could
use the copy in the error case.


> [  177.984996] RPC:  1689 xprt_rdma_allocate: size 436, request
> 0x880848f0
> [  182.290655] RPC:   xprt_rdma_connect_worker: reconnect
> [  182.290992] RPC:   rpcrdma_ep_disconnect: after wait, disconnected
> [  187.300726] RPC:   xprt_rdma_connect_worker: exit
> [  197.320527] RPC:   xprt_rdma_connect_worker: reconnect
> [  197.320795] RPC:   rpcrdma_ep_disconnect: after wait, disconnected
> [  202.330477] RPC:   xprt_rdma_connect_worker: exit
> [  222.354286] RPC:   xprt_rdma_connect_worker: reconnect
> [  222.354624] RPC:   rpcrdma_ep_disconnect: after wait, disconnected
> 
> 
> The output on the serial terminal is:
> 
> [  227.364376] kernel tried to execute NX-protected page - exploit
> attempt? (uid: 0)
> [  227.364517] RPC:  1689 rpcrdma_inline_pullup: pad 0 destp
> 0x880848f017c4 len 100 hdrlen 100
> [  227.364519] RPC:   rpcrdma_register_frmr_external: Using frmr
> 88084e588810 to map 1 segments
> [  227.364522] RPC:  1689 rpcrdma_create_chunks: reply chunk elem
> 152@0x848f0187c:0xcab01 (last)
> [  227.364523] RPC:  1689 rpcrdma_marshal_req: reply chunk: hdrlen 48
> rpclen 100 padlen 0 headerp 0x880848f01100 base 0x880848f01760
> lkey 0x8000
> [  227.411547] BUG: unable to handle kernel paging request at
> 880848ed1758
> [  227.418535] IP: [] 0x880848ed1757
> [  227.423781] PGD 1d7c067 PUD 85d52f063 PMD 848f12063 PTE 800848ed1163
> [  227.430544] Oops: 0011 [#1] SMP
> [  227.433802] Modules linked in: auth_rpcgss oid_registry nfsv4
> xprtrdma cpuid af_packet 8021q garp stp llc rdma_ucm ib_ucm rdma_cm
> iw_cm ib_addr ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa
> ib_mad ib_core joydev usbhid mlx4_core iTCO_wdt iTCO_vendor_support
> acpi_cpufreq coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel
> aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64
> microcode pcspkr sb_edac edac_core i2c_i801 isci sg libsas ehci_pci
> ehci_hcd scsi_transport_sas usbcore lpc_ich ioatdma mfd_core usb_common
> shpchp pci_hotplug wmi mperf processor thermal_sys button edd autofs4
> xfs libcrc32c nfsv3 nfs fscache lockd nfs_acl sunrpc igb dca
> i2c_algo_bit ptp pps_core
> [  227.496536] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
> 3.10.17-allpatches+ #1
> [  227.503583]

[PATCH -next] infiniband: fix iser_verbs.c format warning

2014-03-28 Thread Randy Dunlap
From: Randy Dunlap 

Fix pr_err (printk) format warning:

drivers/infiniband/ulp/iser/iser_verbs.c:1181:4: warning: format '%lx' expects 
argument of type 'long unsigned int', but argument 3 has type 'sector_t' 
[-Wformat]

Signed-off-by: Randy Dunlap 
Cc: Or Gerlitz 
Cc: Roi Dayan 
Cc: linux-rdma@vger.kernel.org
---
 drivers/infiniband/ulp/iser/iser_verbs.c |5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- linux-next-20140328.orig/drivers/infiniband/ulp/iser/iser_verbs.c
+++ linux-next-20140328/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -1178,9 +1178,10 @@ u8 iser_check_task_pi_status(struct iscs
do_div(sector_off, sector_size + 8);
*sector = scsi_get_lba(iser_task->sc) + sector_off;
 
-   pr_err("PI error found type %d at sector %lx "
+   pr_err("PI error found type %d at sector %llx "
   "expected %x vs actual %x\n",
-  mr_status.sig_err.err_type, *sector,
+  mr_status.sig_err.err_type,
+  (unsigned long long)*sector,
   mr_status.sig_err.expected,
   mr_status.sig_err.actual);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch 0/3] Hangs with IPoIB when doing PCI error injection

2014-03-28 Thread Roland Dreier
On Fri, Mar 28, 2014 at 1:47 PM, David Miller  wrote:
> I'm assuming Roland will take this in via his tree.


Yes, hoping for some feedback from Mellanox people.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Patch 0/3] Hangs with IPoIB when doing PCI error injection

2014-03-28 Thread David Miller
From: cls...@linux.vnet.ibm.com
Date: Thu, 27 Mar 2014 09:28:13 -0500

> This patch is to resolve some hangs we are seeing when doing PCI error 
> injection
> to Mellanox Infiniband cards. With this patch we make mlx4 driver send an 
> IB_EVENT_DEVICE_FATAL 
> to the users and added this event to event handlers to avoid these hangs. 
> If IPoIB is in connected mode, then added to cm an event handler and tried to 
> make sure that when it sees the fatal event it does not try to send anymore 
> packet because
> it will not receive any more completions or interrupts. 

I'm assuming Roland will take this in via his tree.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 16/16] ibacm: remove processing of IP's from ibacm_addr.cfg

2014-03-28 Thread Hefty, Sean
> Subject: [PATCH 16/16] ibacm: remove processing of IP's from ibacm_addr.cfg
> 
> From: Ira Weiny 
> 
> Flag an error and do not process IP's which may appear in this file.

It probably makes sense to make this configurable, with the default to no 
longer process IP addresses in the file.  The option to enable could be read 
from the configuration file.  This provides an easy way for a user to re-enable 
this feature, in the off chance that it is in use.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2 2/4] IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Marciniszyn, Mike
> And struct ipath_dma_mapping_ops was converted to C99 initializer.
> 

This is now mentioned in the ipath patch.


[PATCH v3] IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The lack of these methods is compensated for by code changes
to .map_sg to insure that the vanilla sg_dma_address() and
sg_dma_len() will do the same thing as the equivalent
former ib_sg_dma_address() and ib_sg_dma_len() calls
into the drivers.

The introduction of this patch required that the struct
ipath_dma_mapping_ops be converted to C99 initializer.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ipath/ipath_dma.c |   43 +++
 1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_dma.c 
b/drivers/infiniband/hw/ipath/ipath_dma.c
index 644c2c7..123a8c0 100644
--- a/drivers/infiniband/hw/ipath/ipath_dma.c
+++ b/drivers/infiniband/hw/ipath/ipath_dma.c
@@ -115,6 +115,10 @@ static int ipath_map_sg(struct ib_device *dev, struct 
scatterlist *sgl,
ret = 0;
break;
}
+   sg->dma_address = addr + sg->offset;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   sg->dma_length = sg->length;
+#endif
}
return ret;
 }
@@ -126,21 +130,6 @@ static void ipath_unmap_sg(struct ib_device *dev,
BUG_ON(!valid_dma_direction(direction));
 }
 
-static u64 ipath_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   u64 addr = (u64) page_address(sg_page(sg));
-
-   if (addr)
-   addr += sg->offset;
-   return addr;
-}
-
-static unsigned int ipath_sg_dma_len(struct ib_device *dev,
-struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ipath_sync_single_for_cpu(struct ib_device *dev,
  u64 addr,
  size_t size,
@@ -176,17 +165,15 @@ static void ipath_dma_free_coherent(struct ib_device 
*dev, size_t size,
 }
 
 struct ib_dma_mapping_ops ipath_dma_mapping_ops = {
-   ipath_mapping_error,
-   ipath_dma_map_single,
-   ipath_dma_unmap_single,
-   ipath_dma_map_page,
-   ipath_dma_unmap_page,
-   ipath_map_sg,
-   ipath_unmap_sg,
-   ipath_sg_dma_address,
-   ipath_sg_dma_len,
-   ipath_sync_single_for_cpu,
-   ipath_sync_single_for_device,
-   ipath_dma_alloc_coherent,
-   ipath_dma_free_coherent
+   .mapping_error = ipath_mapping_error,
+   .map_single = ipath_dma_map_single,
+   .unmap_single = ipath_dma_unmap_single,
+   .map_page = ipath_dma_map_page,
+   .unmap_page = ipath_dma_unmap_page,
+   .map_sg = ipath_map_sg,
+   .unmap_sg = ipath_unmap_sg,
+   .sync_single_for_cpu = ipath_sync_single_for_cpu,
+   .sync_single_for_device = ipath_sync_single_for_device,
+   .alloc_coherent = ipath_dma_alloc_coherent,
+   .free_coherent = ipath_dma_free_coherent
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RDMA/cxgb4: set error code on kmalloc() failure

2014-03-28 Thread David Miller
From: "Steve Wise" 
Date: Wed, 26 Mar 2014 10:25:22 -0500

> Acked-by: Steve Wise 
> 
> Note: This fix applies only to net-next because the commit that introduced 
> this is still
> pending in net-next:
> 
> commit 05eb23893c2cf9502a9cec0c32e7f1d1ed2895c8
> Author: Steve Wise 
> Date:   Fri Mar 14 21:52:08 2014 +0530
> 
> cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes
> 
> 
> Dave, can you please merge this?

Done, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/4] Series short description

2014-03-28 Thread Yann Droneaud
Le vendredi 28 mars 2014 à 13:26 -0400, Mike Marciniszyn a écrit :
> This patch series modified the ib_sg_dma API to
> eliminate the .dma_len and .dma_address methods.
> 
> In all present cases that overload these methods
> (ipath, qib, ehca), the lack of these methods are
> compensated for by code changes to the driver .map_sg
> to insure that the vanilla sg_dma_address() and
> sg_dma_len() will do the same thing as the equivalent
> former ib_sg_dma_address() and ib_sg_dma_len() calls
> into the drivers.
> 
> This patch series is a followup to this recent submission
> http://marc.info/?l=linux-rdma&m=139602422108727&w=2
> and Bart's similar comment in
> http://marc.info/?l=linux-netdev&m=135643746610259&w=2.
> 
> Roland, I'm not sure of the history of these methods and
> what rationale caused them to be added?
> 
> There is obviously an interim step that could be done
> here to preseve the overload by not changing verbs, yet
> change qib/ipath/echa to not overload them.  This would
> allow the drivers to work even when the "correct" API's
> had not been used by the ULP.
> 
> This version reorders the patch series order based
> on comments from Yann Droneaud 
> so that the overload removal is last.
> 
> The commit messages are enhanced a bit as well.
> 

Thanks, that makes more sense now.

Reviewed-by: Yann Droneaud 

Regards

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 2/4] IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Yann Droneaud
Le vendredi 28 mars 2014 à 13:26 -0400, Mike Marciniszyn a écrit :
> The lack of these methods is compensated for by code changes
> to .map_sg to insure that the vanilla sg_dma_address() and
> sg_dma_len() will do the same thing as the equivalent
> former ib_sg_dma_address() and ib_sg_dma_len() calls
> into the drivers.
> 

And struct ipath_dma_mapping_ops was converted to C99 initializer.

> Suggested-by: Bart Van Assche 
> Cc: Bart Van Assche 
> Reviewed-by: Dennis Dalessandro 
> Signed-off-by: Mike Marciniszyn 
> ---
>  drivers/infiniband/hw/ipath/ipath_dma.c |   43 
> +++
>  1 file changed, 15 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/ipath/ipath_dma.c 
> b/drivers/infiniband/hw/ipath/ipath_dma.c
> index 644c2c7..123a8c0 100644
> --- a/drivers/infiniband/hw/ipath/ipath_dma.c
> +++ b/drivers/infiniband/hw/ipath/ipath_dma.c
> @@ -115,6 +115,10 @@ static int ipath_map_sg(struct ib_device *dev, struct 
> scatterlist *sgl,
>   ret = 0;
>   break;
>   }
> + sg->dma_address = addr + sg->offset;
> +#ifdef CONFIG_NEED_SG_DMA_LENGTH
> + sg->dma_length = sg->length;
> +#endif
>   }
>   return ret;
>  }
> @@ -126,21 +130,6 @@ static void ipath_unmap_sg(struct ib_device *dev,
>   BUG_ON(!valid_dma_direction(direction));
>  }
>  
> -static u64 ipath_sg_dma_address(struct ib_device *dev, struct scatterlist 
> *sg)
> -{
> - u64 addr = (u64) page_address(sg_page(sg));
> -
> - if (addr)
> - addr += sg->offset;
> - return addr;
> -}
> -
> -static unsigned int ipath_sg_dma_len(struct ib_device *dev,
> -  struct scatterlist *sg)
> -{
> - return sg->length;
> -}
> -
>  static void ipath_sync_single_for_cpu(struct ib_device *dev,
> u64 addr,
> size_t size,
> @@ -176,17 +165,15 @@ static void ipath_dma_free_coherent(struct ib_device 
> *dev, size_t size,
>  }
>  
>  struct ib_dma_mapping_ops ipath_dma_mapping_ops = {
> - ipath_mapping_error,
> - ipath_dma_map_single,
> - ipath_dma_unmap_single,
> - ipath_dma_map_page,
> - ipath_dma_unmap_page,
> - ipath_map_sg,
> - ipath_unmap_sg,
> - ipath_sg_dma_address,
> - ipath_sg_dma_len,
> - ipath_sync_single_for_cpu,
> - ipath_sync_single_for_device,
> - ipath_dma_alloc_coherent,
> - ipath_dma_free_coherent
> + .mapping_error = ipath_mapping_error,
> + .map_single = ipath_dma_map_single,
> + .unmap_single = ipath_dma_unmap_single,
> + .map_page = ipath_dma_map_page,
> + .unmap_page = ipath_dma_unmap_page,
> + .map_sg = ipath_map_sg,
> + .unmap_sg = ipath_unmap_sg,
> + .sync_single_for_cpu = ipath_sync_single_for_cpu,
> + .sync_single_for_device = ipath_sync_single_for_device,
> + .alloc_coherent = ipath_dma_alloc_coherent,
> + .free_coherent = ipath_dma_free_coherent
>  };
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 11/16] ibacm: Add thread to monitor IP address changes

2014-03-28 Thread Weiny, Ira
> On 03/28/14 06:50, sean.he...@intel.com wrote:
> > +   while ((len = recv(sock, buffer, NL_MSG_BUF_SIZE, 0)) > 0) {
> > +   nlh = (struct nlmsghdr *)buffer;
> > +   while ((NLMSG_OK(nlh, len)) && (nlh->nlmsg_type !=
> NLMSG_DONE)) {
> > +   struct ifaddrmsg *ifa = (struct ifaddrmsg *)
> NLMSG_DATA(nlh);
> > +   struct ifinfomsg *ifi = (struct ifinfomsg *)
> NLMSG_DATA(nlh);
> > +   struct rtattr *rth = IFA_RTA(ifa);
> > +   int rtl = IFA_PAYLOAD(nlh);
> > +
> > +   switch (nlh->nlmsg_type) {
> > +   [ ... ]
> > +   nlh = NLMSG_NEXT(nlh, len);
> > +   }
> > +   }
> 
> Is there any reason why this code doesn't handle netlink buffer overflows
> (ENOBUFS) ? From the netlink(7) man page:

No reason other than my inexperience with netlink.

> 
> Netlink is not a reliable protocol. [ ... ] The kernel can't send a netlink
> message if the socket buffer is full: the message will be dropped and the
> kernel and the user-space process will no longer have the same view of
> kernel state. It is up to the application to detect when this happens (via the
> ENOBUFS error returned by recvmsg(2)) and resynchronize.

Thanks, I'll rework the patch to account for this.

Maybe by that time I will have Intel's email figured out...

Thanks,
Ira

> 
> Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The code is replaced by logic in the .map_sg overload.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ipath/ipath_dma.c |   43 +++
 1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_dma.c 
b/drivers/infiniband/hw/ipath/ipath_dma.c
index 644c2c7..123a8c0 100644
--- a/drivers/infiniband/hw/ipath/ipath_dma.c
+++ b/drivers/infiniband/hw/ipath/ipath_dma.c
@@ -115,6 +115,10 @@ static int ipath_map_sg(struct ib_device *dev, struct 
scatterlist *sgl,
ret = 0;
break;
}
+   sg->dma_address = addr + sg->offset;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   sg->dma_length = sg->length;
+#endif
}
return ret;
 }
@@ -126,21 +130,6 @@ static void ipath_unmap_sg(struct ib_device *dev,
BUG_ON(!valid_dma_direction(direction));
 }
 
-static u64 ipath_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   u64 addr = (u64) page_address(sg_page(sg));
-
-   if (addr)
-   addr += sg->offset;
-   return addr;
-}
-
-static unsigned int ipath_sg_dma_len(struct ib_device *dev,
-struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ipath_sync_single_for_cpu(struct ib_device *dev,
  u64 addr,
  size_t size,
@@ -176,17 +165,15 @@ static void ipath_dma_free_coherent(struct ib_device 
*dev, size_t size,
 }
 
 struct ib_dma_mapping_ops ipath_dma_mapping_ops = {
-   ipath_mapping_error,
-   ipath_dma_map_single,
-   ipath_dma_unmap_single,
-   ipath_dma_map_page,
-   ipath_dma_unmap_page,
-   ipath_map_sg,
-   ipath_unmap_sg,
-   ipath_sg_dma_address,
-   ipath_sg_dma_len,
-   ipath_sync_single_for_cpu,
-   ipath_sync_single_for_device,
-   ipath_dma_alloc_coherent,
-   ipath_dma_free_coherent
+   .mapping_error = ipath_mapping_error,
+   .map_single = ipath_dma_map_single,
+   .unmap_single = ipath_dma_unmap_single,
+   .map_page = ipath_dma_map_page,
+   .unmap_page = ipath_dma_unmap_page,
+   .map_sg = ipath_map_sg,
+   .unmap_sg = ipath_unmap_sg,
+   .sync_single_for_cpu = ipath_sync_single_for_cpu,
+   .sync_single_for_device = ipath_sync_single_for_device,
+   .alloc_coherent = ipath_dma_alloc_coherent,
+   .free_coherent = ipath_dma_free_coherent
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] Series short description

2014-03-28 Thread Mike Marciniszyn
This patch series modified the ib_sg_dma API to
eliminate the .dma_len and .dma_address methods.

In all present cases that overload these methods
(ipath, qib, ehca), the lack of these methods are
compensated for by code changes to the driver .map_sg
to insure that the vanilla sg_dma_address() and
sg_dma_len() will do the same thing as the equivalent
former ib_sg_dma_address() and ib_sg_dma_len() calls
into the drivers.

This patch series is a followup to this recent submission
http://marc.info/?l=linux-rdma&m=139602422108727&w=2
and Bart's similar comment in
http://marc.info/?l=linux-netdev&m=135643746610259&w=2.

Roland, I'm not sure of the history of these methods and
what rationale caused them to be added?

There is obviously an interim step that could be done
here to preseve the overload by not changing verbs, yet
change qib/ipath/echa to not overload them.  This would
allow the drivers to work even when the "correct" API's
had not been used by the ULP.

This version reorders the patch series order based
on comments from Yann Droneaud 
so that the overload removal is last.

The commit messages are enhanced a bit as well.

---

Mike Marciniszyn (4):
  IB/qib: remove ib_sg_dma_address() and ib_sg_dma_len() overloads
  IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads
  IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads
  IB/core: Remove overload in ib_sg_dma*


 drivers/infiniband/hw/ehca/ehca_mrmw.c  |   12 -
 drivers/infiniband/hw/ipath/ipath_dma.c |   43 +++
 drivers/infiniband/hw/qib/qib_dma.c |   21 +++
 include/rdma/ib_verbs.h |   14 --
 4 files changed, 25 insertions(+), 65 deletions(-)

-- 
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] IB/core: Remove overload in ib_sg_dma*

2014-03-28 Thread Mike Marciniszyn
The code is replaced by driver specific
changes and avoids the pointer NULL test
for drivers that don't overload these
operations.

Suggested-by: 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Tested-by: Vinod Kumar 
Signed-off-by: Mike Marciniszyn 
---
 include/rdma/ib_verbs.h |   14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6793f32..516 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1266,10 +1266,6 @@ struct ib_dma_mapping_ops {
void(*unmap_sg)(struct ib_device *dev,
struct scatterlist *sg, int nents,
enum dma_data_direction direction);
-   u64 (*dma_address)(struct ib_device *dev,
-  struct scatterlist *sg);
-   unsigned int(*dma_len)(struct ib_device *dev,
-  struct scatterlist *sg);
void(*sync_single_for_cpu)(struct ib_device *dev,
   u64 dma_handle,
   size_t size,
@@ -2089,12 +2085,13 @@ static inline void ib_dma_unmap_sg_attrs(struct 
ib_device *dev,
  * ib_sg_dma_address - Return the DMA address from a scatter/gather entry
  * @dev: The device for which the DMA addresses were created
  * @sg: The scatter/gather entry
+ *
+ * Note: this function is obsolete. To do: change all occurrences of
+ * ib_sg_dma_address() into sg_dma_address().
  */
 static inline u64 ib_sg_dma_address(struct ib_device *dev,
struct scatterlist *sg)
 {
-   if (dev->dma_ops)
-   return dev->dma_ops->dma_address(dev, sg);
return sg_dma_address(sg);
 }
 
@@ -2102,12 +2099,13 @@ static inline u64 ib_sg_dma_address(struct ib_device 
*dev,
  * ib_sg_dma_len - Return the DMA length from a scatter/gather entry
  * @dev: The device for which the DMA addresses were created
  * @sg: The scatter/gather entry
+ *
+ * Note: this function is obsolete. To do: change all occurrences of
+ * ib_sg_dma_len() into sg_dma_len().
  */
 static inline unsigned int ib_sg_dma_len(struct ib_device *dev,
 struct scatterlist *sg)
 {
-   if (dev->dma_ops)
-   return dev->dma_ops->dma_len(dev, sg);
return sg_dma_len(sg);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/4] ib_sg_dma changes

2014-03-28 Thread Marciniszyn, Mike
> It's just a matter of making patch "[PATCH 1/4] IB/core: Remove overload in
> ib_sg_dma*" the last one.
> 
> BTW, You might want to provide a better explanation in the drivers functions
> remove patches (just duplicate the explanation).

Take a look at the latest series.

Mike


[PATCH v2 1/4] IB/qib: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
Remove the overload for .dma_len and .dma_address

The lack of these methods is compensated for by code changes
to .map_sg to insure that the vanilla sg_dma_address() and
sg_dma_len() will do the same thing as the equivalent
former ib_sg_dma_address() and ib_sg_dma_len() calls
into the drivers.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Tested-by: Vinod Kumar 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/qib/qib_dma.c |   21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_dma.c 
b/drivers/infiniband/hw/qib/qib_dma.c
index 2920bb3..59fe092 100644
--- a/drivers/infiniband/hw/qib/qib_dma.c
+++ b/drivers/infiniband/hw/qib/qib_dma.c
@@ -108,6 +108,10 @@ static int qib_map_sg(struct ib_device *dev, struct 
scatterlist *sgl,
ret = 0;
break;
}
+   sg->dma_address = addr + sg->offset;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   sg->dma_length = sg->length;
+#endif
}
return ret;
 }
@@ -119,21 +123,6 @@ static void qib_unmap_sg(struct ib_device *dev,
BUG_ON(!valid_dma_direction(direction));
 }
 
-static u64 qib_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   u64 addr = (u64) page_address(sg_page(sg));
-
-   if (addr)
-   addr += sg->offset;
-   return addr;
-}
-
-static unsigned int qib_sg_dma_len(struct ib_device *dev,
-  struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void qib_sync_single_for_cpu(struct ib_device *dev, u64 addr,
size_t size, enum dma_data_direction dir)
 {
@@ -173,8 +162,6 @@ struct ib_dma_mapping_ops qib_dma_mapping_ops = {
.unmap_page = qib_dma_unmap_page,
.map_sg = qib_map_sg,
.unmap_sg = qib_unmap_sg,
-   .dma_address = qib_sg_dma_address,
-   .dma_len = qib_sg_dma_len,
.sync_single_for_cpu = qib_sync_single_for_cpu,
.sync_single_for_device = qib_sync_single_for_device,
.alloc_coherent = qib_dma_alloc_coherent,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
These methods appear to only mimic the sg_dma_address()
and sg_dma_len() behavior.

They can be safely removed.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Cc: Hoang-Nam Nguyen 
Cc: Christoph Raisch 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ehca/ehca_mrmw.c |   12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c 
b/drivers/infiniband/hw/ehca/ehca_mrmw.c
index bcfb0c1..65873ee 100644
--- a/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -2591,16 +2591,6 @@ static void ehca_dma_unmap_sg(struct ib_device *dev, 
struct scatterlist *sg,
/* This is only a stub; nothing to be done here */
 }
 
-static u64 ehca_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->dma_address;
-}
-
-static unsigned int ehca_dma_len(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ehca_dma_sync_single_for_cpu(struct ib_device *dev, u64 addr,
 size_t size,
 enum dma_data_direction dir)
@@ -2653,8 +2643,6 @@ struct ib_dma_mapping_ops ehca_dma_mapping_ops = {
.unmap_page = ehca_dma_unmap_page,
.map_sg = ehca_dma_map_sg,
.unmap_sg   = ehca_dma_unmap_sg,
-   .dma_address= ehca_dma_address,
-   .dma_len= ehca_dma_len,
.sync_single_for_cpu= ehca_dma_sync_single_for_cpu,
.sync_single_for_device = ehca_dma_sync_single_for_device,
.alloc_coherent = ehca_dma_alloc_coherent,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4] IB/core: Remove overload in ib_sg_dma*

2014-03-28 Thread Mike Marciniszyn
The code is replaced by driver specific
changes and avoids the pointer NULL test
for drivers that don't overload these
operations.

Suggested-by: 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Tested-by: Vinod Kumar 
Signed-off-by: Mike Marciniszyn 
---
 include/rdma/ib_verbs.h |   14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 6793f32..516 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1266,10 +1266,6 @@ struct ib_dma_mapping_ops {
void(*unmap_sg)(struct ib_device *dev,
struct scatterlist *sg, int nents,
enum dma_data_direction direction);
-   u64 (*dma_address)(struct ib_device *dev,
-  struct scatterlist *sg);
-   unsigned int(*dma_len)(struct ib_device *dev,
-  struct scatterlist *sg);
void(*sync_single_for_cpu)(struct ib_device *dev,
   u64 dma_handle,
   size_t size,
@@ -2089,12 +2085,13 @@ static inline void ib_dma_unmap_sg_attrs(struct 
ib_device *dev,
  * ib_sg_dma_address - Return the DMA address from a scatter/gather entry
  * @dev: The device for which the DMA addresses were created
  * @sg: The scatter/gather entry
+ *
+ * Note: this function is obsolete. To do: change all occurrences of
+ * ib_sg_dma_address() into sg_dma_address().
  */
 static inline u64 ib_sg_dma_address(struct ib_device *dev,
struct scatterlist *sg)
 {
-   if (dev->dma_ops)
-   return dev->dma_ops->dma_address(dev, sg);
return sg_dma_address(sg);
 }
 
@@ -2102,12 +2099,13 @@ static inline u64 ib_sg_dma_address(struct ib_device 
*dev,
  * ib_sg_dma_len - Return the DMA length from a scatter/gather entry
  * @dev: The device for which the DMA addresses were created
  * @sg: The scatter/gather entry
+ *
+ * Note: this function is obsolete. To do: change all occurrences of
+ * ib_sg_dma_len() into sg_dma_len().
  */
 static inline unsigned int ib_sg_dma_len(struct ib_device *dev,
 struct scatterlist *sg)
 {
-   if (dev->dma_ops)
-   return dev->dma_ops->dma_len(dev, sg);
return sg_dma_len(sg);
 }
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The lack of these methods is compensated for by code changes
to .map_sg to insure that the vanilla sg_dma_address() and
sg_dma_len() will do the same thing as the equivalent
former ib_sg_dma_address() and ib_sg_dma_len() calls
into the drivers.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ipath/ipath_dma.c |   43 +++
 1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/drivers/infiniband/hw/ipath/ipath_dma.c 
b/drivers/infiniband/hw/ipath/ipath_dma.c
index 644c2c7..123a8c0 100644
--- a/drivers/infiniband/hw/ipath/ipath_dma.c
+++ b/drivers/infiniband/hw/ipath/ipath_dma.c
@@ -115,6 +115,10 @@ static int ipath_map_sg(struct ib_device *dev, struct 
scatterlist *sgl,
ret = 0;
break;
}
+   sg->dma_address = addr + sg->offset;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   sg->dma_length = sg->length;
+#endif
}
return ret;
 }
@@ -126,21 +130,6 @@ static void ipath_unmap_sg(struct ib_device *dev,
BUG_ON(!valid_dma_direction(direction));
 }
 
-static u64 ipath_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   u64 addr = (u64) page_address(sg_page(sg));
-
-   if (addr)
-   addr += sg->offset;
-   return addr;
-}
-
-static unsigned int ipath_sg_dma_len(struct ib_device *dev,
-struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ipath_sync_single_for_cpu(struct ib_device *dev,
  u64 addr,
  size_t size,
@@ -176,17 +165,15 @@ static void ipath_dma_free_coherent(struct ib_device 
*dev, size_t size,
 }
 
 struct ib_dma_mapping_ops ipath_dma_mapping_ops = {
-   ipath_mapping_error,
-   ipath_dma_map_single,
-   ipath_dma_unmap_single,
-   ipath_dma_map_page,
-   ipath_dma_unmap_page,
-   ipath_map_sg,
-   ipath_unmap_sg,
-   ipath_sg_dma_address,
-   ipath_sg_dma_len,
-   ipath_sync_single_for_cpu,
-   ipath_sync_single_for_device,
-   ipath_dma_alloc_coherent,
-   ipath_dma_free_coherent
+   .mapping_error = ipath_mapping_error,
+   .map_single = ipath_dma_map_single,
+   .unmap_single = ipath_dma_unmap_single,
+   .map_page = ipath_dma_map_page,
+   .unmap_page = ipath_dma_unmap_page,
+   .map_sg = ipath_map_sg,
+   .unmap_sg = ipath_unmap_sg,
+   .sync_single_for_cpu = ipath_sync_single_for_cpu,
+   .sync_single_for_device = ipath_sync_single_for_device,
+   .alloc_coherent = ipath_dma_alloc_coherent,
+   .free_coherent = ipath_dma_free_coherent
 };

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] ibacm: properly check return from ibv_open_device

2014-03-28 Thread Hefty, Sean
thanks - merged
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/16] ibacm: Add thread to monitor IP address changes

2014-03-28 Thread Bart Van Assche
On 03/28/14 16:51, Weiny, Ira wrote:
>> On 03/28/14 06:50, sean.he...@intel.com wrote:
>>> +   while ((len = recv(sock, buffer, NL_MSG_BUF_SIZE, 0)) > 0) {
>>> +   nlh = (struct nlmsghdr *)buffer;
>>> +   while ((NLMSG_OK(nlh, len)) && (nlh->nlmsg_type !=
>> NLMSG_DONE)) {
>>> +   struct ifaddrmsg *ifa = (struct ifaddrmsg *)
>> NLMSG_DATA(nlh);
>>> +   struct ifinfomsg *ifi = (struct ifinfomsg *)
>> NLMSG_DATA(nlh);
>>> +   struct rtattr *rth = IFA_RTA(ifa);
>>> +   int rtl = IFA_PAYLOAD(nlh);
>>> +
>>> +   switch (nlh->nlmsg_type) {
>>> +   [ ... ]
>>> +   nlh = NLMSG_NEXT(nlh, len);
>>> +   }
>>> +   }
>>
>> Is there any reason why this code doesn't handle netlink buffer overflows
>> (ENOBUFS) ? From the netlink(7) man page:
> 
> No reason other than my inexperience with netlink.

In that case it's probably helpful to have a look at the libnl
documentation. I'm not saying that library should be used here but it's
accompanied by excellent documentation about the netlink protocol. See
also http://www.carisma.slowglass.com/~tgr/libnl/ and
http://www.carisma.slowglass.com/~tgr/libnl/doc/core.html.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4] ib_sg_dma changes

2014-03-28 Thread Mike Marciniszyn
This patch series modified the ib_sg_dma API to
eliminate the .dma_len and .dma_address methods.

In all present cases that overload these methods
(ipath, qib, ehca), the lack of these methods are
compensated for by code changes to the driver .map_sg
to insure that the vanilla sg_dma_address() and
sg_dma_len() will do the same thing as the equivalent
former ib_sg_dma_address() and ib_sg_dma_len() calls
into the drivers.

This patch series is a followup to this recent submission
http://marc.info/?l=linux-rdma&m=139602422108727&w=2
and Bart's similar comment in
http://marc.info/?l=linux-netdev&m=135643746610259&w=2.

Roland, I'm not sure of the history of these methods and
what rationale caused them to be added?

There is obviously an interim step that could be done
here to preseve the overload by not changing verbs, yet
change qib/ipath/echa to not overload them.  This would
allow the drivers to work even when the "correct" API's
had not been used by the ULP.
---

Mike Marciniszyn (4):
  IB/core: Remove overload in ib_sg_dma*
  IB/qib: remove ib_sg_dma_address() and ib_sg_dma_len() overloads
  IB/ipath: remove ib_sg_dma_address() and ib_sg_dma_len() overloads
  IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads


 drivers/infiniband/hw/ehca/ehca_mrmw.c  |   12 -
 drivers/infiniband/hw/ipath/ipath_dma.c |   43 +++
 drivers/infiniband/hw/qib/qib_dma.c |   21 +++
 include/rdma/ib_verbs.h |   14 --
 4 files changed, 25 insertions(+), 65 deletions(-)

-- 
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] IB/qib: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The code is replaced by logic in the .map_sg overload.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Tested-by: Vinod Kumar 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/qib/qib_dma.c |   21 -
 1 file changed, 4 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/qib/qib_dma.c 
b/drivers/infiniband/hw/qib/qib_dma.c
index 2920bb3..59fe092 100644
--- a/drivers/infiniband/hw/qib/qib_dma.c
+++ b/drivers/infiniband/hw/qib/qib_dma.c
@@ -108,6 +108,10 @@ static int qib_map_sg(struct ib_device *dev, struct 
scatterlist *sgl,
ret = 0;
break;
}
+   sg->dma_address = addr + sg->offset;
+#ifdef CONFIG_NEED_SG_DMA_LENGTH
+   sg->dma_length = sg->length;
+#endif
}
return ret;
 }
@@ -119,21 +123,6 @@ static void qib_unmap_sg(struct ib_device *dev,
BUG_ON(!valid_dma_direction(direction));
 }
 
-static u64 qib_sg_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   u64 addr = (u64) page_address(sg_page(sg));
-
-   if (addr)
-   addr += sg->offset;
-   return addr;
-}
-
-static unsigned int qib_sg_dma_len(struct ib_device *dev,
-  struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void qib_sync_single_for_cpu(struct ib_device *dev, u64 addr,
size_t size, enum dma_data_direction dir)
 {
@@ -173,8 +162,6 @@ struct ib_dma_mapping_ops qib_dma_mapping_ops = {
.unmap_page = qib_dma_unmap_page,
.map_sg = qib_map_sg,
.unmap_sg = qib_unmap_sg,
-   .dma_address = qib_sg_dma_address,
-   .dma_len = qib_sg_dma_len,
.sync_single_for_cpu = qib_sync_single_for_cpu,
.sync_single_for_device = qib_sync_single_for_device,
.alloc_coherent = qib_dma_alloc_coherent,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Marciniszyn, Mike
This patch just cc's ehca folks.

Sorry I forgot on first one.

Mike

> Subject: [PATCH v2] IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len()



[PATCH 4/4] IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The method has been removed.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ehca/ehca_mrmw.c |   12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c 
b/drivers/infiniband/hw/ehca/ehca_mrmw.c
index bcfb0c1..65873ee 100644
--- a/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -2591,16 +2591,6 @@ static void ehca_dma_unmap_sg(struct ib_device *dev, 
struct scatterlist *sg,
/* This is only a stub; nothing to be done here */
 }
 
-static u64 ehca_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->dma_address;
-}
-
-static unsigned int ehca_dma_len(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ehca_dma_sync_single_for_cpu(struct ib_device *dev, u64 addr,
 size_t size,
 enum dma_data_direction dir)
@@ -2653,8 +2643,6 @@ struct ib_dma_mapping_ops ehca_dma_mapping_ops = {
.unmap_page = ehca_dma_unmap_page,
.map_sg = ehca_dma_map_sg,
.unmap_sg   = ehca_dma_unmap_sg,
-   .dma_address= ehca_dma_address,
-   .dma_len= ehca_dma_len,
.sync_single_for_cpu= ehca_dma_sync_single_for_cpu,
.sync_single_for_device = ehca_dma_sync_single_for_device,
.alloc_coherent = ehca_dma_alloc_coherent,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] ib_sg_dma changes

2014-03-28 Thread Yann Droneaud
Hi,

Le vendredi 28 mars 2014 à 12:35 -0400, Mike Marciniszyn a écrit :
> This patch series modified the ib_sg_dma API to
> eliminate the .dma_len and .dma_address methods.
> 
> In all present cases that overload these methods
> (ipath, qib, ehca), the lack of these methods are
> compensated for by code changes to the driver .map_sg
> to insure that the vanilla sg_dma_address() and
> sg_dma_len() will do the same thing as the equivalent
> former ib_sg_dma_address() and ib_sg_dma_len() calls
> into the drivers.
> 
> This patch series is a followup to this recent submission
> http://marc.info/?l=linux-rdma&m=139602422108727&w=2
> and Bart's similar comment in
> http://marc.info/?l=linux-netdev&m=135643746610259&w=2.
> 
> Roland, I'm not sure of the history of these methods and
> what rationale caused them to be added?
> 
> There is obviously an interim step that could be done
> here to preseve the overload by not changing verbs, yet
> change qib/ipath/echa to not overload them.  This would
> allow the drivers to work even when the "correct" API's
> had not been used by the ULP.


Hum, you should re-order the patch the other way:
first remove the functions in drivers, then remove the call to the
functions, and finaly remove the function pointers in the structure.
Otherwise you might break build for people doing some git bisect.

It's just a matter of making patch "[PATCH 1/4] IB/core: Remove overload
in ib_sg_dma*" the last one.

BTW, You might want to provide a better explanation in the drivers
functions remove patches (just duplicate the explanation).

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] IB/ehca: remove ib_sg_dma_address() and ib_sg_dma_len() overloads

2014-03-28 Thread Mike Marciniszyn
The method has been removed.

Suggested-by: Bart Van Assche 
Cc: Bart Van Assche 
Cc: Hoang-Nam Nguyen 
Cc: Christoph Raisch 
Reviewed-by: Dennis Dalessandro 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/hw/ehca/ehca_mrmw.c |   12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ehca_mrmw.c 
b/drivers/infiniband/hw/ehca/ehca_mrmw.c
index bcfb0c1..65873ee 100644
--- a/drivers/infiniband/hw/ehca/ehca_mrmw.c
+++ b/drivers/infiniband/hw/ehca/ehca_mrmw.c
@@ -2591,16 +2591,6 @@ static void ehca_dma_unmap_sg(struct ib_device *dev, 
struct scatterlist *sg,
/* This is only a stub; nothing to be done here */
 }
 
-static u64 ehca_dma_address(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->dma_address;
-}
-
-static unsigned int ehca_dma_len(struct ib_device *dev, struct scatterlist *sg)
-{
-   return sg->length;
-}
-
 static void ehca_dma_sync_single_for_cpu(struct ib_device *dev, u64 addr,
 size_t size,
 enum dma_data_direction dir)
@@ -2653,8 +2643,6 @@ struct ib_dma_mapping_ops ehca_dma_mapping_ops = {
.unmap_page = ehca_dma_unmap_page,
.map_sg = ehca_dma_map_sg,
.unmap_sg   = ehca_dma_unmap_sg,
-   .dma_address= ehca_dma_address,
-   .dma_len= ehca_dma_len,
.sync_single_for_cpu= ehca_dma_sync_single_for_cpu,
.sync_single_for_device = ehca_dma_sync_single_for_device,
.alloc_coherent = ehca_dma_alloc_coherent,

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ib_srpt: Use correct ib_sg_dma primitives

2014-03-28 Thread Mike Marciniszyn
The code was incorrectly using sg_dma_address() and
sg_dma_len() instead of ib_sg_dma_address() and
ib_sg_dma_len().

This prevents srpt from functioning with the
Intel HCA and indeed will corrupt memory
badly.

Cc: 
Cc: Bart Van Assche 
Reviewed-by: Dennis Dalessandro 
Tested-by: Vinod Kumar 
Signed-off-by: Mike Marciniszyn 
---
 drivers/infiniband/ulp/srpt/ib_srpt.c |   16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c 
b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 520a7e5..b4884ae 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1078,6 +1078,7 @@ static void srpt_unmap_sg_to_ib_sge(struct srpt_rdma_ch 
*ch,
 static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
 struct srpt_send_ioctx *ioctx)
 {
+   struct ib_device *dev = ch->sport->sdev->device;
struct se_cmd *cmd;
struct scatterlist *sg, *sg_orig;
int sg_cnt;
@@ -1124,7 +1125,7 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
 
db = ioctx->rbufs;
tsize = cmd->data_length;
-   dma_len = sg_dma_len(&sg[0]);
+   dma_len = ib_sg_dma_len(dev, &sg[0]);
riu = ioctx->rdma_ius;
 
/*
@@ -1155,7 +1156,8 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
++j;
if (j < count) {
sg = sg_next(sg);
-   dma_len = sg_dma_len(sg);
+   dma_len = ib_sg_dma_len(
+   dev, sg);
}
}
} else {
@@ -1192,8 +1194,8 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
tsize = cmd->data_length;
riu = ioctx->rdma_ius;
sg = sg_orig;
-   dma_len = sg_dma_len(&sg[0]);
-   dma_addr = sg_dma_address(&sg[0]);
+   dma_len = ib_sg_dma_len(dev, &sg[0]);
+   dma_addr = ib_sg_dma_address(dev, &sg[0]);
 
/* this second loop is really mapped sg_addres to rdma_iu->ib_sge */
for (i = 0, j = 0;
@@ -1216,8 +1218,10 @@ static int srpt_map_sg_to_ib_sge(struct srpt_rdma_ch *ch,
++j;
if (j < count) {
sg = sg_next(sg);
-   dma_len = sg_dma_len(sg);
-   dma_addr = sg_dma_address(sg);
+   dma_len = ib_sg_dma_len(
+   dev, sg);
+   dma_addr = ib_sg_dma_address(
+   dev, sg);
}
}
} else {

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [patch] RDMA/cxgb4: info leak in c4iw_alloc_ucontext()

2014-03-28 Thread David Laight
From: Yann Droneaud
> Hi,
> 
> Le vendredi 28 mars 2014  11:24 +0300, Dan Carpenter a crit :
> > The c4iw_alloc_ucontext_resp struct has a 4 byte hole after the last
> > member and we should clear it before passing it to the user.
> >
> > Fixes: 05eb23893c2c ('cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes')
> > Signed-off-by: Dan Carpenter 
> >
> 
> It's not the proper fix for this issue: an explicit padding has to be
> added (and initialized), see "Re: [PATCH net-next 2/2] cxgb4/iw_cxgb4:
> Doorbell Drop Avoidance Bug Fixes"
> http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain
> 
> In its current form, the c4iw_alloc_ucontext_resp structure does not
> require padding on i386, so a 32bits userspace program using this
> structure against a x86_64 kernel will make the kernel do a buffer
> overflow in userspace, likely on stack, as answer of a GET_CONTEXT
> request:
...
> struct c4iw_alloc_ucontext_resp {
> struct ibv_get_context_resp ibv_resp;
> __u64 status_page_key;
> __u32 status_page_size;
> };

Or add __attribute__((aligned(4))) to the 64bit fields.
And maybe a compile time assert on the length of the structure.
Since it is part of an ABI it must not change

David



Re: [patch] RDMA/cxgb4: info leak in c4iw_alloc_ucontext()

2014-03-28 Thread Yann Droneaud
Hi,

Le vendredi 28 mars 2014 à 11:24 +0300, Dan Carpenter a écrit :
> The c4iw_alloc_ucontext_resp struct has a 4 byte hole after the last
> member and we should clear it before passing it to the user.
> 
> Fixes: 05eb23893c2c ('cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes')
> Signed-off-by: Dan Carpenter 
> 

It's not the proper fix for this issue: an explicit padding has to be
added (and initialized), see "Re: [PATCH net-next 2/2] cxgb4/iw_cxgb4:
Doorbell Drop Avoidance Bug Fixes"
http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain

In its current form, the c4iw_alloc_ucontext_resp structure does not
require padding on i386, so a 32bits userspace program using this
structure against a x86_64 kernel will make the kernel do a buffer
overflow in userspace, likely on stack, as answer of a GET_CONTEXT
request:

#include 
#include 

#define IBV_INIT_CMD_RESP(cmd, size, opcode, out, outsize) \
do {   \
(cmd)->command = IB_USER_VERBS_CMD_##opcode;   \
(cmd)->in_words  = (size) / 4; \
(cmd)->out_words = (outsize) / 4;  \
(cmd)->response  = (out);  \
} while (0)

struct c4iw_alloc_ucontext_resp {
struct ibv_get_context_resp ibv_resp;
__u64 status_page_key;
__u32 status_page_size;
};

struct ibv_context *alloc_context(struct ibv_device *ibdev,
  int cmd_fd)
{
struct ibv_get_context cmd;
struct c4iw_alloc_ucontext_resp resp;
ssize_t sret;
...

IBV_INIT_CMD_RESP(&cmd, sizeof(cmd),
  GET_CONTEXT,
  &resp, sizeof(resp));
...
sret = write(context->cmd_fd,
 &cmd,
 sizeof(cmd));

if (sret != sizeof(cmd)) {
int err = errno;
fprintf(stderr, "GET_CONTEXT failed: %d (%s)\n",
err, strerror(err));
...
}
...
}

Unfortunately, it's not the only structure which has this problem. I'm
currently preparing a report on this issue for this driver (cxgb4) and
another.

> diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
> b/drivers/infiniband/hw/cxgb4/provider.c
> index e36d2a2..a72aaa7 100644
> --- a/drivers/infiniband/hw/cxgb4/provider.c
> +++ b/drivers/infiniband/hw/cxgb4/provider.c
> @@ -107,7 +107,7 @@ static struct ib_ucontext *c4iw_alloc_ucontext(struct 
> ib_device *ibdev,
>   struct c4iw_ucontext *context;
>   struct c4iw_dev *rhp = to_c4iw_dev(ibdev);
>   static int warned;
> - struct c4iw_alloc_ucontext_resp uresp;
> + struct c4iw_alloc_ucontext_resp uresp = {};
>   int ret = 0;
>   struct c4iw_mm_entry *mm = NULL;
>  

Regards.

-- 
Yann Droneaud
OPTEYA



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Kernel oops/panic with NFS over RDMA mount after disrupted Infiniband connection

2014-03-28 Thread Senn Klemens
Hi Chuck,

On 03/27/2014 04:59 PM, Chuck Lever wrote:
> Hi-
> 
> 
> On Mar 27, 2014, at 12:53 AM, Reiter Rafael  wrote:
> 
>> On 03/26/2014 07:15 PM, Chuck Lever wrote:
>>>
>>> Hi Rafael-
>>>
>>> I’ll take a look. Can you report your HCA and how you reproduce this issue?
>>
>> The HCA is Mellanox Technologies MT26428.
>>
>> Reproduction:
>> 1) Mount a directory via NFS/RDMA
>> mount -t nfs -o port=20049,rdma,vers=4.0,timeo=900 172.16.100.2:/ /mnt/

An additional "ls /mnt" is needed here (between step 1 and 2)

>> 2) Pull the Infiniband cable or use ibportstate to disrupt the Infiniband 
>> connection
>> 3) ls /mnt
>> 4) wait 5-30 seconds
> 
> Thanks for the information.
> 
> I have that HCA, but I won’t have access to my test systems for a week 
> (traveling). So can you try this:
> 
>  # rpcdebug -m rpc -s trans
> 
> then reproduce (starting with step 1 above). Some debugging output will 
> appear at the tail of /var/log/messages. Copy it to this thread.
> 

The output of /var/log/messages is:

[  143.233701] RPC:  1688 xprt_rdma_allocate: size 1112 too large for
buffer[1024]: prog 13 vers 4 proc 1
[  143.233708] RPC:  1688 xprt_rdma_allocate: size 1112, request
0x88105894c000
[  143.233715] RPC:  1688 rpcrdma_inline_pullup: pad 0 destp
0x88105894d7dc len 124 hdrlen 124
[  143.233718] RPC:   rpcrdma_register_frmr_external: Using frmr
88084e589260 to map 1 segments
[  143.233722] RPC:  1688 rpcrdma_create_chunks: reply chunk elem
652@0x105894d92c:0xced01 (last)
[  143.233725] RPC:  1688 rpcrdma_marshal_req: reply chunk: hdrlen 48
rpclen 124 padlen 0 headerp 0x88105894d100 base 0x88105894d760
lkey 0x8000
[  143.233785] RPC:   rpcrdma_event_process: event rep
88084e589260 status 0 opcode 8 length 0
[  177.272397] RPC:   rpcrdma_event_process: event rep
(null) status C opcode 8808 length 4294967295
[  177.272649] RPC:   rpcrdma_event_process: event rep
880848ed status 5 opcode 8808 length 4294936584
[  177.272651] RPC:   rpcrdma_event_process: WC opcode -30712 status
5, connection lost
[  177.984996] RPC:  1689 xprt_rdma_allocate: size 436, request
0x880848f0
[  182.290655] RPC:   xprt_rdma_connect_worker: reconnect
[  182.290992] RPC:   rpcrdma_ep_disconnect: after wait, disconnected
[  187.300726] RPC:   xprt_rdma_connect_worker: exit
[  197.320527] RPC:   xprt_rdma_connect_worker: reconnect
[  197.320795] RPC:   rpcrdma_ep_disconnect: after wait, disconnected
[  202.330477] RPC:   xprt_rdma_connect_worker: exit
[  222.354286] RPC:   xprt_rdma_connect_worker: reconnect
[  222.354624] RPC:   rpcrdma_ep_disconnect: after wait, disconnected


The output on the serial terminal is:

[  227.364376] kernel tried to execute NX-protected page - exploit
attempt? (uid: 0)
[  227.364517] RPC:  1689 rpcrdma_inline_pullup: pad 0 destp
0x880848f017c4 len 100 hdrlen 100
[  227.364519] RPC:   rpcrdma_register_frmr_external: Using frmr
88084e588810 to map 1 segments
[  227.364522] RPC:  1689 rpcrdma_create_chunks: reply chunk elem
152@0x848f0187c:0xcab01 (last)
[  227.364523] RPC:  1689 rpcrdma_marshal_req: reply chunk: hdrlen 48
rpclen 100 padlen 0 headerp 0x880848f01100 base 0x880848f01760
lkey 0x8000
[  227.411547] BUG: unable to handle kernel paging request at
880848ed1758
[  227.418535] IP: [] 0x880848ed1757
[  227.423781] PGD 1d7c067 PUD 85d52f063 PMD 848f12063 PTE 800848ed1163
[  227.430544] Oops: 0011 [#1] SMP
[  227.433802] Modules linked in: auth_rpcgss oid_registry nfsv4
xprtrdma cpuid af_packet 8021q garp stp llc rdma_ucm ib_ucm rdma_cm
iw_cm ib_addr ib_ipoib ib_cm ib_uverbs ib_umad mlx4_en mlx4_ib ib_sa
ib_mad ib_core joydev usbhid mlx4_core iTCO_wdt iTCO_vendor_support
acpi_cpufreq coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel
aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64
microcode pcspkr sb_edac edac_core i2c_i801 isci sg libsas ehci_pci
ehci_hcd scsi_transport_sas usbcore lpc_ich ioatdma mfd_core usb_common
shpchp pci_hotplug wmi mperf processor thermal_sys button edd autofs4
xfs libcrc32c nfsv3 nfs fscache lockd nfs_acl sunrpc igb dca
i2c_algo_bit ptp pps_core
[  227.496536] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.10.17-allpatches+ #1
[  227.503583] Hardware name: Supermicro B9DRG-E/B9DRG-E, BIOS 3.0
09/04/2013
[  227.510451] task: 81a11440 ti: 81a0 task.ti:
81a0
[  227.517924] RIP: 0010:[]  []
0x880848ed1757
[  227.525597] RSP: 0018:88087fc03e88  EFLAGS: 00010282
[  227.530903] RAX: 0286 RBX: 880848ed1758 RCX:
a0354360
[  227.538032] RDX: 88084e589280 RSI: 0286 RDI:
88084e589260
[  227.545157] RBP: 88087fc03ea0 R08: a0354360 R09:
05f0
[  227.552286] R10: 0003 R11: dead00100100 R12:
88084e589260
[  227.559412] R13: 0006 R14: 0006 R15:
81a5db90
[  227.566540] FS:  (000

[patch] RDMA/cxgb4: info leak in c4iw_alloc_ucontext()

2014-03-28 Thread Dan Carpenter
The c4iw_alloc_ucontext_resp struct has a 4 byte hole after the last
member and we should clear it before passing it to the user.

Fixes: 05eb23893c2c ('cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes')
Signed-off-by: Dan Carpenter 

diff --git a/drivers/infiniband/hw/cxgb4/provider.c 
b/drivers/infiniband/hw/cxgb4/provider.c
index e36d2a2..a72aaa7 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -107,7 +107,7 @@ static struct ib_ucontext *c4iw_alloc_ucontext(struct 
ib_device *ibdev,
struct c4iw_ucontext *context;
struct c4iw_dev *rhp = to_c4iw_dev(ibdev);
static int warned;
-   struct c4iw_alloc_ucontext_resp uresp;
+   struct c4iw_alloc_ucontext_resp uresp = {};
int ret = 0;
struct c4iw_mm_entry *mm = NULL;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 2/2] net/mlx4: make buffer larger to avoid overflow warning

2014-03-28 Thread Dan Carpenter
My static checker complains that the sprintf() here can overflow.

drivers/infiniband/hw/mlx4/main.c:1836 mlx4_ib_alloc_eqs()
error: format string overflow. buf_size: 32 length: 69

This seems like a valid complaint.  The "dev->pdev->bus->name" string
can be 48 characters long.  I just made the buffer 80 characters instead
of 69 and I changed the sprintf() to snprintf().

Signed-off-by: Dan Carpenter 

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 8d9d6b8..1b6dbe15 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -1803,7 +1803,7 @@ static void init_pkeys(struct mlx4_ib_dev *ibdev)
 
 static void mlx4_ib_alloc_eqs(struct mlx4_dev *dev, struct mlx4_ib_dev *ibdev)
 {
-   char name[32];
+   char name[80];
int eq_per_port = 0;
int added_eqs = 0;
int total_eqs = 0;
@@ -1833,8 +1833,8 @@ static void mlx4_ib_alloc_eqs(struct mlx4_dev *dev, 
struct mlx4_ib_dev *ibdev)
eq = 0;
mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB) {
for (j = 0; j < eq_per_port; j++) {
-   sprintf(name, "mlx4-ib-%d-%d@%s",
-   i, j, dev->pdev->bus->name);
+   snprintf(name, sizeof(name), "mlx4-ib-%d-%d@%s",
+i, j, dev->pdev->bus->name);
/* Set IRQ for specific name (per ring) */
if (mlx4_assign_eq(dev, name, NULL,
   &ibdev->eq_table[eq])) {
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/2] net/mlx4: fix some indenting in mlx4_ib_add()

2014-03-28 Thread Dan Carpenter
The code was indented too far and also kernel style says we should have
curly braces.

Signed-off-by: Dan Carpenter 

diff --git a/drivers/infiniband/hw/mlx4/main.c 
b/drivers/infiniband/hw/mlx4/main.c
index 6cb8546..8d9d6b8 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2048,8 +2048,9 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
err = mlx4_counter_alloc(ibdev->dev, 
&ibdev->counters[i]);
if (err)
ibdev->counters[i] = -1;
-   } else
-   ibdev->counters[i] = -1;
+   } else {
+   ibdev->counters[i] = -1;
+   }
}
 
mlx4_foreach_port(i, dev, MLX4_PORT_TYPE_IB)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html