Re: NFS over RDMA crashing

2013-02-06 Thread Steve Wise

On 2/6/2013 9:48 AM, Yan Burman wrote:
When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was 
no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere 
in the middle of 3.7.0-rc2).




+tom tucker

I'd try going back a few kernels, like to 3.5.x and see if things are 
more stable.  If you find a point that works, then git bisect might help 
identify the regression.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread Jeff Becker
Hi. In case you're interested, I did the NFS/RDMA backports for OFED. I 
tested that NFS/RDMA in OFED 3.5 works on kernel 3.5, and also the RHEL 
6.3 kernel. However, I did not test it with SRIOV. If you test it 
(OFED-3.5-rc6 was released last week), I'd like to know how it goes. Thanks.


Jeff Becker

On 02/06/2013 07:58 AM, Steve Wise wrote:

On 2/6/2013 9:48 AM, Yan Burman wrote:

When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I was
no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere
in the middle of 3.7.0-rc2)


+tom tucker

I'd try going back a few kernels, like to 3.5.x and see if things are
more stable.  If you find a point that works, then git bisect might help
identify the regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread J. Bruce Fields
On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> When killing mount command that got stuck:
> ---
> 
> BUG: unable to handle kernel paging request at 880324dc7ff8
> IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> Oops: 0003 [#1] PREEMPT SMP
> Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
> ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> target_core_file target_core_pscsi target_core_mod configfs 8021q
> bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
> sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
> CPU 6
> Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> X8DTH-i/6/iF/6F/X8DTH
> RIP: 0010:[]  []
> rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> R13: 0003 R14: 0001 R15: 0010
> FS:  () GS:88063fc0() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
> Stack:
>  880324c3dc78 880324c3dcd8 0282 880631cec000
>  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
>  88062ed33058 880630ce2b90 8806299e8000 0003
> Call Trace:
>  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
>  [] ? try_to_wake_up+0x2f0/0x2f0
>  [] svc_recv+0x3ef/0x4b0 [sunrpc]
>  [] ? nfsd_svc+0x740/0x740 [nfsd]
>  [] nfsd+0xad/0x130 [nfsd]
>  [] ? nfsd_svc+0x740/0x740 [nfsd]
>  [] kthread+0xd6/0xe0
>  [] ? __init_kthread_worker+0x70/0x70
>  [] ret_from_fork+0x7c/0xb0
>  [] ? __init_kthread_worker+0x70/0x70
> Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
> RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
>  RSP 
> CR2: 880324dc7ff8
> ---[ end trace 06d0384754e9609a ]---
> 
> 
> It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> is responsible for the crash (it seems to be crashing in
> net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> 
> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> was no longer getting the server crashes,
> so the reset of my tests were done using that point (it is somewhere
> in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-06 Thread Steve Wise

On 2/6/2013 4:24 PM, J. Bruce Fields wrote:

On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:

When killing mount command that got stuck:
---

BUG: unable to handle kernel paging request at 880324dc7ff8
IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
target_core_file target_core_pscsi target_core_mod configfs 8021q
bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
CPU 6
Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[]  []
rdma_read_xdr+0x8bb/0xd40 [svcrdma]
RSP: 0018:880324c3dbf8  EFLAGS: 00010297
RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
R13: 0003 R14: 0001 R15: 0010
FS:  () GS:88063fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
Stack:
  880324c3dc78 880324c3dcd8 0282 880631cec000
  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
  88062ed33058 880630ce2b90 8806299e8000 0003
Call Trace:
  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
  [] ? try_to_wake_up+0x2f0/0x2f0
  [] svc_recv+0x3ef/0x4b0 [sunrpc]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] nfsd+0xad/0x130 [nfsd]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] kthread+0xd6/0xe0
  [] ? __init_kthread_worker+0x70/0x70
  [] ret_from_fork+0x7c/0xb0
  [] ? __init_kthread_worker+0x70/0x70
Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
<48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
  RSP 
CR2: 880324dc7ff8
---[ end trace 06d0384754e9609a ]---


It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
"nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
is responsible for the crash (it seems to be crashing in
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.

When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
was no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere
in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.


Maybe Tom can shed some light?

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: NFS over RDMA crashing

2013-02-07 Thread Yan Burman
> -Original Message-
> From: Jeff Becker [mailto:jeffrey.c.bec...@nasa.gov]
> Sent: Wednesday, February 06, 2013 19:07
> To: Steve Wise
> Cc: Yan Burman; bfie...@fieldses.org; linux-...@vger.kernel.org; linux-
> r...@vger.kernel.org; Or Gerlitz; Tom Tucker
> Subject: Re: NFS over RDMA crashing
> 
> Hi. In case you're interested, I did the NFS/RDMA backports for OFED. I
> tested that NFS/RDMA in OFED 3.5 works on kernel 3.5, and also the RHEL
> 6.3 kernel. However, I did not test it with SRIOV. If you test it
> (OFED-3.5-rc6 was released last week), I'd like to know how it goes. Thanks.
> 
> Jeff Becker
> 
> On 02/06/2013 07:58 AM, Steve Wise wrote:
> > On 2/6/2013 9:48 AM, Yan Burman wrote:
> >> When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> was
> >> no longer getting the server crashes, so the reset of my tests were
> >> done using that point (it is somewhere in the middle of 3.7.0-rc2)
> >>
> > +tom tucker
> >
> > I'd try going back a few kernels, like to 3.5.x and see if things are
> > more stable.  If you find a point that works, then git bisect might
> > help identify the regression.
> > --

Vanilla 3.5.7 seem to work OK out of the box.
I will try 3.6 next week + performance comparisons.

Yan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-07 Thread J. Bruce Fields
On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > When killing mount command that got stuck:
> > ---
> > 
> > BUG: unable to handle kernel paging request at 880324dc7ff8
> > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > Oops: 0003 [#1] PREEMPT SMP
> > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
> > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > target_core_file target_core_pscsi target_core_mod configfs 8021q
> > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
> > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
> > CPU 6
> > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > X8DTH-i/6/iF/6F/X8DTH
> > RIP: 0010:[]  []
> > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> > RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> > R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> > R13: 0003 R14: 0001 R15: 0010
> > FS:  () GS:88063fc0() knlGS:
> > CS:  0010 DS:  ES:  CR0: 8005003b
> > CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: 0ff0 DR7: 0400
> > Process nfsd (pid: 4744, threadinfo 880324c3c000, task 88033055)
> > Stack:
> >  880324c3dc78 880324c3dcd8 0282 880631cec000
> >  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
> >  88062ed33058 880630ce2b90 8806299e8000 0003
> > Call Trace:
> >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> >  [] ? try_to_wake_up+0x2f0/0x2f0
> >  [] svc_recv+0x3ef/0x4b0 [sunrpc]
> >  [] ? nfsd_svc+0x740/0x740 [nfsd]
> >  [] nfsd+0xad/0x130 [nfsd]
> >  [] ? nfsd_svc+0x740/0x740 [nfsd]
> >  [] kthread+0xd6/0xe0
> >  [] ? __init_kthread_worker+0x70/0x70
> >  [] ret_from_fork+0x7c/0xb0
> >  [] ? __init_kthread_worker+0x70/0x70
> > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
> > RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> >  RSP 
> > CR2: 880324dc7ff8
> > ---[ end trace 06d0384754e9609a ]---
> > 
> > 
> > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> > is responsible for the crash (it seems to be crashing in
> > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> > CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> > 
> > When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> > was no longer getting the server crashes,
> > so the reset of my tests were done using that point (it is somewhere
> > in the middle of 3.7.0-rc2).
> 
> OK, so this part's clearly my fault--I'll work on a patch, but the
> rdma's use of the ->rq_pages array is pretty confusing.

Does this help?

They must have added this for some reason, but I'm not seeing how it
could have ever done anything

--b.

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 0ce7552..e8f25ec 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -520,13 +520,6 @@ next_sge:
for (ch_no = 0; &rqstp->rq_pages[ch_no] < rqstp->rq_respages; ch_no++)
rqstp->rq_pages[ch_no] = NULL;
 
-   /*
-* Detach res pages. If svc_release sees any it will attempt to
-* put them.
-*/
-   while (rqstp->rq_next_page != rqstp->rq_respages)
-   *(--rqstp->rq_next_page) = NULL;
-
return err;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2013-02-07 Thread Tom Tucker

On 2/6/13 3:28 PM, Steve Wise wrote:

On 2/6/2013 4:24 PM, J. Bruce Fields wrote:

On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:

When killing mount command that got stuck:
---

BUG: unable to handle kernel paging request at 880324dc7ff8
IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
Oops: 0003 [#1] PREEMPT SMP
Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm iw_cm
ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
target_core_file target_core_pscsi target_core_mod configfs 8021q
bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3 jbd
sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
CPU 6
Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
X8DTH-i/6/iF/6F/X8DTH
RIP: 0010:[] []
rdma_read_xdr+0x8bb/0xd40 [svcrdma]
RSP: 0018:880324c3dbf8  EFLAGS: 00010297
RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
R13: 0003 R14: 0001 R15: 0010
FS:  () GS:88063fc0() 
knlGS:

CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process nfsd (pid: 4744, threadinfo 880324c3c000, task 
88033055)

Stack:
  880324c3dc78 880324c3dcd8 0282 880631cec000
  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
  88062ed33058 880630ce2b90 8806299e8000 0003
Call Trace:
  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
  [] ? try_to_wake_up+0x2f0/0x2f0
  [] svc_recv+0x3ef/0x4b0 [sunrpc]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] nfsd+0xad/0x130 [nfsd]
  [] ? nfsd_svc+0x740/0x740 [nfsd]
  [] kthread+0xd6/0xe0
  [] ? __init_kthread_worker+0x70/0x70
  [] ret_from_fork+0x7c/0xb0
  [] ? __init_kthread_worker+0x70/0x70
Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
<48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
  RSP 
CR2: 880324dc7ff8
---[ end trace 06d0384754e9609a ]---


It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
"nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
is responsible for the crash (it seems to be crashing in
net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.

When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
was no longer getting the server crashes,
so the reset of my tests were done using that point (it is somewhere
in the middle of 3.7.0-rc2).

OK, so this part's clearly my fault--I'll work on a patch, but the
rdma's use of the ->rq_pages array is pretty confusing.


Maybe Tom can shed some light?


Yes, the RDMA transport has two confusing tweaks on rq_pages. Most 
transports (UDP/TCP) use the rq_pages allocated by SVC. For RDMA, 
however, the RQ already contains pre-allocated memory that will contain 
inbound NFS requests from the client. Instead of copying this data from 
the per-registered receive buffer into the buffer in rq_pages, I just 
replace the page in rq_pages with the one that already contains the data.


The second somewhat strange thing is that the NFS request contains an 
NFSRDMA header. This is just like TCP (i.e. the 4B length), however, the 
difference is that (unlike TCP) this header is needed for the response 
because it maps out where in the client the response data will be written.


Tom



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: NFS over RDMA crashing

2013-02-11 Thread Yan Burman
> -Original Message-
> From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> Sent: Thursday, February 07, 2013 18:42
> To: Yan Burman
> Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> r...@vger.kernel.org; Or Gerlitz
> Subject: Re: NFS over RDMA crashing
> 
> On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > > When killing mount command that got stuck:
> > > ---
> > >
> > > BUG: unable to handle kernel paging request at 880324dc7ff8
> > > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma] PGD
> > > 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > > Oops: 0003 [#1] PREEMPT SMP
> > > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm
> iw_cm
> > > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> > > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > > target_core_file target_core_pscsi target_core_mod configfs 8021q
> > > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> > > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> > > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> > > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> > > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3
> jbd
> > > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod CPU 6
> > > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > > X8DTH-i/6/iF/6F/X8DTH
> > > RIP: 0010:[]  []
> > > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > > RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> > > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> > > RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> > > R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> > > R13: 0003 R14: 0001 R15: 0010
> > > FS:  () GS:88063fc0()
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> > > DR0:  DR1:  DR2: 
> > > DR3:  DR6: 0ff0 DR7: 0400
> > > Process nfsd (pid: 4744, threadinfo 880324c3c000, task
> > > 88033055)
> > > Stack:
> > >  880324c3dc78 880324c3dcd8 0282 880631cec000
> > >  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
> > >  88062ed33058 880630ce2b90 8806299e8000 0003
> > > Call Trace:
> > >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> > > [] ? try_to_wake_up+0x2f0/0x2f0
> > > [] svc_recv+0x3ef/0x4b0 [sunrpc]
> > > [] ? nfsd_svc+0x740/0x740 [nfsd]
> > > [] nfsd+0xad/0x130 [nfsd]  [] ?
> > > nfsd_svc+0x740/0x740 [nfsd]  [] kthread+0xd6/0xe0
> > > [] ? __init_kthread_worker+0x70/0x70
> > > [] ret_from_fork+0x7c/0xb0  [] ?
> > > __init_kthread_worker+0x70/0x70
> > > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> > > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> > > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 RIP
> > > [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]  RSP
> > > 
> > > CR2: 880324dc7ff8
> > > ---[ end trace 06d0384754e9609a ]---
> > >
> > >
> > > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> > > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> > > is responsible for the crash (it seems to be crashing in
> > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> > > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> > > CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> > >
> > > When I moved to commit 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> > > was no longer getting the server crashes, so the reset of my tests
> > > were done using that point (it is somewhere in the middle of
> > > 3.7.0-rc2).
> >
> > OK, so this part's clearly my fault--I'll work on a patch, but the
> > rdma's use of the ->

Re: NFS over RDMA crashing

2013-02-11 Thread J. Bruce Fields
On Mon, Feb 11, 2013 at 03:19:42PM +, Yan Burman wrote:
> > -Original Message-
> > From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> > Sent: Thursday, February 07, 2013 18:42
> > To: Yan Burman
> > Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> > r...@vger.kernel.org; Or Gerlitz
> > Subject: Re: NFS over RDMA crashing
> > 
> > On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > > > When killing mount command that got stuck:
> > > > ---
> > > >
> > > > BUG: unable to handle kernel paging request at 880324dc7ff8
> > > > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma] PGD
> > > > 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > > > Oops: 0003 [#1] PREEMPT SMP
> > > > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm
> > iw_cm
> > > > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > > > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> > > > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > > > target_core_file target_core_pscsi target_core_mod configfs 8021q
> > > > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> > > > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> > > > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> > > > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> > > > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3
> > jbd
> > > > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod CPU 6
> > > > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > > > X8DTH-i/6/iF/6F/X8DTH
> > > > RIP: 0010:[]  []
> > > > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > > > RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> > > > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> > > > RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> > > > R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> > > > R13: 0003 R14: 0001 R15: 0010
> > > > FS:  () GS:88063fc0()
> > > > knlGS:
> > > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > > CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> > > > DR0:  DR1:  DR2: 
> > > > DR3:  DR6: 0ff0 DR7: 0400
> > > > Process nfsd (pid: 4744, threadinfo 880324c3c000, task
> > > > 88033055)
> > > > Stack:
> > > >  880324c3dc78 880324c3dcd8 0282 880631cec000
> > > >  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
> > > >  88062ed33058 880630ce2b90 8806299e8000 0003
> > > > Call Trace:
> > > >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> > > > [] ? try_to_wake_up+0x2f0/0x2f0
> > > > [] svc_recv+0x3ef/0x4b0 [sunrpc]
> > > > [] ? nfsd_svc+0x740/0x740 [nfsd]
> > > > [] nfsd+0xad/0x130 [nfsd]  [] ?
> > > > nfsd_svc+0x740/0x740 [nfsd]  [] kthread+0xd6/0xe0
> > > > [] ? __init_kthread_worker+0x70/0x70
> > > > [] ret_from_fork+0x7c/0xb0  [] ?
> > > > __init_kthread_worker+0x70/0x70
> > > > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> > > > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> > > > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 RIP
> > > > [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]  RSP
> > > > 
> > > > CR2: 880324dc7ff8
> > > > ---[ end trace 06d0384754e9609a ]---
> > > >
> > > >
> > > > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> > > > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> > > > is responsible for the crash (it seems to be crashing in
> > > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> > > > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> > > > CONFIG_DEBUG_RODATA enabled. I did

Re: NFS over RDMA crashing

2013-02-15 Thread J. Bruce Fields
On Mon, Feb 11, 2013 at 03:19:42PM +, Yan Burman wrote:
> > -Original Message-
> > From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> > Sent: Thursday, February 07, 2013 18:42
> > To: Yan Burman
> > Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> > r...@vger.kernel.org; Or Gerlitz
> > Subject: Re: NFS over RDMA crashing
> > 
> > On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> > > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > > > When killing mount command that got stuck:
> > > > ---
> > > >
> > > > BUG: unable to handle kernel paging request at 880324dc7ff8
> > > > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma] PGD
> > > > 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > > > Oops: 0003 [#1] PREEMPT SMP
> > > > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm
> > iw_cm
> > > > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > > > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables x_tables
> > > > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > > > target_core_file target_core_pscsi target_core_mod configfs 8021q
> > > > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> > > > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> > > > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> > > > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad ib_core
> > > > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3
> > jbd
> > > > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod CPU 6
> > > > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > > > X8DTH-i/6/iF/6F/X8DTH
> > > > RIP: 0010:[]  []
> > > > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > > > RAX: 880324dc8000 RBX: 0001 RCX: 880324dd8428
> > > > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> > > > RBP: 880324c3dd78 R08: 60f9c860 R09: 0001
> > > > R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> > > > R13: 0003 R14: 0001 R15: 0010
> > > > FS:  () GS:88063fc0()
> > > > knlGS:
> > > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > > CR2: 880324dc7ff8 CR3: 01a0b000 CR4: 07e0
> > > > DR0:  DR1:  DR2: 
> > > > DR3:  DR6: 0ff0 DR7: 0400
> > > > Process nfsd (pid: 4744, threadinfo 880324c3c000, task
> > > > 88033055)
> > > > Stack:
> > > >  880324c3dc78 880324c3dcd8 0282 880631cec000
> > > >  880324dd8000 88062ed33040 000124c3dc48 880324dd8000
> > > >  88062ed33058 880630ce2b90 8806299e8000 0003
> > > > Call Trace:
> > > >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> > > > [] ? try_to_wake_up+0x2f0/0x2f0
> > > > [] svc_recv+0x3ef/0x4b0 [sunrpc]
> > > > [] ? nfsd_svc+0x740/0x740 [nfsd]
> > > > [] nfsd+0xad/0x130 [nfsd]  [] ?
> > > > nfsd_svc+0x740/0x740 [nfsd]  [] kthread+0xd6/0xe0
> > > > [] ? __init_kthread_worker+0x70/0x70
> > > > [] ret_from_fork+0x7c/0xb0  [] ?
> > > > __init_kthread_worker+0x70/0x70
> > > > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a 00
> > > > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> > > > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00 RIP
> > > > [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]  RSP
> > > > 
> > > > CR2: 880324dc7ff8
> > > > ---[ end trace 06d0384754e9609a ]---
> > > >
> > > >
> > > > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> > > > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> > > > is responsible for the crash (it seems to be crashing in
> > > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> > > > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> > > > CONFIG_DEBUG_RODATA enabled. I did

RE: NFS over RDMA crashing

2013-02-18 Thread Yan Burman

> -Original Message-
> From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> Sent: Friday, February 15, 2013 17:28
> To: Yan Burman
> Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> r...@vger.kernel.org; Or Gerlitz
> Subject: Re: NFS over RDMA crashing
> 
> On Mon, Feb 11, 2013 at 03:19:42PM +, Yan Burman wrote:
> > > -Original Message-
> > > From: J. Bruce Fields [mailto:bfie...@fieldses.org]
> > > Sent: Thursday, February 07, 2013 18:42
> > > To: Yan Burman
> > > Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> > > r...@vger.kernel.org; Or Gerlitz
> > > Subject: Re: NFS over RDMA crashing
> > >
> > > On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> > > > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > > > > When killing mount command that got stuck:
> > > > > ---
> > > > >
> > > > > BUG: unable to handle kernel paging request at 880324dc7ff8
> > > > > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma] PGD
> > > > > 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > > > > Oops: 0003 [#1] PREEMPT SMP
> > > > > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm
> > > iw_cm
> > > > > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > > > > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables
> > > > > x_tables
> > > > > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > > > > target_core_file target_core_pscsi target_core_mod configfs
> > > > > 8021q bridge stp llc ipv6 dm_mirror dm_region_hash dm_log
> > > > > vhost_net macvtap macvlan tun uinput iTCO_wdt
> > > > > iTCO_vendor_support kvm_intel kvm crc32c_intel microcode pcspkr
> > > > > joydev i2c_i801 lpc_ich mfd_core ehci_pci ehci_hcd sg ioatdma
> > > > > ixgbe mdio mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core igb
> > > > > hwmon dca ptp pps_core button dm_mod ext3
> > > jbd
> > > > > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod CPU 6
> > > > > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > > > > X8DTH-i/6/iF/6F/X8DTH
> > > > > RIP: 0010:[]  []
> > > > > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > > > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > > > > RAX: 880324dc8000 RBX: 0001 RCX:
> > > > > 880324dd8428
> > > > > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI:
> > > > > 81149618
> > > > > RBP: 880324c3dd78 R08: 60f9c860 R09:
> > > > > 0001
> > > > > R10: 880324dd8000 R11: 0001 R12:
> > > > > 8806299dcb10
> > > > > R13: 0003 R14: 0001 R15:
> > > > > 0010
> > > > > FS:  () GS:88063fc0()
> > > > > knlGS:
> > > > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > > > CR2: 880324dc7ff8 CR3: 01a0b000 CR4:
> > > > > 07e0
> > > > > DR0:  DR1:  DR2:
> > > > > 
> > > > > DR3:  DR6: 0ff0 DR7:
> > > > > 0400 Process nfsd (pid: 4744, threadinfo
> > > > > 880324c3c000, task
> > > > > 88033055)
> > > > > Stack:
> > > > >  880324c3dc78 880324c3dcd8 0282
> > > > > 880631cec000
> > > > >  880324dd8000 88062ed33040 000124c3dc48
> > > > > 880324dd8000
> > > > >  88062ed33058 880630ce2b90 8806299e8000
> > > > > 0003 Call Trace:
> > > > >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> > > > > [] ? try_to_wake_up+0x2f0/0x2f0
> > > > > [] svc_recv+0x3ef/0x4b0 [sunrpc]
> > > > > [] ? nfsd_svc+0x740/0x740 [nfsd]
> > > > > [] nfsd+0xad/0x130 [nfsd]  [] ?
> > > > > nfsd_svc+0x740/0x740 [nfsd]  []
> > > > > kthread+0xd6/0xe0 [] ?
> > > > > __init_kthread_worker+0x70/0x70 []
> ret_from_fork+0x7c/0xb0  [] ?
> > > > > __init_kthread_worker+0x70/0x70
>

RE: NFS over RDMA crashing

2014-03-07 Thread Steve Wise
Resurrecting an old issue :)

More inline below...

> -Original Message-
> From: linux-nfs-ow...@vger.kernel.org [mailto:linux-nfs-
> ow...@vger.kernel.org] On Behalf Of J. Bruce Fields
> Sent: Thursday, February 07, 2013 10:42 AM
> To: Yan Burman
> Cc: linux-...@vger.kernel.org; sw...@opengridcomputing.com; linux-
> r...@vger.kernel.org; Or Gerlitz
> Subject: Re: NFS over RDMA crashing
> 
> On Wed, Feb 06, 2013 at 05:24:35PM -0500, J. Bruce Fields wrote:
> > On Wed, Feb 06, 2013 at 05:48:15PM +0200, Yan Burman wrote:
> > > When killing mount command that got stuck:
> > > ---
> > >
> > > BUG: unable to handle kernel paging request at 880324dc7ff8
> > > IP: [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > PGD 1a0c063 PUD 32f82e063 PMD 32f2fd063 PTE 800324dc7161
> > > Oops: 0003 [#1] PREEMPT SMP
> > > Modules linked in: md5 ib_ipoib xprtrdma svcrdma rdma_cm ib_cm
> iw_cm
> > > ib_addr nfsd exportfs netconsole ip6table_filter ip6_tables
> > > iptable_filter ip_tables ebtable_nat nfsv3 nfs_acl ebtables
x_tables
> > > nfsv4 auth_rpcgss nfs lockd autofs4 sunrpc target_core_iblock
> > > target_core_file target_core_pscsi target_core_mod configfs 8021q
> > > bridge stp llc ipv6 dm_mirror dm_region_hash dm_log vhost_net
> > > macvtap macvlan tun uinput iTCO_wdt iTCO_vendor_support kvm_intel
> > > kvm crc32c_intel microcode pcspkr joydev i2c_i801 lpc_ich mfd_core
> > > ehci_pci ehci_hcd sg ioatdma ixgbe mdio mlx4_ib ib_sa ib_mad
> ib_core
> > > mlx4_en mlx4_core igb hwmon dca ptp pps_core button dm_mod ext3
> jbd
> > > sd_mod ata_piix libata uhci_hcd megaraid_sas scsi_mod
> > > CPU 6
> > > Pid: 4744, comm: nfsd Not tainted 3.8.0-rc5+ #4 Supermicro
> > > X8DTH-i/6/iF/6F/X8DTH
> > > RIP: 0010:[]  []
> > > rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > > RSP: 0018:880324c3dbf8  EFLAGS: 00010297
> > > RAX: 880324dc8000 RBX: 0001 RCX:
> 880324dd8428
> > > RDX: 880324dc7ff8 RSI: 880324dd8428 RDI: 81149618
> > > RBP: 880324c3dd78 R08: 60f9c860 R09:
> 0001
> > > R10: 880324dd8000 R11: 0001 R12: 8806299dcb10
> > > R13: 0003 R14: 0001 R15:
> 0010
> > > FS:  () GS:88063fc0()
> knlGS:
> > > CS:  0010 DS:  ES:  CR0: 8005003b
> > > CR2: 880324dc7ff8 CR3: 01a0b000 CR4:
> 07e0
> > > DR0:  DR1:  DR2:
> 
> > > DR3:  DR6: 0ff0 DR7:
> 0400
> > > Process nfsd (pid: 4744, threadinfo 880324c3c000, task
> 88033055)
> > > Stack:
> > >  880324c3dc78 880324c3dcd8 0282
> 880631cec000
> > >  880324dd8000 88062ed33040 000124c3dc48
> 880324dd8000
> > >  88062ed33058 880630ce2b90 8806299e8000
> 0003
> > > Call Trace:
> > >  [] svc_rdma_recvfrom+0x3ee/0xd80 [svcrdma]
> > >  [] ? try_to_wake_up+0x2f0/0x2f0
> > >  [] svc_recv+0x3ef/0x4b0 [sunrpc]
> > >  [] ? nfsd_svc+0x740/0x740 [nfsd]
> > >  [] nfsd+0xad/0x130 [nfsd]
> > >  [] ? nfsd_svc+0x740/0x740 [nfsd]
> > >  [] kthread+0xd6/0xe0
> > >  [] ? __init_kthread_worker+0x70/0x70
> > >  [] ret_from_fork+0x7c/0xb0
> > >  [] ? __init_kthread_worker+0x70/0x70
> > > Code: 63 c2 49 8d 8c c2 18 02 00 00 48 39 ce 77 e1 49 8b 82 40 0a
00
> > > 00 48 39 c6 0f 84 92 f7 ff ff 90 48 8d 50 f8 49 89 92 40 0a 00 00
> > > <48> c7 40 f8 00 00 00 00 49 8b 82 40 0a 00 00 49 3b 82 30 0a 00
> > > RIP  [] rdma_read_xdr+0x8bb/0xd40 [svcrdma]
> > >  RSP 
> > > CR2: 880324dc7ff8
> > > ---[ end trace 06d0384754e9609a ]---
> > >
> > >
> > > It seems that commit afc59400d6c65bad66d4ad0b2daf879cbff8e23e
> > > "nfsd4: cleanup: replace rq_resused count by rq_next_page pointer"
> > > is responsible for the crash (it seems to be crashing in
> > > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c:527)
> > > It may be because I have CONFIG_DEBUG_SET_MODULE_RONX and
> > > CONFIG_DEBUG_RODATA enabled. I did not try to disable them yet.
> > >
> > > When I moved to commit
> 79f77bf9a4e3dd5ead006b8f17e7c4ff07d8374e I
> > > was no longer getting the server crashes,
> > > so the reset of my tests were done using that point

RE: NFS over RDMA crashing

2014-03-07 Thread Steve Wise
> >
> > Does this help?
> >
> > They must have added this for some reason, but I'm not seeing how it
> > could have ever done anything
> >
> > --b.
> >
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index 0ce7552..e8f25ec 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -520,13 +520,6 @@ next_sge:
> > for (ch_no = 0; &rqstp->rq_pages[ch_no] < rqstp->rq_respages;
> > ch_no++)
> > rqstp->rq_pages[ch_no] = NULL;
> >
> > -   /*
> > -* Detach res pages. If svc_release sees any it will attempt to
> > -* put them.
> > -*/
> > -   while (rqstp->rq_next_page != rqstp->rq_respages)
> > -   *(--rqstp->rq_next_page) = NULL;
> > -
> > return err;
> >  }
> >
> 
> I can reproduce this server crash readily on a recent net-next tree.
I
> added the above change, and see a different crash:
> 
> [  192.764773] BUG: unable to handle kernel paging request at
> 1000
> [  192.765688] IP: [] put_page+0x9/0x50
> [  192.765688] PGD 0
> [  192.765688] Oops:  [#1] SMP DEBUG_PAGEALLOC
> [  192.765688] Modules linked in: nfsd lockd nfs_acl exportfs
> auth_rpcgss oid_registry svcrdma tg3 ip6table_filter ip6_tables
> ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
> nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter
> ip_tables bridge stp llc autofs4 sunrpc rdma_ucm rdma_cm iw_cm
ib_ipoib
> ib_cm ib_uverbs ib_umad iw_nes libcrc32c iw_cxgb4 iw_cxgb3 cxgb3 mdio
> ib_qib dca mlx4_en ib_mthca vhost_net macvtap macvlan vhost tun
> kvm_intel kvm uinput ipmi_si ipmi_msghandler iTCO_wdt
> iTCO_vendor_support dcdbas sg microcode pcspkr mlx4_ib ib_sa serio_raw
> ib_mad ib_core ib_addr ipv6 ptp pps_core lpc_ich mfd_core i5100_edac
> edac_core mlx4_core cxgb4 ext4 jbd2 mbcache sd_mod crc_t10dif
> crct10dif_common sr_mod cdrom pata_acpi ata_generic ata_piix radeon
> ttm
> drm_kms_helper drm i2c_algo_bit
> [  192.765688]  i2c_core dm_mirror dm_region_hash dm_log dm_mod
> [last
> unloaded: tg3]
> [  192.765688] CPU: 1 PID: 6590 Comm: nfsd Not tainted
> 3.14.0-rc3-pending+ #5
> [  192.765688] Hardware name: Dell Inc. PowerEdge R300/0TY179, BIOS
> 1.3.0 08/15/2008
> [  192.765688] task: 8800b75c62c0 ti: 8801faa4a000 task.ti:
> 8801faa4a000
> [  192.765688] RIP: 0010:[]  []
> put_page+0x9/0x50
> [  192.765688] RSP: 0018:8801faa4be28  EFLAGS: 00010206
> [  192.765688] RAX: 8801fa9542a8 RBX: 8801fa954000 RCX:
> 0001
> [  192.765688] RDX: 8801fa953e10 RSI: 0200 RDI:
> 1000
> [  192.765688] RBP: 8801faa4be28 R08: 9b8d39b9 R09:
> 0017
> [  192.765688] R10:  R11:  R12:
> 8800cb2e7c00
> [  192.765688] R13: 8801fa954210 R14:  R15:
> 
> [  192.765688] FS:  () GS:88022ec8()
> knlGS:
> [  192.765688] CS:  0010 DS:  ES:  CR0: 8005003b
> [  192.765688] CR2: 1000 CR3: b9a5a000 CR4:
> 07e0
> [  192.765688] Stack:
> [  192.765688]  8801faa4be58 a0881f4e 880204dd0e00
> 8801fa954000
> [  192.765688]  880204dd0e00 8800cb2e7c00 8801faa4be88
> a08825f5
> [  192.765688]  8801fa954000 8800b75c62c0 81ae5ac0
> a08cf930
> [  192.765688] Call Trace:
> [  192.765688]  [] svc_xprt_release+0x6e/0xf0
[sunrpc]
> [  192.765688]  [] svc_recv+0x165/0x190 [sunrpc]
> [  192.765688]  [] ?
nfsd_pool_stats_release+0x60/0x60
> [nfsd]
> [  192.765688]  [] nfsd+0xb5/0x160 [nfsd]
> [  192.765688]  [] ?
nfsd_pool_stats_release+0x60/0x60
> [nfsd]
> [  192.765688]  [] kthread+0xce/0xf0
> [  192.765688]  [] ?
> kthread_freezable_should_stop+0x70/0x70
> [  192.765688]  [] ret_from_fork+0x7c/0xb0
> [  192.765688]  [] ?
> kthread_freezable_should_stop+0x70/0x70
> [  192.765688] Code: 8d 7b 10 e8 ea fa ff ff 48 c7 03 00 00 00 00 48
83
> c4 08 5b c9 c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66
66
> 66 90 <66> f7 07 00 c0 75 32 8b 47 1c 48 8d 57 1c 85 c0 74 1c f0 ff 0a
> [  192.765688] RIP  [] put_page+0x9/0x50
> [  192.765688]  RSP 
> [  192.765688] CR2: 1000
> crash>

This new crash is here calling put_page() on garbage I guess:

static inline void svc_free_res_pages(struct svc_rqst *rqstp)
{
while (rqstp->rq_next_page != rqstp->rq_respages) {
struct page **pp = --rqstp->rq_next_page;
if (*pp) {
put_page(*pp);
*pp = NULL;
}
}
}
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-08 Thread Steve Wise

On 3/7/2014 2:41 PM, Steve Wise wrote:

Does this help?

They must have added this for some reason, but I'm not seeing how it
could have ever done anything

--b.

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 0ce7552..e8f25ec 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -520,13 +520,6 @@ next_sge:
for (ch_no = 0; &rqstp->rq_pages[ch_no] < rqstp->rq_respages;
ch_no++)
rqstp->rq_pages[ch_no] = NULL;

-   /*
-* Detach res pages. If svc_release sees any it will attempt to
-* put them.
-*/
-   while (rqstp->rq_next_page != rqstp->rq_respages)
-   *(--rqstp->rq_next_page) = NULL;
-
return err;
  }


I can reproduce this server crash readily on a recent net-next tree.

I

added the above change, and see a different crash:

[  192.764773] BUG: unable to handle kernel paging request at
1000
[  192.765688] IP: [] put_page+0x9/0x50
[  192.765688] PGD 0
[  192.765688] Oops:  [#1] SMP DEBUG_PAGEALLOC
[  192.765688] Modules linked in: nfsd lockd nfs_acl exportfs
auth_rpcgss oid_registry svcrdma tg3 ip6table_filter ip6_tables
ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle iptable_filter
ip_tables bridge stp llc autofs4 sunrpc rdma_ucm rdma_cm iw_cm

ib_ipoib

ib_cm ib_uverbs ib_umad iw_nes libcrc32c iw_cxgb4 iw_cxgb3 cxgb3 mdio
ib_qib dca mlx4_en ib_mthca vhost_net macvtap macvlan vhost tun
kvm_intel kvm uinput ipmi_si ipmi_msghandler iTCO_wdt
iTCO_vendor_support dcdbas sg microcode pcspkr mlx4_ib ib_sa serio_raw
ib_mad ib_core ib_addr ipv6 ptp pps_core lpc_ich mfd_core i5100_edac
edac_core mlx4_core cxgb4 ext4 jbd2 mbcache sd_mod crc_t10dif
crct10dif_common sr_mod cdrom pata_acpi ata_generic ata_piix radeon
ttm
drm_kms_helper drm i2c_algo_bit
[  192.765688]  i2c_core dm_mirror dm_region_hash dm_log dm_mod
[last
unloaded: tg3]
[  192.765688] CPU: 1 PID: 6590 Comm: nfsd Not tainted
3.14.0-rc3-pending+ #5
[  192.765688] Hardware name: Dell Inc. PowerEdge R300/0TY179, BIOS
1.3.0 08/15/2008
[  192.765688] task: 8800b75c62c0 ti: 8801faa4a000 task.ti:
8801faa4a000
[  192.765688] RIP: 0010:[]  []
put_page+0x9/0x50
[  192.765688] RSP: 0018:8801faa4be28  EFLAGS: 00010206
[  192.765688] RAX: 8801fa9542a8 RBX: 8801fa954000 RCX:
0001
[  192.765688] RDX: 8801fa953e10 RSI: 0200 RDI:
1000
[  192.765688] RBP: 8801faa4be28 R08: 9b8d39b9 R09:
0017
[  192.765688] R10:  R11:  R12:
8800cb2e7c00
[  192.765688] R13: 8801fa954210 R14:  R15:

[  192.765688] FS:  () GS:88022ec8()
knlGS:
[  192.765688] CS:  0010 DS:  ES:  CR0: 8005003b
[  192.765688] CR2: 1000 CR3: b9a5a000 CR4:
07e0
[  192.765688] Stack:
[  192.765688]  8801faa4be58 a0881f4e 880204dd0e00
8801fa954000
[  192.765688]  880204dd0e00 8800cb2e7c00 8801faa4be88
a08825f5
[  192.765688]  8801fa954000 8800b75c62c0 81ae5ac0
a08cf930
[  192.765688] Call Trace:
[  192.765688]  [] svc_xprt_release+0x6e/0xf0

[sunrpc]

[  192.765688]  [] svc_recv+0x165/0x190 [sunrpc]
[  192.765688]  [] ?

nfsd_pool_stats_release+0x60/0x60

[nfsd]
[  192.765688]  [] nfsd+0xb5/0x160 [nfsd]
[  192.765688]  [] ?

nfsd_pool_stats_release+0x60/0x60

[nfsd]
[  192.765688]  [] kthread+0xce/0xf0
[  192.765688]  [] ?
kthread_freezable_should_stop+0x70/0x70
[  192.765688]  [] ret_from_fork+0x7c/0xb0
[  192.765688]  [] ?
kthread_freezable_should_stop+0x70/0x70
[  192.765688] Code: 8d 7b 10 e8 ea fa ff ff 48 c7 03 00 00 00 00 48

83

c4 08 5b c9 c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 66 66

66

66 90 <66> f7 07 00 c0 75 32 8b 47 1c 48 8d 57 1c 85 c0 74 1c f0 ff 0a
[  192.765688] RIP  [] put_page+0x9/0x50
[  192.765688]  RSP 
[  192.765688] CR2: 1000
crash>

This new crash is here calling put_page() on garbage I guess:

static inline void svc_free_res_pages(struct svc_rqst *rqstp)
{
 while (rqstp->rq_next_page != rqstp->rq_respages) {
 struct page **pp = --rqstp->rq_next_page;
 if (*pp) {
 put_page(*pp);
 *pp = NULL;
 }
 }
}
  


I removed your change and started debugging original crash that happens 
on top-o-tree.   Seems like rq_next_pages is screwed up.  It should 
always be >= rq_respages, yes?  I added a BUG_ON() to assert this in 
rdma_read_xdr() we hit the BUG_ON().  Look


crash> svc_rqst.rq_next_page 0x8800b84e6000
  rq_next_page = 0x8800b84e6228
crash> svc_rqst.rq_respages 0x8800b84e6000
  rq_respages = 0x8800b84e62a8

Any ideas Bruce/Tom?

Here are the BUG_ON()s I added:

diff --

Re: NFS over RDMA crashing

2014-03-08 Thread Steve Wise


I removed your change and started debugging original crash that 
happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
this in rdma_read_xdr() we hit the BUG_ON(). Look


crash> svc_rqst.rq_next_page 0x8800b84e6000
  rq_next_page = 0x8800b84e6228
crash> svc_rqst.rq_respages 0x8800b84e6000
  rq_respages = 0x8800b84e62a8

Any ideas Bruce/Tom?



Guys, the patch below seems to fix the problem.  Dunno if it is correct 
though.  What do you think?


diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c

index 0ce7552..6d62411 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
sge_no++;
}
rqstp->rq_respages = &rqstp->rq_pages[sge_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

/* We should never run out of SGE because the limit is defined to
 * support the max allowed RPC data length
@@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct svcxprt_rdma 
*xprt,


/* rq_respages points one past arg pages */
rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

/* Create the reply and chunk maps */
offset = 0;


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-08 Thread Steve Wise

On 3/8/2014 1:20 PM, Steve Wise wrote:


I removed your change and started debugging original crash that 
happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
this in rdma_read_xdr() we hit the BUG_ON(). Look


crash> svc_rqst.rq_next_page 0x8800b84e6000
  rq_next_page = 0x8800b84e6228
crash> svc_rqst.rq_respages 0x8800b84e6000
  rq_respages = 0x8800b84e62a8

Any ideas Bruce/Tom?



Guys, the patch below seems to fix the problem.  Dunno if it is 
correct though.  What do you think?


diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c

index 0ce7552..6d62411 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
sge_no++;
}
rqstp->rq_respages = &rqstp->rq_pages[sge_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

/* We should never run out of SGE because the limit is defined to
 * support the max allowed RPC data length
@@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
svcxprt_rdma *xprt,


/* rq_respages points one past arg pages */
rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

/* Create the reply and chunk maps */
offset = 0;




While this patch avoids the crashing, it apparently isn't correct...I'm 
getting IO errors reading files over the mount. :)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Jeff Layton
On Sat, 08 Mar 2014 14:13:44 -0600
Steve Wise  wrote:

> On 3/8/2014 1:20 PM, Steve Wise wrote:
> >
> >> I removed your change and started debugging original crash that 
> >> happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
> >> should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
> >> this in rdma_read_xdr() we hit the BUG_ON(). Look
> >>
> >> crash> svc_rqst.rq_next_page 0x8800b84e6000
> >>   rq_next_page = 0x8800b84e6228
> >> crash> svc_rqst.rq_respages 0x8800b84e6000
> >>   rq_respages = 0x8800b84e62a8
> >>
> >> Any ideas Bruce/Tom?
> >>
> >
> > Guys, the patch below seems to fix the problem.  Dunno if it is 
> > correct though.  What do you think?
> >
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> > b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index 0ce7552..6d62411 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
> > sge_no++;
> > }
> > rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> > +   rqstp->rq_next_page = rqstp->rq_respages;
> >
> > /* We should never run out of SGE because the limit is defined to
> >  * support the max allowed RPC data length
> > @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
> > svcxprt_rdma *xprt,
> >
> > /* rq_respages points one past arg pages */
> > rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> > +   rqstp->rq_next_page = rqstp->rq_respages;
> >
> > /* Create the reply and chunk maps */
> > offset = 0;
> >
> >
> 
> While this patch avoids the crashing, it apparently isn't correct...I'm 
> getting IO errors reading files over the mount. :)
> 

I hit the same oops and tested your patch and it seems to have fixed
that particular panic, but I still see a bunch of other mem corruption
oopses even with it. I'll look more closely at that when I get some
time.

FWIW, I can easily reproduce that by simply doing something like:

$ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1

I'm not sure why you're not seeing any panics with your patch in place.
Perhaps it's due to hw differences between our test rigs.

The EIO problem that you're seeing is likely the same client bug that
Chuck recently fixed in this patch:

[PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA

AIUI, Trond is merging that set for 3.15, so I'd make sure your client
has those patches when testing.

Finally, I also have a forthcoming patch to fix non-page aligned NFS
READs as well. I'm hesitant to send that out though until I can at
least run the connectathon testsuite against this server. The WRITE
oopses sort of prevent that for now...

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Trond Myklebust

On Mar 12, 2014, at 9:33, Jeff Layton  wrote:

> On Sat, 08 Mar 2014 14:13:44 -0600
> Steve Wise  wrote:
> 
>> On 3/8/2014 1:20 PM, Steve Wise wrote:
>>> 
 I removed your change and started debugging original crash that 
 happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It 
 should always be >= rq_respages, yes?  I added a BUG_ON() to assert 
 this in rdma_read_xdr() we hit the BUG_ON(). Look
 
 crash> svc_rqst.rq_next_page 0x8800b84e6000
 rq_next_page = 0x8800b84e6228
 crash> svc_rqst.rq_respages 0x8800b84e6000
 rq_respages = 0x8800b84e62a8
 
 Any ideas Bruce/Tom?
 
>>> 
>>> Guys, the patch below seems to fix the problem.  Dunno if it is 
>>> correct though.  What do you think?
>>> 
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
>>> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> index 0ce7552..6d62411 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
>>>   sge_no++;
>>>   }
>>>   rqstp->rq_respages = &rqstp->rq_pages[sge_no];
>>> +   rqstp->rq_next_page = rqstp->rq_respages;
>>> 
>>>   /* We should never run out of SGE because the limit is defined to
>>>* support the max allowed RPC data length
>>> @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
>>> svcxprt_rdma *xprt,
>>> 
>>>   /* rq_respages points one past arg pages */
>>>   rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
>>> +   rqstp->rq_next_page = rqstp->rq_respages;
>>> 
>>>   /* Create the reply and chunk maps */
>>>   offset = 0;
>>> 
>>> 
>> 
>> While this patch avoids the crashing, it apparently isn't correct...I'm 
>> getting IO errors reading files over the mount. :)
>> 
> 
> I hit the same oops and tested your patch and it seems to have fixed
> that particular panic, but I still see a bunch of other mem corruption
> oopses even with it. I'll look more closely at that when I get some
> time.
> 
> FWIW, I can easily reproduce that by simply doing something like:
> 
>   $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1
> 
> I'm not sure why you're not seeing any panics with your patch in place.
> Perhaps it's due to hw differences between our test rigs.
> 
> The EIO problem that you're seeing is likely the same client bug that
> Chuck recently fixed in this patch:
> 
>   [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA
> 
> AIUI, Trond is merging that set for 3.15, so I'd make sure your client
> has those patches when testing.
> 

Nothing is in my queue yet.

_
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Tom Tucker

Hi Trond,

I think this patch is still 'off-by-one'. We'll take a look at this today.

Thanks,
Tom

On 3/12/14 9:05 AM, Trond Myklebust wrote:

On Mar 12, 2014, at 9:33, Jeff Layton  wrote:


On Sat, 08 Mar 2014 14:13:44 -0600
Steve Wise  wrote:


On 3/8/2014 1:20 PM, Steve Wise wrote:

I removed your change and started debugging original crash that
happens on top-o-tree.   Seems like rq_next_pages is screwed up.  It
should always be >= rq_respages, yes?  I added a BUG_ON() to assert
this in rdma_read_xdr() we hit the BUG_ON(). Look

crash> svc_rqst.rq_next_page 0x8800b84e6000
rq_next_page = 0x8800b84e6228
crash> svc_rqst.rq_respages 0x8800b84e6000
rq_respages = 0x8800b84e62a8

Any ideas Bruce/Tom?


Guys, the patch below seems to fix the problem.  Dunno if it is
correct though.  What do you think?

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index 0ce7552..6d62411 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst *rqstp,
   sge_no++;
   }
   rqstp->rq_respages = &rqstp->rq_pages[sge_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

   /* We should never run out of SGE because the limit is defined to
* support the max allowed RPC data length
@@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct
svcxprt_rdma *xprt,

   /* rq_respages points one past arg pages */
   rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
+   rqstp->rq_next_page = rqstp->rq_respages;

   /* Create the reply and chunk maps */
   offset = 0;



While this patch avoids the crashing, it apparently isn't correct...I'm
getting IO errors reading files over the mount. :)


I hit the same oops and tested your patch and it seems to have fixed
that particular panic, but I still see a bunch of other mem corruption
oopses even with it. I'll look more closely at that when I get some
time.

FWIW, I can easily reproduce that by simply doing something like:

   $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1

I'm not sure why you're not seeing any panics with your patch in place.
Perhaps it's due to hw differences between our test rigs.

The EIO problem that you're seeing is likely the same client bug that
Chuck recently fixed in this patch:

   [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA

AIUI, Trond is merging that set for 3.15, so I'd make sure your client
has those patches when testing.


Nothing is in my queue yet.

_
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Jeffrey Layton
On Wed, 12 Mar 2014 10:05:24 -0400
Trond Myklebust  wrote:

> 
> On Mar 12, 2014, at 9:33, Jeff Layton  wrote:
> 
> > On Sat, 08 Mar 2014 14:13:44 -0600
> > Steve Wise  wrote:
> > 
> >> On 3/8/2014 1:20 PM, Steve Wise wrote:
> >>> 
>  I removed your change and started debugging original crash that 
>  happens on top-o-tree.   Seems like rq_next_pages is screwed
>  up.  It should always be >= rq_respages, yes?  I added a
>  BUG_ON() to assert this in rdma_read_xdr() we hit the BUG_ON().
>  Look
>  
>  crash> svc_rqst.rq_next_page 0x8800b84e6000
>  rq_next_page = 0x8800b84e6228
>  crash> svc_rqst.rq_respages 0x8800b84e6000
>  rq_respages = 0x8800b84e62a8
>  
>  Any ideas Bruce/Tom?
>  
> >>> 
> >>> Guys, the patch below seems to fix the problem.  Dunno if it is 
> >>> correct though.  What do you think?
> >>> 
> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> >>> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> index 0ce7552..6d62411 100644
> >>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst
> >>> *rqstp, sge_no++;
> >>>   }
> >>>   rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> >>> +   rqstp->rq_next_page = rqstp->rq_respages;
> >>> 
> >>>   /* We should never run out of SGE because the limit is
> >>> defined to
> >>>* support the max allowed RPC data length
> >>> @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
> >>> svcxprt_rdma *xprt,
> >>> 
> >>>   /* rq_respages points one past arg pages */
> >>>   rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> >>> +   rqstp->rq_next_page = rqstp->rq_respages;
> >>> 
> >>>   /* Create the reply and chunk maps */
> >>>   offset = 0;
> >>> 
> >>> 
> >> 
> >> While this patch avoids the crashing, it apparently isn't
> >> correct...I'm getting IO errors reading files over the mount. :)
> >> 
> > 
> > I hit the same oops and tested your patch and it seems to have fixed
> > that particular panic, but I still see a bunch of other mem
> > corruption oopses even with it. I'll look more closely at that when
> > I get some time.
> > 
> > FWIW, I can easily reproduce that by simply doing something like:
> > 
> >   $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1
> > 
> > I'm not sure why you're not seeing any panics with your patch in
> > place. Perhaps it's due to hw differences between our test rigs.
> > 
> > The EIO problem that you're seeing is likely the same client bug
> > that Chuck recently fixed in this patch:
> > 
> >   [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA
> > 
> > AIUI, Trond is merging that set for 3.15, so I'd make sure your
> > client has those patches when testing.
> > 
> 
> Nothing is in my queue yet.
> 

Doh! Any reason not to merge that set from Chuck? They do fix a couple
of nasty client bugs...

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Trond Myklebust

On Mar 12, 2014, at 10:28, Jeffrey Layton  wrote:

> On Wed, 12 Mar 2014 10:05:24 -0400
> Trond Myklebust  wrote:
> 
>> 
>> On Mar 12, 2014, at 9:33, Jeff Layton  wrote:
>> 
>>> On Sat, 08 Mar 2014 14:13:44 -0600
>>> Steve Wise  wrote:
>>> 
 On 3/8/2014 1:20 PM, Steve Wise wrote:
> 
>> I removed your change and started debugging original crash that 
>> happens on top-o-tree.   Seems like rq_next_pages is screwed
>> up.  It should always be >= rq_respages, yes?  I added a
>> BUG_ON() to assert this in rdma_read_xdr() we hit the BUG_ON().
>> Look
>> 
>> crash> svc_rqst.rq_next_page 0x8800b84e6000
>> rq_next_page = 0x8800b84e6228
>> crash> svc_rqst.rq_respages 0x8800b84e6000
>> rq_respages = 0x8800b84e62a8
>> 
>> Any ideas Bruce/Tom?
>> 
> 
> Guys, the patch below seems to fix the problem.  Dunno if it is 
> correct though.  What do you think?
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index 0ce7552..6d62411 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst
> *rqstp, sge_no++;
>  }
>  rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> +   rqstp->rq_next_page = rqstp->rq_respages;
> 
>  /* We should never run out of SGE because the limit is
> defined to
>   * support the max allowed RPC data length
> @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
> svcxprt_rdma *xprt,
> 
>  /* rq_respages points one past arg pages */
>  rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> +   rqstp->rq_next_page = rqstp->rq_respages;
> 
>  /* Create the reply and chunk maps */
>  offset = 0;
> 
> 
 
 While this patch avoids the crashing, it apparently isn't
 correct...I'm getting IO errors reading files over the mount. :)
 
>>> 
>>> I hit the same oops and tested your patch and it seems to have fixed
>>> that particular panic, but I still see a bunch of other mem
>>> corruption oopses even with it. I'll look more closely at that when
>>> I get some time.
>>> 
>>> FWIW, I can easily reproduce that by simply doing something like:
>>> 
>>>  $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1
>>> 
>>> I'm not sure why you're not seeing any panics with your patch in
>>> place. Perhaps it's due to hw differences between our test rigs.
>>> 
>>> The EIO problem that you're seeing is likely the same client bug
>>> that Chuck recently fixed in this patch:
>>> 
>>>  [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA
>>> 
>>> AIUI, Trond is merging that set for 3.15, so I'd make sure your
>>> client has those patches when testing.
>>> 
>> 
>> Nothing is in my queue yet.
>> 
> 
> Doh! Any reason not to merge that set from Chuck? They do fix a couple
> of nasty client bugs…
> 

Most of them are one-line debugging dprintks which I do not intend to apply.

One of them confuses a readdir optimisation with a bugfix; at the very least 
the patch comments need changing.
That leaves 2 that can go in, however as they are clearly insufficient to make 
RDMA safe for general use, they certainly do not warrant a stable@ label. The 
workaround for the Oopses is simple: use TCP.

_
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: NFS over RDMA crashing

2014-03-12 Thread Jeffrey Layton
On Wed, 12 Mar 2014 11:03:52 -0400
Trond Myklebust  wrote:

> 
> On Mar 12, 2014, at 10:28, Jeffrey Layton  wrote:
> 
> > On Wed, 12 Mar 2014 10:05:24 -0400
> > Trond Myklebust  wrote:
> > 
> >> 
> >> On Mar 12, 2014, at 9:33, Jeff Layton  wrote:
> >> 
> >>> On Sat, 08 Mar 2014 14:13:44 -0600
> >>> Steve Wise  wrote:
> >>> 
>  On 3/8/2014 1:20 PM, Steve Wise wrote:
> > 
> >> I removed your change and started debugging original crash
> >> that happens on top-o-tree.   Seems like rq_next_pages is
> >> screwed up.  It should always be >= rq_respages, yes?  I added
> >> a BUG_ON() to assert this in rdma_read_xdr() we hit the
> >> BUG_ON(). Look
> >> 
> >> crash> svc_rqst.rq_next_page 0x8800b84e6000
> >> rq_next_page = 0x8800b84e6228
> >> crash> svc_rqst.rq_respages 0x8800b84e6000
> >> rq_respages = 0x8800b84e62a8
> >> 
> >> Any ideas Bruce/Tom?
> >> 
> > 
> > Guys, the patch below seems to fix the problem.  Dunno if it is 
> > correct though.  What do you think?
> > 
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c 
> > b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index 0ce7552..6d62411 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -90,6 +90,7 @@ static void rdma_build_arg_xdr(struct svc_rqst
> > *rqstp, sge_no++;
> >  }
> >  rqstp->rq_respages = &rqstp->rq_pages[sge_no];
> > +   rqstp->rq_next_page = rqstp->rq_respages;
> > 
> >  /* We should never run out of SGE because the limit is
> > defined to
> >   * support the max allowed RPC data length
> > @@ -276,6 +277,7 @@ static int fast_reg_read_chunks(struct 
> > svcxprt_rdma *xprt,
> > 
> >  /* rq_respages points one past arg pages */
> >  rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
> > +   rqstp->rq_next_page = rqstp->rq_respages;
> > 
> >  /* Create the reply and chunk maps */
> >  offset = 0;
> > 
> > 
>  
>  While this patch avoids the crashing, it apparently isn't
>  correct...I'm getting IO errors reading files over the mount. :)
>  
> >>> 
> >>> I hit the same oops and tested your patch and it seems to have
> >>> fixed that particular panic, but I still see a bunch of other mem
> >>> corruption oopses even with it. I'll look more closely at that
> >>> when I get some time.
> >>> 
> >>> FWIW, I can easily reproduce that by simply doing something like:
> >>> 
> >>>  $ dd if=/dev/urandom of=/file/on/nfsordma/mount bs=4k count=1
> >>> 
> >>> I'm not sure why you're not seeing any panics with your patch in
> >>> place. Perhaps it's due to hw differences between our test rigs.
> >>> 
> >>> The EIO problem that you're seeing is likely the same client bug
> >>> that Chuck recently fixed in this patch:
> >>> 
> >>>  [PATCH 2/8] SUNRPC: Fix large reads on NFS/RDMA
> >>> 
> >>> AIUI, Trond is merging that set for 3.15, so I'd make sure your
> >>> client has those patches when testing.
> >>> 
> >> 
> >> Nothing is in my queue yet.
> >> 
> > 
> > Doh! Any reason not to merge that set from Chuck? They do fix a
> > couple of nasty client bugs…
> > 
> 
> Most of them are one-line debugging dprintks which I do not intend to
> apply.
> 

Fair enough. Those are certainly not necessary, but some of them clean
up existing printks and probably do need to go in. That said, debugging
this stuff is *really* difficult so having extra debug printks in place
seems like a good thing (unless you're arguing for moving wholesale to
tracepoints instead).

> One of them confuses a readdir optimisation with a bugfix; at the
> very least the patch comments need changing.

I'll leave that to Chuck to comment on. I had the impression that it
was a bugfix, but maybe there's some better way to handle that bug.

>  That leaves 2 that can
> go in, however as they are clearly insufficient to make RDMA safe for
> general use, they certainly do not warrant a stable@ label. The
> workaround for the Oopses is simple: use TCP.
> 

Yeah, it's definitely rickety, but it's in and we do need to get fixes
merged to this code. I'm ok with dropping the stable labels on those
patches, but if we're going to declare this stuff "not stable enough
for general use" then I think that we should take an aggressive approach
on merging fixes to it.

FWIW, I also notice that Kconfig doesn't show the option to actually
enable/disable RDMA transports. I'll post a patch to fix that soon.
Since this stuff is not very safe to use, then we should make it
reasonably simple to disable it.

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html