Re: [PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception

2017-08-24 Thread J. Bruce Fields
On Wed, Aug 23, 2017 at 06:33:42AM -0400, Jeff Layton wrote:
> I think this one looks fine. Bruce, do you mind picking this one up? It
> might also be reasonable for stable...

Yep, thanks to you both, I'll plan to send a pull request tonight or
tomorrow.

--b.


Re: [PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception

2017-08-23 Thread Jeff Layton
On Wed, 2017-08-23 at 06:24 -0400, Vadim Lomovtsev wrote:
> Hi all,
> 
> Any comments on this ?
> 
> WBR,
> Vadim
> 
> On Mon, Aug 21, 2017 at 07:23:07AM -0400, Vadim Lomovtsev wrote:
> > While running nfs/connectathon tests kernel NULL-pointer exception
> > has been observed due to races in svcsock.c.
> > 
> > Race is appear when kernel accepts connection by kernel_accept
> > (which creates new socket) and start queuing ingress packets
> > to new socket. This happens in ksoftirq context which could run
> > concurrently on a different core while new socket setup is not done yet.
> > 
> > The fix is to re-order socket user data init sequence and add
> > write/read barrier calls to be sure that we got proper values
> > for callback pointers before actually calling them.
> > 
> > Test results: nfs/connectathon reports '0' failed tests for about 200+ 
> > iterations.
> > 
> > Crash log:
> > ---<-snip->---
> > [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual 
> > address 
> > [ 6708.647093] pgd = 094e
> > [ 6708.650497] [] *pgd=01090003, *pud=01090003, 
> > *pmd=01080003, *pte=
> > [ 6708.660761] Internal error: Oops: 8605 [#1] SMP
> > [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log 
> > nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay 
> > xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 
> > xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag 
> > udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm 
> > libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp 
> > scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core 
> > nls_koi8_u nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack 
> > vfat fat ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg 
> > thunderx_edac i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd 
> > grace sunrpc xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper 
> > syscopyarea sysfillrect sysimgblt fb_sys_fops
> > [ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder 
> > mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
> > stap_3c300909c5b3f46dcacd49aab3334af_87021]
> > [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW  OE   
> > 4.11.0-4.el7.aarch64 #1
> > [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 
> > 2017
> > [ 6708.767910] task: 810006842e80 task.stack: 81000689c000
> > [ 6708.773822] PC is at 0x0
> > [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
> > [ 6708.781866] pc : [<>] lr : [] pstate: 
> > 6145
> > [ 6708.789248] sp : 810ffbad3900
> > [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58
> > [ 6708.797853] x27:  x26: 81000bbe1e00
> > [ 6708.803156] x25: 0020 x24: 800f7410bf28
> > [ 6708.808458] x23: 08c63000 x22: 08c63000
> > [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00
> > [ 6708.819063] x19: 810012412400 x18: d82a9df2
> > [ 6708.824365] x17:  x16: 
> > [ 6708.829667] x15:  x14: 0001
> > [ 6708.834969] x13:  x12: 722e736f622e676e
> > [ 6708.840271] x11: f814dd99 x10: 
> > [ 6708.845573] x9 : 737468722500 x8 : 
> > [ 6708.850875] x7 :  x6 : 
> > [ 6708.856177] x5 : 0028 x4 : 
> > [ 6708.861479] x3 :  x2 : e500
> > [ 6708.866781] x1 :  x0 : 81000bbe1e00
> > [ 6708.872084]
> > [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000)
> > [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a)
> > [ 6708.886075] Call trace:
> > [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840)
> > [ 6708.894942] 3700:   810012412400 
> > 0001
> > [ 6708.902759] 3720: 810ffbad3900  6145 
> > 800f7930
> > [ 6708.910577] 3740: 09274d00 03ea 0015 
> > 08c63000
> > [ 6708.918395] 3760: 810ffbad3830 800f7930 004d 
> > 
> > [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 
> > 004d
> > [ 6708.934030] 37a0: 800f7930093c 08c63000  
> > 0140
> > [ 6708.941848] 37c0: 08c2c000 00040b00 81000bbe1e00 
> > 
> > [ 6708.949665] 37e0: e500   
> > 0028
> > [ 6708.957483] 3800:    
> > 737468722500
> > [ 6708.965300] 3820:  f814dd99 722e736f622e676e 
> > 
> > [ 

Re: [PATCH v4] net: sunrpc: svcsock: fix NULL-pointer exception

2017-08-23 Thread Vadim Lomovtsev

Hi all,

Any comments on this ?

WBR,
Vadim

On Mon, Aug 21, 2017 at 07:23:07AM -0400, Vadim Lomovtsev wrote:
> While running nfs/connectathon tests kernel NULL-pointer exception
> has been observed due to races in svcsock.c.
> 
> Race is appear when kernel accepts connection by kernel_accept
> (which creates new socket) and start queuing ingress packets
> to new socket. This happens in ksoftirq context which could run
> concurrently on a different core while new socket setup is not done yet.
> 
> The fix is to re-order socket user data init sequence and add
> write/read barrier calls to be sure that we got proper values
> for callback pointers before actually calling them.
> 
> Test results: nfs/connectathon reports '0' failed tests for about 200+ 
> iterations.
> 
> Crash log:
> ---<-snip->---
> [ 6708.638984] Unable to handle kernel NULL pointer dereference at virtual 
> address 
> [ 6708.647093] pgd = 094e
> [ 6708.650497] [] *pgd=01090003, *pud=01090003, 
> *pmd=01080003, *pte=
> [ 6708.660761] Internal error: Oops: 8605 [#1] SMP
> [ 6708.665630] Modules linked in: nfsv3 nfnetlink_queue nfnetlink_log 
> nfnetlink rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache overlay 
> xt_CONNSECMARK xt_SECMARK xt_conntrack iptable_security ip_tables ah4 
> xfrm4_mode_transport sctp tun binfmt_misc ext4 jbd2 mbcache loop tcp_diag 
> udp_diag inet_diag rpcrdma ib_isert iscsi_target_mod ib_iser rdma_cm iw_cm 
> libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp 
> scsi_transport_srp ib_ipoib ib_ucm ib_uverbs ib_umad ib_cm ib_core nls_koi8_u 
> nls_cp932 ts_kmp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack vfat fat 
> ghash_ce sha2_ce sha1_ce cavium_rng_vf i2c_thunderx sg thunderx_edac 
> i2c_smbus edac_core cavium_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc 
> xfs libcrc32c nicvf nicpf ast i2c_algo_bit drm_kms_helper syscopyarea 
> sysfillrect sysimgblt fb_sys_fops
> [ 6708.736446]  ttm drm i2c_core thunder_bgx thunder_xcv mdio_thunder 
> mdio_cavium dm_mirror dm_region_hash dm_log dm_mod [last unloaded: 
> stap_3c300909c5b3f46dcacd49aab3334af_87021]
> [ 6708.752275] CPU: 84 PID: 0 Comm: swapper/84 Tainted: GW  OE   
> 4.11.0-4.el7.aarch64 #1
> [ 6708.760787] Hardware name: www.cavium.com CRB-2S/CRB-2S, BIOS 0.3 Mar 13 
> 2017
> [ 6708.767910] task: 810006842e80 task.stack: 81000689c000
> [ 6708.773822] PC is at 0x0
> [ 6708.776739] LR is at svc_data_ready+0x38/0x88 [sunrpc]
> [ 6708.781866] pc : [<>] lr : [] pstate: 
> 6145
> [ 6708.789248] sp : 810ffbad3900
> [ 6708.792551] x29: 810ffbad3900 x28: 08c73d58
> [ 6708.797853] x27:  x26: 81000bbe1e00
> [ 6708.803156] x25: 0020 x24: 800f7410bf28
> [ 6708.808458] x23: 08c63000 x22: 08c63000
> [ 6708.813760] x21: 800f7410bf28 x20: 81000bbe1e00
> [ 6708.819063] x19: 810012412400 x18: d82a9df2
> [ 6708.824365] x17:  x16: 
> [ 6708.829667] x15:  x14: 0001
> [ 6708.834969] x13:  x12: 722e736f622e676e
> [ 6708.840271] x11: f814dd99 x10: 
> [ 6708.845573] x9 : 737468722500 x8 : 
> [ 6708.850875] x7 :  x6 : 
> [ 6708.856177] x5 : 0028 x4 : 
> [ 6708.861479] x3 :  x2 : e500
> [ 6708.866781] x1 :  x0 : 81000bbe1e00
> [ 6708.872084]
> [ 6708.873565] Process swapper/84 (pid: 0, stack limit = 0x81000689c000)
> [ 6708.880341] Stack: (0x810ffbad3900 to 0x8100068a)
> [ 6708.886075] Call trace:
> [ 6708.888513] Exception stack(0x810ffbad3710 to 0x810ffbad3840)
> [ 6708.894942] 3700:   810012412400 
> 0001
> [ 6708.902759] 3720: 810ffbad3900  6145 
> 800f7930
> [ 6708.910577] 3740: 09274d00 03ea 0015 
> 08c63000
> [ 6708.918395] 3760: 810ffbad3830 800f7930 004d 
> 
> [ 6708.926212] 3780: 810ffbad3890 080f88dc 800f7930 
> 004d
> [ 6708.934030] 37a0: 800f7930093c 08c63000  
> 0140
> [ 6708.941848] 37c0: 08c2c000 00040b00 81000bbe1e00 
> 
> [ 6708.949665] 37e0: e500   
> 0028
> [ 6708.957483] 3800:    
> 737468722500
> [ 6708.965300] 3820:  f814dd99 722e736f622e676e 
> 
> [ 6708.973117] [<  (null)>]   (null)
> [ 6708.977824] [] tcp_data_queue+0x754/0xc5c
> [ 6708.983386] [] tcp_rcv_established+0x1a0/0x67c
> [ 6708.989384] [] tcp_v4_do_rcv+0x15c/0x22c
> [ 6708.994858] [] tcp_v4_rcv+0xaf0/0xb58
> [