Re: [PATCH] net: ieee802154: fix net_device reference release too early

2017-05-22 Thread zhanglin496
Hello.

Sorry too late to reply.
>
> Hello.
>
> On Thu, 2017-05-18 at 15:14, Stefan Schmidt wrote:
> > Hello.
> >
> > On Thu, 2017-05-18 at 15:50, linzhang wrote:
> > > This patch fixes the kernel oops when release net_device reference in
> > > advance. In function raw_sendmsg(i think the dgram_sendmsg has the same
> > > problem), there is a race condition between dev_put and dev_queue_xmit
> > > when the device is gong that maybe lead to dev_queue_ximt to see
> > > an illegal net_device pointer.
> > >
> >
> > You have a test case to reproduce this oops? I fear I have not seen
> > one.
>
> If you have a test case handy adding it to the commit would be handy. If you 
> do
> not have one around we can do without.
>

My test kernel is 3.13.0-32.
Becasue i am not have a real 802154 device, so i change lowpan_newlink
function to this:

/* find and hold real wpan device */
real_dev = dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
if (!real_dev)
return -ENODEV;
//  if (real_dev->type != ARPHRD_IEEE802154) {
//  dev_put(real_dev);
//  return -EINVAL;
//  }
lowpan_dev_info(dev)->real_dev = real_dev;
lowpan_dev_info(dev)->fragment_tag = 0;
mutex_init(&lowpan_dev_info(dev)->dev_list_mtx);

Also, in order to simulate preempt, i change the raw_sendmsg function to this:

skb->dev = dev;
skb->sk  = sk;
skb->protocol = htons(ETH_P_IEEE802154);
dev_put(dev);
//simulate preempt
schedule_timeout_uninterruptible(30 * HZ);
err = dev_queue_xmit(skb);
if (err > 0)
err = net_xmit_errno(err);

and this is my userspace test code named test_send_data:

#include 
#include 
#include 
#include 
#include 
int main(int argc, char **argv)
{
char buf[127];
int sockfd;
sockfd = socket(AF_IEEE802154, SOCK_RAW, 0);
if (sockfd < 0) {
printf("create sockfd error: %s\n", strerror(errno));
return -1;
}
send(sockfd, buf, sizeof(buf), 0);
return 0;
}

This is my test case:
root@zhanglin-x-computer:~/develop/802154# uname -a
Linux zhanglin-x-computer 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15
03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
root@zhanglin-x-computer:~/develop/802154# ip link add link eth0 name
lowpan0 type lowpan
root@zhanglin-x-computer:~/develop/802154#
//keep the lowpan0 device down
root@zhanglin-x-computer:~/develop/802154# ./test_send_data &
//wait a while
root@zhanglin-x-computer:~/develop/802154# ip link del link dev lowpan0
//the device is gone
//oops
[381.303307] general protection fault:  [#1]SMP
[381.303407] Modules linked in: af_802154 6lowpan bnep rfcomm
bluetooth nls_iso8859_1 snd_hda_codec_hdmi snd_hda_codec_realtek
rts5139(C) snd_hda_intel
snd_had_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi
snd_seq_midi_event snd_rawmidi snd_req intel_rapl snd_seq_device
coretemp i915 kvm_intel
kvm snd_timer snd crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
cypted drm_kms_helper drm i2c_algo_bit soundcore video mac_hid
parport_pc ppdev ip parport hid_generic
usbhid hid ahci r8169 mii libahdi
[381.304286] CPU:1 PID: 2524 Commm: 1 Tainted: G C 0 3.13.0-32-generic
#57-Ubuntu
[381.304409] Hardware name: Haier Haier DT Computer/Haier DT Codputer,
BIOS FIBT19H02_X64 06/09/2014
[381.304546] tasks: 96965fc0 ti: B0013779c000 task.ti:
B8013779c000
[381.304659] RIP: 0010:[] []
__dev_queue_ximt+0x61/0x500
[381.304798] RSP: 0018:B8013779dca0 EFLAGS: 00010202
[381.304880] RAX: 272b031d57565351 RBX:  RCX: 8800968f1a00
[381.304987] RDX:  RSI:  RDI: 8800968f1a00
[381.305095] RBP: 8e013773dce0 R08: 0266 R09: 0004
[381.305202] R10: 0004 R11: 0005 R12: 88013902e000
[381.305310] R13: 007f R14: 007f R15: 8800968f1a00
[381.305418] FS:  7fc57f50f740() GS: 88013fc8()
knlGS: 
[381.305540] CS:  0010 DS:  ES:  CR0: 8005003b
[381.305627] CR2: 7fad0841c000 CR3: 0001368dd000 CR4: 001007e0
[361.905734] Stack:
[381.305768]  002052d0 3facb30a 88013779dcc0
880137764000
[381.305898]  88013779de70 007f 007f
88013902e000
[381.306026]  88013779dcf0 81622490 88013779dd39
a03af9f1
[381.306155] Call Trace:
[381.306202]  [] dev_queue_xmit+0x10/0x20
[381.306294]  [] raw_sendmsg+0x1b1/0x270 [af_802154]
[381.306396]  [] ieee802154_sock_sendmsg+0x14/0x20 [af_802154]
[381.306512]  [] sock_sendmsg+0x8b/0xc0
[381.306600]  [] ? __d_alloc+0x25/0x180
[381.306687]  [] ? kmem_cache_alloc_trace+0x1c6/0x1f0
[381.306791]  [] SYSC_sendto+0x121/0x1c0
[381.306878]  [] ? vtime_account_user+x54/0x60
[381.306975]  [] ? syscall_trace_enter+0x145/0x250
[381.307073]  [] SyS_sendto+0xe/0x10
[381.307156]  [] tracesys

Re: [PATCH] net: ieee802154: fix net_device reference release too early

2017-05-18 Thread Stefan Schmidt
Hello.

On Thu, 2017-05-18 at 15:14, Stefan Schmidt wrote:
> Hello.
> 
> On Thu, 2017-05-18 at 15:50, linzhang wrote:
> > This patch fixes the kernel oops when release net_device reference in 
> > advance. In function raw_sendmsg(i think the dgram_sendmsg has the same 
> > problem), there is a race condition between dev_put and dev_queue_xmit
> > when the device is gong that maybe lead to dev_queue_ximt to see
> > an illegal net_device pointer.
> > 
> 
> You have a test case to reproduce this oops? I fear I have not seen
> one.

If you have a test case handy adding it to the commit would be handy. If you do
not have one around we can do without.

> > So i think that dev_put should be behind of the dev_queue_xmit.
> > 
> > Also, explicit set skb->sk is needless, sock_alloc_send_skb is
> > already set it.
> 
> You could have put this fixup in a different patch.

I actually would request you to split this into two patches. One for the
removal of the sk setting and one for the race condition fix.

> > Signed-off-by: linzhang 
> 
> This looks more like a username instead of a real name. If you have Lin
> Zhang as you English real name that would be better here. :)

This would be also appreciated.

> > ---
> >  net/ieee802154/socket.c | 10 --
> >  1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
> > index eedba76..a60658c 100644
> > --- a/net/ieee802154/socket.c
> > +++ b/net/ieee802154/socket.c
> > @@ -301,15 +301,14 @@ static int raw_sendmsg(struct sock *sk, struct msghdr 
> > *msg, size_t size)
> > goto out_skb;
> >  
> > skb->dev = dev;
> > -   skb->sk  = sk;
> > skb->protocol = htons(ETH_P_IEEE802154);
> >  
> > -   dev_put(dev);
> > -
> > err = dev_queue_xmit(skb);
> > if (err > 0)
> > err = net_xmit_errno(err);
> >  
> > +   dev_put(dev);
> > +
> > return err ?: size;
> >  
> >  out_skb:
> > @@ -690,15 +689,14 @@ static int dgram_sendmsg(struct sock *sk, struct 
> > msghdr *msg, size_t size)
> > goto out_skb;
> >  
> > skb->dev = dev;
> > -   skb->sk  = sk;
> > skb->protocol = htons(ETH_P_IEEE802154);
> >  
> > -   dev_put(dev);
> > -
> > err = dev_queue_xmit(skb);
> > if (err > 0)
> > err = net_xmit_errno(err);
> >  
> > +   dev_put(dev);
> > +
> > return err ?: size;
> 
> Going to give this a test ride here now.

I gave it a ride in my testbed and I encountered no problems. While I have never
seen the race and oops myself doing the dev_put before the xmit can surely lead 
to
such a race and the fix is valid.

Once you have done the changes requested above and re-submit your two patches 
you can
add my

Acked-by: Stefan Schmidt 

to both of them.

regards
Stefan Schmidt


Re: [PATCH] net: ieee802154: fix net_device reference release too early

2017-05-18 Thread Stefan Schmidt
Hello.

On Thu, 2017-05-18 at 15:50, linzhang wrote:
> This patch fixes the kernel oops when release net_device reference in 
> advance. In function raw_sendmsg(i think the dgram_sendmsg has the same 
> problem), there is a race condition between dev_put and dev_queue_xmit
> when the device is gong that maybe lead to dev_queue_ximt to see
> an illegal net_device pointer.
> 

You have a test case to reproduce this oops? I fear I have not seen
one.

> So i think that dev_put should be behind of the dev_queue_xmit.
> 
> Also, explicit set skb->sk is needless, sock_alloc_send_skb is
> already set it.

You could have put this fixup in a different patch.

> Signed-off-by: linzhang 

This looks more like a username instead of a real name. If you have Lin
Zhang as you English real name that would be better here. :)

> ---
>  net/ieee802154/socket.c | 10 --
>  1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
> index eedba76..a60658c 100644
> --- a/net/ieee802154/socket.c
> +++ b/net/ieee802154/socket.c
> @@ -301,15 +301,14 @@ static int raw_sendmsg(struct sock *sk, struct msghdr 
> *msg, size_t size)
>   goto out_skb;
>  
>   skb->dev = dev;
> - skb->sk  = sk;
>   skb->protocol = htons(ETH_P_IEEE802154);
>  
> - dev_put(dev);
> -
>   err = dev_queue_xmit(skb);
>   if (err > 0)
>   err = net_xmit_errno(err);
>  
> + dev_put(dev);
> +
>   return err ?: size;
>  
>  out_skb:
> @@ -690,15 +689,14 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr 
> *msg, size_t size)
>   goto out_skb;
>  
>   skb->dev = dev;
> - skb->sk  = sk;
>   skb->protocol = htons(ETH_P_IEEE802154);
>  
> - dev_put(dev);
> -
>   err = dev_queue_xmit(skb);
>   if (err > 0)
>   err = net_xmit_errno(err);
>  
> + dev_put(dev);
> +
>   return err ?: size;

Going to give this a test ride here now.

regards
Stefan Schmidt