On Mon, Nov 14, 2016 at 10:23 PM, Andrei Vagin <ava...@gmail.com> wrote:
> Hi Nicolas,
>
> cleanup_net() calls idr_destroy(net->netns_ids) for network namespaces
> and then it calls unregister_netdevice_many() which calls
> idr_alloc(net0>netns_ids). It looks wrong, doesn't it?

Here is a report from kmemleak detector:

unreferenced object 0xffff91badb543950 (size 2096):
  comm "kworker/u4:0", pid 6, jiffies 4295152553 (age 28.418s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 cb 5f df ba 91 ff ff  .........._.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffb1865bea>] kmemleak_alloc+0x4a/0xa0
    [<ffffffffb1243b38>] kmem_cache_alloc+0x128/0x280
    [<ffffffffb142f5ab>] idr_layer_alloc+0x2b/0x90
    [<ffffffffb142f9cd>] idr_get_empty_slot+0x34d/0x370
    [<ffffffffb142fa4e>] idr_alloc+0x5e/0x110
    [<ffffffffb170ac3d>] __peernet2id_alloc+0x6d/0x90
    [<ffffffffb170bda5>] peernet2id_alloc+0x55/0xb0
    [<ffffffffb1731216>] rtnl_fill_ifinfo+0xaa6/0x10a0
    [<ffffffffb1733073>] rtmsg_ifinfo_build_skb+0x73/0xd0
    [<ffffffffb17125d5>] rollback_registered_many+0x295/0x390
    [<ffffffffb1712765>] unregister_netdevice_many+0x25/0x80
    [<ffffffffb17138a5>] default_device_exit_batch+0x145/0x170
    [<ffffffffb170ae52>] ops_exit_list.isra.4+0x52/0x60
    [<ffffffffb170c17f>] cleanup_net+0x1bf/0x2a0
    [<ffffffffb10b616f>] process_one_work+0x1ff/0x660
    [<ffffffffb10b661e>] worker_thread+0x4e/0x480


>
> I compiled the kernel with the next patch:
> diff --git a/lib/idr.c b/lib/idr.c
> index 6098336..c0a3a32 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -636,6 +636,8 @@ void idr_destroy(struct idr *idp)
>                 struct idr_layer *p = get_from_free_list(idp);
>                 kmem_cache_free(idr_layer_cache, p);
>         }
> +
> +       idp->top = 0xdeaddead;
>  }
>  EXPORT_SYMBOL(idr_destroy);
>
> and it crashed as expected:
>
> [  306.974024] BUG: unable to handle kernel paging request at 00000000deade6bd
> [  306.977724] IP: [<ffffffff8b445085>] _find_next_bit.part.0+0x15/0x70
> [  306.978490] PGD 20dfa067 [  306.978781] PUD 0
> [  306.979043]
> [  306.979230] Oops: 0000 [#1] SMP
> [  306.979607] Modules linked in: macvlan tun bridge stp llc
> nf_conntrack_netlink udp_diag tcp_diag inet_diag netlink_diag
> af_packet_diag unix_diag binfmt_misc veth nf_conntrack_ipv4
> nf_defrag_ipv4 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack
> nf_conntrack nfnetlink ip6table_filter ip6_tables sunrpc ppdev
> crc32c_intel joydev virtio_balloon virtio_net i2c_piix4 parport_pc
> parport acpi_cpufreq tpm_tis tpm_tis_core tpm virtio_blk serio_raw
> virtio_pci ata_generic virtio_ring virtio pata_acpi
> [  306.985236] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 4.9.0-rc5+ #91
> [  306.986005] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.9.1-1.fc24 04/01/2014
> [  306.987024] Workqueue: netns cleanup_net
> [  306.987511] task: ffff8ca63cb5a540 task.stack: ffff9e3240340000
> [  306.988207] RIP: 0010:[<ffffffff8b445085>]  [<ffffffff8b445085>]
> _find_next_bit.part.0+0x15/0x70
> [  306.989246] RSP: 0018:ffff9e3240343970  EFLAGS: 00010046
> [  306.989871] RAX: ffffffffffffffff RBX: 0000000000000000 RCX: 
> 0000000000000000
> [  306.990713] RDX: 0000000000000000 RSI: 0000000000000100 RDI: 
> 00000000deade6bd
> [  306.991548] RBP: ffff9e3240343980 R08: ffffffffffffffff R09: 
> ffffffffffffffff
> [  306.992383] R10: 00000000f314d32d R11: 0000000000000000 R12: 
> 00000000ffffffff
> [  306.993277] R13: 00000000fffffff8 R14: 00000000deaddead R15: 
> 0000000000000000
> [  306.994117] FS:  0000000000000000(0000) GS:ffff8ca63fd00000(0000)
> knlGS:0000000000000000
> [  306.995068] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  306.995744] CR2: 00000000deade6bd CR3: 0000000059aec000 CR4: 
> 00000000000006e0
> [  306.996586] DR0: 00000000000100a0 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  306.997423] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000600
> [  306.998258] Stack:
> [  306.998503]  ffff9e3240343980 ffffffff8b44511d ffff9e32403439e0
> ffffffff8b42f819
> [  306.999434]  0000000000000000 0208002000000007 ffff8ca6289d80c0
> ffff9e32403439f8
> [  307.000365]  0000000000000000 000000007fffffff ffff8ca61f09b200
> ffff8ca6289d80c0
> [  307.001296] Call Trace:
> [  307.001594]  [<ffffffff8b44511d>] ? find_next_zero_bit+0x1d/0x20
> [  307.002307]  [<ffffffff8b42f819>] idr_get_empty_slot+0x189/0x370
> [  307.003012]  [<ffffffff8b42fa5e>] idr_alloc+0x5e/0x110
> [  307.003631]  [<ffffffff8b70bd88>] ? peernet2id_alloc+0x38/0xb0
> [  307.004321]  [<ffffffff8b70ac3d>] __peernet2id_alloc+0x6d/0x90
> [  307.005003]  [<ffffffff8b70bda5>] peernet2id_alloc+0x55/0xb0
> [  307.005673]  [<ffffffff8b731216>] rtnl_fill_ifinfo+0xaa6/0x10a0
> [  307.006368]  [<ffffffff8b112458>] ? rcu_read_lock_sched_held+0x58/0x60
> [  307.007136]  [<ffffffff8b6ffe2b>] ? __alloc_skb+0x9b/0x1e0
> [  307.007780]  [<ffffffff8b733073>] rtmsg_ifinfo_build_skb+0x73/0xd0
> [  307.008509]  [<ffffffff8b7125d5>] rollback_registered_many+0x295/0x390
> [  307.009282]  [<ffffffff8b712765>] unregister_netdevice_many+0x25/0x80
> [  307.010047]  [<ffffffff8b7138a5>] default_device_exit_batch+0x145/0x170
> [  307.010825]  [<ffffffff8b0e7b10>] ? finish_wait+0x70/0x70
> [  307.011465]  [<ffffffff8b70ae52>] ops_exit_list.isra.4+0x52/0x60
> [  307.012175]  [<ffffffff8b70c17f>] cleanup_net+0x1bf/0x2a0
> [  307.012811]  [<ffffffff8b0b616f>] process_one_work+0x1ff/0x660
> [  307.013548]  [<ffffffff8b0b60f4>] ? process_one_work+0x184/0x660
> [  307.014259]  [<ffffffff8b0b661e>] worker_thread+0x4e/0x480
> [  307.014906]  [<ffffffff8b0b65d0>] ? process_one_work+0x660/0x660
> [  307.015617]  [<ffffffff8b0bd2a4>] kthread+0xf4/0x110
> [  307.016209]  [<ffffffff8b0bd1b0>] ? kthread_park+0x60/0x60
> [  307.016857]  [<ffffffff8b872efa>] ret_from_fork+0x2a/0x40

Reply via email to