Hi Rune,

What kind of timeout do you use in your subscription? Is it 0 or low value
(< 10ms) ??
Apart from the fixes Jon has suggested, we still have issues as having low
timeouts are very racy.

/Partha

On Wed, Nov 29, 2017 at 3:05 PM, Rune Torgersen <[email protected]> wrote:

> (Resending as I think it got lost somewhere).
>
> A bug that I thought had been fixed is rearing its ugly head again in
> latest Ubuntu 16.04 LTS kernel (4.4.0-97)
> It is happening to me quite frequently (2-3 times a week).
>
> The application where this happens uses lots of short lived sockets, and
> also lots of short-lived connections to the topology server.
>
> [151611.149711] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000028
> [151611.149946] IP: [<ffffffffc046a0a2>] tipc_nametbl_unsubscribe+0x72/0x100
> [tipc]
> [151611.150069] PGD 0
> [151611.150104] Oops: 0002 [#1] SMP
> [151611.150160] Modules linked in: tipc ip6_udp_tunnel udp_tunnel
> intel_powerclamp coretemp kvm_intel gpio_ich input_leds joydev kvm
> irqbypass i7core_edac edac_core serio_raw lpc_ich shpchp hpilo 8250_fintek
> ipmi_ssif acpi_power_meter mac_hid lp parport ipmi_watchdog ipmi_si
> ipmi_devintf ipmi_msghandler autofs4 raid10 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0
> multipath linear amdkfd amd_iommu_v2 radeon hid_generic i2c_algo_bit ttm
> drm_kms_helper syscopyarea sysfillrect psmouse sysimgblt fb_sys_fops usbhid
> drm hid pata_acpi bnx2 hpsa netxen_nic scsi_transport_sas fjes
> [151611.151291] CPU: 1 PID: 14873 Comm: kworker/u64:2 Tainted: G
> I     4.4.0-97-generic #120-Ubuntu
> [151611.151429] Hardware name: HP ProLiant DL360 G6, BIOS P64 05/15/2010
> [151611.151547] Workqueue: tipc_rcv tipc_recv_work [tipc]
> [151611.151631] task: ffff880213c98cc0 ti: ffff8802131b8000 task.ti:
> ffff8802131b8000
> [151611.151740] RIP: 0010:[<ffffffffc046a0a2>]  [<ffffffffc046a0a2>]
> tipc_nametbl_unsubscribe+0x72/0x100 [tipc]
> [151611.151889] RSP: 0018:ffff88021f443e10  EFLAGS: 00010246
> [151611.151967] RAX: ffff880213d87f80 RBX: ffff880213d87f00 RCX:
> 0000000000000020
> [151611.152071] RDX: 000000000000000e RSI: 0000000000000067 RDI:
> ffff8802101a9638
> [151611.152176] RBP: ffff88021f443e30 R08: ffff88021f45a0c0 R09:
> ffff880217003b00
> [151611.152280] R10: ffff8800da043f40 R11: ffff880213c98d20 R12:
> ffff8802101a9600
> [151611.152385] R13: ffff8800d9fa9120 R14: ffff8802101a9638 R15:
> ffff880213d87f00
> [151611.152490] FS:  0000000000000000(0000) GS:ffff88021f440000(0000)
> knlGS:0000000000000000
> [151611.152631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [151611.152730] CR2: 0000000000000028 CR3: 0000000001e0a000 CR4:
> 00000000000006e0
> [151611.152835] Stack:
> [151611.152865]  ffff880213d87f00 ffff8800d9fa8000 ffff880213c499c8
> ffffffffc0468ed0
> [151611.152989]  ffff88021f443e50 ffffffffc04688bf ffff880213d87f00
> ffff880213c499c0
> [151611.153113]  ffff88021f443e78 ffffffffc0468f15 ffff88021f44ddc0
> ffff880213d87f30
> [151611.153237] Call Trace:
> [151611.153275]  <IRQ>
> [151611.153311]  [<ffffffffc0468ed0>] ? tipc_subscrb_shutdown_cb+0xc0/0xc0
> [tipc]
> [151611.153422]  [<ffffffffc04688bf>] tipc_subscrp_delete+0x2f/0x80 [tipc]
> [151611.153523]  [<ffffffffc0468f15>] tipc_subscrp_timeout+0x45/0x70 [tipc]
> [151611.153624]  [<ffffffff810ecfc5>] call_timer_fn+0x35/0x120
> [151611.153735]  [<ffffffffc0468ed0>] ? tipc_subscrb_shutdown_cb+0xc0/0xc0
> [tipc]
> [151611.153846]  [<ffffffff810ed97a>] run_timer_softirq+0x23a/0x2f0
> [151611.153936]  [<ffffffff81085dc1>] __do_softirq+0x101/0x290
> [151611.154017]  [<ffffffff810860c3>] irq_exit+0xa3/0xb0
> [151611.154091]  [<ffffffff818462a2>] smp_apic_timer_interrupt+0x42/0x50
> [151611.154185]  [<ffffffff81844562>] apic_timer_interrupt+0x82/0x90
> [151611.154272]  <EOI>
> [151611.154305]  [<ffffffff81843225>] ? _raw_spin_unlock_irqrestore+
> 0x15/0x20
> [151611.154407]  [<ffffffff810eefef>] mod_timer+0x10f/0x240
> [151611.154489]  [<ffffffffc0468be0>] tipc_subscrb_rcv_cb+0x1c0/0x390
> [tipc]
> [151611.154591]  [<ffffffffc04755e2>] tipc_receive_from_sock+0xc2/0x120
> [tipc]
> [151611.154695]  [<ffffffffc047526b>] tipc_recv_work+0x2b/0x60 [tipc]
> [151611.154809]  [<ffffffff8109a635>] process_one_work+0x165/0x480
> [151611.159008]  [<ffffffff8109a99b>] worker_thread+0x4b/0x4c0
> [151611.163372]  [<ffffffff8109a950>] ? process_one_work+0x480/0x480
> [151611.167622]  [<ffffffff810a0c75>] kthread+0xe5/0x100
> [151611.171755]  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
> [151611.175858]  [<ffffffff81843b8f>] ret_from_fork+0x3f/0x70
> [151611.179999]  [<ffffffff810a0b90>] ? kthread_create_on_node+0x1e0/0x1e0
> [151611.184004] Code: ff ff 48 85 c0 74 56 4c 8d 70 38 49 89 c4 4c 89 f7
> e8 43 92 3d c1 48 8b 8b 80 00 00 00 48 8b 93 88 00 00 00 48 8d 83 80 00 00
> 00 <48> 89 51 08 48 89 0a 48 89 83 80 00 00 00 48 89 83 88 00 00 00
> [151611.192678] RIP  [<ffffffffc046a0a2>] tipc_nametbl_unsubscribe+0x72/0x100
> [tipc]
> [151611.196733]  RSP <ffff88021f443e10>
> [151611.200739] CR2: 0000000000000028
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> tipc-discussion mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/tipc-discussion
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to