Looks like the fix triggered another issue, so we need to see how upstream fixes this one.
Kernel test robot picked up a bug from this fix: FYI, we noticed the below changes on https://github.com/0day-ci/linux Herbert-Xu/rhashtable-Fix-walker-list-corruption/20151216-164833 commit f9f51b8070be3e829100614a7372b219723b864f ("rhashtable: Fix walker list corruption") [ 8.933376] =============================== [ 8.933376] =============================== [ 8.934629] [ INFO: suspicious RCU usage. ] [ 8.934629] [ INFO: suspicious RCU usage. ] [ 8.935941] 4.4.0-rc3-00995-gf9f51b8 #2 Not tainted [ 8.935941] 4.4.0-rc3-00995-gf9f51b8 #2 Not tainted [ 8.937494] ------------------------------- [ 8.937494] ------------------------------- [ 8.938818] lib/rhashtable.c:504 suspicious rcu_dereference_protected() usage! [ 8.938818] lib/rhashtable.c:504 suspicious rcu_dereference_protected() usage! [ 8.941705] [ 8.941705] other info that might help us debug this: [ 8.941705] [ 8.941705] [ 8.941705] other info that might help us debug this: [ 8.941705] [ 8.944161] [ 8.944161] rcu_scheduler_active = 1, debug_locks = 0 [ 8.944161] [ 8.944161] rcu_scheduler_active = 1, debug_locks = 0 [ 8.946244] 1 lock held by swapper/0/1: [ 8.946244] 1 lock held by swapper/0/1: [ 8.947463] #0: [ 8.947463] #0: ( (&(&ht->lock)->rlock&(&ht->lock)->rlock){+.+...}){+.+...}, at: , at: [<ffffffff814b8900>] rhashtable_walk_init+0x70/0x150 [<ffffffff814b8900>] rhashtable_walk_init+0x70/0x150 [ 8.950428] [ 8.950428] stack backtrace: [ 8.950428] [ 8.950428] stack backtrace: [ 8.951770] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc3-00995-gf9f51b8 #2 [ 8.951770] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.4.0-rc3-00995-gf9f51b8 #2 [ 8.954245] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 8.954245] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 8.956973] 0000000000000001 [ 8.956973] 0000000000000001 ffff880078393d30 ffff880078393d30 ffffffff81493238 ffffffff81493238 ffff88007838c040 ffff88007838c040 [ 8.959333] ffff880078393d60 [ 8.959333] ffff880078393d60 ffffffff8112cb9f ffffffff8112cb9f ffff880078393da0 ffff880078393da0 ffffffff83e9d6c0 ffffffff83e9d6c0 [ 8.961684] ffffffff83e9d7f0 [ 8.961684] ffffffff83e9d7f0 ffff880061720e00 ffff880061720e00 ffff880078393d90 ffff880078393d90 ffffffff814b89c8 ffffffff814b89c8 [ 8.964148] Call Trace: [ 8.964148] Call Trace: [ 8.964955] [<ffffffff81493238>] dump_stack+0x7c/0xb4 [ 8.964955] [<ffffffff81493238>] dump_stack+0x7c/0xb4 [ 8.966728] [<ffffffff8112cb9f>] lockdep_rcu_suspicious+0x14f/0x1c0 [ 8.966728] [<ffffffff8112cb9f>] lockdep_rcu_suspicious+0x14f/0x1c0 [ 8.968753] [<ffffffff814b89c8>] rhashtable_walk_init+0x138/0x150 [ 8.968753] [<ffffffff814b89c8>] rhashtable_walk_init+0x138/0x150 [ 8.970567] [<ffffffff815021d8>] test_bucket_stats+0x22/0x17d [ 8.970567] [<ffffffff815021d8>] test_bucket_stats+0x22/0x17d [ 8.972682] [<ffffffff82dda0fa>] test_rhashtable+0xe0/0x12ac [ 8.972682] [<ffffffff82dda0fa>] test_rhashtable+0xe0/0x12ac [ 8.974746] [<ffffffff816157db>] ? get_random_bytes+0x2b/0x40 [ 8.974746] [<ffffffff816157db>] ? get_random_bytes+0x2b/0x40 [ 8.976467] [<ffffffff814b6c33>] ? bucket_table_alloc+0x173/0x280 [ 8.976467] [<ffffffff814b6c33>] ? bucket_table_alloc+0x173/0x280 [ 8.978548] [<ffffffff82ddb3d5>] test_rht_init+0x10f/0x523 [ 8.978548] [<ffffffff82ddb3d5>] test_rht_init+0x10f/0x523 [ 8.980179] [<ffffffff82ddb2c6>] ? test_rhashtable+0x12ac/0x12ac [ 8.980179] [<ffffffff82ddb2c6>] ? test_rhashtable+0x12ac/0x12ac [ 8.982424] [<ffffffff82d9f757>] do_one_initcall+0x16b/0x248 [ 8.982424] [<ffffffff82d9f757>] do_one_initcall+0x16b/0x248 [ 8.984208] [<ffffffff82d9f9f8>] kernel_init_freeable+0x1c4/0x2b8 [ 8.984208] [<ffffffff82d9f9f8>] kernel_init_freeable+0x1c4/0x2b8 [ 8.986165] [<ffffffff81ed5b60>] ? rest_init+0x200/0x200 [ 8.986165] [<ffffffff81ed5b60>] ? rest_init+0x200/0x200 [ 8.987925] [<ffffffff81ed5b71>] kernel_init+0x11/0x190 [ 8.987925] [<ffffffff81ed5b71>] kernel_init+0x11/0x190 [ 8.989608] [<ffffffff81ee406f>] ret_from_fork+0x3f/0x70 [ 8.989608] [<ffffffff81ee406f>] ret_from_fork+0x3f/0x70 [ 8.991270] [<ffffffff81ed5b60>] ? rest_init+0x200/0x200 [ 8.991270] [<ffffffff81ed5b60>] ? rest_init+0x200/0x200 Thanks, Kernel Test Robot -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1526811 Title: SRU: walker list corruption while being intensively stressed Status in linux package in Ubuntu: Fix Committed Status in linux source package in Wily: In Progress Status in linux source package in Xenial: Fix Committed Bug description: [SRU Justification][Wily] + [Xenial] While stress testing with the stress-ng procfs stressor I hit a walker list bug. This has been recently fixed by Herbert Xu in commit: The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable: Fix sleeping inside RCU critical section in walk_stop") introduced a new spinlock for the walker list. However, it did not convert all existing users of the list over to the new spin lock. Some continued to use the old mutex for this purpose. This obviously led to corruption of the list. [Fix] Clean upstream cherry pick, commit c6ff5268293ef98e48a99597e765ffc417e39fa5 Will land in Xenial automatically (4.4) [Testcase] Run multiple instances of the attached code on a multicore system. Alternatively, run stress-ng --procfs 0 on a multi-core system Fix will stop the above code corrupting the list and crashing. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1526811/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp