Re: [selftests] e48d82b67a: BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten

2021-03-22 Thread Vlastimil Babka
On 3/17/21 7:53 PM, David Rientjes wrote:
> On Wed, 17 Mar 2021, Vlastimil Babka wrote:
>> > 
>> > [   22.154049] random: get_random_u32 called from 
>> > __kmem_cache_create+0x23/0x3e0 with crng_init=0 
>> > [   22.154070] random: get_random_u32 called from 
>> > cache_random_seq_create+0x7c/0x140 with crng_init=0 
>> > [   22.154167] random: get_random_u32 called from 
>> > allocate_slab+0x155/0x5e0 with crng_init=0 
>> > [   22.154690] test_slub: 1. kmem_cache: Clobber Redzone 0x12->0x(ptrval)
>> > [   22.164499] 
>> > =
>> > [   22.166629] BUG TestSlub_RZ_alloc (Not tainted): Redzone overwritten
>> > [   22.168179] 
>> > -
>> > [   22.168179]
>> > [   22.168372] Disabling lock debugging due to kernel taint
>> > [   22.168372] INFO: 0x(ptrval)-0x(ptrval) @offset=1064. First byte 0x12 
>> > instead of 0xcc
>> > [   22.168372] INFO: Allocated in resiliency_test+0x47/0x1be age=3 cpu=0 
>> > pid=1 
>> > [   22.168372] __slab_alloc+0x57/0x80 
>> > [   22.168372] kmem_cache_alloc (kbuild/src/consumer/mm/slub.c:2871 
>> > kbuild/src/consumer/mm/slub.c:2915 kbuild/src/consumer/mm/slub.c:2920) 
>> > [   22.168372] resiliency_test (kbuild/src/consumer/lib/test_slub.c:34 
>> > kbuild/src/consumer/lib/test_slub.c:107) 
>> > [   22.168372] test_slub_init (kbuild/src/consumer/lib/test_slub.c:124) 
>> > [   22.168372] do_one_initcall (kbuild/src/consumer/init/main.c:1226) 
>> > [   22.168372] kernel_init_freeable (kbuild/src/consumer/init/main.c:1298 
>> > kbuild/src/consumer/init/main.c:1315 kbuild/src/consumer/init/main.c:1335 
>> > kbuild/src/consumer/init/main.c:1537) 
>> > [   22.168372] kernel_init (kbuild/src/consumer/init/main.c:1426) 
>> > [   22.168372] ret_from_fork 
>> > (kbuild/src/consumer/arch/x86/entry/entry_32.S:856) 
>> > [   22.168372] INFO: Slab 0x(ptrval) objects=16 used=1 fp=0x(ptrval) 
>> > flags=0x4201
>> > [   22.168372] INFO: Object 0x(ptrval) @offset=1000 fp=0x(ptrval)
>> > [   22.168372]
>> > [   22.168372] Redzone (ptrval): cc cc cc cc cc cc cc cc   
>> >
>> > [   22.168372] Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
>> > 6b 6b  
>> > [   22.168372] Object (ptrval): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 
>> > 6b a5  kkk.
>> > [   22.168372] Redzone (ptrval): 12 cc cc cc   
>> >
>> > [   22.168372] Padding (ptrval): 5a 5a 5a 5a 5a 5a 5a 5a   
>> >
>> > [   22.168372] CPU: 0 PID: 1 Comm: swapper/0 Tainted: GB 
>> > 5.12.0-rc2-1-ge48d82b67a2b #1
>> > [   22.168372] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
>> > 1.12.0-1 04/01/2014
>> > [   22.168372] Call Trace:
>> > [   22.168372] dump_stack (kbuild/src/consumer/lib/dump_stack.c:122) 
>> > [   22.168372] print_trailer (kbuild/src/consumer/mm/slub.c:737) 
>> > [   22.168372] check_bytes_and_report.cold 
>> > (kbuild/src/consumer/mm/slub.c:807) 
>> > [   22.168372] check_object (kbuild/src/consumer/mm/slub.c:914) 
>> > [   22.168372] validate_slab (kbuild/src/consumer/mm/slub.c:4635) 
>> 
>> Hm but in this case the output means the tested functionality (slub 
>> debugging)
>> is working as intended. So what can we do? Indicate/teach somehow to the bot
>> that this is OK? Does kselftest have some support for this? Or silence the
>> validation output for testing purposes? (I would prefer not to)
>> 
> 
> Unless you're familiar with everything that CONFIG_TEST_SLUB does, maybe 
> this could be inferred as an actual issue that the test has uncovered that 
> is unexpected?
> 
> I don't have a good way of silencing the check_bytes_and_report() output 
> other than a big hammer: implement {disable,enable}_slub_warnings() that 
> the resiliency test could call into before triggering these checks.

So Oliver has implemented this, but now I got a different idea that should be
much cleaner IMHO. We could add a per-cache flag SLAB_SILENT_ERRORS (similar to
SLAB_RED_ZONE etc) instead of a global bool. The test would just create the
caches with this flag.
The flag should be added to the SLAB_NEVER_MERGE, SLAB_DEBUG_FLAGS,
SLAB_FLAGS_PERMITTED macros as well.

A similar suggestion is that adding the errors counter parameter to all
validate_slab_cache() and relevant functions is tedious - there are more that
had to be modified like this than initially expected.
Instead the error counter can be added to SLUB's struct kmem_cache definition,
incremented by the various checks and the tests can look at that after 
validation.

Thanks,
Vlastimil


Re: [selftests] e48d82b67a: BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten

2021-03-21 Thread Oliver Sang
Hi Vlastimil,

On Wed, Mar 17, 2021 at 12:29:40PM +0100, Vlastimil Babka wrote:
> On 3/17/21 9:36 AM, kernel test robot wrote:
> > 
> > 
> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-9):
> > 
> > commit: e48d82b67a2b760eedf7b95ca15f41267496386c ("[PATCH 1/2] selftests: 
> > add a kselftest for SLUB debugging functionality")
> > url: 
> > https://github.com/0day-ci/linux/commits/glittao-gmail-com/selftests-add-a-kselftest-for-SLUB-debugging-functionality/20210316-204257
> > base: 
> > https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
> > 
> > in testcase: trinity
> > version: trinity-static-i386-x86_64-f93256fb_2019-08-28
> > with following parameters:
> > 
> > group: group-04
> > 
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
> > 
> > 
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire 
> > log/backtrace):
> > 
> > 
> > +---+---++
> > |   
> > | v5.12-rc2 | e48d82b67a |
> > +---+---++
> > | BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten
> > | 0 | 69 |
> > | INFO:0x(ptrval)-0x(ptrval)@offset=#.First_byte#instead_of 
> > | 0 | 69 |
> > | INFO:Allocated_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | INFO:Slab0x(ptrval)objects=#used=#fp=0x(ptrval)flags= 
> > | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset=#fp=0x(ptrval)   
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Freechain_corrupt 
> > | 0 | 69 |
> > | INFO:Freed_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Wrong_object_count.Counter_is#but_counted_were
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Redzone_overwritten   
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Objects_remaining_in_TestSlub_next_ptr_free_on__kmem_cache_shutdown()
> >  | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset= 
> > | 0 | 69 |
> > | BUG_TestSlub_1th_word_free(Tainted:G_B):Poison_overwritten
> > | 0 | 69 |
> > | BUG_TestSlub_50th_word_free(Tainted:G_B):Poison_overwritten   
> > | 0 | 69 |
> > | BUG_TestSlub_RZ_free(Tainted:G_B):Redzone_overwritten 
> > | 0 | 69 |
> > +---+---++
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot 
> > 
> > 
> > 
> > [   22.154049] random: get_random_u32 called from 
> > __kmem_cache_create+0x23/0x3e0 with crng_init=0 
> > [   22.154070] random: get_random_u32 called from 
> > cache_random_seq_create+0x7c/0x140 with crng_init=0 
> > [   22.154167] random: get_random_u32 called from allocate_slab+0x155/0x5e0 
> > with crng_init=0 
> > [   22.154690] test_slub: 1. kmem_cache: Clobber Redzone 0x12->0x(ptrval)
> > [   22.164499] 
> > =
> > [   22.166629] BUG TestSlub_RZ_alloc (Not tainted): Redzone overwritten
> > [   22.168179] 
> > -
> > [   22.168179]
> > [   22.168372] Disabling lock debugging due to kernel taint
> > [   22.168372] INFO: 0x(ptrval)-0x(ptrval) @offset=1064. First byte 0x12 
> > instead of 0xcc
> > [   22.168372] INFO: Allocated in resiliency_test+0x47/0x1be age=3 cpu=0 
> > pid=1 
> > [   22.168372] __slab_alloc+0x57/0x80 
> > [   22.168372] kmem_cache_alloc (kbuild/src/consumer/mm/slub.c:2871 
> > kbuild/src/consumer/mm/slub.c:2915 kbuild/src/consumer/mm/slub.c:2920) 
> > [   

Re: [selftests] e48d82b67a: BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten

2021-03-17 Thread David Rientjes
On Wed, 17 Mar 2021, Vlastimil Babka wrote:

> > Greeting,
> > 
> > FYI, we noticed the following commit (built with gcc-9):
> > 
> > commit: e48d82b67a2b760eedf7b95ca15f41267496386c ("[PATCH 1/2] selftests: 
> > add a kselftest for SLUB debugging functionality")
> > url: 
> > https://github.com/0day-ci/linux/commits/glittao-gmail-com/selftests-add-a-kselftest-for-SLUB-debugging-functionality/20210316-204257
> > base: 
> > https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git next
> > 
> > in testcase: trinity
> > version: trinity-static-i386-x86_64-f93256fb_2019-08-28
> > with following parameters:
> > 
> > group: group-04
> > 
> > test-description: Trinity is a linux system call fuzz tester.
> > test-url: http://codemonkey.org.uk/projects/trinity/
> > 
> > 
> > on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> > 
> > caused below changes (please refer to attached dmesg/kmsg for entire 
> > log/backtrace):
> > 
> > 
> > +---+---++
> > |   
> > | v5.12-rc2 | e48d82b67a |
> > +---+---++
> > | BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten
> > | 0 | 69 |
> > | INFO:0x(ptrval)-0x(ptrval)@offset=#.First_byte#instead_of 
> > | 0 | 69 |
> > | INFO:Allocated_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | INFO:Slab0x(ptrval)objects=#used=#fp=0x(ptrval)flags= 
> > | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset=#fp=0x(ptrval)   
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Freechain_corrupt 
> > | 0 | 69 |
> > | INFO:Freed_in_resiliency_test_age=#cpu=#pid=  
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Wrong_object_count.Counter_is#but_counted_were
> > | 0 | 69 |
> > | BUG_TestSlub_next_ptr_free(Tainted:G_B):Redzone_overwritten   
> > | 0 | 69 |
> > | 
> > BUG_TestSlub_next_ptr_free(Tainted:G_B):Objects_remaining_in_TestSlub_next_ptr_free_on__kmem_cache_shutdown()
> >  | 0 | 69 |
> > | INFO:Object0x(ptrval)@offset= 
> > | 0 | 69 |
> > | BUG_TestSlub_1th_word_free(Tainted:G_B):Poison_overwritten
> > | 0 | 69 |
> > | BUG_TestSlub_50th_word_free(Tainted:G_B):Poison_overwritten   
> > | 0 | 69 |
> > | BUG_TestSlub_RZ_free(Tainted:G_B):Redzone_overwritten 
> > | 0 | 69 |
> > +---+---++
> > 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kernel test robot 
> > 
> > 
> > 
> > [   22.154049] random: get_random_u32 called from 
> > __kmem_cache_create+0x23/0x3e0 with crng_init=0 
> > [   22.154070] random: get_random_u32 called from 
> > cache_random_seq_create+0x7c/0x140 with crng_init=0 
> > [   22.154167] random: get_random_u32 called from allocate_slab+0x155/0x5e0 
> > with crng_init=0 
> > [   22.154690] test_slub: 1. kmem_cache: Clobber Redzone 0x12->0x(ptrval)
> > [   22.164499] 
> > =
> > [   22.166629] BUG TestSlub_RZ_alloc (Not tainted): Redzone overwritten
> > [   22.168179] 
> > -
> > [   22.168179]
> > [   22.168372] Disabling lock debugging due to kernel taint
> > [   22.168372] INFO: 0x(ptrval)-0x(ptrval) @offset=1064. First byte 0x12 
> > instead of 0xcc
> > [   22.168372] INFO: Allocated in resiliency_test+0x47/0x1be age=3 cpu=0 
> > pid=1 
> > [   22.168372] __slab_alloc+0x57/0x80 
> > [   22.168372] kmem_cache_alloc (kbuild/src/consumer/mm/slub.c:2871 
> > kbuild/src/consumer/mm/slub.c:2915 kbuild/src/consumer/mm/slub.c:2920) 
> > [   22.168372] resiliency_test (kbuild/src/consumer/lib/test_slub.c:34 
> > 

Re: [selftests] e48d82b67a: BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten

2021-03-17 Thread Vlastimil Babka
On 3/17/21 9:36 AM, kernel test robot wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-9):
> 
> commit: e48d82b67a2b760eedf7b95ca15f41267496386c ("[PATCH 1/2] selftests: add 
> a kselftest for SLUB debugging functionality")
> url: 
> https://github.com/0day-ci/linux/commits/glittao-gmail-com/selftests-add-a-kselftest-for-SLUB-debugging-functionality/20210316-204257
> base: https://git.kernel.org/cgit/linux/kernel/git/shuah/linux-kselftest.git 
> next
> 
> in testcase: trinity
> version: trinity-static-i386-x86_64-f93256fb_2019-08-28
> with following parameters:
> 
>   group: group-04
> 
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
> 
> 
> on test machine: qemu-system-i386 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):
> 
> 
> +---+---++
> | 
>   | v5.12-rc2 | e48d82b67a |
> +---+---++
> | BUG_TestSlub_RZ_alloc(Not_tainted):Redzone_overwritten  
>   | 0 | 69 |
> | INFO:0x(ptrval)-0x(ptrval)@offset=#.First_byte#instead_of   
>   | 0 | 69 |
> | INFO:Allocated_in_resiliency_test_age=#cpu=#pid=
>   | 0 | 69 |
> | INFO:Slab0x(ptrval)objects=#used=#fp=0x(ptrval)flags=   
>   | 0 | 69 |
> | INFO:Object0x(ptrval)@offset=#fp=0x(ptrval) 
>   | 0 | 69 |
> | BUG_TestSlub_next_ptr_free(Tainted:G_B):Freechain_corrupt   
>   | 0 | 69 |
> | INFO:Freed_in_resiliency_test_age=#cpu=#pid=
>   | 0 | 69 |
> | 
> BUG_TestSlub_next_ptr_free(Tainted:G_B):Wrong_object_count.Counter_is#but_counted_were
> | 0 | 69 |
> | BUG_TestSlub_next_ptr_free(Tainted:G_B):Redzone_overwritten 
>   | 0 | 69 |
> | 
> BUG_TestSlub_next_ptr_free(Tainted:G_B):Objects_remaining_in_TestSlub_next_ptr_free_on__kmem_cache_shutdown()
>  | 0 | 69 |
> | INFO:Object0x(ptrval)@offset=   
>   | 0 | 69 |
> | BUG_TestSlub_1th_word_free(Tainted:G_B):Poison_overwritten  
>   | 0 | 69 |
> | BUG_TestSlub_50th_word_free(Tainted:G_B):Poison_overwritten 
>   | 0 | 69 |
> | BUG_TestSlub_RZ_free(Tainted:G_B):Redzone_overwritten   
>   | 0 | 69 |
> +---+---++
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot 
> 
> 
> 
> [   22.154049] random: get_random_u32 called from 
> __kmem_cache_create+0x23/0x3e0 with crng_init=0 
> [   22.154070] random: get_random_u32 called from 
> cache_random_seq_create+0x7c/0x140 with crng_init=0 
> [   22.154167] random: get_random_u32 called from allocate_slab+0x155/0x5e0 
> with crng_init=0 
> [   22.154690] test_slub: 1. kmem_cache: Clobber Redzone 0x12->0x(ptrval)
> [   22.164499] 
> =
> [   22.166629] BUG TestSlub_RZ_alloc (Not tainted): Redzone overwritten
> [   22.168179] 
> -
> [   22.168179]
> [   22.168372] Disabling lock debugging due to kernel taint
> [   22.168372] INFO: 0x(ptrval)-0x(ptrval) @offset=1064. First byte 0x12 
> instead of 0xcc
> [   22.168372] INFO: Allocated in resiliency_test+0x47/0x1be age=3 cpu=0 
> pid=1 
> [   22.168372] __slab_alloc+0x57/0x80 
> [   22.168372] kmem_cache_alloc (kbuild/src/consumer/mm/slub.c:2871 
> kbuild/src/consumer/mm/slub.c:2915 kbuild/src/consumer/mm/slub.c:2920) 
> [   22.168372] resiliency_test (kbuild/src/consumer/lib/test_slub.c:34 
> kbuild/src/consumer/lib/test_slub.c:107) 
> [   22.168372] test_slub_init (kbuild/src/consumer/lib/test_slub.c:124) 
> [   22.168372] do_one_initcall (kbuild/src/consumer/init/main.c:1226) 
> [