Re: 15c8410c67 ("mm/slob.c: respect list_head abstraction layer"): WARNING: CPU: 0 PID: 1 at lib/list_debug.c:28 __list_add_valid

2019-04-03 Thread Tobin C. Harding
On Wed, Apr 03, 2019 at 03:54:17PM +1100, Tobin C. Harding wrote:
> On Wed, Apr 03, 2019 at 10:00:38AM +0800, kernel test robot wrote:
> > Greetings,
> > 
> > 0day kernel testing robot got the below dmesg and the first bad commit is
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> > 
> > commit 15c8410c67adefd26ea0df1f1b86e1836051784b
> > Author: Tobin C. Harding 
> > AuthorDate: Fri Mar 29 10:01:23 2019 +1100
> > Commit: Stephen Rothwell 
> > CommitDate: Sat Mar 30 16:09:41 2019 +1100
> > 
> > mm/slob.c: respect list_head abstraction layer
> > 
> > Currently we reach inside the list_head.  This is a violation of the 
> > layer
> > of abstraction provided by the list_head.  It makes the code fragile.
> > More importantly it makes the code wicked hard to understand.
> > 
> > The code logic is based on the page in which an allocation was made, we
> > want to modify the slob_list we are working on to have this page at the
> > front.  We already have a function to check if an entry is at the front 
> > of
> > the list.  Recently a function was added to list.h to do the list
> > rotation.  We can use these two functions to reduce line count, reduce
> > code fragility, and reduce cognitive load required to read the code.
> > 
> > Use list_head functions to interact with lists thereby maintaining the
> > abstraction provided by the list_head structure.
> > 
> > Link: http://lkml.kernel.org/r/20190318000234.22049-3-to...@kernel.org
> > Signed-off-by: Tobin C. Harding 
> > Cc: Christoph Lameter 
> > Cc: David Rientjes 
> > Cc: Joonsoo Kim 
> > Cc: Pekka Enberg 
> > Cc: Roman Gushchin 
> > Signed-off-by: Andrew Morton 
> > Signed-off-by: Stephen Rothwell 
> > 
> > 2e1f88301e  include/linux/list.h: add list_rotate_to_front()
> > 15c8410c67  mm/slob.c: respect list_head abstraction layer
> > 05d08e2995  Add linux-next specific files for 20190402
> > +---+++---+
> > |   | 2e1f88301e | 
> > 15c8410c67 | next-20190402 |
> > +---+++---+
> > | boot_successes| 1009   | 198  
> >   | 299   |
> > | boot_failures | 0  | 2
> >   | 44|
> > | WARNING:at_lib/list_debug.c:#__list_add_valid | 0  | 2
> >   | 44|
> > | RIP:__list_add_valid  | 0  | 2
> >   | 44|
> > | WARNING:at_lib/list_debug.c:#__list_del_entry_valid   | 0  | 2
> >   | 25|
> > | RIP:__list_del_entry_valid| 0  | 2
> >   | 25|
> > | WARNING:possible_circular_locking_dependency_detected | 0  | 2
> >   | 44|
> > | RIP:_raw_spin_unlock_irqrestore   | 0  | 2
> >   | 2 |
> > | BUG:kernel_hang_in_test_stage | 0  | 0
> >   | 6 |
> > | BUG:unable_to_handle_kernel   | 0  | 0
> >   | 1 |
> > | Oops:#[##]| 0  | 0
> >   | 1 |
> > | RIP:slob_page_alloc   | 0  | 0
> >   | 1 |
> > | Kernel_panic-not_syncing:Fatal_exception  | 0  | 0
> >   | 1 |
> > | RIP:delay_tsc | 0  | 0
> >   | 2 |
> > +---+++---+
> > 
> > [2.618737] db_root: cannot open: /etc/target
> > [2.620114] mtdoops: mtd device (mtddev=name/number) must be supplied
> > [2.620967] slram: not enough parameters.
> > [2.621614] [ cut here ]
> > [2.622254] list_add corruption. prev->next should be next 
> > (aeeb71b0), but was cee1406d3f70. (prev=cee140422508).
> 
> Is this perhaps a false positive because we hackishly move the list_head
> 'head' and insert it back into the list.  Perhaps this is confusing the
> validation functions?

This has got me stumped.  I cannot create a test case where manipulating
a list with list_rotate_to_front() causes the list validation functions
to emit an error.  Also I cannot come up with a way on paper that it can
happen either.

I don't really know how to go forwards from here.  I'll sleep on it and
see if something comes to me, any ideas to look into please?

thanks,
Tobin.


Re: 15c8410c67 ("mm/slob.c: respect list_head abstraction layer"): WARNING: CPU: 0 PID: 1 at lib/list_debug.c:28 __list_add_valid

2019-04-02 Thread Tobin C. Harding
On Wed, Apr 03, 2019 at 10:00:38AM +0800, kernel test robot wrote:
> Greetings,
> 
> 0day kernel testing robot got the below dmesg and the first bad commit is
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> 
> commit 15c8410c67adefd26ea0df1f1b86e1836051784b
> Author: Tobin C. Harding 
> AuthorDate: Fri Mar 29 10:01:23 2019 +1100
> Commit: Stephen Rothwell 
> CommitDate: Sat Mar 30 16:09:41 2019 +1100
> 
> mm/slob.c: respect list_head abstraction layer
> 
> Currently we reach inside the list_head.  This is a violation of the layer
> of abstraction provided by the list_head.  It makes the code fragile.
> More importantly it makes the code wicked hard to understand.
> 
> The code logic is based on the page in which an allocation was made, we
> want to modify the slob_list we are working on to have this page at the
> front.  We already have a function to check if an entry is at the front of
> the list.  Recently a function was added to list.h to do the list
> rotation.  We can use these two functions to reduce line count, reduce
> code fragility, and reduce cognitive load required to read the code.
> 
> Use list_head functions to interact with lists thereby maintaining the
> abstraction provided by the list_head structure.
> 
> Link: http://lkml.kernel.org/r/20190318000234.22049-3-to...@kernel.org
> Signed-off-by: Tobin C. Harding 
> Cc: Christoph Lameter 
> Cc: David Rientjes 
> Cc: Joonsoo Kim 
> Cc: Pekka Enberg 
> Cc: Roman Gushchin 
> Signed-off-by: Andrew Morton 
> Signed-off-by: Stephen Rothwell 
> 
> 2e1f88301e  include/linux/list.h: add list_rotate_to_front()
> 15c8410c67  mm/slob.c: respect list_head abstraction layer
> 05d08e2995  Add linux-next specific files for 20190402
> +---+++---+
> |   | 2e1f88301e | 
> 15c8410c67 | next-20190402 |
> +---+++---+
> | boot_successes| 1009   | 198
> | 299   |
> | boot_failures | 0  | 2  
> | 44|
> | WARNING:at_lib/list_debug.c:#__list_add_valid | 0  | 2  
> | 44|
> | RIP:__list_add_valid  | 0  | 2  
> | 44|
> | WARNING:at_lib/list_debug.c:#__list_del_entry_valid   | 0  | 2  
> | 25|
> | RIP:__list_del_entry_valid| 0  | 2  
> | 25|
> | WARNING:possible_circular_locking_dependency_detected | 0  | 2  
> | 44|
> | RIP:_raw_spin_unlock_irqrestore   | 0  | 2  
> | 2 |
> | BUG:kernel_hang_in_test_stage | 0  | 0  
> | 6 |
> | BUG:unable_to_handle_kernel   | 0  | 0  
> | 1 |
> | Oops:#[##]| 0  | 0  
> | 1 |
> | RIP:slob_page_alloc   | 0  | 0  
> | 1 |
> | Kernel_panic-not_syncing:Fatal_exception  | 0  | 0  
> | 1 |
> | RIP:delay_tsc | 0  | 0  
> | 2 |
> +---+++---+
> 
> [2.618737] db_root: cannot open: /etc/target
> [2.620114] mtdoops: mtd device (mtddev=name/number) must be supplied
> [2.620967] slram: not enough parameters.
> [2.621614] [ cut here ]
> [2.622254] list_add corruption. prev->next should be next 
> (aeeb71b0), but was cee1406d3f70. (prev=cee140422508).

Is this perhaps a false positive because we hackishly move the list_head
'head' and insert it back into the list.  Perhaps this is confusing the
validation functions?

Tobin