Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

2022-06-27 Thread Alexander Aring
Hi,

On Mon, Jun 27, 2022 at 8:56 PM Alexander Aring  wrote:
>
> Hi Luc and others,
>
> On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck
>  wrote:
> >
> > On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote:
> > > Hi,
> > >
> > > I recently converted to use kref_put_lock() in fs/dlm subsystem and
> > > now I get the following warning in sparse:
> > >
> > > warning: context imbalance in 'put_rsb' - unexpected unlock
> > >
> > > It seems sparse is not able to detect that there is a conditional
> > > requirement that the lock passed to kref_put_lock() (or also
> > > refcount_dec_and_lock()) is locked or not. I evaluate the return value
> > > to check if kref_put_lock() helds the lock and unlock it then. The
> > > idea is that the lock needs only to be held when the refcount is going
> > > to be zero.
> > >
> > > It seems other users unlock the lock at the release callback of
> > > kref_put_lock() and annotate the callback with "__releases()" which
> > > seems to work to avoid the sparse warning. However this works if you
> > > don't have additional stack pointers which you need to pass to the
> > > release callback.
> > >
> > > My question would be is this a known problem and a recommended way to
> > > avoid this sparse warning (maybe just for now)?
> >
> > Hi,
> >
> > I suppose that your case here can be simplified into something like:
> >
> > if (some_condition)
> > take(some_lock);
> >
> > do_stuff();
> >
> > if (some_condition)
> > release(some_lock);
> >
> > Sparse issues the 'context imbalance' warning because, a priori,
> > it can't exclude that some execution will takes the lock and not
> > releases it (or the opposite). In some case, when do_stuff() is
> > very simple, sparse can see that everything is OK, but generally
> > it won't (some whole kernel analysis but the general case is
> > undecidable anyway).
> >
> > The recommended way would be to write things rather like this:
> >
> > if (some_condition) {
> > take(some_lock);
> > do_stuff();
> > release(some_lock);
> > } else {
> > do_stuff();
> > }
> >
>
> This is not an alternative for me because the lock needs to hold
> during the "some_condition". (More background is that we dealing with
> data structures here and cannot allow a get() from this data
> structures during "some_condition", the lock is preventing this)
> It is the refcount code which causes trouble here [0] and I think the
> refcount code should never call the unlock() procedure in any
> condition and leave it to the caller to call unlock() in any case.
>
> I to'ed (hope to get more attention to this) more people related to
> lib/refcount.c implementation (provided by get_maintainers.pl -f).
>
> >
> > The __acquires() and __releases() annotations are needed for sparse
> > to know that the annotated function always take or always release
> > some lock but won't handle conditional locks.
> >
>
> If we change the refcount code to _never_ calling unlock() for the
> specific lock, then all those foo_and_lock_bar() functions can be
> annotated with "__acquires()". This should also end in the same code?

sorry, this will not work because of the first condition of "if
(refcount_dec_not_one(r))" which will never hold the lock if true...
that's what the optimization is all about. However, maybe somebody has
another idea...

- Alex



Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

2022-06-27 Thread Alexander Aring
Hi Luc and others,

On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck
 wrote:
>
> On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote:
> > Hi,
> >
> > I recently converted to use kref_put_lock() in fs/dlm subsystem and
> > now I get the following warning in sparse:
> >
> > warning: context imbalance in 'put_rsb' - unexpected unlock
> >
> > It seems sparse is not able to detect that there is a conditional
> > requirement that the lock passed to kref_put_lock() (or also
> > refcount_dec_and_lock()) is locked or not. I evaluate the return value
> > to check if kref_put_lock() helds the lock and unlock it then. The
> > idea is that the lock needs only to be held when the refcount is going
> > to be zero.
> >
> > It seems other users unlock the lock at the release callback of
> > kref_put_lock() and annotate the callback with "__releases()" which
> > seems to work to avoid the sparse warning. However this works if you
> > don't have additional stack pointers which you need to pass to the
> > release callback.
> >
> > My question would be is this a known problem and a recommended way to
> > avoid this sparse warning (maybe just for now)?
>
> Hi,
>
> I suppose that your case here can be simplified into something like:
>
> if (some_condition)
> take(some_lock);
>
> do_stuff();
>
> if (some_condition)
> release(some_lock);
>
> Sparse issues the 'context imbalance' warning because, a priori,
> it can't exclude that some execution will takes the lock and not
> releases it (or the opposite). In some case, when do_stuff() is
> very simple, sparse can see that everything is OK, but generally
> it won't (some whole kernel analysis but the general case is
> undecidable anyway).
>
> The recommended way would be to write things rather like this:
>
> if (some_condition) {
> take(some_lock);
> do_stuff();
> release(some_lock);
> } else {
> do_stuff();
> }
>

This is not an alternative for me because the lock needs to hold
during the "some_condition". (More background is that we dealing with
data structures here and cannot allow a get() from this data
structures during "some_condition", the lock is preventing this)
It is the refcount code which causes trouble here [0] and I think the
refcount code should never call the unlock() procedure in any
condition and leave it to the caller to call unlock() in any case.

I to'ed (hope to get more attention to this) more people related to
lib/refcount.c implementation (provided by get_maintainers.pl -f).

>
> The __acquires() and __releases() annotations are needed for sparse
> to know that the annotated function always take or always release
> some lock but won't handle conditional locks.
>

If we change the refcount code to _never_ calling unlock() for the
specific lock, then all those foo_and_lock_bar() functions can be
annotated with "__acquires()". This should also end in the same code?
For me it looks like the current implementation of refcount.c is fine
except sparse cannot figure out what's going on and maybe a reason to
change the specific handling to the mentioned one.

> I hope that this is helpful to you.
>

I hope we will find some solution, because I don't like sparse warnings.

- Alex

[0] https://elixir.bootlin.com/linux/v5.19-rc4/source/lib/refcount.c#L144



Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

2022-06-27 Thread Luc Van Oostenryck
On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote:
> Hi,
> 
> I recently converted to use kref_put_lock() in fs/dlm subsystem and
> now I get the following warning in sparse:
> 
> warning: context imbalance in 'put_rsb' - unexpected unlock
> 
> It seems sparse is not able to detect that there is a conditional
> requirement that the lock passed to kref_put_lock() (or also
> refcount_dec_and_lock()) is locked or not. I evaluate the return value
> to check if kref_put_lock() helds the lock and unlock it then. The
> idea is that the lock needs only to be held when the refcount is going
> to be zero.
> 
> It seems other users unlock the lock at the release callback of
> kref_put_lock() and annotate the callback with "__releases()" which
> seems to work to avoid the sparse warning. However this works if you
> don't have additional stack pointers which you need to pass to the
> release callback.
> 
> My question would be is this a known problem and a recommended way to
> avoid this sparse warning (maybe just for now)?

Hi,

I suppose that your case here can be simplified into something like:

if (some_condition)
take(some_lock);

do_stuff();

if (some_condition)
release(some_lock);

Sparse issues the 'context imbalance' warning because, a priori,
it can't exclude that some execution will takes the lock and not
releases it (or the opposite). In some case, when do_stuff() is
very simple, sparse can see that everything is OK, but generally
it won't (some whole kernel analysis but the general case is
undecidable anyway).

The recommended way would be to write things rather like this:

if (some_condition) {
take(some_lock);
do_stuff();
release(some_lock);
} else {
do_stuff();
}


The __acquires() and __releases() annotations are needed for sparse
to know that the annotated function always take or always release
some lock but won't handle conditional locks.

I hope that this is helpful to you.

-- Luc



[Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

2022-06-27 Thread Alexander Aring
Hi,

I recently converted to use kref_put_lock() in fs/dlm subsystem and
now I get the following warning in sparse:

warning: context imbalance in 'put_rsb' - unexpected unlock

It seems sparse is not able to detect that there is a conditional
requirement that the lock passed to kref_put_lock() (or also
refcount_dec_and_lock()) is locked or not. I evaluate the return value
to check if kref_put_lock() helds the lock and unlock it then. The
idea is that the lock needs only to be held when the refcount is going
to be zero.

It seems other users unlock the lock at the release callback of
kref_put_lock() and annotate the callback with "__releases()" which
seems to work to avoid the sparse warning. However this works if you
don't have additional stack pointers which you need to pass to the
release callback.

My question would be is this a known problem and a recommended way to
avoid this sparse warning (maybe just for now)?

Thanks.

- Alex



[Cluster-devel] [syzbot] general protection fault in gfs2_evict_inode (2)

2022-06-27 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:ca1fdab7fd27 Merge tag 'efi-urgent-for-v5.19-1' of git://g..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=102e856008
kernel config:  https://syzkaller.appspot.com/x/.config?x=542d3d75f0e6f36f
dashboard link: https://syzkaller.appspot.com/bug?extid=8a5fc6416c175cecea34
compiler:   gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for 
Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8a5fc6416c175cece...@syzkaller.appspotmail.com

gfs2: fsid=syz:syz.0: first mount done, others may mount
general protection fault, probably for non-canonical address 
0xdc11:  [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0088-0x008f]
CPU: 1 PID: 10573 Comm: syz-executor.0 Not tainted 
5.19.0-rc3-syzkaller-00038-gca1fdab7fd27 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:evict_linked_inode fs/gfs2/super.c:1329 [inline]
RIP: 0010:gfs2_evict_inode+0xbf2/0x2030 fs/gfs2/super.c:1384
Code: 03 80 3c 02 00 0f 85 bd 13 00 00 48 8b 9d 00 09 00 00 48 b8 00 00 00 00 
00 fc ff df 48 8d bb 8c 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 
e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 e0
RSP: 0018:c90005ae7670 EFLAGS: 00010217
RAX: dc00 RBX:  RCX: c90003a83000
RDX: 0011 RSI: 838de301 RDI: 008c
RBP: 88802cd6c000 R08: 0005 R09: 
R10:  R11: 0001 R12: 0001
R13: 88802e1a5160 R14: 88802e1a5698 R15: 88802e1a5610
FS:  7f4c042c5700() GS:8880b9b0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fa58679d090 CR3: 7d0e3000 CR4: 00350ee0
Call Trace:
 
 evict+0x2ed/0x6b0 fs/inode.c:664
 iput_final fs/inode.c:1744 [inline]
 iput.part.0+0x562/0x820 fs/inode.c:1770
 iput+0x58/0x70 fs/inode.c:1760
 init_journal fs/gfs2/ops_fstype.c:870 [inline]
 init_inodes+0x28c/0x2720 fs/gfs2/ops_fstype.c:924
 gfs2_fill_super+0x1b49/0x28a0 fs/gfs2/ops_fstype.c:1242
 get_tree_bdev+0x440/0x760 fs/super.c:1292
 gfs2_get_tree+0x4a/0x270 fs/gfs2/ops_fstype.c:1325
 vfs_get_tree+0x89/0x2f0 fs/super.c:1497
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x1320/0x1fa0 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f4c0308a63a
Code: 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d2 e8 b8 04 00 00 
0f 1f 84 00 00 00 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 
c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:7f4c042c4f88 EFLAGS: 0206 ORIG_RAX: 00a5
RAX: ffda RBX: 2200 RCX: 7f4c0308a63a
RDX: 2000 RSI: 2100 RDI: 7f4c042c4fe0
RBP: 7f4c042c5020 R08: 7f4c042c5020 R09: 2000
R10:  R11: 0206 R12: 2000
R13: 2100 R14: 7f4c042c4fe0 R15: 20047a20
 
Modules linked in:
---[ end trace  ]---
RIP: 0010:evict_linked_inode fs/gfs2/super.c:1329 [inline]
RIP: 0010:gfs2_evict_inode+0xbf2/0x2030 fs/gfs2/super.c:1384
Code: 03 80 3c 02 00 0f 85 bd 13 00 00 48 8b 9d 00 09 00 00 48 b8 00 00 00 00 
00 fc ff df 48 8d bb 8c 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 
e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 e0
RSP: 0018:c90005ae7670 EFLAGS: 00010217
RAX: dc00 RBX:  RCX: c90003a83000
RDX: 0011 RSI: 838de301 RDI: 008c
RBP: 88802cd6c000 R08: 0005 R09: 
R10:  R11: 0001 R12: 0001
R13: 88802e1a5160 R14: 88802e1a5698 R15: 88802e1a5610
FS:  7f4c042c5700() GS:8880b9b0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fa58679d090 CR3: 7d0e3000 CR4: 00350ee0

Code disassembly (best guess):
   0:   03 80 3c 02 00 0f   add0xf00023c(%rax),%eax
   6:   85 bd 13 00 00 48   test   %edi,0x4813(%rbp)
   c:   8b 9d 00 09 00 00   mov0x900(%rbp),%ebx
  12:   48 b8 00 00 00 00 00movabs $0xdc00,%rax
  19:   fc ff df
  1c:   48 8d bb 8c 00 00 00lea0x8c(%rbx),%rdi
  23:   48 89 famov%rdi,%rdx
  26:   48 c1 ea 03 shr$0x3,%rdx
* 2a:   0f b6 14 02 movzbl (%rdx,%rax,1),%edx <-- trapping 
instruction
  2e:   48 89 f8