Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()
Hi, On Mon, Jun 27, 2022 at 8:56 PM Alexander Aring wrote: > > Hi Luc and others, > > On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck > wrote: > > > > On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote: > > > Hi, > > > > > > I recently converted to use kref_put_lock() in fs/dlm subsystem and > > > now I get the following warning in sparse: > > > > > > warning: context imbalance in 'put_rsb' - unexpected unlock > > > > > > It seems sparse is not able to detect that there is a conditional > > > requirement that the lock passed to kref_put_lock() (or also > > > refcount_dec_and_lock()) is locked or not. I evaluate the return value > > > to check if kref_put_lock() helds the lock and unlock it then. The > > > idea is that the lock needs only to be held when the refcount is going > > > to be zero. > > > > > > It seems other users unlock the lock at the release callback of > > > kref_put_lock() and annotate the callback with "__releases()" which > > > seems to work to avoid the sparse warning. However this works if you > > > don't have additional stack pointers which you need to pass to the > > > release callback. > > > > > > My question would be is this a known problem and a recommended way to > > > avoid this sparse warning (maybe just for now)? > > > > Hi, > > > > I suppose that your case here can be simplified into something like: > > > > if (some_condition) > > take(some_lock); > > > > do_stuff(); > > > > if (some_condition) > > release(some_lock); > > > > Sparse issues the 'context imbalance' warning because, a priori, > > it can't exclude that some execution will takes the lock and not > > releases it (or the opposite). In some case, when do_stuff() is > > very simple, sparse can see that everything is OK, but generally > > it won't (some whole kernel analysis but the general case is > > undecidable anyway). > > > > The recommended way would be to write things rather like this: > > > > if (some_condition) { > > take(some_lock); > > do_stuff(); > > release(some_lock); > > } else { > > do_stuff(); > > } > > > > This is not an alternative for me because the lock needs to hold > during the "some_condition". (More background is that we dealing with > data structures here and cannot allow a get() from this data > structures during "some_condition", the lock is preventing this) > It is the refcount code which causes trouble here [0] and I think the > refcount code should never call the unlock() procedure in any > condition and leave it to the caller to call unlock() in any case. > > I to'ed (hope to get more attention to this) more people related to > lib/refcount.c implementation (provided by get_maintainers.pl -f). > > > > > The __acquires() and __releases() annotations are needed for sparse > > to know that the annotated function always take or always release > > some lock but won't handle conditional locks. > > > > If we change the refcount code to _never_ calling unlock() for the > specific lock, then all those foo_and_lock_bar() functions can be > annotated with "__acquires()". This should also end in the same code? sorry, this will not work because of the first condition of "if (refcount_dec_not_one(r))" which will never hold the lock if true... that's what the optimization is all about. However, maybe somebody has another idea... - Alex
Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()
Hi Luc and others, On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck wrote: > > On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote: > > Hi, > > > > I recently converted to use kref_put_lock() in fs/dlm subsystem and > > now I get the following warning in sparse: > > > > warning: context imbalance in 'put_rsb' - unexpected unlock > > > > It seems sparse is not able to detect that there is a conditional > > requirement that the lock passed to kref_put_lock() (or also > > refcount_dec_and_lock()) is locked or not. I evaluate the return value > > to check if kref_put_lock() helds the lock and unlock it then. The > > idea is that the lock needs only to be held when the refcount is going > > to be zero. > > > > It seems other users unlock the lock at the release callback of > > kref_put_lock() and annotate the callback with "__releases()" which > > seems to work to avoid the sparse warning. However this works if you > > don't have additional stack pointers which you need to pass to the > > release callback. > > > > My question would be is this a known problem and a recommended way to > > avoid this sparse warning (maybe just for now)? > > Hi, > > I suppose that your case here can be simplified into something like: > > if (some_condition) > take(some_lock); > > do_stuff(); > > if (some_condition) > release(some_lock); > > Sparse issues the 'context imbalance' warning because, a priori, > it can't exclude that some execution will takes the lock and not > releases it (or the opposite). In some case, when do_stuff() is > very simple, sparse can see that everything is OK, but generally > it won't (some whole kernel analysis but the general case is > undecidable anyway). > > The recommended way would be to write things rather like this: > > if (some_condition) { > take(some_lock); > do_stuff(); > release(some_lock); > } else { > do_stuff(); > } > This is not an alternative for me because the lock needs to hold during the "some_condition". (More background is that we dealing with data structures here and cannot allow a get() from this data structures during "some_condition", the lock is preventing this) It is the refcount code which causes trouble here [0] and I think the refcount code should never call the unlock() procedure in any condition and leave it to the caller to call unlock() in any case. I to'ed (hope to get more attention to this) more people related to lib/refcount.c implementation (provided by get_maintainers.pl -f). > > The __acquires() and __releases() annotations are needed for sparse > to know that the annotated function always take or always release > some lock but won't handle conditional locks. > If we change the refcount code to _never_ calling unlock() for the specific lock, then all those foo_and_lock_bar() functions can be annotated with "__acquires()". This should also end in the same code? For me it looks like the current implementation of refcount.c is fine except sparse cannot figure out what's going on and maybe a reason to change the specific handling to the mentioned one. > I hope that this is helpful to you. > I hope we will find some solution, because I don't like sparse warnings. - Alex [0] https://elixir.bootlin.com/linux/v5.19-rc4/source/lib/refcount.c#L144
Re: [Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()
On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote: > Hi, > > I recently converted to use kref_put_lock() in fs/dlm subsystem and > now I get the following warning in sparse: > > warning: context imbalance in 'put_rsb' - unexpected unlock > > It seems sparse is not able to detect that there is a conditional > requirement that the lock passed to kref_put_lock() (or also > refcount_dec_and_lock()) is locked or not. I evaluate the return value > to check if kref_put_lock() helds the lock and unlock it then. The > idea is that the lock needs only to be held when the refcount is going > to be zero. > > It seems other users unlock the lock at the release callback of > kref_put_lock() and annotate the callback with "__releases()" which > seems to work to avoid the sparse warning. However this works if you > don't have additional stack pointers which you need to pass to the > release callback. > > My question would be is this a known problem and a recommended way to > avoid this sparse warning (maybe just for now)? Hi, I suppose that your case here can be simplified into something like: if (some_condition) take(some_lock); do_stuff(); if (some_condition) release(some_lock); Sparse issues the 'context imbalance' warning because, a priori, it can't exclude that some execution will takes the lock and not releases it (or the opposite). In some case, when do_stuff() is very simple, sparse can see that everything is OK, but generally it won't (some whole kernel analysis but the general case is undecidable anyway). The recommended way would be to write things rather like this: if (some_condition) { take(some_lock); do_stuff(); release(some_lock); } else { do_stuff(); } The __acquires() and __releases() annotations are needed for sparse to know that the annotated function always take or always release some lock but won't handle conditional locks. I hope that this is helpful to you. -- Luc
[Cluster-devel] sparse warnings related to kref_put_lock() and refcount_dec_and_lock()
Hi, I recently converted to use kref_put_lock() in fs/dlm subsystem and now I get the following warning in sparse: warning: context imbalance in 'put_rsb' - unexpected unlock It seems sparse is not able to detect that there is a conditional requirement that the lock passed to kref_put_lock() (or also refcount_dec_and_lock()) is locked or not. I evaluate the return value to check if kref_put_lock() helds the lock and unlock it then. The idea is that the lock needs only to be held when the refcount is going to be zero. It seems other users unlock the lock at the release callback of kref_put_lock() and annotate the callback with "__releases()" which seems to work to avoid the sparse warning. However this works if you don't have additional stack pointers which you need to pass to the release callback. My question would be is this a known problem and a recommended way to avoid this sparse warning (maybe just for now)? Thanks. - Alex
[Cluster-devel] [syzbot] general protection fault in gfs2_evict_inode (2)
Hello, syzbot found the following issue on: HEAD commit:ca1fdab7fd27 Merge tag 'efi-urgent-for-v5.19-1' of git://g.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=102e856008 kernel config: https://syzkaller.appspot.com/x/.config?x=542d3d75f0e6f36f dashboard link: https://syzkaller.appspot.com/bug?extid=8a5fc6416c175cecea34 compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+8a5fc6416c175cece...@syzkaller.appspotmail.com gfs2: fsid=syz:syz.0: first mount done, others may mount general protection fault, probably for non-canonical address 0xdc11: [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0088-0x008f] CPU: 1 PID: 10573 Comm: syz-executor.0 Not tainted 5.19.0-rc3-syzkaller-00038-gca1fdab7fd27 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:evict_linked_inode fs/gfs2/super.c:1329 [inline] RIP: 0010:gfs2_evict_inode+0xbf2/0x2030 fs/gfs2/super.c:1384 Code: 03 80 3c 02 00 0f 85 bd 13 00 00 48 8b 9d 00 09 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d bb 8c 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 e0 RSP: 0018:c90005ae7670 EFLAGS: 00010217 RAX: dc00 RBX: RCX: c90003a83000 RDX: 0011 RSI: 838de301 RDI: 008c RBP: 88802cd6c000 R08: 0005 R09: R10: R11: 0001 R12: 0001 R13: 88802e1a5160 R14: 88802e1a5698 R15: 88802e1a5610 FS: 7f4c042c5700() GS:8880b9b0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fa58679d090 CR3: 7d0e3000 CR4: 00350ee0 Call Trace: evict+0x2ed/0x6b0 fs/inode.c:664 iput_final fs/inode.c:1744 [inline] iput.part.0+0x562/0x820 fs/inode.c:1770 iput+0x58/0x70 fs/inode.c:1760 init_journal fs/gfs2/ops_fstype.c:870 [inline] init_inodes+0x28c/0x2720 fs/gfs2/ops_fstype.c:924 gfs2_fill_super+0x1b49/0x28a0 fs/gfs2/ops_fstype.c:1242 get_tree_bdev+0x440/0x760 fs/super.c:1292 gfs2_get_tree+0x4a/0x270 fs/gfs2/ops_fstype.c:1325 vfs_get_tree+0x89/0x2f0 fs/super.c:1497 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x1320/0x1fa0 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f4c0308a63a Code: 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8 ff ff ff ff eb d2 e8 b8 04 00 00 0f 1f 84 00 00 00 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:7f4c042c4f88 EFLAGS: 0206 ORIG_RAX: 00a5 RAX: ffda RBX: 2200 RCX: 7f4c0308a63a RDX: 2000 RSI: 2100 RDI: 7f4c042c4fe0 RBP: 7f4c042c5020 R08: 7f4c042c5020 R09: 2000 R10: R11: 0206 R12: 2000 R13: 2100 R14: 7f4c042c4fe0 R15: 20047a20 Modules linked in: ---[ end trace ]--- RIP: 0010:evict_linked_inode fs/gfs2/super.c:1329 [inline] RIP: 0010:gfs2_evict_inode+0xbf2/0x2030 fs/gfs2/super.c:1384 Code: 03 80 3c 02 00 0f 85 bd 13 00 00 48 8b 9d 00 09 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d bb 8c 00 00 00 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 e0 RSP: 0018:c90005ae7670 EFLAGS: 00010217 RAX: dc00 RBX: RCX: c90003a83000 RDX: 0011 RSI: 838de301 RDI: 008c RBP: 88802cd6c000 R08: 0005 R09: R10: R11: 0001 R12: 0001 R13: 88802e1a5160 R14: 88802e1a5698 R15: 88802e1a5610 FS: 7f4c042c5700() GS:8880b9b0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fa58679d090 CR3: 7d0e3000 CR4: 00350ee0 Code disassembly (best guess): 0: 03 80 3c 02 00 0f add0xf00023c(%rax),%eax 6: 85 bd 13 00 00 48 test %edi,0x4813(%rbp) c: 8b 9d 00 09 00 00 mov0x900(%rbp),%ebx 12: 48 b8 00 00 00 00 00movabs $0xdc00,%rax 19: fc ff df 1c: 48 8d bb 8c 00 00 00lea0x8c(%rbx),%rdi 23: 48 89 famov%rdi,%rdx 26: 48 c1 ea 03 shr$0x3,%rdx * 2a: 0f b6 14 02 movzbl (%rdx,%rax,1),%edx <-- trapping instruction 2e: 48 89 f8