Re: [4.15-rc9] fs_reclaim lockdep trace
Peter Zijlstra wrote: > On Mon, Jan 29, 2018 at 08:47:20PM +0900, Tetsuo Handa wrote: > > Peter Zijlstra wrote: > > > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > > > location of > > > > > > > > /* this guy won't enter reclaim */ > > > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > > > return false; > > > > > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim > > > > context > > > > (__GFP_NOFS)"). > > > > > > I'm not entirly sure I get what you mean here. How did I move it? It was > > > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > > > mark the lock as held. > > > > d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with > > fs_reclaim_acquire(), and removed current->lockdep_recursion handling. > > > > -- > > # git show d92a8cfcb37ecd13 | grep recursion > > -# define INIT_LOCKDEP .lockdep_recursion = 0, > > .lockdep_reclaim_gfp = 0, > > +# define INIT_LOCKDEP .lockdep_recursion = 0, > > unsigned intlockdep_recursion; > > - if (unlikely(current->lockdep_recursion)) > > - current->lockdep_recursion = 1; > > - current->lockdep_recursion = 0; > > -* context checking code. This tests GFP_FS recursion (a lock taken > > -- > > That should not matter at all. The only case that would matter for is if > lockdep itself would ever call into lockdep again. Not something that > happens here. > > > > The new code has it in fs_reclaim_acquire/release to the same effect, if > > > __GFP_NOMEMALLOC, we'll not acquire/release the lock. > > > > Excuse me, but I can't catch. > > We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. > > Right, got the case inverted, same difference though. Before we'd do > mark_held_lock(), now we do acquire/release under the same conditions. > > > > > Since __kmalloc_reserve() from __alloc_skb() adds > > > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > > > failing to return false despite PF_MEMALLOC context (and resulted in > > > > lockdep warning). > > > > > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > > > That's what the name says. > > > > __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that > > allocation > > request should use. > > Right. > > > But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. > > Ah indeed. > > > Then, how can fs_reclaim contribute to deadlock? > > Not sure it can. But if we're going to allow this, it needs to come with > a clear description on why. Not a few clues to a puzzle. > Let's decode Dave's report. -- stack backtrace: CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 Call Trace: dump_stack+0xbc/0x13f __lock_acquire+0xa09/0x2040 lock_acquire+0x12e/0x350 fs_reclaim_acquire.part.102+0x29/0x30 kmem_cache_alloc+0x3d/0x2c0 alloc_extent_state+0xa7/0x410 __clear_extent_bit+0x3ea/0x570 try_release_extent_mapping+0x21a/0x260 __btrfs_releasepage+0xb0/0x1c0 btrfs_releasepage+0x161/0x170 try_to_release_page+0x162/0x1c0 shrink_page_list+0x1d5a/0x2fb0 shrink_inactive_list+0x451/0x940 shrink_node_memcg.constprop.88+0x4c9/0x5e0 shrink_node+0x12d/0x260 try_to_free_pages+0x418/0xaf0 __alloc_pages_slowpath+0x976/0x1790 __alloc_pages_nodemask+0x52c/0x5c0 new_slab+0x374/0x3f0 ___slab_alloc.constprop.81+0x47e/0x5a0 __slab_alloc.constprop.80+0x32/0x60 __kmalloc_track_caller+0x267/0x310 __kmalloc_reserve.isra.40+0x29/0x80 __alloc_skb+0xee/0x390 sk_stream_alloc_skb+0xb8/0x340 -- struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp, bool force_schedule) { skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp) = { // gfp == GFP_KERNEL static inline struct sk_buff *alloc_skb_fclone(unsigned int size, gfp_t priority) { // priority == GFP_KERNEL return __alloc_skb(size, priority, SKB_ALLOC_FCLONE, NUMA_NO_NODE) = { data = kmalloc_reserve(size, gfp_mask, node, ) = { // gfp_mask == GFP_KERNEL obj = kmalloc_node_track_caller(size, flags | __GFP_NOMEMALLOC | __GFP_NOWARN, node) = { // flags == GFP_KERNEL __kmalloc_node_track_caller(size, GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN, node) = { void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, int node, unsigned long caller) { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN ret = slab_alloc_node(s, gfpflags, node, caller) = { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN static __always_inline void *slab_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node, unsigned long addr) { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC |
Re: [4.15-rc9] fs_reclaim lockdep trace
Peter Zijlstra wrote: > On Mon, Jan 29, 2018 at 08:47:20PM +0900, Tetsuo Handa wrote: > > Peter Zijlstra wrote: > > > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > > > location of > > > > > > > > /* this guy won't enter reclaim */ > > > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > > > return false; > > > > > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim > > > > context > > > > (__GFP_NOFS)"). > > > > > > I'm not entirly sure I get what you mean here. How did I move it? It was > > > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > > > mark the lock as held. > > > > d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with > > fs_reclaim_acquire(), and removed current->lockdep_recursion handling. > > > > -- > > # git show d92a8cfcb37ecd13 | grep recursion > > -# define INIT_LOCKDEP .lockdep_recursion = 0, > > .lockdep_reclaim_gfp = 0, > > +# define INIT_LOCKDEP .lockdep_recursion = 0, > > unsigned intlockdep_recursion; > > - if (unlikely(current->lockdep_recursion)) > > - current->lockdep_recursion = 1; > > - current->lockdep_recursion = 0; > > -* context checking code. This tests GFP_FS recursion (a lock taken > > -- > > That should not matter at all. The only case that would matter for is if > lockdep itself would ever call into lockdep again. Not something that > happens here. > > > > The new code has it in fs_reclaim_acquire/release to the same effect, if > > > __GFP_NOMEMALLOC, we'll not acquire/release the lock. > > > > Excuse me, but I can't catch. > > We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. > > Right, got the case inverted, same difference though. Before we'd do > mark_held_lock(), now we do acquire/release under the same conditions. > > > > > Since __kmalloc_reserve() from __alloc_skb() adds > > > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > > > failing to return false despite PF_MEMALLOC context (and resulted in > > > > lockdep warning). > > > > > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > > > That's what the name says. > > > > __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that > > allocation > > request should use. > > Right. > > > But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. > > Ah indeed. > > > Then, how can fs_reclaim contribute to deadlock? > > Not sure it can. But if we're going to allow this, it needs to come with > a clear description on why. Not a few clues to a puzzle. > Let's decode Dave's report. -- stack backtrace: CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 Call Trace: dump_stack+0xbc/0x13f __lock_acquire+0xa09/0x2040 lock_acquire+0x12e/0x350 fs_reclaim_acquire.part.102+0x29/0x30 kmem_cache_alloc+0x3d/0x2c0 alloc_extent_state+0xa7/0x410 __clear_extent_bit+0x3ea/0x570 try_release_extent_mapping+0x21a/0x260 __btrfs_releasepage+0xb0/0x1c0 btrfs_releasepage+0x161/0x170 try_to_release_page+0x162/0x1c0 shrink_page_list+0x1d5a/0x2fb0 shrink_inactive_list+0x451/0x940 shrink_node_memcg.constprop.88+0x4c9/0x5e0 shrink_node+0x12d/0x260 try_to_free_pages+0x418/0xaf0 __alloc_pages_slowpath+0x976/0x1790 __alloc_pages_nodemask+0x52c/0x5c0 new_slab+0x374/0x3f0 ___slab_alloc.constprop.81+0x47e/0x5a0 __slab_alloc.constprop.80+0x32/0x60 __kmalloc_track_caller+0x267/0x310 __kmalloc_reserve.isra.40+0x29/0x80 __alloc_skb+0xee/0x390 sk_stream_alloc_skb+0xb8/0x340 -- struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp, bool force_schedule) { skb = alloc_skb_fclone(size + sk->sk_prot->max_header, gfp) = { // gfp == GFP_KERNEL static inline struct sk_buff *alloc_skb_fclone(unsigned int size, gfp_t priority) { // priority == GFP_KERNEL return __alloc_skb(size, priority, SKB_ALLOC_FCLONE, NUMA_NO_NODE) = { data = kmalloc_reserve(size, gfp_mask, node, ) = { // gfp_mask == GFP_KERNEL obj = kmalloc_node_track_caller(size, flags | __GFP_NOMEMALLOC | __GFP_NOWARN, node) = { // flags == GFP_KERNEL __kmalloc_node_track_caller(size, GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN, node) = { void *__kmalloc_node_track_caller(size_t size, gfp_t gfpflags, int node, unsigned long caller) { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN ret = slab_alloc_node(s, gfpflags, node, caller) = { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC | __GFP_NOWARN static __always_inline void *slab_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node, unsigned long addr) { // gfpflags == GFP_KERNEL | __GFP_NOMEMALLOC |
Re: [4.15-rc9] fs_reclaim lockdep trace
On Mon, Jan 29, 2018 at 08:47:20PM +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > > location of > > > > > > /* this guy won't enter reclaim */ > > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > > return false; > > > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > > > (__GFP_NOFS)"). > > > > I'm not entirly sure I get what you mean here. How did I move it? It was > > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > > mark the lock as held. > > d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with > fs_reclaim_acquire(), and removed current->lockdep_recursion handling. > > -- > # git show d92a8cfcb37ecd13 | grep recursion > -# define INIT_LOCKDEP .lockdep_recursion = 0, > .lockdep_reclaim_gfp = 0, > +# define INIT_LOCKDEP .lockdep_recursion = 0, > unsigned intlockdep_recursion; > - if (unlikely(current->lockdep_recursion)) > - current->lockdep_recursion = 1; > - current->lockdep_recursion = 0; > -* context checking code. This tests GFP_FS recursion (a lock taken > -- That should not matter at all. The only case that would matter for is if lockdep itself would ever call into lockdep again. Not something that happens here. > > The new code has it in fs_reclaim_acquire/release to the same effect, if > > __GFP_NOMEMALLOC, we'll not acquire/release the lock. > > Excuse me, but I can't catch. > We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. Right, got the case inverted, same difference though. Before we'd do mark_held_lock(), now we do acquire/release under the same conditions. > > > Since __kmalloc_reserve() from __alloc_skb() adds > > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > > failing to return false despite PF_MEMALLOC context (and resulted in > > > lockdep warning). > > > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > > That's what the name says. > > __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that allocation > request should use. Right. > But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. Ah indeed. > Then, how can fs_reclaim contribute to deadlock? Not sure it can. But if we're going to allow this, it needs to come with a clear description on why. Not a few clues to a puzzle. Now, even if its not strictly a deadlock, there is something to be said for flagging GFP_FS allocs that lead to nested GFP_FS allocs, do we ever want to allow that?
Re: [4.15-rc9] fs_reclaim lockdep trace
On Mon, Jan 29, 2018 at 08:47:20PM +0900, Tetsuo Handa wrote: > Peter Zijlstra wrote: > > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > > location of > > > > > > /* this guy won't enter reclaim */ > > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > > return false; > > > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > > > (__GFP_NOFS)"). > > > > I'm not entirly sure I get what you mean here. How did I move it? It was > > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > > mark the lock as held. > > d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with > fs_reclaim_acquire(), and removed current->lockdep_recursion handling. > > -- > # git show d92a8cfcb37ecd13 | grep recursion > -# define INIT_LOCKDEP .lockdep_recursion = 0, > .lockdep_reclaim_gfp = 0, > +# define INIT_LOCKDEP .lockdep_recursion = 0, > unsigned intlockdep_recursion; > - if (unlikely(current->lockdep_recursion)) > - current->lockdep_recursion = 1; > - current->lockdep_recursion = 0; > -* context checking code. This tests GFP_FS recursion (a lock taken > -- That should not matter at all. The only case that would matter for is if lockdep itself would ever call into lockdep again. Not something that happens here. > > The new code has it in fs_reclaim_acquire/release to the same effect, if > > __GFP_NOMEMALLOC, we'll not acquire/release the lock. > > Excuse me, but I can't catch. > We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. Right, got the case inverted, same difference though. Before we'd do mark_held_lock(), now we do acquire/release under the same conditions. > > > Since __kmalloc_reserve() from __alloc_skb() adds > > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > > failing to return false despite PF_MEMALLOC context (and resulted in > > > lockdep warning). > > > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > > That's what the name says. > > __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that allocation > request should use. Right. > But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. Ah indeed. > Then, how can fs_reclaim contribute to deadlock? Not sure it can. But if we're going to allow this, it needs to come with a clear description on why. Not a few clues to a puzzle. Now, even if its not strictly a deadlock, there is something to be said for flagging GFP_FS allocs that lead to nested GFP_FS allocs, do we ever want to allow that?
Re: [4.15-rc9] fs_reclaim lockdep trace
Peter Zijlstra wrote: > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > location of > > > > /* this guy won't enter reclaim */ > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > return false; > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > > (__GFP_NOFS)"). > > I'm not entirly sure I get what you mean here. How did I move it? It was > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > mark the lock as held. d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with fs_reclaim_acquire(), and removed current->lockdep_recursion handling. -- # git show d92a8cfcb37ecd13 | grep recursion -# define INIT_LOCKDEP .lockdep_recursion = 0, .lockdep_reclaim_gfp = 0, +# define INIT_LOCKDEP .lockdep_recursion = 0, unsigned intlockdep_recursion; - if (unlikely(current->lockdep_recursion)) - current->lockdep_recursion = 1; - current->lockdep_recursion = 0; -* context checking code. This tests GFP_FS recursion (a lock taken -- > > The new code has it in fs_reclaim_acquire/release to the same effect, if > __GFP_NOMEMALLOC, we'll not acquire/release the lock. Excuse me, but I can't catch. We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. -- +static bool __need_fs_reclaim(gfp_t gfp_mask) +{ (...snipped...) + /* this guy won't enter reclaim */ + if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) + return false; (...snipped...) +} -- > > > > Since __kmalloc_reserve() from __alloc_skb() adds > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > failing to return false despite PF_MEMALLOC context (and resulted in > > lockdep warning). > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > That's what the name says. __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that allocation request should use. -- static inline int __gfp_pfmemalloc_flags(gfp_t gfp_mask) { if (unlikely(gfp_mask & __GFP_NOMEMALLOC)) return 0; if (gfp_mask & __GFP_MEMALLOC) return ALLOC_NO_WATERMARKS; if (in_serving_softirq() && (current->flags & PF_MEMALLOC)) return ALLOC_NO_WATERMARKS; if (!in_interrupt()) { if (current->flags & PF_MEMALLOC) return ALLOC_NO_WATERMARKS; else if (oom_reserves_allowed(current)) return ALLOC_OOM; } return 0; } -- But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. -- /* Attempt with potentially adjusted zonelist and alloc_flags */ page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); if (page) goto got_pg; /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) goto nopage; /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, _some_progress); if (page) goto got_pg; /* Try direct compaction and then allocating */ page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac, compact_priority, _result); if (page) goto got_pg; /* Do not loop if specifically requested */ if (gfp_mask & __GFP_NORETRY) goto nopage; -- Then, how can fs_reclaim contribute to deadlock? > > > Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking > > __GFP_NOMEMALLOC might make sense. But since this safeguard was added by > > commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for > > allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. > > Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to > > return false. > > This does not in fact explain what's going on, it just points to > 'random' patches. > > Are you talking about this: > > + /* Avoid recursion of direct reclaim */ > + if (p->flags & PF_MEMALLOC) > + goto nopage; > > bit? Yes. > > > Reported-by: Dave Jones> > Signed-off-by: Tetsuo Handa > > Cc: Peter Zijlstra > > Cc: Nick Piggin > > --- > > mm/page_alloc.c | 2 +- > > 1 file changed,
Re: [4.15-rc9] fs_reclaim lockdep trace
Peter Zijlstra wrote: > On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > > This warning seems to be caused by commit d92a8cfcb37ecd13 > > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > > location of > > > > /* this guy won't enter reclaim */ > > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > > return false; > > > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > > (__GFP_NOFS)"). > > I'm not entirly sure I get what you mean here. How did I move it? It was > part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not > mark the lock as held. d92a8cfcb37ecd13 replaced lockdep_set_current_reclaim_state() with fs_reclaim_acquire(), and removed current->lockdep_recursion handling. -- # git show d92a8cfcb37ecd13 | grep recursion -# define INIT_LOCKDEP .lockdep_recursion = 0, .lockdep_reclaim_gfp = 0, +# define INIT_LOCKDEP .lockdep_recursion = 0, unsigned intlockdep_recursion; - if (unlikely(current->lockdep_recursion)) - current->lockdep_recursion = 1; - current->lockdep_recursion = 0; -* context checking code. This tests GFP_FS recursion (a lock taken -- > > The new code has it in fs_reclaim_acquire/release to the same effect, if > __GFP_NOMEMALLOC, we'll not acquire/release the lock. Excuse me, but I can't catch. We currently acquire/release __fs_reclaim_map if __GFP_NOMEMALLOC. -- +static bool __need_fs_reclaim(gfp_t gfp_mask) +{ (...snipped...) + /* this guy won't enter reclaim */ + if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) + return false; (...snipped...) +} -- > > > > Since __kmalloc_reserve() from __alloc_skb() adds > > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > > failing to return false despite PF_MEMALLOC context (and resulted in > > lockdep warning). > > But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. > That's what the name says. __GFP_NOMEMALLOC negates PF_MEMALLOC regarding what watermark that allocation request should use. -- static inline int __gfp_pfmemalloc_flags(gfp_t gfp_mask) { if (unlikely(gfp_mask & __GFP_NOMEMALLOC)) return 0; if (gfp_mask & __GFP_MEMALLOC) return ALLOC_NO_WATERMARKS; if (in_serving_softirq() && (current->flags & PF_MEMALLOC)) return ALLOC_NO_WATERMARKS; if (!in_interrupt()) { if (current->flags & PF_MEMALLOC) return ALLOC_NO_WATERMARKS; else if (oom_reserves_allowed(current)) return ALLOC_OOM; } return 0; } -- But at the same time, PF_MEMALLOC negates __GFP_DIRECT_RECLAIM. -- /* Attempt with potentially adjusted zonelist and alloc_flags */ page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); if (page) goto got_pg; /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) goto nopage; /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, _some_progress); if (page) goto got_pg; /* Try direct compaction and then allocating */ page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac, compact_priority, _result); if (page) goto got_pg; /* Do not loop if specifically requested */ if (gfp_mask & __GFP_NORETRY) goto nopage; -- Then, how can fs_reclaim contribute to deadlock? > > > Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking > > __GFP_NOMEMALLOC might make sense. But since this safeguard was added by > > commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for > > allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. > > Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to > > return false. > > This does not in fact explain what's going on, it just points to > 'random' patches. > > Are you talking about this: > > + /* Avoid recursion of direct reclaim */ > + if (p->flags & PF_MEMALLOC) > + goto nopage; > > bit? Yes. > > > Reported-by: Dave Jones > > Signed-off-by: Tetsuo Handa > > Cc: Peter Zijlstra > > Cc: Nick Piggin > > --- > > mm/page_alloc.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > This warning seems to be caused by commit d92a8cfcb37ecd13 > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > location of > > /* this guy won't enter reclaim */ > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > return false; > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > (__GFP_NOFS)"). I'm not entirly sure I get what you mean here. How did I move it? It was part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not mark the lock as held. The new code has it in fs_reclaim_acquire/release to the same effect, if __GFP_NOMEMALLOC, we'll not acquire/release the lock. > Since __kmalloc_reserve() from __alloc_skb() adds > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > failing to return false despite PF_MEMALLOC context (and resulted in > lockdep warning). But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. That's what the name says. > Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking > __GFP_NOMEMALLOC might make sense. But since this safeguard was added by > commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for > allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. > Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to > return false. This does not in fact explain what's going on, it just points to 'random' patches. Are you talking about this: + /* Avoid recursion of direct reclaim */ + if (p->flags & PF_MEMALLOC) + goto nopage; bit? > Reported-by: Dave Jones> Signed-off-by: Tetsuo Handa > Cc: Peter Zijlstra > Cc: Nick Piggin > --- > mm/page_alloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 76c9688..7804b0e 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) > return false; > > /* this guy won't enter reclaim */ > - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > + if (current->flags & PF_MEMALLOC) > return false; I'm _really_ uncomfortable doing that. Esp. without a solid explanation of how this raelly can't possibly lead to trouble. Which the above semi incoherent rambling is not. Your backtrace shows the btrfs shrinker doing an allocation, that's the exact kind of thing we need to be extremely careful with.
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > This warning seems to be caused by commit d92a8cfcb37ecd13 > ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the > location of > > /* this guy won't enter reclaim */ > if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > return false; > > check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context > (__GFP_NOFS)"). I'm not entirly sure I get what you mean here. How did I move it? It was part of lockdep_trace_alloc(), if __GFP_NOMEMALLOC was set, it would not mark the lock as held. The new code has it in fs_reclaim_acquire/release to the same effect, if __GFP_NOMEMALLOC, we'll not acquire/release the lock. > Since __kmalloc_reserve() from __alloc_skb() adds > __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is > failing to return false despite PF_MEMALLOC context (and resulted in > lockdep warning). But that's correct right, __GFP_NOMEMALLOC should negate PF_MEMALLOC. That's what the name says. > Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking > __GFP_NOMEMALLOC might make sense. But since this safeguard was added by > commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for > allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. > Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to > return false. This does not in fact explain what's going on, it just points to 'random' patches. Are you talking about this: + /* Avoid recursion of direct reclaim */ + if (p->flags & PF_MEMALLOC) + goto nopage; bit? > Reported-by: Dave Jones > Signed-off-by: Tetsuo Handa > Cc: Peter Zijlstra > Cc: Nick Piggin > --- > mm/page_alloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 76c9688..7804b0e 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) > return false; > > /* this guy won't enter reclaim */ > - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > + if (current->flags & PF_MEMALLOC) > return false; I'm _really_ uncomfortable doing that. Esp. without a solid explanation of how this raelly can't possibly lead to trouble. Which the above semi incoherent rambling is not. Your backtrace shows the btrfs shrinker doing an allocation, that's the exact kind of thing we need to be extremely careful with.
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > Dave, would you try below patch? > > >From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa> Date: Sun, 28 Jan 2018 14:17:14 +0900 > Subject: [PATCH] lockdep: Fix fs_reclaim warning. Seems to suppress the warning for me. Tested-by: Dave Jones
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sun, Jan 28, 2018 at 02:55:28PM +0900, Tetsuo Handa wrote: > Dave, would you try below patch? > > >From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001 > From: Tetsuo Handa > Date: Sun, 28 Jan 2018 14:17:14 +0900 > Subject: [PATCH] lockdep: Fix fs_reclaim warning. Seems to suppress the warning for me. Tested-by: Dave Jones
Re: [4.15-rc9] fs_reclaim lockdep trace
Dave, would you try below patch? >From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001 From: Tetsuo HandaDate: Sun, 28 Jan 2018 14:17:14 +0900 Subject: [PATCH] lockdep: Fix fs_reclaim warning. Dave Jones reported fs_reclaim lockdep warnings. WARNING: possible recursive locking detected 4.15.0-rc9-backup-debug+ #1 Not tainted sshd/24800 is trying to acquire lock: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 but task is already holding lock: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(fs_reclaim); lock(fs_reclaim); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by sshd/24800: #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] tcp_sendmsg+0x19/0x40 #1: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 stack backtrace: CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 Call Trace: dump_stack+0xbc/0x13f __lock_acquire+0xa09/0x2040 lock_acquire+0x12e/0x350 fs_reclaim_acquire.part.102+0x29/0x30 kmem_cache_alloc+0x3d/0x2c0 alloc_extent_state+0xa7/0x410 __clear_extent_bit+0x3ea/0x570 try_release_extent_mapping+0x21a/0x260 __btrfs_releasepage+0xb0/0x1c0 btrfs_releasepage+0x161/0x170 try_to_release_page+0x162/0x1c0 shrink_page_list+0x1d5a/0x2fb0 shrink_inactive_list+0x451/0x940 shrink_node_memcg.constprop.88+0x4c9/0x5e0 shrink_node+0x12d/0x260 try_to_free_pages+0x418/0xaf0 __alloc_pages_slowpath+0x976/0x1790 __alloc_pages_nodemask+0x52c/0x5c0 new_slab+0x374/0x3f0 ___slab_alloc.constprop.81+0x47e/0x5a0 __slab_alloc.constprop.80+0x32/0x60 __kmalloc_track_caller+0x267/0x310 __kmalloc_reserve.isra.40+0x29/0x80 __alloc_skb+0xee/0x390 sk_stream_alloc_skb+0xb8/0x340 tcp_sendmsg_locked+0x8e6/0x1d30 tcp_sendmsg+0x27/0x40 inet_sendmsg+0xd0/0x310 sock_write_iter+0x17a/0x240 __vfs_write+0x2ab/0x380 vfs_write+0xfb/0x260 SyS_write+0xb6/0x140 do_syscall_64+0x1e5/0xc05 entry_SYSCALL64_slow_path+0x25/0x25 Since no fs locks are held, doing GFP_KERNEL allocation should be safe as long as there is PF_MEMALLOC safeguard ( /* Avoid recursion of direct reclaim */ if (p->flags & PF_MEMALLOC) goto nopage; ) which prevents infinite recursion. This warning seems to be caused by commit d92a8cfcb37ecd13 ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the location of /* this guy won't enter reclaim */ if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) return false; check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context (__GFP_NOFS)"). Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is failing to return false despite PF_MEMALLOC context (and resulted in lockdep warning). Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking __GFP_NOMEMALLOC might make sense. But since this safeguard was added by commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to return false. Reported-by: Dave Jones Signed-off-by: Tetsuo Handa Cc: Peter Zijlstra Cc: Nick Piggin --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 76c9688..7804b0e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) return false; /* this guy won't enter reclaim */ - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) + if (current->flags & PF_MEMALLOC) return false; /* We're only interested __GFP_FS allocations for now */ -- 1.8.3.1
Re: [4.15-rc9] fs_reclaim lockdep trace
Dave, would you try below patch? >From cae2cbf389ae3cdef1b492622722b4aeb07eb284 Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Sun, 28 Jan 2018 14:17:14 +0900 Subject: [PATCH] lockdep: Fix fs_reclaim warning. Dave Jones reported fs_reclaim lockdep warnings. WARNING: possible recursive locking detected 4.15.0-rc9-backup-debug+ #1 Not tainted sshd/24800 is trying to acquire lock: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 but task is already holding lock: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 lock(fs_reclaim); lock(fs_reclaim); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by sshd/24800: #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] tcp_sendmsg+0x19/0x40 #1: (fs_reclaim){+.+.}, at: [<84f438c2>] fs_reclaim_acquire.part.102+0x5/0x30 stack backtrace: CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 Call Trace: dump_stack+0xbc/0x13f __lock_acquire+0xa09/0x2040 lock_acquire+0x12e/0x350 fs_reclaim_acquire.part.102+0x29/0x30 kmem_cache_alloc+0x3d/0x2c0 alloc_extent_state+0xa7/0x410 __clear_extent_bit+0x3ea/0x570 try_release_extent_mapping+0x21a/0x260 __btrfs_releasepage+0xb0/0x1c0 btrfs_releasepage+0x161/0x170 try_to_release_page+0x162/0x1c0 shrink_page_list+0x1d5a/0x2fb0 shrink_inactive_list+0x451/0x940 shrink_node_memcg.constprop.88+0x4c9/0x5e0 shrink_node+0x12d/0x260 try_to_free_pages+0x418/0xaf0 __alloc_pages_slowpath+0x976/0x1790 __alloc_pages_nodemask+0x52c/0x5c0 new_slab+0x374/0x3f0 ___slab_alloc.constprop.81+0x47e/0x5a0 __slab_alloc.constprop.80+0x32/0x60 __kmalloc_track_caller+0x267/0x310 __kmalloc_reserve.isra.40+0x29/0x80 __alloc_skb+0xee/0x390 sk_stream_alloc_skb+0xb8/0x340 tcp_sendmsg_locked+0x8e6/0x1d30 tcp_sendmsg+0x27/0x40 inet_sendmsg+0xd0/0x310 sock_write_iter+0x17a/0x240 __vfs_write+0x2ab/0x380 vfs_write+0xfb/0x260 SyS_write+0xb6/0x140 do_syscall_64+0x1e5/0xc05 entry_SYSCALL64_slow_path+0x25/0x25 Since no fs locks are held, doing GFP_KERNEL allocation should be safe as long as there is PF_MEMALLOC safeguard ( /* Avoid recursion of direct reclaim */ if (p->flags & PF_MEMALLOC) goto nopage; ) which prevents infinite recursion. This warning seems to be caused by commit d92a8cfcb37ecd13 ("locking/lockdep: Rework FS_RECLAIM annotation") which moved the location of /* this guy won't enter reclaim */ if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) return false; check added by commit cf40bd16fdad42c0 ("lockdep: annotate reclaim context (__GFP_NOFS)"). Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is failing to return false despite PF_MEMALLOC context (and resulted in lockdep warning). Since there was no PF_MEMALLOC safeguard as of cf40bd16fdad42c0, checking __GFP_NOMEMALLOC might make sense. But since this safeguard was added by commit 341ce06f69abfafa ("page allocator: calculate the alloc_flags for allocation only once"), checking __GFP_NOMEMALLOC no longer makes sense. Thus, let's remove __GFP_NOMEMALLOC check and allow __need_fs_reclaim() to return false. Reported-by: Dave Jones Signed-off-by: Tetsuo Handa Cc: Peter Zijlstra Cc: Nick Piggin --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 76c9688..7804b0e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3583,7 +3583,7 @@ static bool __need_fs_reclaim(gfp_t gfp_mask) return false; /* this guy won't enter reclaim */ - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) + if (current->flags & PF_MEMALLOC) return false; /* We're only interested __GFP_FS allocations for now */ -- 1.8.3.1
Re: [4.15-rc9] fs_reclaim lockdep trace
On 2018/01/28 10:16, Tetsuo Handa wrote: > Linus Torvalds wrote: >> On Sat, Jan 27, 2018 at 2:24 PM, Dave Joneswrote: >>> On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: >>> > Just triggered this on a server I was rsync'ing to. >>> >>> Actually, I can trigger this really easily, even with an rsync from one >>> disk to another. Though that also smells a little like networking in >>> the traces. Maybe netdev has ideas. >> >> Is this new to 4.15? Or is it just that you're testing something new? >> >> If it's new and easy to repro, can you just bisect it? And if it isn't >> new, can you perhaps check whether it's new to 4.14 (ie 4.13 being >> ok)? >> >> Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, >> but it was rewritten for 4.14.. I'm wondering if that remodeling ended >> up triggering something. > > --- linux-4.13.16/mm/page_alloc.c > +++ linux-4.14.15/mm/page_alloc.c Oops. This output was inverted. > @@ -3527,53 +3519,12 @@ > return true; > } > return false; > } > #endif /* CONFIG_COMPACTION */ > > -#ifdef CONFIG_LOCKDEP > -struct lockdep_map __fs_reclaim_map = > - STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); > - > -static bool __need_fs_reclaim(gfp_t gfp_mask) > -{ > - gfp_mask = current_gfp_context(gfp_mask); > - > - /* no reclaim without waiting on it */ > - if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) > - return false; > - > - /* this guy won't enter reclaim */ > - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > - return false; Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is failing to return false here. But why checking __GFP_NOMEMALLOC here? __alloc_pages_slowpath() skips direct reclaim if !(gfp_mask & __GFP_DIRECT_RECLAIM) or (current->flags & PF_MEMALLOC), doesn't it? -- static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { (...snipped...) /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) goto nopage; /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, _some_progress); if (page) goto got_pg; (...snipped...) } --
Re: [4.15-rc9] fs_reclaim lockdep trace
On 2018/01/28 10:16, Tetsuo Handa wrote: > Linus Torvalds wrote: >> On Sat, Jan 27, 2018 at 2:24 PM, Dave Jones wrote: >>> On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: >>> > Just triggered this on a server I was rsync'ing to. >>> >>> Actually, I can trigger this really easily, even with an rsync from one >>> disk to another. Though that also smells a little like networking in >>> the traces. Maybe netdev has ideas. >> >> Is this new to 4.15? Or is it just that you're testing something new? >> >> If it's new and easy to repro, can you just bisect it? And if it isn't >> new, can you perhaps check whether it's new to 4.14 (ie 4.13 being >> ok)? >> >> Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, >> but it was rewritten for 4.14.. I'm wondering if that remodeling ended >> up triggering something. > > --- linux-4.13.16/mm/page_alloc.c > +++ linux-4.14.15/mm/page_alloc.c Oops. This output was inverted. > @@ -3527,53 +3519,12 @@ > return true; > } > return false; > } > #endif /* CONFIG_COMPACTION */ > > -#ifdef CONFIG_LOCKDEP > -struct lockdep_map __fs_reclaim_map = > - STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); > - > -static bool __need_fs_reclaim(gfp_t gfp_mask) > -{ > - gfp_mask = current_gfp_context(gfp_mask); > - > - /* no reclaim without waiting on it */ > - if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) > - return false; > - > - /* this guy won't enter reclaim */ > - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) > - return false; Since __kmalloc_reserve() from __alloc_skb() adds __GFP_NOMEMALLOC | __GFP_NOWARN to gfp_mask, __need_fs_reclaim() is failing to return false here. But why checking __GFP_NOMEMALLOC here? __alloc_pages_slowpath() skips direct reclaim if !(gfp_mask & __GFP_DIRECT_RECLAIM) or (current->flags & PF_MEMALLOC), doesn't it? -- static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) { (...snipped...) /* Caller is not willing to reclaim, we can't balance anything */ if (!can_direct_reclaim) goto nopage; /* Avoid recursion of direct reclaim */ if (current->flags & PF_MEMALLOC) goto nopage; /* Try direct reclaim and then allocating */ page = __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags, ac, _some_progress); if (page) goto got_pg; (...snipped...) } --
Re: [4.15-rc9] fs_reclaim lockdep trace
Linus Torvalds wrote: > On Sat, Jan 27, 2018 at 2:24 PM, Dave Joneswrote: >> On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: >> > Just triggered this on a server I was rsync'ing to. >> >> Actually, I can trigger this really easily, even with an rsync from one >> disk to another. Though that also smells a little like networking in >> the traces. Maybe netdev has ideas. > > Is this new to 4.15? Or is it just that you're testing something new? > > If it's new and easy to repro, can you just bisect it? And if it isn't > new, can you perhaps check whether it's new to 4.14 (ie 4.13 being > ok)? > > Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, > but it was rewritten for 4.14.. I'm wondering if that remodeling ended > up triggering something. --- linux-4.13.16/mm/page_alloc.c +++ linux-4.14.15/mm/page_alloc.c @@ -3527,53 +3519,12 @@ return true; } return false; } #endif /* CONFIG_COMPACTION */ -#ifdef CONFIG_LOCKDEP -struct lockdep_map __fs_reclaim_map = - STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); - -static bool __need_fs_reclaim(gfp_t gfp_mask) -{ - gfp_mask = current_gfp_context(gfp_mask); - - /* no reclaim without waiting on it */ - if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) - return false; - - /* this guy won't enter reclaim */ - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) - return false; - - /* We're only interested __GFP_FS allocations for now */ - if (!(gfp_mask & __GFP_FS)) - return false; - - if (gfp_mask & __GFP_NOLOCKDEP) - return false; - - return true; -} - -void fs_reclaim_acquire(gfp_t gfp_mask) -{ - if (__need_fs_reclaim(gfp_mask)) - lock_map_acquire(&__fs_reclaim_map); -} -EXPORT_SYMBOL_GPL(fs_reclaim_acquire); - -void fs_reclaim_release(gfp_t gfp_mask) -{ - if (__need_fs_reclaim(gfp_mask)) - lock_map_release(&__fs_reclaim_map); -} -EXPORT_SYMBOL_GPL(fs_reclaim_release); -#endif - /* Perform direct synchronous page reclaim */ static int __perform_reclaim(gfp_t gfp_mask, unsigned int order, const struct alloc_context *ac) { struct reclaim_state reclaim_state; @@ -3582,21 +3533,21 @@ cond_resched(); /* We now go into synchronous reclaim */ cpuset_memory_pressure_bump(); noreclaim_flag = memalloc_noreclaim_save(); - fs_reclaim_acquire(gfp_mask); + lockdep_set_current_reclaim_state(gfp_mask); reclaim_state.reclaimed_slab = 0; current->reclaim_state = _state; progress = try_to_free_pages(ac->zonelist, order, gfp_mask, ac->nodemask); current->reclaim_state = NULL; - fs_reclaim_release(gfp_mask); + lockdep_clear_current_reclaim_state(); memalloc_noreclaim_restore(noreclaim_flag); cond_resched(); return progress; } > > Adding PeterZ to the participants list in case he has ideas. I'm not > seeing what would be the problem in that call chain from hell. > >Linus Dave Jones wrote: > > WARNING: possible recursive locking detected > 4.15.0-rc9-backup-debug+ #1 Not tainted > > sshd/24800 is trying to acquire lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > but task is already holding lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(fs_reclaim); > lock(fs_reclaim); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by sshd/24800: > #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] tcp_sendmsg+0x19/0x40 > #1: (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > stack backtrace: > CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 > Call Trace: > dump_stack+0xbc/0x13f > __lock_acquire+0xa09/0x2040 > lock_acquire+0x12e/0x350 > fs_reclaim_acquire.part.102+0x29/0x30 > kmem_cache_alloc+0x3d/0x2c0 > alloc_extent_state+0xa7/0x410 > __clear_extent_bit+0x3ea/0x570 > try_release_extent_mapping+0x21a/0x260 > __btrfs_releasepage+0xb0/0x1c0 > btrfs_releasepage+0x161/0x170 > try_to_release_page+0x162/0x1c0 > shrink_page_list+0x1d5a/0x2fb0 > shrink_inactive_list+0x451/0x940 > shrink_node_memcg.constprop.88+0x4c9/0x5e0 > shrink_node+0x12d/0x260 > try_to_free_pages+0x418/0xaf0 > __alloc_pages_slowpath+0x976/0x1790 > __alloc_pages_nodemask+0x52c/0x5c0 > new_slab+0x374/0x3f0 > ___slab_alloc.constprop.81+0x47e/0x5a0 >
Re: [4.15-rc9] fs_reclaim lockdep trace
Linus Torvalds wrote: > On Sat, Jan 27, 2018 at 2:24 PM, Dave Jones wrote: >> On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: >> > Just triggered this on a server I was rsync'ing to. >> >> Actually, I can trigger this really easily, even with an rsync from one >> disk to another. Though that also smells a little like networking in >> the traces. Maybe netdev has ideas. > > Is this new to 4.15? Or is it just that you're testing something new? > > If it's new and easy to repro, can you just bisect it? And if it isn't > new, can you perhaps check whether it's new to 4.14 (ie 4.13 being > ok)? > > Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, > but it was rewritten for 4.14.. I'm wondering if that remodeling ended > up triggering something. --- linux-4.13.16/mm/page_alloc.c +++ linux-4.14.15/mm/page_alloc.c @@ -3527,53 +3519,12 @@ return true; } return false; } #endif /* CONFIG_COMPACTION */ -#ifdef CONFIG_LOCKDEP -struct lockdep_map __fs_reclaim_map = - STATIC_LOCKDEP_MAP_INIT("fs_reclaim", &__fs_reclaim_map); - -static bool __need_fs_reclaim(gfp_t gfp_mask) -{ - gfp_mask = current_gfp_context(gfp_mask); - - /* no reclaim without waiting on it */ - if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) - return false; - - /* this guy won't enter reclaim */ - if ((current->flags & PF_MEMALLOC) && !(gfp_mask & __GFP_NOMEMALLOC)) - return false; - - /* We're only interested __GFP_FS allocations for now */ - if (!(gfp_mask & __GFP_FS)) - return false; - - if (gfp_mask & __GFP_NOLOCKDEP) - return false; - - return true; -} - -void fs_reclaim_acquire(gfp_t gfp_mask) -{ - if (__need_fs_reclaim(gfp_mask)) - lock_map_acquire(&__fs_reclaim_map); -} -EXPORT_SYMBOL_GPL(fs_reclaim_acquire); - -void fs_reclaim_release(gfp_t gfp_mask) -{ - if (__need_fs_reclaim(gfp_mask)) - lock_map_release(&__fs_reclaim_map); -} -EXPORT_SYMBOL_GPL(fs_reclaim_release); -#endif - /* Perform direct synchronous page reclaim */ static int __perform_reclaim(gfp_t gfp_mask, unsigned int order, const struct alloc_context *ac) { struct reclaim_state reclaim_state; @@ -3582,21 +3533,21 @@ cond_resched(); /* We now go into synchronous reclaim */ cpuset_memory_pressure_bump(); noreclaim_flag = memalloc_noreclaim_save(); - fs_reclaim_acquire(gfp_mask); + lockdep_set_current_reclaim_state(gfp_mask); reclaim_state.reclaimed_slab = 0; current->reclaim_state = _state; progress = try_to_free_pages(ac->zonelist, order, gfp_mask, ac->nodemask); current->reclaim_state = NULL; - fs_reclaim_release(gfp_mask); + lockdep_clear_current_reclaim_state(); memalloc_noreclaim_restore(noreclaim_flag); cond_resched(); return progress; } > > Adding PeterZ to the participants list in case he has ideas. I'm not > seeing what would be the problem in that call chain from hell. > >Linus Dave Jones wrote: > > WARNING: possible recursive locking detected > 4.15.0-rc9-backup-debug+ #1 Not tainted > > sshd/24800 is trying to acquire lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > but task is already holding lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(fs_reclaim); > lock(fs_reclaim); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by sshd/24800: > #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] tcp_sendmsg+0x19/0x40 > #1: (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > stack backtrace: > CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 > Call Trace: > dump_stack+0xbc/0x13f > __lock_acquire+0xa09/0x2040 > lock_acquire+0x12e/0x350 > fs_reclaim_acquire.part.102+0x29/0x30 > kmem_cache_alloc+0x3d/0x2c0 > alloc_extent_state+0xa7/0x410 > __clear_extent_bit+0x3ea/0x570 > try_release_extent_mapping+0x21a/0x260 > __btrfs_releasepage+0xb0/0x1c0 > btrfs_releasepage+0x161/0x170 > try_to_release_page+0x162/0x1c0 > shrink_page_list+0x1d5a/0x2fb0 > shrink_inactive_list+0x451/0x940 > shrink_node_memcg.constprop.88+0x4c9/0x5e0 > shrink_node+0x12d/0x260 > try_to_free_pages+0x418/0xaf0 > __alloc_pages_slowpath+0x976/0x1790 > __alloc_pages_nodemask+0x52c/0x5c0 > new_slab+0x374/0x3f0 > ___slab_alloc.constprop.81+0x47e/0x5a0 > __slab_alloc.constprop.80+0x32/0x60 >
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sat, Jan 27, 2018 at 2:24 PM, Dave Joneswrote: > On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: > > Just triggered this on a server I was rsync'ing to. > > Actually, I can trigger this really easily, even with an rsync from one > disk to another. Though that also smells a little like networking in > the traces. Maybe netdev has ideas. Is this new to 4.15? Or is it just that you're testing something new? If it's new and easy to repro, can you just bisect it? And if it isn't new, can you perhaps check whether it's new to 4.14 (ie 4.13 being ok)? Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, but it was rewritten for 4.14.. I'm wondering if that remodeling ended up triggering something. Adding PeterZ to the participants list in case he has ideas. I'm not seeing what would be the problem in that call chain from hell. Linus
Re: [4.15-rc9] fs_reclaim lockdep trace
On Sat, Jan 27, 2018 at 2:24 PM, Dave Jones wrote: > On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: > > Just triggered this on a server I was rsync'ing to. > > Actually, I can trigger this really easily, even with an rsync from one > disk to another. Though that also smells a little like networking in > the traces. Maybe netdev has ideas. Is this new to 4.15? Or is it just that you're testing something new? If it's new and easy to repro, can you just bisect it? And if it isn't new, can you perhaps check whether it's new to 4.14 (ie 4.13 being ok)? Because that fs_reclaim_acquire/release() debugging isn't new to 4.15, but it was rewritten for 4.14.. I'm wondering if that remodeling ended up triggering something. Adding PeterZ to the participants list in case he has ideas. I'm not seeing what would be the problem in that call chain from hell. Linus
Re: [4.15-rc9] fs_reclaim lockdep trace
On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: > Just triggered this on a server I was rsync'ing to. Actually, I can trigger this really easily, even with an rsync from one disk to another. Though that also smells a little like networking in the traces. Maybe netdev has ideas. The first instance: > > WARNING: possible recursive locking detected > 4.15.0-rc9-backup-debug+ #1 Not tainted > > sshd/24800 is trying to acquire lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > but task is already holding lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(fs_reclaim); > lock(fs_reclaim); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by sshd/24800: > #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] > tcp_sendmsg+0x19/0x40 > #1: (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > stack backtrace: > CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 > Call Trace: > dump_stack+0xbc/0x13f > ? _atomic_dec_and_lock+0x101/0x101 > ? fs_reclaim_acquire.part.102+0x5/0x30 > ? print_lock+0x54/0x68 > __lock_acquire+0xa09/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? mutex_destroy+0x120/0x120 > ? hlock_class+0xa0/0xa0 > ? kernel_text_address+0x5c/0x90 > ? __kernel_text_address+0xe/0x30 > ? unwind_get_return_address+0x2f/0x50 > ? __save_stack_trace+0x92/0x100 > ? graph_lock+0x8d/0x100 > ? check_noncircular+0x20/0x20 > ? __lock_acquire+0x616/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? __lock_acquire+0x616/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? print_irqtrace_events+0x110/0x110 > ? active_load_balance_cpu_stop+0x7b0/0x7b0 > ? debug_show_all_locks+0x2f0/0x2f0 > ? mark_lock+0x1b1/0xa00 > ? lock_acquire+0x12e/0x350 > lock_acquire+0x12e/0x350 > ? fs_reclaim_acquire.part.102+0x5/0x30 > ? lockdep_rcu_suspicious+0x100/0x100 > ? set_next_entity+0x20e/0x10d0 > ? mark_lock+0x1b1/0xa00 > ? match_held_lock+0x8d/0x440 > ? mark_lock+0x1b1/0xa00 > ? save_trace+0x1e0/0x1e0 > ? print_irqtrace_events+0x110/0x110 > ? alloc_extent_state+0xa7/0x410 > fs_reclaim_acquire.part.102+0x29/0x30 > ? fs_reclaim_acquire.part.102+0x5/0x30 > kmem_cache_alloc+0x3d/0x2c0 > ? rb_erase+0xe63/0x1240 > alloc_extent_state+0xa7/0x410 > ? lock_extent_buffer_for_io+0x3f0/0x3f0 > ? find_held_lock+0x6d/0xd0 > ? test_range_bit+0x197/0x210 > ? lock_acquire+0x350/0x350 > ? do_raw_spin_unlock+0x147/0x220 > ? do_raw_spin_trylock+0x100/0x100 > ? iotree_fs_info+0x30/0x30 > __clear_extent_bit+0x3ea/0x570 > ? clear_state_bit+0x270/0x270 > ? count_range_bits+0x2f0/0x2f0 > ? lock_acquire+0x350/0x350 > ? rb_prev+0x21/0x90 > try_release_extent_mapping+0x21a/0x260 > __btrfs_releasepage+0xb0/0x1c0 > ? btrfs_submit_direct+0xca0/0xca0 > ? check_new_page_bad+0x1f0/0x1f0 > ? match_held_lock+0xa5/0x440 > ? debug_show_all_locks+0x2f0/0x2f0 > btrfs_releasepage+0x161/0x170 > ? __btrfs_releasepage+0x1c0/0x1c0 > ? page_rmapping+0xd0/0xd0 > ? rmap_walk+0x100/0x100 > try_to_release_page+0x162/0x1c0 > ? generic_file_write_iter+0x3c0/0x3c0 > ? page_evictable+0xcc/0x110 > ? lookup_address_in_pgd+0x107/0x190 > shrink_page_list+0x1d5a/0x2fb0 > ? putback_lru_page+0x3f0/0x3f0 > ? save_trace+0x1e0/0x1e0 > ? _lookup_address_cpa.isra.13+0x40/0x60 > ? debug_show_all_locks+0x2f0/0x2f0 > ? kmem_cache_free+0x8c/0x280 > ? free_extent_state+0x1c8/0x3b0 > ? mark_lock+0x1b1/0xa00 > ? page_rmapping+0xd0/0xd0 > ? print_irqtrace_events+0x110/0x110 > ? shrink_node_memcg.constprop.88+0x4c9/0x5e0 > ? shrink_node+0x12d/0x260 > ? try_to_free_pages+0x418/0xaf0 > ? __alloc_pages_slowpath+0x976/0x1790 > ? __alloc_pages_nodemask+0x52c/0x5c0 > ? delete_node+0x28d/0x5c0 > ? find_held_lock+0x6d/0xd0 > ? free_pcppages_bulk+0x381/0x570 > ? lock_acquire+0x350/0x350 > ? do_raw_spin_unlock+0x147/0x220 > ? do_raw_spin_trylock+0x100/0x100 > ? __lock_is_held+0x51/0xc0 > ? _raw_spin_unlock+0x24/0x30 > ? free_pcppages_bulk+0x381/0x570 > ? mark_lock+0x1b1/0xa00 > ? free_compound_page+0x30/0x30 > ? print_irqtrace_events+0x110/0x110 > ? __kernel_map_pages+0x2c9/0x310 > ? mark_lock+0x1b1/0xa00 > ? print_irqtrace_events+0x110/0x110 > ? __delete_from_page_cache+0x2e7/0x4e0 > ? save_trace+0x1e0/0x1e0 > ? __add_to_page_cache_locked+0x680/0x680 > ? find_held_lock+0x6d/0xd0 > ? __list_add_valid+0x29/0xa0 > ? free_unref_page_commit+0x198/0x270 > ? drain_local_pages_wq+0x20/0x20 > ? stop_critical_timings+0x210/0x210 > ? mark_lock+0x1b1/0xa00 > ? mark_lock+0x1b1/0xa00 > ?
Re: [4.15-rc9] fs_reclaim lockdep trace
On Tue, Jan 23, 2018 at 08:36:51PM -0500, Dave Jones wrote: > Just triggered this on a server I was rsync'ing to. Actually, I can trigger this really easily, even with an rsync from one disk to another. Though that also smells a little like networking in the traces. Maybe netdev has ideas. The first instance: > > WARNING: possible recursive locking detected > 4.15.0-rc9-backup-debug+ #1 Not tainted > > sshd/24800 is trying to acquire lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > but task is already holding lock: > (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > other info that might help us debug this: > Possible unsafe locking scenario: > >CPU0 > > lock(fs_reclaim); > lock(fs_reclaim); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 2 locks held by sshd/24800: > #0: (sk_lock-AF_INET6){+.+.}, at: [<1a069652>] > tcp_sendmsg+0x19/0x40 > #1: (fs_reclaim){+.+.}, at: [<84f438c2>] > fs_reclaim_acquire.part.102+0x5/0x30 > > stack backtrace: > CPU: 3 PID: 24800 Comm: sshd Not tainted 4.15.0-rc9-backup-debug+ #1 > Call Trace: > dump_stack+0xbc/0x13f > ? _atomic_dec_and_lock+0x101/0x101 > ? fs_reclaim_acquire.part.102+0x5/0x30 > ? print_lock+0x54/0x68 > __lock_acquire+0xa09/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? mutex_destroy+0x120/0x120 > ? hlock_class+0xa0/0xa0 > ? kernel_text_address+0x5c/0x90 > ? __kernel_text_address+0xe/0x30 > ? unwind_get_return_address+0x2f/0x50 > ? __save_stack_trace+0x92/0x100 > ? graph_lock+0x8d/0x100 > ? check_noncircular+0x20/0x20 > ? __lock_acquire+0x616/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? __lock_acquire+0x616/0x2040 > ? debug_show_all_locks+0x2f0/0x2f0 > ? print_irqtrace_events+0x110/0x110 > ? active_load_balance_cpu_stop+0x7b0/0x7b0 > ? debug_show_all_locks+0x2f0/0x2f0 > ? mark_lock+0x1b1/0xa00 > ? lock_acquire+0x12e/0x350 > lock_acquire+0x12e/0x350 > ? fs_reclaim_acquire.part.102+0x5/0x30 > ? lockdep_rcu_suspicious+0x100/0x100 > ? set_next_entity+0x20e/0x10d0 > ? mark_lock+0x1b1/0xa00 > ? match_held_lock+0x8d/0x440 > ? mark_lock+0x1b1/0xa00 > ? save_trace+0x1e0/0x1e0 > ? print_irqtrace_events+0x110/0x110 > ? alloc_extent_state+0xa7/0x410 > fs_reclaim_acquire.part.102+0x29/0x30 > ? fs_reclaim_acquire.part.102+0x5/0x30 > kmem_cache_alloc+0x3d/0x2c0 > ? rb_erase+0xe63/0x1240 > alloc_extent_state+0xa7/0x410 > ? lock_extent_buffer_for_io+0x3f0/0x3f0 > ? find_held_lock+0x6d/0xd0 > ? test_range_bit+0x197/0x210 > ? lock_acquire+0x350/0x350 > ? do_raw_spin_unlock+0x147/0x220 > ? do_raw_spin_trylock+0x100/0x100 > ? iotree_fs_info+0x30/0x30 > __clear_extent_bit+0x3ea/0x570 > ? clear_state_bit+0x270/0x270 > ? count_range_bits+0x2f0/0x2f0 > ? lock_acquire+0x350/0x350 > ? rb_prev+0x21/0x90 > try_release_extent_mapping+0x21a/0x260 > __btrfs_releasepage+0xb0/0x1c0 > ? btrfs_submit_direct+0xca0/0xca0 > ? check_new_page_bad+0x1f0/0x1f0 > ? match_held_lock+0xa5/0x440 > ? debug_show_all_locks+0x2f0/0x2f0 > btrfs_releasepage+0x161/0x170 > ? __btrfs_releasepage+0x1c0/0x1c0 > ? page_rmapping+0xd0/0xd0 > ? rmap_walk+0x100/0x100 > try_to_release_page+0x162/0x1c0 > ? generic_file_write_iter+0x3c0/0x3c0 > ? page_evictable+0xcc/0x110 > ? lookup_address_in_pgd+0x107/0x190 > shrink_page_list+0x1d5a/0x2fb0 > ? putback_lru_page+0x3f0/0x3f0 > ? save_trace+0x1e0/0x1e0 > ? _lookup_address_cpa.isra.13+0x40/0x60 > ? debug_show_all_locks+0x2f0/0x2f0 > ? kmem_cache_free+0x8c/0x280 > ? free_extent_state+0x1c8/0x3b0 > ? mark_lock+0x1b1/0xa00 > ? page_rmapping+0xd0/0xd0 > ? print_irqtrace_events+0x110/0x110 > ? shrink_node_memcg.constprop.88+0x4c9/0x5e0 > ? shrink_node+0x12d/0x260 > ? try_to_free_pages+0x418/0xaf0 > ? __alloc_pages_slowpath+0x976/0x1790 > ? __alloc_pages_nodemask+0x52c/0x5c0 > ? delete_node+0x28d/0x5c0 > ? find_held_lock+0x6d/0xd0 > ? free_pcppages_bulk+0x381/0x570 > ? lock_acquire+0x350/0x350 > ? do_raw_spin_unlock+0x147/0x220 > ? do_raw_spin_trylock+0x100/0x100 > ? __lock_is_held+0x51/0xc0 > ? _raw_spin_unlock+0x24/0x30 > ? free_pcppages_bulk+0x381/0x570 > ? mark_lock+0x1b1/0xa00 > ? free_compound_page+0x30/0x30 > ? print_irqtrace_events+0x110/0x110 > ? __kernel_map_pages+0x2c9/0x310 > ? mark_lock+0x1b1/0xa00 > ? print_irqtrace_events+0x110/0x110 > ? __delete_from_page_cache+0x2e7/0x4e0 > ? save_trace+0x1e0/0x1e0 > ? __add_to_page_cache_locked+0x680/0x680 > ? find_held_lock+0x6d/0xd0 > ? __list_add_valid+0x29/0xa0 > ? free_unref_page_commit+0x198/0x270 > ? drain_local_pages_wq+0x20/0x20 > ? stop_critical_timings+0x210/0x210 > ? mark_lock+0x1b1/0xa00 > ? mark_lock+0x1b1/0xa00 > ?