Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, Jan 12, 2021 at 2:44 AM Jan Kara wrote: > > On Fri 08-01-21 18:04:21, Linus Torvalds wrote: > > > > Oh, and Michael Larabel (of phoronix) reports that that one-liner does > > something bad to a few PostgreSQL tests, on the order of 5-10% > > regression on some machines (but apparently not others). > > Do you have more details? From my experience (we do regular pgbench runs > for various kernels in various configs in SUSE) PostgreSQL numbers tend to > be somewhat noisy and more dependent on CPU scheduling and NUMA locality > than anything else. But it very much depends on the exact config passed to > pgbench so that's why I'm asking... No, I don't really have many more details. I don't have things like raw numbers or exact configurations, but Michael has been very responsive if you ask, so if you are interested, just email him at Michael Larabel . It wasn't NUMA - apparently the machines he saw this on were just plain consumer setups, and his larger machines didn't actually show the effect. But yes, I suspect it was some scheduling artifact, and probably just fairly random noise from just changing the scheduling pattern a bit. Linus
Re: kernel BUG at mm/page-writeback.c:LINE!
On Fri 08-01-21 18:04:21, Linus Torvalds wrote: > On Tue, Jan 5, 2021 at 11:53 AM Linus Torvalds > wrote: > > > > I took your "way to go" statement as an ack, and made it all be commit > > c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple > > pending writebacks"). > > Oh, and Michael Larabel (of phoronix) reports that that one-liner does > something bad to a few PostgreSQL tests, on the order of 5-10% > regression on some machines (but apparently not others). Do you have more details? From my experience (we do regular pgbench runs for various kernels in various configs in SUSE) PostgreSQL numbers tend to be somewhat noisy and more dependent on CPU scheduling and NUMA locality than anything else. But it very much depends on the exact config passed to pgbench so that's why I'm asking... > I suspect that's a sign of instability in the benchmark numbers, but > it probably also means that we have some silly condition where > multiple threads want to clean the same page. > > I sent him a patch to try if it ends up being better to just not wake > things up early at all (instead of the "if" -> "while") conversion. > That trivial patch appended here in case anybody has comments. > > Just the fact that that one-liner made a performance impact makes me > go "hmm", though. Michael didn't see the BUG_ON(), so it's presumably > some _other_ user of wait_on_page_writeback() than the > write_cache_pages() one that causes issues. > > Anybody got any suspicions? Honestly, when working on the page wait > queues, I was working under the assumption that it's really just the > page lock that truly matters. > > I'm thinking things like __filemap_fdatawait_range(), which doesn't > hold the page lock at all, so it's all kinds of non-serialized, and > could now be waiting for any number of IO's ro complete.. > > Oh well. This email doesn't really have a point, it's more of a > heads-up that that "wait to see one or multiple writebacks" thing > seems to matter more than I would have expected for some loads.. Honestly I'm surprised your patch made a difference as well. It is pretty common a page gets redirtied while it's being written back but usually it takes time before next writeback of the page is started. But I guess with the DB load it is possible e.g. if we frequently flush out some page for data consistency reasons (I know PostgreSQL is using sync_file_range(2) interface to start flushing pages early and then uses fsync(2) when it really needs the pages written which could create a situation with unfair treatment of PageWriteback bit). Honza -- Jan Kara SUSE Labs, CR
Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, Jan 5, 2021 at 11:53 AM Linus Torvalds wrote: > > I took your "way to go" statement as an ack, and made it all be commit > c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple > pending writebacks"). Oh, and Michael Larabel (of phoronix) reports that that one-liner does something bad to a few PostgreSQL tests, on the order of 5-10% regression on some machines (but apparently not others). I suspect that's a sign of instability in the benchmark numbers, but it probably also means that we have some silly condition where multiple threads want to clean the same page. I sent him a patch to try if it ends up being better to just not wake things up early at all (instead of the "if" -> "while") conversion. That trivial patch appended here in case anybody has comments. Just the fact that that one-liner made a performance impact makes me go "hmm", though. Michael didn't see the BUG_ON(), so it's presumably some _other_ user of wait_on_page_writeback() than the write_cache_pages() one that causes issues. Anybody got any suspicions? Honestly, when working on the page wait queues, I was working under the assumption that it's really just the page lock that truly matters. I'm thinking things like __filemap_fdatawait_range(), which doesn't hold the page lock at all, so it's all kinds of non-serialized, and could now be waiting for any number of IO's ro complete.. Oh well. This email doesn't really have a point, it's more of a heads-up that that "wait to see one or multiple writebacks" thing seems to matter more than I would have expected for some loads.. Linus patch Description: Binary data
Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, Jan 05, 2021 at 01:22:49PM -0800, Linus Torvalds wrote: > On Tue, Jan 5, 2021 at 1:13 PM Hugh Dickins wrote: > > > > I was going to raise a question, whether you should now revert > > 073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"): > > which would not have gone in like that if c2407cf7d22d were already in. > > Honestly, even if it wasn't for that PageTail issue, I think > 073861ed77b6 is just the right thing to do anyway. It just feels _so_ > much safer to not have the possibility of that page wait thing > following while the page is possibly then being free'd and re-used at > the same time. > > So I think the only reason to revert that commit would be if we were > to find that it's a huge performance problem to raise the page > refcount temporarily. Which I think is very unlikely (since we already > dirty the page structure due to the page flags modification - although > they are far enough apart that it might be a different cacheline). struct pages _tend_ to be 64 bytes on 64-bit platforms (and i suspect you're long past caring about performance on 32-bit platforms).
Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, Jan 5, 2021 at 1:13 PM Hugh Dickins wrote: > > I was going to raise a question, whether you should now revert > 073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"): > which would not have gone in like that if c2407cf7d22d were already in. Honestly, even if it wasn't for that PageTail issue, I think 073861ed77b6 is just the right thing to do anyway. It just feels _so_ much safer to not have the possibility of that page wait thing following while the page is possibly then being free'd and re-used at the same time. So I think the only reason to revert that commit would be if we were to find that it's a huge performance problem to raise the page refcount temporarily. Which I think is very unlikely (since we already dirty the page structure due to the page flags modification - although they are far enough apart that it might be a different cacheline). Linus
Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, 5 Jan 2021, Linus Torvalds wrote: > On Tue, Jan 5, 2021 at 11:31 AM Linus Torvalds > wrote: > > On Mon, Jan 4, 2021 at 7:29 PM Hugh Dickins wrote: > > > > > > > So the one-liner of changing the "if" to "while" in > > > > wait_on_page_writeback() should get us back to what we used to do. > > > > > > I think that is the realistic way to go. > > > > Yeah, that's what I'll do. > > I took your "way to go" statement as an ack, and made it all be commit > c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple > pending writebacks"). Great, thanks, I see it now. I was going to raise a question, whether you should now revert 073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)"): which would not have gone in like that if c2407cf7d22d were already in. But if it were reverted, we'd need some other fix for the PageTail part of it; and I still cannot think of anywhere else where we knowingly operated on a struct page without holding a reference; and there have been no adverse reports on its extra get_page+put_page. So I think it's safest to leave it in. Hugh
Re: kernel BUG at mm/page-writeback.c:LINE!
On Tue, Jan 5, 2021 at 11:31 AM Linus Torvalds wrote: > > On Mon, Jan 4, 2021 at 7:29 PM Hugh Dickins wrote: > > > > > So the one-liner of changing the "if" to "while" in > > > wait_on_page_writeback() should get us back to what we used to do. > > > > I think that is the realistic way to go. > > Yeah, that's what I'll do. I took your "way to go" statement as an ack, and made it all be commit c2407cf7d22d ("mm: make wait_on_page_writeback() wait for multiple pending writebacks"). Linus
Re: kernel BUG at mm/page-writeback.c:LINE!
On Mon, Jan 4, 2021 at 7:29 PM Hugh Dickins wrote: > > > But I feel it's really that end_page_writeback() itself is > > fundamentally buggy, because the "wakeup" is not atomic with the bit > > clearing _and_ it doesn't actually hold the page lock that is > > allegedly serializing this all. > > And we won't be adding a lock_page() into end_page_writeback()! Right. However, the more I think about this, the more I feel end_page_writeback() is kind of ok, and the real problem is that we should have had a "lock/unlock" pattern for the PG_writeback bit instead. Instead, what we have is basically a special "wait for bit to clear", and then external sychronization - even if lacking - for set/clear bit. And the "wait for bit to clear" and "set bit" aren't even adjacent - the waiting happens in write_cache_pages(), but then the actual setting happens later in the actual "(*writepage)()" function. What _would_ be nicer, I feel, is if write_cache_pages() simply did the equivalent of "lock_page()" except for setting the PG_writeback bit. The whole "wait for bit to clear" is fundamentally a much more ambiguous thing for that whole "what if somebody else got it" reason, but it's also nasty because it can be very unfair (the same way our "lock_page()" itself used to be very unfair). IOW, you can have that situation where one actor continually waits for the bit to clear, but then somebody else comes in and gets the bit instead. Of course, writeback is much less likely than lock_page(), so it probably doesn't happen much in practice. And honestly, while I think it would be good to make the rule be "writepage function was entered with the writeback bit already held", that kind of change would be a major pain. We have at leastr 49 different 'writepage' implementations, and while I think a few of them end up just being block_write_full_page(), it means that we have a lot of that BUG_ON(PageWriteback(page)); set_page_writeback(page); pattern in various random filesystem code that all would have to be changed. So I think I know what I'd _like_ the code to be like, but I don't think it's worth it. > > So the one-liner of changing the "if" to "while" in > > wait_on_page_writeback() should get us back to what we used to do. > > I think that is the realistic way to go. Yeah, that's what I'll do. > > Which makes me think that the best option would actually be to move > > the bit clearing _into_ wake_up_page(). But that looks like a very big > > change. See above - having thought about this overnight, I don't think this is even the best change. > But I expect you to take one look and quickly opt instead for the "while" > in wait_on_page_writeback(): which is not so bad now that you've shown a > scenario which justifies it. [ patch removed - I agree, this pain isn't worth it ] Linus
Re: kernel BUG at mm/page-writeback.c:LINE!
On Mon, 4 Jan 2021, Linus Torvalds wrote: > On Mon, Jan 4, 2021 at 12:41 PM Andrew Morton > wrote: > > > Linus, how confident are you in those wait_on_page_bit_common() > > changes? > > Pretty confident. The atomicity of the bitops themselves is fairly simple. > > But in the writeback bit? No. The old code would basically _loop_ if > it was woken up and the writeback bit was set again, and would hide > any problems with it. > > The new code basically goes "ok, the writeback bit was clear at one > point, so I've waited enough". > > We could easily turn the "if ()" in wait_on_page_writeback() into a "while()". > > But honestly, it does smell to me like the bug is always in the caller > not having serialized with whatever actually starts writeback. High > figured out one such case. > > This code holds the page lock, but I don't see where > set_page_writeback() would always be done with the page lock held. So > what really protects against PG_writeback simply being set again? > > The whole BUG_ON() seems entirely buggy to me. > > In fact, even if you hold the page lock while doing > set_page_writeback(), since the actual IO does *NOT* hold the page > lock, the unlock happens without it. So even if every single case of > setting the page writeback were to hold the page lock, I did an audit when this came up before, and though not 100% confident in my diligence, it certainly looked that way; and others looked too (IIRC Matthew had a patch to add a WARN_ON_ONCE or whatever, but that didn't go upstream). > what keeps this from happening: > > CPU1 = end previous writeback > CPU2 = start new writeback under page lock > CPU3 = write_cache_pages() > > CPU1 CPU2CPU3 > > > end_page_writeback() > test_clear_page_writeback(page) > ... delayed... > > > lock_page(); > set_page_writeback() > unlock_page() > > > lock_page() > wait_on_page_writeback(); > > wake_up_page(page, PG_writeback); > .. wakes up CPU3 .. > > BUG_ON(PageWriteback(page)); > > IOW, that BUG_ON() really feels entirely bogus to me. Notice how it > wasn't actually serialized with the waking up of the _previous_ bit. Well. That looks so obvious now you suggest it, that I feel very stupid for not seeing it before, so have tried hard to disprove you. But I think you're right. > > Could we make the wait_on_page_writeback() just loop if it sees the > page under writeback again? Sure. > > Could we make the wait_on_page_bit_common() code say "if this is > PG_writeback, I won't wake it up after all, because the bit is set > again?" Sure. > > But I feel it's really that end_page_writeback() itself is > fundamentally buggy, because the "wakeup" is not atomic with the bit > clearing _and_ it doesn't actually hold the page lock that is > allegedly serializing this all. And we won't be adding a lock_page() into end_page_writeback()! > > That raciness was what caused the "stale wakeup from previous owner" > thing too. And I think that Hugh fixed the page re-use case, but the > fundamental problem of end_page_writeback() kind of remained. > > And yes, I think this was all hidden by wait_on_page_writeback() > effectively looping over the "PageWriteback(page)" test because of how > wait_on_page_bit() worked. > > So the one-liner of changing the "if" to "while" in > wait_on_page_writeback() should get us back to what we used to do. I think that is the realistic way to go. > > Except I still get the feeling that the bug really is not in > wait_on_page_writeback(), but in the end_page_writeback() side. > > Comments? I'm perfectly happy doing the one-liner. I would just be > _happier_ with end_page_writeback() having the serialization.. > > The real problem is that "wake_up_page(page, bit)" is not the thing > that actually clears the bit. So there's a fundamental race between > clearing the bit and waking something up. > > Which makes me think that the best option would actually be to move > the bit clearing _into_ wake_up_page(). But that looks like a very big > change. I'll be surprised if that direction is even possible, without unpleasant extra locking. If there were only one wakeup to be done, perhaps, but potentially there are many. When I looked before, it seemed that the clear bit needs to come before the wakeup, and the wakeup needs to come before the clear bit. And the BOOKMARK case drops q->lock. It should be possible to rely on the XArray's i_pages lock rather than the page lock for serialization, much as I did in one variant of the patch I sent originally. Updated version appended below for show (most of it rearrangement+cleanup rather than the functional change); but I think it's slightly incomplete (__test_set_page_writeback() should take i_pages lock even in the !mapping_use_writeback_tags case); and
Re: kernel BUG at mm/page-writeback.c:LINE!
On Mon, Jan 4, 2021 at 12:41 PM Andrew Morton wrote: > > > > > kernel BUG at mm/page-writeback.c:2241! > > Call Trace: > > mpage_writepages+0xd8/0x230 fs/mpage.c:714 > > do_writepages+0xec/0x290 mm/page-writeback.c:2352 > > __filemap_fdatawrite_range+0x2a1/0x380 mm/filemap.c:422 > > fat_cont_expand+0x169/0x230 fs/fat/file.c:235 > > fat_setattr+0xac2/0xf40 fs/fat/file.c:501 > > notify_change+0xb60/0x10a0 fs/attr.c:336 > > do_truncate+0x134/0x1f0 fs/open.c:64 > > do_sys_ftruncate+0x703/0x860 fs/open.c:195 > > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > Well that's exciting. write_cache_pages() does: > > if (PageWriteback(page)) { > if (wbc->sync_mode != WB_SYNC_NONE) > wait_on_page_writeback(page); > else > goto continue_unlock; > } > > bang -->> BUG_ON(PageWriteback(page)); This is the same situation that was discussed earlier, and that we thought Hugh commit 073861ed77b6 ("mm: fix VM_BUG_ON(PageTail) and BUG_ON(PageWriteback)") should fix. Basically, the theory was that the sequence if (PageWriteback(page)) wait_on_page_writeback(page); BUG_ON(PageWriteback(page)); could have the wake-up of the _previous_ owner of the page happen, and wake up the wait_on_page_writeback() waiter, but then a new owner of the page would re-allocate it and mark it for writeback. > So either wait_on_page_writeback() simply failed to work (returned > without waiting) or someone came in and unexpectedly set PG_writeback a > second time. So Hugh found at least _one_ way that that "someone came in and unexpectedly set PG_writeback a second time" could happen. But that fix by Hugh is already in that HEAD commit that syzbot is now reporting, so there's something else going on. > Linus, how confident are you in those wait_on_page_bit_common() > changes? Pretty confident. The atomicity of the bitops themselves is fairly simple. But in the writeback bit? No. The old code would basically _loop_ if it was woken up and the writeback bit was set again, and would hide any problems with it. The new code basically goes "ok, the writeback bit was clear at one point, so I've waited enough". We could easily turn the "if ()" in wait_on_page_writeback() into a "while()". But honestly, it does smell to me like the bug is always in the caller not having serialized with whatever actually starts writeback. High figured out one such case. This code holds the page lock, but I don't see where set_page_writeback() would always be done with the page lock held. So what really protects against PG_writeback simply being set again? The whole BUG_ON() seems entirely buggy to me. In fact, even if you hold the page lock while doing set_page_writeback(), since the actual IO does *NOT* hold the page lock, the unlock happens without it. So even if every single case of setting the page writeback were to hold the page lock, what keeps this from happening: CPU1 = end previous writeback CPU2 = start new writeback under page lock CPU3 = write_cache_pages() CPU1 CPU2CPU3 end_page_writeback() test_clear_page_writeback(page) ... delayed... lock_page(); set_page_writeback() unlock_page() lock_page() wait_on_page_writeback(); wake_up_page(page, PG_writeback); .. wakes up CPU3 .. BUG_ON(PageWriteback(page)); IOW, that BUG_ON() really feels entirely bogus to me. Notice how it wasn't actually serialized with the waking up of the _previous_ bit. Could we make the wait_on_page_writeback() just loop if it sees the page under writeback again? Sure. Could we make the wait_on_page_bit_common() code say "if this is PG_writeback, I won't wake it up after all, because the bit is set again?" Sure. But I feel it's really that end_page_writeback() itself is fundamentally buggy, because the "wakeup" is not atomic with the bit clearing _and_ it doesn't actually hold the page lock that is allegedly serializing this all. That raciness was what caused the "stale wakeup from previous owner" thing too. And I think that Hugh fixed the page re-use case, but the fundamental problem of end_page_writeback() kind of remained. And yes, I think this was all hidden by wait_on_page_writeback() effectively looping over the "PageWriteback(page)" test because of how wait_on_page_bit() worked. So the one-liner of changing the "if" to "while" in wait_on_page_writeback() should get us back to what we used to do. Except I still get the feeling that the bug really is not in wait_on_page_writeback(), but in the end_page_writeback() side.
Re: kernel BUG at mm/page-writeback.c:LINE!
On Sun, 03 Jan 2021 06:19:15 -0800 syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit:139711f0 Merge branch 'akpm' (patches from Andrew) > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=11648e9350 > kernel config: https://syzkaller.appspot.com/x/.config?x=f3e74a3fb99ae2c2 > dashboard link: https://syzkaller.appspot.com/bug?extid=2fc0712f8f8b8b8fa0ef > compiler: gcc (GCC) 10.1.0-syz 20200507 > > Unfortunately, I don't have any reproducer for this issue yet. > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+2fc0712f8f8b8b8fa...@syzkaller.appspotmail.com > > kernel BUG at mm/page-writeback.c:2241! > invalid opcode: [#1] PREEMPT SMP KASAN > CPU: 1 PID: 26311 Comm: syz-executor.2 Not tainted 5.11.0-rc1-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:write_cache_pages+0xd06/0x1190 mm/page-writeback.c:2241 > Code: 02 00 00 e8 dc 6e da ff 48 c7 c6 80 64 54 89 4c 89 ef e8 0d 04 0b 00 0f > 0b 49 c7 c4 c0 77 f7 8e e9 ce f9 ff ff e8 ba 6e da ff <0f> 0b e8 b3 6e da ff > 49 8d 5c 24 ff e9 c5 f8 ff ff e8 a4 6e da ff > RSP: 0018:c9000931f850 EFLAGS: 00010212 > RAX: ac05 RBX: 8000 RCX: c9000d7c1000 > RDX: 0004 RSI: 819805d6 RDI: 0003 > RBP: R08: R09: eac0ce07 > R10: 8197fee3 R11: R12: ea0002762cc8 > R13: eac0ce00 R14: dc00 R15: > FS: 7f653a526700() GS:8880b9f0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 0052e94c CR3: 15251000 CR4: 00350ee0 > Call Trace: > mpage_writepages+0xd8/0x230 fs/mpage.c:714 > do_writepages+0xec/0x290 mm/page-writeback.c:2352 > __filemap_fdatawrite_range+0x2a1/0x380 mm/filemap.c:422 > fat_cont_expand+0x169/0x230 fs/fat/file.c:235 > fat_setattr+0xac2/0xf40 fs/fat/file.c:501 > notify_change+0xb60/0x10a0 fs/attr.c:336 > do_truncate+0x134/0x1f0 fs/open.c:64 > do_sys_ftruncate+0x703/0x860 fs/open.c:195 > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 > entry_SYSCALL_64_after_hwframe+0x44/0xa9 > RIP: 0033:0x45e299 > Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 > 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f > 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:7f653a525c68 EFLAGS: 0246 ORIG_RAX: 004d > RAX: ffda RBX: 0002 RCX: 0045e299 > RDX: RSI: 0800 RDI: 0003 > RBP: 0119c060 R08: R09: > R10: R11: 0246 R12: 0119c034 > R13: 7fff76bf2e8f R14: 7f653a5269c0 R15: 0119c034 > Modules linked in: > ---[ end trace 23c7a881902c6de9 ]--- > RIP: 0010:write_cache_pages+0xd06/0x1190 mm/page-writeback.c:2241 > Code: 02 00 00 e8 dc 6e da ff 48 c7 c6 80 64 54 89 4c 89 ef e8 0d 04 0b 00 0f > 0b 49 c7 c4 c0 77 f7 8e e9 ce f9 ff ff e8 ba 6e da ff <0f> 0b e8 b3 6e da ff > 49 8d 5c 24 ff e9 c5 f8 ff ff e8 a4 6e da ff > RSP: 0018:c9000931f850 EFLAGS: 00010212 > RAX: ac05 RBX: 8000 RCX: c9000d7c1000 > RDX: 0004 RSI: 819805d6 RDI: 0003 > RBP: R08: R09: eac0ce07 > R10: 8197fee3 R11: R12: ea0002762cc8 > R13: eac0ce00 R14: dc00 R15: > FS: 7f653a526700() GS:8880b9f0() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 001b2f228000 CR3: 15251000 CR4: 00350ee0 Well that's exciting. write_cache_pages() does: if (PageWriteback(page)) { if (wbc->sync_mode != WB_SYNC_NONE) wait_on_page_writeback(page); else goto continue_unlock; } bang -->> BUG_ON(PageWriteback(page)); and filemap_fdatawrite_range() passed WB_SYNC_ALL to __filemap_fdatawrite_range(). So either wait_on_page_writeback() simply failed to work (returned without waiting) or someone came in and unexpectedly set PG_writeback a second time. And because it isn't reproducible, we don't know if it's a -mm patch or it it's a mainline bug or whatever. There isn't much material in -mm at present and I can't see any which would affect these things, so is it reasonable to suspect that this is a mainline bug? Linus, how confident are you in those wait_on_page_bit_common() changes?
kernel BUG at mm/page-writeback.c:LINE!
Hello, syzbot found the following issue on: HEAD commit:139711f0 Merge branch 'akpm' (patches from Andrew) git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=11648e9350 kernel config: https://syzkaller.appspot.com/x/.config?x=f3e74a3fb99ae2c2 dashboard link: https://syzkaller.appspot.com/bug?extid=2fc0712f8f8b8b8fa0ef compiler: gcc (GCC) 10.1.0-syz 20200507 Unfortunately, I don't have any reproducer for this issue yet. IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+2fc0712f8f8b8b8fa...@syzkaller.appspotmail.com kernel BUG at mm/page-writeback.c:2241! invalid opcode: [#1] PREEMPT SMP KASAN CPU: 1 PID: 26311 Comm: syz-executor.2 Not tainted 5.11.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:write_cache_pages+0xd06/0x1190 mm/page-writeback.c:2241 Code: 02 00 00 e8 dc 6e da ff 48 c7 c6 80 64 54 89 4c 89 ef e8 0d 04 0b 00 0f 0b 49 c7 c4 c0 77 f7 8e e9 ce f9 ff ff e8 ba 6e da ff <0f> 0b e8 b3 6e da ff 49 8d 5c 24 ff e9 c5 f8 ff ff e8 a4 6e da ff RSP: 0018:c9000931f850 EFLAGS: 00010212 RAX: ac05 RBX: 8000 RCX: c9000d7c1000 RDX: 0004 RSI: 819805d6 RDI: 0003 RBP: R08: R09: eac0ce07 R10: 8197fee3 R11: R12: ea0002762cc8 R13: eac0ce00 R14: dc00 R15: FS: 7f653a526700() GS:8880b9f0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0052e94c CR3: 15251000 CR4: 00350ee0 Call Trace: mpage_writepages+0xd8/0x230 fs/mpage.c:714 do_writepages+0xec/0x290 mm/page-writeback.c:2352 __filemap_fdatawrite_range+0x2a1/0x380 mm/filemap.c:422 fat_cont_expand+0x169/0x230 fs/fat/file.c:235 fat_setattr+0xac2/0xf40 fs/fat/file.c:501 notify_change+0xb60/0x10a0 fs/attr.c:336 do_truncate+0x134/0x1f0 fs/open.c:64 do_sys_ftruncate+0x703/0x860 fs/open.c:195 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45e299 Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7f653a525c68 EFLAGS: 0246 ORIG_RAX: 004d RAX: ffda RBX: 0002 RCX: 0045e299 RDX: RSI: 0800 RDI: 0003 RBP: 0119c060 R08: R09: R10: R11: 0246 R12: 0119c034 R13: 7fff76bf2e8f R14: 7f653a5269c0 R15: 0119c034 Modules linked in: ---[ end trace 23c7a881902c6de9 ]--- RIP: 0010:write_cache_pages+0xd06/0x1190 mm/page-writeback.c:2241 Code: 02 00 00 e8 dc 6e da ff 48 c7 c6 80 64 54 89 4c 89 ef e8 0d 04 0b 00 0f 0b 49 c7 c4 c0 77 f7 8e e9 ce f9 ff ff e8 ba 6e da ff <0f> 0b e8 b3 6e da ff 49 8d 5c 24 ff e9 c5 f8 ff ff e8 a4 6e da ff RSP: 0018:c9000931f850 EFLAGS: 00010212 RAX: ac05 RBX: 8000 RCX: c9000d7c1000 RDX: 0004 RSI: 819805d6 RDI: 0003 RBP: R08: R09: eac0ce07 R10: 8197fee3 R11: R12: ea0002762cc8 R13: eac0ce00 R14: dc00 R15: FS: 7f653a526700() GS:8880b9f0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 001b2f228000 CR3: 15251000 CR4: 00350ee0 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot.