[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #71 from Guido (guido.iod...@gmail.com) --- Since the problem occurred with the 5.18 kernel, I assume there is something wrong after that version. I think the analysis of the problem should start with the changes that occurred with version 5.18. At the moment, I am still stuck on version 5.17.9 for this very reason, which in fact prevents me from using the PC with more recent versions of the kernel. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #72 from Guido (guido.iod...@gmail.com) --- I tried kernel 6.0.6, after 2 days the problem reoccurred :-( -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #73 from Guido (guido.iod...@gmail.com) --- just to try it out, I will now give background_gc=sync a chance -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #74 from Matteo Croce (rootki...@yahoo.it) --- Hi all, The only way to find where the issue is, is to bisect from the latest working kernel to the first non working one -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #75 from bogdan.nico...@gmail.com --- Guido just pointed that out in #71: the issue appeared since 5.18 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #76 from Guido (guido.iod...@gmail.com) --- (In reply to Matteo Croce from comment #74) > Hi all, > > The only way to find where the issue is, is to bisect from the latest > working kernel to the first non working one the last working was 5.17.15, the first with the bug is 5.18 I tried to give a look to the diff in gc.c file in kernel, they are very few, maybe the problem is related to GC_URGENT_HIGH / MID mechanism... -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #77 from Matteo Croce (rootki...@yahoo.it) --- Great. If you do a bisect, you will find the problem in, let's say, 14 steps. Really worth a try. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #78 from Guido (guido.iod...@gmail.com) --- I tried background_gc=sync. It doesn't solve the problem... -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #79 from Jaegeuk Kim (jaeg...@kernel.org) --- I just had some time to think of this issue, and suspect there was no time to reschedule the cpu in the loop? Can anyone try this change? diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index a71e818cd67b..c351c3269874 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1325,6 +1325,7 @@ struct page *f2fs_get_lock_data_page(struct inode *inode, pgoff_t index, lock_page(page); if (unlikely(page->mapping != mapping)) { f2fs_put_page(page, 1); + f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); goto repeat; } if (unlikely(!PageUptodate(page))) { -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #80 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #79) I tried to apply it to kernel 6.0.8 but failed. I found the code at row 1313 so I can try to apply there. But there is another identical code at row 3568. Do we need to patch also there? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #81 from Jaegeuk Kim (jaeg...@kernel.org) --- I think 1313 would be enough to avoid this issue first. 3568 case is after submit IO which could have a chance to get another states. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #82 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #81) > I think 1313 would be enough to avoid this issue first. > 3568 case is after submit IO which could have a chance to get another states. Thank you, I'm testing 6.0.8 patched. I will not turn off the PC for several days, so let's see what happens. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #83 from Guido (guido.iod...@gmail.com) --- I tried this script to trigger the GC: https://github.com/LLJY/f2fs-gc/blob/master/f2fs-gc.sh It's been running for 10 minutes now, but it's stock to 2503 dirty segments on the root partition. But no sign of 100% cpu, everything looks normal. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #84 from Guido (guido.iod...@gmail.com) --- this is the output Performing GC on /sys/fs/f2fs/nvme0n1p3/ 2589 2589 2503 2503 2503 2503 ... and a lot of 2503 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #85 from Guido (guido.iod...@gmail.com) --- I modified the script to run it on the partition on my choice. No problem with the home and other partitions. It looks like something in root partition. May be it is related to the bug? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #86 from Yuriy Garin (yuriy.ga...@gmail.com) --- (In reply to Jaegeuk Kim from comment #79) Running this patch (and debug printk) on 6.0.8-arch1-1. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #87 from Guido (guido.iod...@gmail.com) --- Created attachment 303184 --> https://bugzilla.kernel.org/attachment.cgi?id=303184&action=edit kernel log (with patch on data.c applied) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #88 from Guido (guido.iod...@gmail.com) --- (In reply to Guido from comment #87) > Created attachment 303184 [details] > kernel log (with patch on data.c applied) After several days of use I still did not have the 100% cpu busy problem but it gets worse. The system would not go to sleep or shut down (this happened several times and forced me to brutally shut down the computer), so checking the log I noticed several errors related to f2fs. I attach the log. Follows an extract of the log for your convenience nov 15 02:17:13 manjaro kernel: INFO: task f2fs_ckpt-259:3:233 blocked for more than 245 seconds. nov 15 02:17:13 manjaro kernel: Tainted: G U 6.0.8-1-MANJARO #1 nov 15 02:17:13 manjaro kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nov 15 02:17:13 manjaro kernel: task:f2fs_ckpt-259:3 state:D stack:0 pid: 233 ppid: 2 flags:0x4000 nov 15 02:17:13 manjaro kernel: Call Trace: nov 15 02:17:13 manjaro kernel: nov 15 02:17:13 manjaro kernel: __schedule+0x343/0x11c0 nov 15 02:17:13 manjaro kernel: ? update_load_avg+0x7e/0x730 nov 15 02:17:13 manjaro kernel: schedule+0x5e/0xd0 nov 15 02:17:13 manjaro kernel: rwsem_down_write_slowpath+0x336/0x720 nov 15 02:17:13 manjaro kernel: ? psi_task_switch+0xc3/0x1f0 nov 15 02:17:13 manjaro kernel: ? __schedule+0x34b/0x11c0 nov 15 02:17:13 manjaro kernel: ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: ? __checkpoint_and_complete_reqs+0x1b0/0x1b0 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: issue_checkpoint_thread+0x4c/0x110 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: ? cpuacct_percpu_seq_show+0x20/0x20 nov 15 02:17:13 manjaro kernel: kthread+0xdb/0x110 nov 15 02:17:13 manjaro kernel: ? kthread_complete_and_exit+0x20/0x20 nov 15 02:17:13 manjaro kernel: ret_from_fork+0x1f/0x30 nov 15 02:17:13 manjaro kernel: nov 15 02:17:13 manjaro kernel: INFO: task kworker/u16:10:26736 blocked for more than 245 seconds. nov 15 02:17:13 manjaro kernel: Tainted: G U 6.0.8-1-MANJARO #1 nov 15 02:17:13 manjaro kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nov 15 02:17:13 manjaro kernel: task:kworker/u16:10 state:D stack:0 pid:26736 ppid: 2 flags:0x4000 nov 15 02:17:13 manjaro kernel: Workqueue: writeback wb_workfn (flush-259:0) nov 15 02:17:13 manjaro kernel: Call Trace: nov 15 02:17:13 manjaro kernel: nov 15 02:17:13 manjaro kernel: __schedule+0x343/0x11c0 nov 15 02:17:13 manjaro kernel: schedule+0x5e/0xd0 nov 15 02:17:13 manjaro kernel: schedule_timeout+0x11c/0x150 nov 15 02:17:13 manjaro kernel: wait_for_completion+0x8a/0x160 nov 15 02:17:13 manjaro kernel: f2fs_issue_checkpoint+0x11f/0x200 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: f2fs_balance_fs_bg+0x119/0x370 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: f2fs_write_node_pages+0x78/0x240 [f2fs 112497ead8e6784e9a6a664ca29672f96820d535] nov 15 02:17:13 manjaro kernel: do_writepages+0xc1/0x1d0 nov 15 02:17:13 manjaro kernel: ? __wb_calc_thresh+0x4b/0x140 nov 15 02:17:13 manjaro kernel: __writeback_single_inode+0x3d/0x360 nov 15 02:17:13 manjaro kernel: ? inode_io_list_move_locked+0x69/0xc0 nov 15 02:17:13 manjaro kernel: writeback_sb_inodes+0x1ed/0x4a0 nov 15 02:17:13 manjaro kernel: __writeback_inodes_wb+0x4c/0xf0 nov 15 02:17:13 manjaro kernel: wb_writeback+0x204/0x2f0 nov 15 02:17:13 manjaro kernel: wb_workfn+0x31c/0x4f0 nov 15 02:17:13 manjaro kernel: ? __mod_timer+0x289/0x3b0 nov 15 02:17:13 manjaro kernel: process_one_work+0x1c4/0x380 nov 15 02:17:13 manjaro kernel: worker_thread+0x51/0x390 nov 15 02:17:13 manjaro kernel: ? rescuer_thread+0x3b0/0x3b0 nov 15 02:17:13 manjaro kernel: kthread+0xdb/0x110 nov 15 02:17:13 manjaro kernel: ? kthread_complete_and_exit+0x20/0x20 nov 15 02:17:13 manjaro kernel: ret_from_fork+0x1f/0x30 nov 15 02:17:13 manjaro kernel: nov 15 02:19:16 manjaro kernel: INFO: task kworker/7:1:86 blocked for more than 122 seconds. nov 15 02:19:16 manjaro kernel: Tainted: G U 6.0.8-1-MANJARO #1 nov 15 02:19:16 manjaro kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nov 15 02:19:16 manjaro kernel: task:kworker/7:1 state:D stack:0 pid: 86 ppid: 2 flags:0x4000 nov 15 02:19:16 manjaro kernel: Workqueue: inode_switch_wbs inode_switch_wbs_work_fn nov 15 02:19:16 manjaro kernel: Call Trace: nov 15 02:19:16 manjaro kernel: nov 15 02:19:16 manjaro kernel: __schedule+0x343/0x11c0 nov 15 02:19:16 manjaro kernel: ? ttwu_do_wakeup+0x17/0x170 nov 15 02:19:16 manjaro kernel:
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #89 from bogdan.nico...@gmail.com --- I confirm the bug persists both with background_gc=on and background_gc=sync. It's especially prone to manifest when the machine is idle for a long time. It almost feels like the gc hangs because it has nothing to collect and therefore it is entering an infinite loop. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #90 from Guido (guido.iod...@gmail.com) --- > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index a71e818cd67b..c351c3269874 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1325,6 +1325,7 @@ struct page *f2fs_get_lock_data_page(struct inode > *inode, pgoff_t index, > lock_page(page); > if (unlikely(page->mapping != mapping)) { > f2fs_put_page(page, 1); > + f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); > goto repeat; > } > if (unlikely(!PageUptodate(page))) { this patch seems to avoid the 100% cpu occupation but still doesn't solve the bug. I was wrong in the last comment, it's an improvement! As a workaround I tried to build the f2fs module from 5.17 but I failed. I'm not an expert, so I don't know how to forward-port the module. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #91 from Yuriy Garin (yuriy.ga...@gmail.com) --- Created attachment 303300 --> https://bugzilla.kernel.org/attachment.cgi?id=303300&action=edit debug patch with f2fs_io_schedule_timeout (In reply to #79) This debug patch adds f2fs_io_schedule_timeout call, as proposed in #79. It also prints when problem occurred first time (on this call) and it prints if problem was "fixed". In pseudo-code it looks like this: ... f2fs_get_lock_data_page(...) { int i = 0; repeat: page = f2fs_get_read_data_page(...); ... if (page->mapping != mapping) { if (i++ == 0) /* first time */ printk("bad ..."); f2fs_put_page(page, 1); f2fs_io_schedule_timeout(DEFAULT_IO_TIMEOUT); if (i >= 1) return ERR_PTR(-EAGAIN); /* cannot resolve problem */ goto repeat; } if (i > 0) /* resolved problem successfully */ printk("fix ..."); ... return page; } Thus, ideally, good output should have couples of lines: bad ... good ... In short, it does *not* happen. I'm attaching detailed dmesg log in the next post. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #92 from Yuriy Garin (yuriy.ga...@gmail.com) --- Created attachment 303301 --> https://bugzilla.kernel.org/attachment.cgi?id=303301&action=edit dmesg log for patch f2fs_io_schedule_timeout #91 As you see, there is a lot of lines "bad" for the same address, and there are no corresponding lines "fix". It's all like this: f2fs_get_lock_data_page: bad: 19327, a70291ac, 70d90d71 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #93 from Yuriy Garin (yuriy.ga...@gmail.com) --- It's running on 6.0.9-arch1-1: $ uname -a Linux ... 6.0.9-arch1-1 #2 SMP PREEMPT_DYNAMIC Wed, 23 Nov 2022 05:14:08 + x86_64 GNU/Linux -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #94 from Yuriy Garin (yuriy.ga...@gmail.com) --- (In reply to Yuriy Garin from comment #93) > It's running on 6.0.9-arch1-1: > > $ uname -a > Linux ... 6.0.9-arch1-1 #2 SMP PREEMPT_DYNAMIC Wed, 23 Nov 2022 05:14:08 > + x86_64 GNU/Linux Got the same result on 6.0.10-arch2-1. See timing, may be it helps. Once problems occurs, it goes every 4 minute for 1 1/2 hour. [Wed Nov 30 15:54:15 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e [Wed Nov 30 15:58:08 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e [Wed Nov 30 16:02:02 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e [Wed Nov 30 16:05:55 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e [Wed Nov 30 16:09:48 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e ... [Wed Nov 30 17:27:35 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e [Wed Nov 30 17:31:29 2022] f2fs_get_lock_data_page: bad: 1032147, be98c3cd, d0321d1e -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #95 from Yuriy Garin (yuriy.ga...@gmail.com) --- May be it worth to inject printk "upstream", to see where this condition page->mapping != mapping happens at first place? Any ideas? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #96 from bogdan.nico...@gmail.com --- Well there's also a possibility that the mapping of the inode changes since it was initialized in the beginning: struct address_space *mapping = inode->i_mapping; How about printing all three: page->mapping, mapping and inode->i_mapping. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #97 from Yuriy Garin (yuriy.ga...@gmail.com) --- (In reply to bogdan.nicolae from comment #96) > Well there's also a possibility that the mapping of the inode changes since > it was initialized in the beginning: > struct address_space *mapping = inode->i_mapping; > > How about printing all three: page->mapping, mapping and inode->i_mapping. Good point, thanks! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #98 from Yuriy Garin (yuriy.ga...@gmail.com) --- It would be funny, if actually inode->i_mapping was correctly fixed already, and we spin for nothing. :) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #99 from Guido (guido.iod...@gmail.com) --- Well, I tried to foce f2fs_gc on my partitions (with unpatched 6.0.11 kernel) It seems that the problem of 100% cpu occupation arises only on nvme0n1p3 (my root). The dirty sectors remains 1417 and do not go down and cpu is 100% occupied (since the start, not only when it is at 1417) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #100 from Guido (guido.iod...@gmail.com) --- And I cannot stop f2fs_gc with [manjaro tmp]# echo 500 > /sys/fs/f2fs/nvme0n1p3/gc_urgent_sleep_time [manjaro tmp]# echo 0 > /sys/fs/f2fs/nvme0n1p3/gc_urgent -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #101 from Guido (guido.iod...@gmail.com) --- Very interesting: I run the script with kernel 5.15.81 and it works well on my root partition sudo bash ./f2fs-gc.sh [sudo] password di guido: Performing GC on /sys/fs/f2fs/nvme0n1p3/ 1849 425 330 307 1 GC completed for /sys/fs/f2fs/nvme0n1p3/ Performing GC on /sys/fs/f2fs/nvme0n1p4/ 472 118 47 GC completed for /sys/fs/f2fs/nvme0n1p4/ Performing GC on /sys/fs/f2fs/nvme1n1/ GC completed for /sys/fs/f2fs/nvme1n1/ guido~tmp$ -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #102 from Guido (guido.iod...@gmail.com) --- interesting enough, after the script run on the 5.15 kernel had successfully reduced the dirty segments, I started the system with the 6.0.11 kernel and relaunched the script (after waiting for the dirty segments to return above 100). The script on 6.0.11 also worked without a problem on my root partition. As a precaution, I will run the script every 8 hours. Let's see if this will keep the partition clean and not cause problems with kernels > 5.17 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #103 from bogdan.nico...@gmail.com --- Guido, so if I understand correctly, your theory is that something in the GC strategy changed starting with 5.17, and normally this wouldn't be a problem for a fresh partition but old partitions that were upgraded may be affected (and can be fixed by running the GC offline or with an older kernel)? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #104 from Guido (guido.iod...@gmail.com) --- (In reply to bogdan.nicolae from comment #103) > Guido, so if I understand correctly, your theory is that something in the GC > strategy changed starting with 5.17, and normally this wouldn't be a problem > for a fresh partition but old partitions that were upgraded may be affected > (and can be fixed by running the GC offline or with an older kernel)? It seems so to me. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #105 from Yuriy Garin (yuriy.ga...@gmail.com) --- I'm running next debug patch, but problem is not happening for 4 days at this time. Can anybody suggest a way to increase chances of this GC problem? Sometimes it happens twice a day, usually once in a 2-3 days, but sometimes it runs well for month - with the same work pattern - development compilations all day, never turn computer off, no hibernation. By "way", I mean not scary, dangerous, intrusive way, like LLJY script in #83, something "more natural", less intrusive. Thanks! -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #106 from Guido (guido.iod...@gmail.com) --- (In reply to Yuriy Garin from comment #105) > I'm running next debug patch, but problem is not happening for 4 days at > this time. > > Can anybody suggest a way to increase chances of this GC problem? > > Sometimes it happens twice a day, usually once in a 2-3 days, but sometimes > it runs well for month - with the same work pattern - development > compilations all day, never turn computer off, no hibernation. > > By "way", I mean not scary, dangerous, intrusive way, like LLJY script in > #83, something "more natural", less intrusive. > > Thanks! Running the script on a 5.15 (lts) kernel should be safe (not really intrusive, gc is a supported operation). anyway, I obviously do not take responsibility :-) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #107 from bogdan.nico...@gmail.com --- I found that letting the machine go to sleep tends to trigger the bug more often after it wakes up. You could try starting an I/O intensive task like bonnie++, put the machine to sleep, then wake it up. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #108 from Yuriy Garin (yuriy.ga...@gmail.com) --- Thanks! How can you tell on what disk it happens? I have two nvme - one "plain" f2fs root, another is f2fs on dm-crypt - that's home, where a lot of compilation happens. >From logs and stats I cannot tell where f2fs GC problem occurs. What should I look for? If I would know problematic disk, I would increase load on that disk. Second question: /sys/fs/f2fs atgc_age_threshold has 604800 value. That's 1 week. Changing it to one day or 4 hours - will it really help to trigger problem? If it will, it would be a "safe" way. Thanks again. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #109 from Thomas (v10la...@myway.de) --- For me it seems that applying the debug patch with f2fs_io_schedule_timeout and running the f2fs-gc.sh script one time, then rebooting fixed the problem. For me this was on the root partition which is on a NVMe SSD. I also edited the f2fs-gc.sh script so it runs on that partition only instead of running on all found f2fs partitions before executing it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #110 from Guido (guido.iod...@gmail.com) --- I deactivate the f2fs-gc script for two days and... again the 100% cpu on f2fs_gc process :-( -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #111 from Guido (guido.iod...@gmail.com) --- Even worse, although I reactivated the script to force gc, I had the problem of the cpu at 100 per cent again, even though I had done the 'cleaning' with the 5.15 kernel earlier. So at the moment I'm unfortunately forced to use 5.15 all the time. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #112 from Jaegeuk Kim (jaeg...@kernel.org) --- I feel that this may be a subtle page cache issue, which is really hard to find the root cause. That being said, we might have two options: 1) bisecting the kernel version, 2) trying 5.15 with all the f2fs upstream patches. 1) this requires lots of effort between 5.15 vs. 5.18 tho, is it doable? 2) https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-stable.git/log/?h=linux-5.15.y Is it doable to test this kernel? If this issue happens with this kernel, we can bisect among f2fs changes easily. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #113 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #112) Now I'm trying another solution: I used fstransform to format the partition and upgrade the filesystem to f2fs 1.15. So now I'm testing kernel 6.1. If it work, well. If not, I'll try you kernel version. Maybe other users can test your kernel early. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #114 from Thomas (v10la...@myway.de) --- (In reply to Guido from comment #113) Why not test the "f2fs_io_schedule_timeout" kernel patch in combination with running the manual GC script one time (doesn't seem to matter if you run this on unpatched or patched kernel, all that's important is that you boot without any garbage into the patched kernel, so reboot right after executing the script) ? I did this cause I readed between the lines that this combination worked for others, too, and am having no more issues since around 5 days. And yes, I'm doing a lot to try to trigger this bug again. Also is it save to assume that this issue only occurs on root partitions which are on NVMe drives? I see a pattern here but still not 100% sure. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #115 from Guido (guido.iod...@gmail.com) --- (In reply to Thomas from comment #114) > (In reply to Guido from comment #113) > > Why not test the "f2fs_io_schedule_timeout" kernel patch in combination with > running the manual GC script one time (doesn't seem to matter if you run > this on unpatched or patched kernel, all that's important is that you boot > without any garbage into the patched kernel, so reboot right after executing > the script) ? > > I did this cause I readed between the lines that this combination worked for > others, too, and am having no more issues since around 5 days. And yes, I'm > doing a lot to try to trigger this bug again. > > Also is it save to assume that this issue only occurs on root partitions > which are on NVMe drives? I see a pattern here but still not 100% sure. I already tried the patch (but not in combination with the script) and it solved the problem of 100% cpu but still f2fs_gc remains stuck and doesnt end the garbage collection, so the user cant shutdown in a safe manner. Anyway if I will have the 100% cpu problem in the following days, I'll try it. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #116 from Thomas (v10la...@myway.de) --- (In reply to Jaegeuk Kim from comment #112) > this requires lots of effort between 5.15 vs. 5.18 tho, is it doable? Really good question. I think it is doable but with a lot of time and passion only. After all there is no easy way to recreate the issue but you need to run the kernel for days to see if it's stable. (In reply to Guido from comment #115) > I already tried the patch (but not in combination with the script) and it > solved the problem of 100% cpu but still f2fs_gc remains stuck and doesnt > end the garbage collection, so the user cant shutdown in a safe manner. Must have overlooked that statement, sorry. For me both problems seem to be solved with the script and patch combination through, so might we worth a try (ofc. after you finished your current test). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #117 from Thomas (v10la...@myway.de) --- (In reply to Guido from comment #115) > it solved the problem of 100% cpu but still f2fs_gc remains stuck You're right, this just happened for me, too. So no more 100% CPU but the partitions I/O freezing. [28731.336375] f2fs_get_lock_data_page: bad: 825453, 657faa62, ba8a2fe3 [28952.126658] f2fs_get_lock_data_page: bad: 825453, 657faa62, ba8a2fe3 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #118 from Yuriy Garin (yuriy.ga...@gmail.com) --- Created attachment 303439 --> https://bugzilla.kernel.org/attachment.cgi?id=303439&action=edit debug patch - print page/folio/ref_count This debug patch prints page, folio and folio reference count. As far as I understand logic behind 'f2fs_put_page(page, 1); goto repeat;' - it's an attempt to "unlock" page, release it from page cache and reload again. (I've found it not easy to distinguish between page and folio pointer - it's a C union, sometimes used as page, sometimes used as folio - definitely requires more kernel expertise. Please, tell me what should be done better.) After two weeks of running this patch, I've caught this GC problem and have a log. Attaching it in the next message. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #119 from Yuriy Garin (yuriy.ga...@gmail.com) --- Created attachment 303440 --> https://bugzilla.kernel.org/attachment.cgi?id=303440&action=edit debug patch log - page, folio and ref count As you see, folio pointer is valid. And, ref_count is not 1 before going to f2fs_put_page() - I guess, that's why it does not work. Silly thought :) Interestingly, ref count is 514, which looks suspiciously as a binary flag 100010. Is it possible that during 5.17/5.18 implementation of a "pin", somehow binary flag was written to ref count, or something like '1 << ...' happens? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #120 from Yuriy Garin (yuriy.ga...@gmail.com) --- What's I'm saying, it is, as was pointed in #112: "I feel that this may be a subtle page cache issue". -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #121 from Yuriy Garin (yuriy.ga...@gmail.com) --- (In reply to Yuriy Garin from comment #119) Forgot to add note: $ uname -a Linux ... 6.1.0-arch1-1 #1 SMP PREEMPT_DYNAMIC Wed, 14 Dec 2022 04:55:09 + x86_64 GNU/Linux -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #122 from Yuriy Garin (yuriy.ga...@gmail.com) --- Created attachment 303441 --> https://bugzilla.kernel.org/attachment.cgi?id=303441&action=edit debug patch log - page, folio and ref count - #2 Today is a lucky day. After two weeks of waiting I've got this GC problem second time. It's on different inode, but page-mapping is the same. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #123 from Guido (guido.iod...@gmail.com) --- (In reply to Guido from comment #113) > (In reply to Jaegeuk Kim from comment #112) > > Now I'm trying another solution: I used fstransform to format the partition > and upgrade the filesystem to f2fs 1.15. So now I'm testing kernel 6.1. If > it work, well. If not, I'll try you kernel version. > > Maybe other users can test your kernel early. After 1 month the problem is again here wing kernel 6.1.6... -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 Guido (guido.iod...@gmail.com) changed: What|Removed |Added Severity|normal |high --- Comment #124 from Guido (guido.iod...@gmail.com) --- I took the liberty of raising the importance of the bug because it renders the operating system unusable. I have not set 'blocking' only because not all users are affected. In any case, my experiment of reformatting the partition did not eliminate the problem and shows that it is probably more common than a corner case. The 5.15 LTS kernel will go EoF in October, I hope the bug will be fixed by then. Aside from that, I wonder if an analysis of the differences between the 5.17 and 5.18 kernels shows where the problem lies. I don't have the expertise to do that. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #125 from Guido (guido.iod...@gmail.com) --- Can I ask to other reporters what distro they use? I use manjaro but the problem occurs also with archlinux kernel. Maybe it's related to CONFIG_F2FS_UNFAIR_RWSEM=y ? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #126 from Matteo Croce (rootki...@yahoo.it) --- The only way to find the issue is by doing a bisect. It's a long operation, but in the time we spent commenting, we would have found it already. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #127 from Thomas (v10la...@myway.de) --- (In reply to Guido from comment #125) > Can I ask to other reporters what distro they use? Gentoo Linux > Maybe it's related to CONFIG_F2FS_UNFAIR_RWSEM=y ? Don't think so. My config: CONFIG_F2FS_FS=y CONFIG_F2FS_STAT_FS=y CONFIG_F2FS_FS_XATTR=y CONFIG_F2FS_FS_POSIX_ACL=y CONFIG_F2FS_FS_SECURITY=y # CONFIG_F2FS_CHECK_FS is not set # CONFIG_F2FS_FAULT_INJECTION is not set CONFIG_F2FS_FS_COMPRESSION=y CONFIG_F2FS_FS_LZO=y CONFIG_F2FS_FS_LZORLE=y CONFIG_F2FS_FS_LZ4=y CONFIG_F2FS_FS_LZ4HC=y CONFIG_F2FS_FS_ZSTD=y CONFIG_F2FS_IOSTAT=y # CONFIG_F2FS_UNFAIR_RWSEM is not set (In reply to Matteo Croce from comment #126) > The only way to find the issue is by doing a bisect. Bisecting this is impossible: There are 16205 commits between 5.17 and 5.18. To make sure you're bug free you would need to test each commit for maround 2 months. This means one would need 2 years and 4 months to bisect this (worst case scenario). -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #128 from Guido (guido.iod...@gmail.com) --- (In reply to Thomas from comment #127) > Bisecting this is impossible: There are 16205 commits between 5.17 and 5.18. Well, we need to check only the commits related to F2FS between the last 5.17.x and the first 5.18. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #129 from Matteo Croce (rootki...@yahoo.it) --- > Bisecting this is impossible: There are 16205 commits between 5.17 and 5.18. This will take roughly 14 steps. Long but not impossible. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #130 from Thomas (v10la...@myway.de) --- (In reply to Matteo Croce from comment #129) > > Bisecting this is impossible: There are 16205 commits between 5.17 and > 5.18. > > This will take roughly 14 steps. Long but not impossible. Exactly: 14 steps * 2 months = 28 months = 2 years and 4 months. This ofc assumes you're bisecting 24/7... -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #131 from Jaegeuk Kim (jaeg...@kernel.org) --- Re Comment #122, By any chance, could you add a code to print "page->mapping->host->i_ino" if page->mapping->host exists, and the status of PageUptodate(page)? When GC tries to move the valid block, if the block was truncated and somehow MM gives a stale page, we may hit a loop? How about this to report the error to GC? GC will skip this migration and will do it later or skip it, if the block was really truncated. --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1325,18 +1325,14 @@ struct page *f2fs_get_lock_data_page(struct inode *inode, pgoff_t index, { struct address_space *mapping = inode->i_mapping; struct page *page; -repeat: + page = f2fs_get_read_data_page(inode, index, 0, for_write, NULL); if (IS_ERR(page)) return page; /* wait for read completion */ lock_page(page); - if (unlikely(page->mapping != mapping)) { - f2fs_put_page(page, 1); - goto repeat; - } - if (unlikely(!PageUptodate(page))) { + if (unlikely(page->mapping != mapping || !PageUptodate(page))) { f2fs_put_page(page, 1); return ERR_PTR(-EIO); } -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #132 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #131) > Re Comment #122, > > By any chance, could you add a code to print "page->mapping->host->i_ino" if > page->mapping->host exists, and the status of PageUptodate(page)? > > When GC tries to move the valid block, if the block was truncated and > somehow MM gives a stale page, we may hit a loop? > > How about this to report the error to GC? GC will skip this migration and > will do it later or skip it, if the block was really truncated. > > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1325,18 +1325,14 @@ struct page *f2fs_get_lock_data_page(struct inode > *inode, pgoff_t index, > { > struct address_space *mapping = inode->i_mapping; > struct page *page; > -repeat: > + > page = f2fs_get_read_data_page(inode, index, 0, for_write, NULL); > if (IS_ERR(page)) > return page; > > /* wait for read completion */ > lock_page(page); > - if (unlikely(page->mapping != mapping)) { > - f2fs_put_page(page, 1); > - goto repeat; > - } > - if (unlikely(!PageUptodate(page))) { > + if (unlikely(page->mapping != mapping || !PageUptodate(page))) { > f2fs_put_page(page, 1); > return ERR_PTR(-EIO); > } I want to try this patch later. Does the patch try to solve the problem, or does it only serve to produce a log? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #133 from Guido (guido.iod...@gmail.com) --- I tried to apply the patch on 6.2 but it failed because the repeat is missing @1328. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #134 from bogdan.nico...@gmail.com --- Well lines got shifted a bit. It's now #1336 instead of #1325. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #135 from Guido (guido.iod...@gmail.com) --- (In reply to bogdan.nicolae from comment #134) > Well lines got shifted a bit. It's now #1336 instead of #1325. Yes, in meantime I corrected the patch, I'm building the kernel now. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #136 from Guido (guido.iod...@gmail.com) --- OK, I am testing the new kernel. I tried the script to force the GC and noticed that on the root partition it occupies 10%, while on the home partition the cpu occupation was almost negligible (0.7-1%). The process finished without any problems on all partitions. I will keep you updated of any problems in the coming days. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #137 from bogdan.nico...@gmail.com --- @Guido: any news? Did it work? I did't see any issues with this patch so far. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #138 from Guido (guido.iod...@gmail.com) --- (In reply to bogdan.nicolae from comment #137) > @Guido: any news? Did it work? I did't see any issues with this patch so far. For me too, so far so good, but I think we still have to wait to be sure. Anyway I am beginning to have hope that the bug will be fixed with this patch. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #139 from Guido (guido.iod...@gmail.com) --- I have been using the kernel with this patch for a month now and so far no problems. Out of superstition (I am Italian!), I'm afraid to say that the bug is fixed, but it seems plausible -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #140 from Jaegeuk Kim (jaeg...@kernel.org) --- Cook, it seems no reason not to merge this patch. Thanks, -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #141 from Guido (guido.iod...@gmail.com) --- Today I forced the gc on all partitions. No problem at all. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #142 from Jaegeuk Kim (jaeg...@kernel.org) --- I've reviewed the refcount of the path and found one suspicious routine when handling page->private. By any chance, can we try this patch instead of the above workaround? https://lore.kernel.org/lkml/20230405204321.2056498-1-jaeg...@kernel.org/T/#u -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #143 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #142) > I've reviewed the refcount of the path and found one suspicious routine when > handling page->private. > > By any chance, can we try this patch instead of the above workaround? > > https://lore.kernel.org/lkml/20230405204321.2056498-1-jaeg...@kernel.org/T/#u What kernel version? 6.3 RC5? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #144 from Jaegeuk Kim (jaeg...@kernel.org) --- You can apply it to any kernel version that you're able to build. Let me know if there's a merge conflict. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #145 from Guido (guido.iod...@gmail.com) --- I'm not apre to patch 6.2.9, I receive error for hunk #2 in both data.c and f2fs.c, I tried to change the patch entry point but it fails. Can you help me? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #146 from Jaegeuk Kim (jaeg...@kernel.org) --- By any chance, does this work? This is the backport to 6.1. https://github.com/jaegeuk/f2fs-stable/commit/a0ba9030bd28c01b3e308499df5daec94414f4fb -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #147 from Jaegeuk Kim (jaeg...@kernel.org) --- Ok, I prepared the patches in v6.2. https://github.com/jaegeuk/f2fs-stable/commits/linux-6.2.y Please apply *two* patches on top of the tree. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #148 from Guido (guido.iod...@gmail.com) --- Thank you, I'm building 6.2.10 with both patches and I will try it in next days/weeks -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #149 from Guido (guido.iod...@gmail.com) --- The build process fails but not on f2fs (it fails on a driver for some reason). Is there a way to build only the patched f2fs module against the stock kernel? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #150 from Guido (guido.iod...@gmail.com) --- Created attachment 304096 --> https://bugzilla.kernel.org/attachment.cgi?id=304096&action=edit build error Ok, I found how to in documentation, but I receive errors during build (see attache build.log) -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #151 from Jaegeuk Kim (jaeg...@kernel.org) --- Thanks. I found one mistake in the previous backport of first patch. Could you please re-download them? https://github.com/jaegeuk/f2fs-stable/commits/linux-6.2.y -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #152 from Guido (guido.iod...@gmail.com) --- Done. I built it against my current kernel (6.2.7), then rebuild the initramfs and reboot the system. Then I forced gc with a script and it works without problems. I will test this kernel in next days and weeks. Hope other people can do the same. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 Ryotaro Ko (pikate...@gmail.com) changed: What|Removed |Added CC||pikate...@gmail.com --- Comment #153 from Ryotaro Ko (pikate...@gmail.com) --- I applied the patch on the latest archlinux kernel (6.2.10-arch1 https://github.com/pikatenor/linux/tree/archlinux-6.2.10-f2fs) and tried it, but f2fs_gc still hangs around 2 hours after boot. [0.00] Linux version 6.2.10-arch1-1-test-507874-g453da3ddc42a (linux-test@archlinux) (gcc (GCC) 12.2.1 20230201, GNU ld (GNU Binutils) 2.40) #1 SMP PREEMPT_DYNAMIC Tue, 11 Apr 2023 16:26:44 + [0.00] Command line: initrd=\initramfs-linux-test.img cryptdevice=UUID=b5b188ee-8355-4638-b192-111ee6371c79:Homie root=UUID=ca2eb962-9af0-4d5c-869d-9c1916f32a2e rw quiet i915.enable_psr=0 [ 9584.264309] INFO: task f2fs_ckpt-259:4:213 blocked for more than 122 seconds. [ 9584.264313] Tainted: G U 6.2.10-arch1-1-test-507874-g453da3ddc42a #1 [ 9584.264314] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 9584.264315] task:f2fs_ckpt-259:4 state:D stack:0 pid:213 ppid:2 flags:0x4000 [ 9584.264318] Call Trace: [ 9584.264319] [ 9584.264321] __schedule+0x3c8/0x12e0 [ 9584.264326] ? select_task_rq_fair+0x16c/0x1c00 [ 9584.264329] ? update_load_avg+0x7e/0x780 [ 9584.264332] schedule+0x5e/0xd0 [ 9584.264333] rwsem_down_write_slowpath+0x329/0x700 [ 9584.264338] ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264366] __checkpoint_and_complete_reqs+0x7a/0x1b0 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264390] ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264411] issue_checkpoint_thread+0x4c/0x110 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264433] ? __pfx_autoremove_wake_function+0x10/0x10 [ 9584.264437] kthread+0xdb/0x110 [ 9584.264438] ? __pfx_kthread+0x10/0x10 [ 9584.264440] ret_from_fork+0x29/0x50 [ 9584.264445] [ 9584.264508] INFO: task kworker/u16:2:19587 blocked for more than 122 seconds. [ 9584.264509] Tainted: G U 6.2.10-arch1-1-test-507874-g453da3ddc42a #1 [ 9584.264510] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 9584.264510] task:kworker/u16:2 state:D stack:0 pid:19587 ppid:2 flags:0x4000 [ 9584.264514] Workqueue: writeback wb_workfn (flush-259:0) [ 9584.264517] Call Trace: [ 9584.264518] [ 9584.264519] __schedule+0x3c8/0x12e0 [ 9584.264521] ? ttwu_queue_wakelist+0xef/0x110 [ 9584.264524] ? try_to_wake_up+0xd9/0x540 [ 9584.264527] schedule+0x5e/0xd0 [ 9584.264528] schedule_timeout+0x151/0x160 [ 9584.264531] wait_for_completion+0x8a/0x160 [ 9584.264534] f2fs_issue_checkpoint+0x11f/0x200 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264558] f2fs_balance_fs_bg+0x12e/0x390 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264582] f2fs_write_node_pages+0x78/0x240 [f2fs 137a18329c9b4a66b7d5836126aee7155321bd82] [ 9584.264606] do_writepages+0xc1/0x1d0 [ 9584.264610] __writeback_single_inode+0x3d/0x360 [ 9584.264614] writeback_sb_inodes+0x1ed/0x4a0 [ 9584.264618] __writeback_inodes_wb+0x4c/0xf0 [ 9584.264621] wb_writeback+0x204/0x2f0 [ 9584.264625] wb_workfn+0x354/0x4f0 [ 9584.264627] ? ttwu_queue_wakelist+0xef/0x110 [ 9584.264630] process_one_work+0x1c5/0x3c0 [ 9584.264633] worker_thread+0x51/0x390 [ 9584.264636] ? __pfx_worker_thread+0x10/0x10 [ 9584.264638] kthread+0xdb/0x110 [ 9584.264639] ? __pfx_kthread+0x10/0x10 [ 9584.264641] ret_from_fork+0x29/0x50 [ 9584.264645] -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #154 from Jaegeuk Kim (jaeg...@kernel.org) --- Could you please reapply and test three patches here again? https://github.com/jaegeuk/f2fs-stable/commits/linux-6.2.y -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #155 from Guido (guido.iod...@gmail.com) --- (In reply to Jaegeuk Kim from comment #154) > Could you please reapply and test three patches here again? > > https://github.com/jaegeuk/f2fs-stable/commits/linux-6.2.y I see only two patches now. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #156 from Ryotaro Ko (pikate...@gmail.com) --- I fetched the archlinux kernel (https://github.com/archlinux/linux/tree/v6.2.10-arch1) and rebased f2fs-stable onto it, so if the pre-existing stable tree did not contain that (third) patch, I applied only two patches. (In reply to Jaegeuk Kim from comment #154) > Could you please reapply and test three patches here again? Are you referring to the patch in comment #131? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #157 from Jaegeuk Kim (jaeg...@kernel.org) --- Sorry, I found some issues in the original patches. Could you try two patches now on top of the tree? https://github.com/jaegeuk/f2fs-stable/commits/linux-6.2.y -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #158 from Ryotaro Ko (pikate...@gmail.com) --- Thanks, I am now trying it out and it seems working fine with my root partition mounted using background_gc=on. https://github.com/pikatenor/linux/commits/archlinux-6.2.10-f2fs2 I will continue to use it for a while and let you know how it turns out. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #159 from Guido (guido.iod...@gmail.com) --- I too patched (this time using kernel 6.2.10). I also ran the script to force gc. I will use this kernel in the coming weeks. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #160 from Guido (guido.iod...@gmail.com) --- After several weeks, no problem. I also foced gc now with no problem. Now I would like to swith to kernel 6.3, what patch I should use? -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #161 from Jaegeuk Kim (jaeg...@kernel.org) --- >From Linus tree, could you please try this patch which was merged in 6.4-rc1? https://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git/commit/?id=635a52da8605e5d300ec8c18fdba8d6f8491755d -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #162 from Guido (guido.iod...@gmail.com) --- I'll try ASAP. I tried to patch 6.3.1 with the patches for 6.2.x but fails saying they are already in place. Seeing the code it seems so. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #163 from Guido (guido.iod...@gmail.com) --- To be clear: should I try the patch merged in 6.4-rc1 to 6.3.1 kernel? If so, I prefer to try the kernel 6.4-rc1 instead, with that patche already in place. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #164 from Jaegeuk Kim (jaeg...@kernel.org) --- Yup, 6.4-rc1 should have all patches, which is worth giving it a try. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #165 from Guido (guido.iod...@gmail.com) --- Thank you, for now I'm trying linux-next-git 20230504.r0.g145e5cddfe8b-1 from AUR, it should have the patch already applied. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 Matias (lp61...@gmail.com) changed: What|Removed |Added CC||lp61...@gmail.com --- Comment #166 from Matias (lp61...@gmail.com) --- I've using 6.3.1 with 6.4-rc1 patch for a few days now and with no extra gc parameters, f2fs_gc-8:1 starts using 17.8% of cpu and basically the system becames unusable, can't open anything etc like a soft freeze, but after setting background_gc=sync (although it might be not ideal), it did not happen again, i hope this extra information helps Jaegeuk, love this filesystem. Regards -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #167 from Jaegeuk Kim (jaeg...@kernel.org) --- Matias, you saw the issue with the f2fs updates in 6.4-rc1, right? If so, we may need to consider [1] back.. [1] https://github.com/jaegeuk/f2fs/commit/400dc2a4d7ec96a1fc4168652a0862e7edab3671 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #168 from Matias (lp61...@gmail.com) --- Removed background_gc=sync and it happened again, i hope this message gets sent so you could take a look, this is the journalctl log after it happens. Kernel: 6.3.1 with f2fs updates of 6.4-rc1 May 05 20:13:44 cachyos-x8664 kernel: INFO: task f2fs_ckpt-8:1:204 blocked for more than 122 seconds. May 05 20:13:44 cachyos-x8664 kernel: Not tainted 6.3.1-1-cachyos #1 May 05 20:13:44 cachyos-x8664 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 05 20:13:44 cachyos-x8664 kernel: task:f2fs_ckpt-8:1 state:D stack:0 pid:204 ppid:2 flags:0x4000 May 05 20:13:44 cachyos-x8664 kernel: Call Trace: May 05 20:13:44 cachyos-x8664 kernel: May 05 20:13:44 cachyos-x8664 kernel: __schedule+0x441/0x17b0 May 05 20:13:44 cachyos-x8664 kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 May 05 20:13:44 cachyos-x8664 kernel: schedule_preempt_disabled+0x65/0xe0 May 05 20:13:44 cachyos-x8664 kernel: rwsem_down_write_slowpath+0x22b/0x6c0 May 05 20:13:44 cachyos-x8664 kernel: ? psi_task_switch+0x12f/0x340 May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: down_write+0x5b/0x60 May 05 20:13:44 cachyos-x8664 kernel: __checkpoint_and_complete_reqs+0x7c/0x1b0 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_issue_checkpoint_thread+0x10/0x10 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: issue_checkpoint_thread+0x4c/0x110 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_autoremove_wake_function+0x10/0x10 May 05 20:13:44 cachyos-x8664 kernel: kthread+0xdb/0x110 May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_kthread+0x10/0x10 May 05 20:13:44 cachyos-x8664 kernel: ret_from_fork+0x29/0x50 May 05 20:13:44 cachyos-x8664 kernel: May 05 20:13:44 cachyos-x8664 kernel: INFO: task kworker/u16:0:5392 blocked for more than 122 seconds. May 05 20:13:44 cachyos-x8664 kernel: Not tainted 6.3.1-1-cachyos #1 May 05 20:13:44 cachyos-x8664 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 05 20:13:44 cachyos-x8664 kernel: task:kworker/u16:0 state:D stack:0 pid:5392 ppid:2 flags:0x4000 May 05 20:13:44 cachyos-x8664 kernel: Workqueue: writeback wb_workfn (flush-8:0) May 05 20:13:44 cachyos-x8664 kernel: Call Trace: May 05 20:13:44 cachyos-x8664 kernel: May 05 20:13:44 cachyos-x8664 kernel: __schedule+0x441/0x17b0 May 05 20:13:44 cachyos-x8664 kernel: ? blk_mq_submit_bio+0x396/0x760 May 05 20:13:44 cachyos-x8664 kernel: ? ttwu_queue_wakelist+0xef/0x110 May 05 20:13:44 cachyos-x8664 kernel: schedule+0x5e/0xd0 May 05 20:13:44 cachyos-x8664 kernel: schedule_timeout+0x329/0x390 May 05 20:13:44 cachyos-x8664 kernel: ? autoremove_wake_function+0x32/0x60 May 05 20:13:44 cachyos-x8664 kernel: wait_for_completion+0x86/0x160 May 05 20:13:44 cachyos-x8664 kernel: f2fs_issue_checkpoint+0x11f/0x200 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: f2fs_balance_fs_bg+0x12e/0x3b0 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: f2fs_write_node_pages+0x85/0xa00 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_ata_scsi_rw_xlat+0x10/0x10 May 05 20:13:44 cachyos-x8664 kernel: ? ata_qc_issue+0x138/0x270 May 05 20:13:44 cachyos-x8664 kernel: ? ata_scsi_queuecmd+0xe4/0x170 May 05 20:13:44 cachyos-x8664 kernel: ? select_task_rq_fair+0x15d/0x2880 May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_f2fs_write_node_pages+0x10/0x10 [f2fs d2333fc34706e39c1a83271e8b382b177aae887d] May 05 20:13:44 cachyos-x8664 kernel: do_writepages+0x8c/0x610 May 05 20:13:44 cachyos-x8664 kernel: ? blk_mq_do_dispatch_sched+0xa7/0x3c0 May 05 20:13:44 cachyos-x8664 kernel: ? _flat_send_IPI_mask+0x1f/0x30 May 05 20:13:44 cachyos-x8664 kernel: ? ttwu_queue_wakelist+0xef/0x110 May 05 20:13:44 cachyos-x8664 kernel: ? try_to_wake_up+0xd9/0xcb0 May 05 20:13:44 cachyos-x8664 kernel: __writeback_single_inode+0x3d/0x360 May 05 20:13:44 cachyos-x8664 kernel: writeback_sb_inodes+0x1ed/0x530 May 05 20:13:44 cachyos-x8664 kernel: ? __wake_up+0x8b/0xc0 May 05 20:13:44 cachyos-x8664 kernel: __writeback_inodes_wb+0x4c/0xf0 May 05 20:13:44 cachyos-x8664 kernel: wb_writeback+0x1fe/0x390 May 05 20:13:44 cachyos-x8664 kernel: wb_workfn+0x412/0x600 May 05 20:13:44 cachyos-x8664 kernel: ? __schedule+0x449/0x17b0 May 05 20:13:44 cachyos-x8664 kernel: process_one_work+0x24b/0x460 May 05 20:13:44 cachyos-x8664 kernel: worker_thread+0x55/0x4f0 May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_worker_thread+0x10/0x10 May 05 20:13:44 cachyos-x8664 kernel: kthread+0xdb/0x110 May 05 20:13:44 cachyos-x8664 kernel: ? __pfx_
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #169 from Matias (lp61...@gmail.com) --- (In reply to Jaegeuk Kim from comment #167) > Matias, you saw the issue with the f2fs updates in 6.4-rc1, right? If so, we > may need to consider [1] back.. > > [1] > https://github.com/jaegeuk/f2fs/commit/ > 400dc2a4d7ec96a1fc4168652a0862e7edab3671 Since rc1 got released today, i'll try again to see if this issue cames back. maybe it was just a regression with 6.3.x kernel but we'll see -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
[f2fs-dev] [Bug 216050] f2fs_gc occupies 100% cpu
https://bugzilla.kernel.org/show_bug.cgi?id=216050 --- Comment #170 from Ryotaro Ko (pikate...@gmail.com) --- Since posting comment #158, I have been using the patched 6.2.10 kernel for a while. Initially it seemed stable, but in the last few days the problem has recurred - again f2fs_gc occupies 100% of the CPU core and blocks other kernel tasks. I am going to switch to the 6.4-rc1 kernel from now on, however I suspect that this bug has probably not been fully fixed. If some kind of patch or logs are needed for debugging, please let me know so I can help out. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug. ___ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel