答复: 答复: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
I have tested this extreme case(No free PEBs left after creating test volumes) on different type of machines for 100 times. The biggest number of attempts are shown below: x86_64 arm64 2-core4 4 4-core8 4 8-core4 4 So, setting the number of attempts to 10 is fine. May I send another PATCH to improve it? Planned revisions: --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -221,12 +221,12 @@ int ubi_wl_get_peb(struct ubi_device *ubi) if (pool->used == pool->size) { spin_unlock(&ubi->wl_lock); - if (retried) { + retried++; + if (retried == 10) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); ret = -ENOSPC; goto out; } - retried = 1; up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -邮件原件- 发件人: Richard Weinberger [mailto:rich...@nod.at] 发送时间: 2019年8月1日 17:40 收件人: chengzhihao 抄送: zhangyi (F) ; linux-mtd ; linux-kernel 主题: Re: 答复: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB - Ursprüngliche Mail - >> Do you have numbers how many attempts were needed to get a free block? > I tested it dozens of times. The biggest number of attempts I've ever > had so far is 6. In most cases, it only takes 2 or 3 times. So raising the retry count to, let's say, 10 would work too? Having it unbound feels dangerous because it may hide other problems. Thanks, //richard
Re: 答复: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
- Ursprüngliche Mail - >> Do you have numbers how many attempts were needed to get a free block? > I tested it dozens of times. The biggest number of attempts I've ever had so > far > is 6. In most cases, it only takes 2 or 3 times. So raising the retry count to, let's say, 10 would work too? Having it unbound feels dangerous because it may hide other problems. Thanks, //richard
答复: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
> You send this patch three times, I guess your mail setup has issues? :-) Sorry, I thought I hadn't sent the first two e-mails. (The Patch work website refreshes slowly) > Do you have numbers how many attempts were needed to get a free block? I tested it dozens of times. The biggest number of attempts I've ever had so far is 6. In most cases, it only takes 2 or 3 times. -邮件原件- 发件人: Richard Weinberger [mailto:rich...@nod.at] 发送时间: 2019年8月1日 17:21 收件人: chengzhihao 抄送: zhangyi (F) ; linux-mtd ; linux-kernel 主题: Re: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB - Ursprüngliche Mail - > Von: "chengzhihao1" > An: "richard" , "yi zhang" > CC: "linux-mtd" , "linux-kernel" > > Gesendet: Donnerstag, 1. August 2019 11:13:20 > Betreff: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number > of attempts with polling while getting PEB > I don't quite understand why a limited number of attempts have been > made to get a free PEB in ubi_wl_get_peb (in fastmap-wl.c). I proposed > this PATCH with reference to the implementation of ubi_wl_get_peb (in > wl.c). As far as I know, getting PEB by polling probably won't fall into > soft-lockup. > ubi_update_fastmap may add new tasks (including erase task or wl > taskk, wl tasks generally do not generate additional free PEBs) to > ubi->works, and produce_free_peb will eventually complete all tasks in > ubi->works or obtain an free PEB that can be filled into pool. You send this patch three times, I guess your mail setup has issues? :-) This is one of the darkest corners of Fastmap where things get messy. The number of retry attempts was limited to avoid a live lock. I agree that allowing only one retry is a little to few. With nandsim, a small nand and a fast PC you can hit that. Do you have numbers how many attempts were needed to get a free block? Thanks, //richard
Re: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
- Ursprüngliche Mail - > Von: "chengzhihao1" > An: "richard" , "yi zhang" > CC: "linux-mtd" , "linux-kernel" > > Gesendet: Donnerstag, 1. August 2019 11:13:20 > Betreff: 答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of > attempts with polling while getting PEB > I don't quite understand why a limited number of attempts have been made to > get > a free PEB in ubi_wl_get_peb (in fastmap-wl.c). I proposed this PATCH with > reference to the implementation of ubi_wl_get_peb (in wl.c). As far as I know, > getting PEB by polling probably won't fall into soft-lockup. > ubi_update_fastmap may add new tasks (including erase task or wl taskk, wl > tasks > generally do not generate additional free PEBs) to ubi->works, and > produce_free_peb will eventually complete all tasks in ubi->works or obtain an > free PEB that can be filled into pool. You send this patch three times, I guess your mail setup has issues? :-) This is one of the darkest corners of Fastmap where things get messy. The number of retry attempts was limited to avoid a live lock. I agree that allowing only one retry is a little to few. With nandsim, a small nand and a fast PC you can hit that. Do you have numbers how many attempts were needed to get a free block? Thanks, //richard
答复: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
I don't quite understand why a limited number of attempts have been made to get a free PEB in ubi_wl_get_peb (in fastmap-wl.c). I proposed this PATCH with reference to the implementation of ubi_wl_get_peb (in wl.c). As far as I know, getting PEB by polling probably won't fall into soft-lockup. ubi_update_fastmap may add new tasks (including erase task or wl taskk, wl tasks generally do not generate additional free PEBs) to ubi->works, and produce_free_peb will eventually complete all tasks in ubi->works or obtain an free PEB that can be filled into pool. -邮件原件- 发件人: chengzhihao 发送时间: 2019年8月1日 17:18 收件人: rich...@nod.at; zhangyi (F) 抄送: linux-...@lists.infradead.org; linux-kernel@vger.kernel.org; chengzhihao 主题: [PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = &ubi->fm_pool; struct ubi_fm_pool *wl_pool = &ubi->fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(&ubi->wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(&ubi->works)); + spin_unlock(&ubi->wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(&ubi->wl_lock); up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
[PATCH RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = &ubi->fm_pool; struct ubi_fm_pool *wl_pool = &ubi->fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(&ubi->wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(&ubi->works)); + spin_unlock(&ubi->wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(&ubi->wl_lock); up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
答复: [RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
I don't quite understand why a limited number of attempts have been made to get a free PEB in ubi_wl_get_peb (in fastmap-wl.c). I proposed this PATCH with reference to the implementation of ubi_wl_get_peb (in wl.c). As far as I know, getting PEB by polling probably won't fall into soft-lockup. ubi_update_fastmap may add new tasks (including erase task or wl taskk, wl tasks generally do not generate additional free PEBs) to ubi->works, and produce_free_peb will eventually complete all tasks in ubi->works or obtain an free PEB that can be filled into pool. - Cheng zhihao -邮件原件- 发件人: chengzhihao 发送时间: 2019年8月1日 17:08 收件人: rich...@nod.at; zhangyi (F) 抄送: linux-...@lists.infradead.org; linux-kernel@vger.kernel.org; chengzhihao 主题: [RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = &ubi->fm_pool; struct ubi_fm_pool *wl_pool = &ubi->fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(&ubi->wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(&ubi->works)); + spin_unlock(&ubi->wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(&ubi->wl_lock); up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4
[RFC] ubi: ubi_wl_get_peb: Replace a limited number of attempts with polling while getting PEB
Running pressure test io_paral (A pressure ubi test in mtd-utils) on an UBI device with fewer PEBs (fastmap enabled) may cause ENOSPC errors and make UBI device read-only, but there are still free PEBs on the UBI device. This problem can be easily reproduced by performing the following steps on a 2-core machine: $ modprobe nandsim first_id_byte=0x20 second_id_byte=0x33 parts=80 $ modprobe ubi mtd="0,0" fm_autoconvert $ ./io_paral /dev/ubi0 We may see the following verbose: (output) [io_paral] update_volume():105: function write() failed with error 30 (Read-only file system) [io_paral] update_volume():108: failed to write 380 bytes at offset 95920 of volume 2 [io_paral] update_volume():109: update: 97088 bytes [io_paral] write_thread():227: function pwrite() failed with error 28 (No space left on device) [io_paral] write_thread():229: cannot write 15872 bytes to offs 31744, wrote -1 (dmesg) ubi0 error: ubi_wl_get_peb [ubi]: Unable to get a free PEB from user WL pool ubi0 warning: ubi_eba_write_leb [ubi]: switch to read-only mode ubi0 error: ubi_io_write [ubi]: read-only mode CPU: 0 PID: 2027 Comm: io_paral Not tainted 5.3.0-rc2-1-g5986cd0 #9 ubi0 warning: try_write_vid_and_data [ubi]: failed to write VID header to LEB 2:5, PEB 18 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0 -0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x85/0xba ubi_eba_write_leb+0xa1e/0xa40 [ubi] vol_cdev_write+0x307/0x520 [ubi] ubi0 error: vol_cdev_write [ubi]: cannot accept more 380 bytes of data, error -30 vfs_write+0xfa/0x280 ksys_pwrite64+0xc5/0xe0 __x64_sys_pwrite64+0x22/0x30 do_syscall_64+0xbf/0x440 In function ubi_wl_get_peb, the operation of filling the pool (ubi_update_fastmap) with free PEBs and fetching a free PEB from the pool is not atomic. After thread A filling the pool with free PEB, free PEB may be taken away by thread B. When thread A checks the expression again, the condition is still unsatisfactory. At this time, there may still be free PEBs on UBI that can be filled into the pool. So, ubi_wl_get_peb (in fastmap-wil.c) should be implemented to obtain a free PEB by polling method. The polling exit condition is that there is no free PEBs on UBI, no free PEBs in pool, and ubi->works_count is 0. Signed-off-by: Zhihao Cheng --- drivers/mtd/ubi/fastmap-wl.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c index d9e2e3a..c5512cf 100644 --- a/drivers/mtd/ubi/fastmap-wl.c +++ b/drivers/mtd/ubi/fastmap-wl.c @@ -196,7 +196,7 @@ static int produce_free_peb(struct ubi_device *ubi) */ int ubi_wl_get_peb(struct ubi_device *ubi) { - int ret, retried = 0; + int ret; struct ubi_fm_pool *pool = &ubi->fm_pool; struct ubi_fm_pool *wl_pool = &ubi->fm_wl_pool; @@ -220,13 +220,14 @@ int ubi_wl_get_peb(struct ubi_device *ubi) } if (pool->used == pool->size) { - spin_unlock(&ubi->wl_lock); - if (retried) { + if (!ubi->free.rb_node && ubi->works_count == 0) { ubi_err(ubi, "Unable to get a free PEB from user WL pool"); + ubi_assert(list_empty(&ubi->works)); + spin_unlock(&ubi->wl_lock); ret = -ENOSPC; goto out; } - retried = 1; + spin_unlock(&ubi->wl_lock); up_read(&ubi->fm_eba_sem); ret = produce_free_peb(ubi); if (ret < 0) { -- 2.7.4