On 2020/5/30 9:49, Jaegeuk Kim wrote:
> On 05/30, Chao Yu wrote:
>> On 2020/5/30 6:34, Jaegeuk Kim wrote:
>>> On 05/29, Chao Yu wrote:
>>>> Under heavy fsstress, we may triggle panic while issuing discard,
>>>> because __check_sit_bitmap() detects that discard command may earse
>>>> valid data blocks, the root cause is as below race stack described,
>>>> since we removed lock when flushing quota data, quota data writeback
>>>> may race with write_checkpoint(), so that it causes inconsistency in
>>>> between cached discard entry and segment bitmap.
>>>>
>>>> - f2fs_write_checkpoint
>>>>  - block_operations
>>>>   - set_sbi_flag(sbi, SBI_QUOTA_SKIP_FLUSH)
>>>>  - f2fs_flush_sit_entries
>>>>   - add_discard_addrs
>>>>    - __set_bit_le(i, (void *)de->discard_map);
>>>>                                            - f2fs_write_data_pages
>>>>                                             - f2fs_write_single_data_page
>>>>                                               : inode is quota one, 
>>>> cp_rwsem won't be locked
>>>>                                              - f2fs_do_write_data_page
>>>>                                               - f2fs_allocate_data_block
>>>>                                                - f2fs_wait_discard_bio
>>>>                                                  : discard entry has not 
>>>> been added yet.
>>>>                                                - update_sit_entry
>>>>  - f2fs_clear_prefree_segments
>>>>   - f2fs_issue_discard
>>>>   : add discard entry
>>>>
>>>> This patch fixes this issue by reverting 435cbab95e39 ("f2fs: fix 
>>>> quota_sync
>>>> failure due to f2fs_lock_op").
>>>>
>>>> Fixes: 435cbab95e39 ("f2fs: fix quota_sync failure due to f2fs_lock_op")
>>>
>>> The previous patch fixes quota_sync gets EAGAIN all the time.
>>> How about this? It seems this works for fsstress test.
>>>
> 
> Then this?
> 
> ---
>  fs/f2fs/segment.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> index ebbadde6cbced..ed11dcf2d69ed 100644
> --- a/fs/f2fs/segment.c
> +++ b/fs/f2fs/segment.c
> @@ -3107,6 +3107,14 @@ void f2fs_allocate_data_block(struct f2fs_sb_info 
> *sbi, struct page *page,
>               type = CURSEG_COLD_DATA;
>       }
>  
> +     /*
> +      * We need to wait for node_write to avoid block allocation during
> +      * checkpoint. This can only happen to quota writes which can cause
> +      * the below discard race condition.
> +      */
> +     if (IS_DATASEG(type))
> +             down_write(&sbi->node_write);
> +
>       down_read(&SM_I(sbi)->curseg_lock);
>  
>       mutex_lock(&curseg->curseg_mutex);
> @@ -3174,6 +3182,9 @@ void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, 
> struct page *page,

Minor thing: unlock order.

if (IS_DATASEG(type))
        up_write(&sbi->node_write);

Could you merge the diff into this patch?

>  
>       if (put_pin_sem)
>               up_read(&sbi->pin_sem);
> +
> +     if (IS_DATASEG(type))
> +             up_write(&sbi->node_write);
>  }
>  
>  static void update_device_state(struct f2fs_io_info *fio)
> 

Reply via email to