On 2020/7/10 18:07, Sahitya Tummala wrote:
> On Fri, Jul 10, 2020 at 04:40:19PM +0800, Chao Yu wrote:
>> On 2020/7/10 11:39, Sahitya Tummala wrote:
>>> Hi Chao,
>>>
>>> On Fri, Jul 10, 2020 at 10:54:13AM +0800, Chao Yu wrote:
>>>> Hi Sahitya,
>>>>
>>>> It looks block plug has already been removed by Jaegeuk with
>>>> below commit:
>>>>
>>>> commit 1f5f11a3c41e2b23288b2769435a00f74e02496b
>>>> Author: Jaegeuk Kim <jaeg...@kernel.org>
>>>> Date:   Fri May 8 12:25:45 2020 -0700
>>>>
>>>>     f2fs: remove blk_plugging in block_operations
>>>>
>>>>     blk_plugging doesn't seem to give any benefit.
>>>>
>>>> How about backporting this patch?
>>>
>>> Yes, I have noticed that patch. But we have nested pluglists in
>>> the block_operations path. Hence, I was not sure if that patch alone
>>> can help.
>>> 1. At the start of  block_operations
>>> 2. Inside __f2fs_write_data_pages() that gets called from
>>> f2fs_sync_dirty_inodes()->filemap_fdatawrite()
>>
>> Would be more safe to call io_schedule() in f2fs_sync_dirty_inodes()?
>>
>> - io_schedule()
>>  - schedule()
>>   - sched_submit_work()
>>    - blk_schedule_flush_plug()
>>     - blk_flush_plug_list()
>>
> 
> With Jaegeuk's patch, we will only have the plug list in CP path - inside
> __f2fs_write_data_pages(), which can now be flushed as its not nested
> anymore.
> 
>         blk_start_plug(&plug);
>         ret = f2fs_write_cache_pages(mapping, wbc, io_type);
>         blk_finish_plug(&plug);
> 
> Earlier this blk_finish_plug() will not flush the plug list as the bios

Oh, correct, because blk_finish_plug() will check caller passed &plug and
plug assigned in current structure...

void blk_finish_plug(struct blk_plug *plug)
{
        if (plug != current->plug)
                return;
        blk_flush_plug_list(plug, false);

        current->plug = NULL;
}

> are attached to the outer plug from block_operations. So I think Jaegeuk's
> patch alone also can help this issue.
> 
>>>
>>> Do you know the possible path for this issue scenario to happen?> Where 
>>> does in the CP path before even f2fs_sync_node_pages() is done, the
>>> node pages cab be submitted for io and get attached to CP plug list?
>>
>> Maybe:
>>
>> writeback_inodes_wb
>>  blk_start_plug                     ---plugged
>>   __writeback_inodes_wb
>>    writeback_sb_inodes
>>     __writeback_single_inode
>>      do_writepages
>>       f2fs_write_node_pages
>>        blk_start_plug               ---plugged
>>        f2fs_sync_node_pages
>>         __write_node_page(do_balance=true)  --submit node page to plug
>>          f2fs_balance_fs
>>           f2fs_balance_fs_bg
>>            f2fs_sync_fs
>>             f2fs_write_checkpoint
>>           or
>>           f2fs_gc
>>            f2fs_write_checkpoint
>>        blk_start_plug
>>  blk_finish_plug
>>
> 
> Hmmm, but from ramdumps the CP thread stack is shown as below.
> 
> f2fs_sync_file
> f2fs_do_sync_file
> f2fs_sync_fs
> f2fs_write_checkpoint

Alright, then, I don't think there is another plug in block_operation()'s
caller.

Thanks,

> 
> Thanks,
> 
>> Thanks,
>>
>>>
>>> Thanks,
>>>
>>>>
>>>> On 2020/7/10 10:30, Sahitya Tummala wrote:
>>>>> Hi Chao, Jaegeuk,
>>>>>
>>>>> I have received an issue report that indicates that system is stuck
>>>>> on IO due to f2fs checkpoint and writeback stuck waiting on each other
>>>>> as explained below.
>>>>>
>>>>> WB thread -
>>>>> ----------
>>>>>
>>>>> io_schedule
>>>>> wait_on_page_bit
>>>>> f2fs_wait_on_page_writeback -> It is waiting for node
>>>>>                   node page writeback whose bio is in the
>>>>>                   plug list of CP thread below.
>>>>> f2fs_update_data_blkaddr
>>>>> f2fs_outplace_write_data
>>>>> f2fs_do_write_data_page
>>>>> __write_data_page
>>>>> __f2fs_write_data_pages
>>>>> f2fs_write_data_pages
>>>>> do_writepages
>>>>>
>>>>> CP thread -
>>>>> -----------
>>>>>
>>>>> __f2fs_write_data_pages -> It is for the same inode above that is under 
>>>>> WB (which
>>>>>   is waiting for node page writeback). In this context, there is nothing 
>>>>> to
>>>>>   be written as the data is already under WB. 
>>>>> filemap_fdatawrite
>>>>> f2fs_sync_dirty_inodes -> It just loops here in f2fs_sync_dirty_inodes() 
>>>>> until
>>>>>                   f2fs_remove_dirty_inode() has been done by the WB 
>>>>> thread above.
>>>>> block_operations
>>>>> f2fs_write_checkpoint
>>>>>
>>>>> The CP thread somehow has the node page bio in its plug list that cannot 
>>>>> be submitted 
>>>>> until end of block_operations() and CP thread is blocked on WB of an 
>>>>> inode who is again
>>>>> waiting for io pending in CP plug list. Both the stacks are stuck on for 
>>>>> each other.
>>>>>
>>>>> The below patch helped to solve the issue, please review and suggest if 
>>>>> this seems to 
>>>>> be okay. Since anyways we are doing cond_resched(), I thought it will be 
>>>>> good to flush
>>>>> the plug list as well (in this issue case, it will loop for the same 
>>>>> inode again and again).
>>>>>
>>>>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
>>>>> index e460d90..152df48 100644
>>>>> --- a/fs/f2fs/checkpoint.c
>>>>> +++ b/fs/f2fs/checkpoint.c
>>>>> @@ -1071,10 +1071,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info 
>>>>> *sbi, enum inode_type type)
>>>>>
>>>>>                 iput(inode);
>>>>>                 /* We need to give cpu to another writers. */
>>>>> -               if (ino == cur_ino)
>>>>> +               if (ino == cur_ino) {
>>>>> +                       blk_flush_plug(current);
>>>>>                         cond_resched();
>>>>> -               else
>>>>> +                } else {
>>>>>                         ino = cur_ino;
>>>>> +                }
>>>>>         } else {
>>>>>                 /*
>>>>>                  * We should submit bio, since it exists several
>>>>>
>>>>> Thanks,
>>>>>
>>>
> 


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to