Re: [PATCH v5 2/8] btrfs: only let one thread pre-flush delayed refs in commit

Nikolay Borisov Tue, 12 Jan 2021 01:18:45 -0800

On 11.01.21 г. 23:50 ч., David Sterba wrote:
> On Mon, Jan 11, 2021 at 10:33:42AM +0200, Nikolay Borisov wrote:
>> On 8.01.21 г. 18:01 ч., David Sterba wrote:
>>> On Fri, Dec 18, 2020 at 02:24:20PM -0500, Josef Bacik wrote:
>>>> @@ -2043,23 +2043,22 @@ int btrfs_commit_transaction(struct 
>>>> btrfs_trans_handle *trans)
>>>>    btrfs_trans_release_metadata(trans);
>>>>    trans->block_rsv = NULL;
>>>>  
>>>> -  /* make a pass through all the delayed refs we have so far
>>>> -   * any runnings procs may add more while we are here
>>>> -   */
>>>> -  ret = btrfs_run_delayed_refs(trans, 0);
>>>> -  if (ret) {
>>>> -          btrfs_end_transaction(trans);
>>>> -          return ret;
>>>> -  }
>>>> -
>>>> -  cur_trans = trans->transaction;
>>>> -
>>>>    /*
>>>> -   * set the flushing flag so procs in this transaction have to
>>>> -   * start sending their work down.
>>>> +   * We only want one transaction commit doing the flushing so we do not
>>>> +   * waste a bunch of time on lock contention on the extent root node.
>>>>     */
>>>> -  cur_trans->delayed_refs.flushing = 1;
>>>> -  smp_wmb();
>>>
>>> This barrier obviously separates the flushing = 1 and the rest of the
>>> code, now implemented as test_and_set_bit, which implies full barrier.
>>>
>>> However, hunk in btrfs_should_end_transaction removes the barrier and
>>> I'm not sure whether this is correct:
>>>
>>> -   smp_mb();
>>>     if (cur_trans->state >= TRANS_STATE_COMMIT_START ||
>>> -       cur_trans->delayed_refs.flushing)
>>> +       test_bit(BTRFS_DELAYED_REFS_FLUSHING,
>>> +                &cur_trans->delayed_refs.flags))
>>>             return true;
>>>
>>> This is never called under locks so we don't have complete
>>> synchronization of neither the transaction state nor the flushing bit.
>>> btrfs_should_end_transaction is merely a hint and not called in critical
>>> places so we could probably afford to keep it without a barrier, or keep
>>> it with comment(s).
>>
>> I think the point is moot in this case, because the test_bit either sees
>> the flag or it doesn't. It's not possible for the flag to be set AND
>> should_end_transaction return false that would be gross violation of
>> program correctness.
> 
> So that's for the flushing part, but what about cur_trans->state?

Looking at the code, the barrier was there to order the publishing of
the delayed_ref.flushing (now replaced by the bit flag) against
surrounding code.

So independently of this patch, let's reason about trans state. In
should_end_transaction it's read without holding any locks. (U)

It's modified in btrfs_cleanup_transaction without holding the
fs_info->trans_lock (U), but the STATE_ERROR flag is going to be set.

set in cleanup_transaction under fs_info->trans_lock (L)
set in btrfs_commit_trans to COMMIT_START under fs_info->trans_lock.(L)
set in btrfs_commit_trans to COMMIT_DOING under fs_info->trans_lock.(L)
set in btrfs_commit_trans to COMMIT_UNBLOCK under fs_info->trans_lock.(L)

set in btrfs_commit_trans to COMMIT_COMPLETED without locks but at this
point the transaction is finished and fs_info->running_trans is NULL (U
but irrelevant).

So by the looks of it we can have a concurrent READ race with a Write,
due to reads not taking a lock. In this case what we want to ensure is
we either see new or old state. I consulted with Will Deacon and he said
that in such a case we'd want to annotate the accesses to ->state with
(READ|WRITE)_ONCE so as to avoid a theoretical tear, in this case I
don't think this could happen but I imagine at some point kcsan would
flag such an access as racy (which it is).

>
Re: [PATCH v5 2/8] btrfs: only let one thread pre-flush delayed refs in commit

Reply via email to