On 21.02.19 г. 10:22 ч., Qu Wenruo wrote:
> There are a lot of error reports complaining about transid error in the
> mail list.
> 
> Under most case, the on-disk transid is lower than expected transid.
> This may indicate that some tree blocks are not written back to disk
> before writing super blocks.
> 
> This patch will add a safe net for developers, by calling
> btrfs_write_and_wait_transaction() before setting transaction unblocked
> and double check btree_inode and dirty_pages io_tree, to ensure no tree
> blocks are still dirty or under writeback.
> 
> Signed-off-by: Qu Wenruo <w...@suse.com>
> ---
> The reason for RFC is, I'm not sure why we currently call
> btrfs_write_and_wait_transaction() after setting transaction UNBLOCKED.
> 
> It looks like an optimization, but I don't see much performance
> difference during regression test.
> 
> I hope to move the call before we unblock transaction so we can do such
> sanity check for all builds and hope to catch some clue of transid
> error.

Even current code ensures that all allocated blocks in the current
transaction (which is what all those EXTENT_DIRTY extents in the
dirty_pages tree ) are written before the new superblocks are.

Slight offtopic: In fact instead of playing games with the flags and
having an extent_io_tree called dirty_pages o_O it can be replaced with
a simple linked list that holds all newly allocated buffers so writing
all such buffers will result in simply iterating the list.

In any case this patch is buggy, see below on why

> ---
>  fs/btrfs/transaction.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index 4ec2b660d014..30b7ed0bf873 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -2213,6 +2213,44 @@ int btrfs_commit_transaction(struct btrfs_trans_handle 
> *trans)
>  
>       btrfs_trans_release_chunk_metadata(trans);
>  
> +     /* Last safenet or developer to catch any unwritten tree blocks */
> +     if (IS_ENABLED(CONFIG_BTRFS_DEBUG)) {
> +             u64 found_start = 0;
> +             u64 found_end = 0;
> +
> +             ret = btrfs_write_and_wait_transaction(trans);
> +             if (ret) {
> +                     btrfs_handle_fs_error(fs_info, ret,
> +                                           "Error while writing out 
> transaction");
> +                     mutex_unlock(&fs_info->tree_log_mutex);
> +                     goto scrub_continue;
> +             }
> +
> +             /* No dirty extent should exist in btree inode */
> +             ret = test_range_bit(&trans->transaction->dirty_pages, 0,
> +                             (u64)-1, EXTENT_DIRTY | EXTENT_WRITEBACK,

Why do you check EXTENT_WRITEBACK, AFAICS that flag is not currently
used in the code and should perhahps be deleted? I don't see anything
setting it, it's only being checked for (as part of EXTENT_IOBITS).

Additionally this check is pointless because
btrfs_write_and_wait_transaction calls clear_btree_io_tree which purges
the tree.

> +                             0, NULL);
> +             if (ret > 0) {
> +                     WARN(1,
> +             "dirty_pages not fully written back, start=%llu len=%llu\n",
> +                          found_start, found_end + 1 - found_start);
> +                     ret = -EUCLEAN;
> +                     mutex_unlock(&fs_info->tree_log_mutex);
> +                     goto scrub_continue;
> +             }
> +             ret = test_range_bit(&BTRFS_I(fs_info->btree_inode)->io_tree, 0,
> +                                  (u64)-1, EXTENT_DIRTY | EXTENT_WRITEBACK,
> +                                  0, NULL);
> +             if (ret > 0) {
> +                     WARN(1,
> +             "btree io_tree not fully written back, start=%llu len=%llu\n",
> +                          found_start, found_end + 1 - found_start);
> +                     ret = -EUCLEAN;
> +                     mutex_unlock(&fs_info->tree_log_mutex);
> +                     goto scrub_continue;
> +             }
> +     }
> +
>       spin_lock(&fs_info->trans_lock);
>       cur_trans->state = TRANS_STATE_UNBLOCKED;
>       fs_info->running_transaction = NULL;
> 

Reply via email to