I noticed in a giant dbench run that we spent a lot of time on lock contention while running transaction commit. This is because dbench results in a lot of fsync()'s that do a btrfs_transaction_commit(), and they all run the delayed refs first thing, so they all contend with each other. This leads to seconds of 0 throughput. Change this to only run the delayed refs if we're the ones committing the transaction. This makes the latency go away and we get no more lock contention.
Reviewed-by: Omar Sandoval <osan...@fb.com> Signed-off-by: Josef Bacik <jo...@toxicpanda.com> --- fs/btrfs/transaction.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index a0f19ca0bd6c..39a2bddb0b29 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1925,15 +1925,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) btrfs_trans_release_metadata(trans); trans->block_rsv = NULL; - /* make a pass through all the delayed refs we have so far - * any runnings procs may add more while we are here - */ - ret = btrfs_run_delayed_refs(trans, 0); - if (ret) { - btrfs_end_transaction(trans); - return ret; - } - cur_trans = trans->transaction; /* @@ -1946,12 +1937,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) if (!list_empty(&trans->new_bgs)) btrfs_create_pending_block_groups(trans); - ret = btrfs_run_delayed_refs(trans, 0); - if (ret) { - btrfs_end_transaction(trans); - return ret; - } - if (!test_bit(BTRFS_TRANS_DIRTY_BG_RUN, &cur_trans->flags)) { int run_it = 0; @@ -2022,6 +2007,15 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans) spin_unlock(&fs_info->trans_lock); } + /* + * We are now the only one in the commit area, we can run delayed refs + * without hitting a bunch of lock contention from a lot of people + * trying to commit the transaction at once. + */ + ret = btrfs_run_delayed_refs(trans, 0); + if (ret) + goto cleanup_transaction; + extwriter_counter_dec(cur_trans, trans->type); ret = btrfs_start_delalloc_flush(fs_info); -- 2.14.3