On Tue, Nov 27, 2018 at 07:43:39PM +0000, Filipe Manana wrote:
> On Tue, Nov 27, 2018 at 7:22 PM Josef Bacik <jo...@toxicpanda.com> wrote:
> >
> > On Fri, Nov 23, 2018 at 04:59:32PM +0000, Filipe Manana wrote:
> > > On Thu, Nov 22, 2018 at 12:35 AM Josef Bacik <jo...@toxicpanda.com> wrote:
> > > >
> > > > I noticed in a giant dbench run that we spent a lot of time on lock
> > > > contention while running transaction commit.  This is because dbench
> > > > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > > > they all run the delayed refs first thing, so they all contend with
> > > > each other.  This leads to seconds of 0 throughput.  Change this to only
> > > > run the delayed refs if we're the ones committing the transaction.  This
> > > > makes the latency go away and we get no more lock contention.
> > >
> > > Can you share the following in the changelog please?
> > >
> > > 1) How did you ran dbench (parameters, config).
> > >
> > > 2) What results did you get before and after this change. So that we all 
> > > get
> > >     an idea of how good the impact is.
> > >
> > > While the reduced contention makes all sense and seems valid, I'm not
> > > sure this is always a win.
> > > It certainly is when multiple tasks are calling
> > > btrfs_commit_transaction() simultaneously, but,
> > > what about when only one does it?
> > >
> > > By running all delayed references inside the critical section of the
> > > transaction commit
> > > (state == TRANS_STATE_COMMIT_START), instead of running most of them
> > > outside/before,
> > > we will be blocking for a longer a time other tasks calling
> > > btrfs_start_transaction() (which is used
> > > a lot - creating files, unlinking files, adding links, etc, and even 
> > > fsync).
> > >
> > > Won't there by any other types of workload and tests other then dbench
> > > that can get increased
> > > latency and/or smaller throughput?
> > >
> > > I find that sort of information useful to have in the changelog. If
> > > you verified that or you think
> > > it's irrelevant to measure/consider, it would be great to have it
> > > mentioned in the changelog
> > > (and explained).
> > >
> >
> > Yeah I thought about the delayed refs being run in the critical section now,
> > that's not awesome.  I'll drop this for now, I think just having a mutex 
> > around
> > running delayed refs will be good enough, since we want people who care 
> > about
> > flushing delayed refs to wait around for that to finish happening.  Thanks,
> 
> Well, I think we can have a solution that doesn't bring such trade-off
> nor introducing a mutex.
> We could do like what is currently done for writing space caches, to
> make sure only the first task
> calling commit transaction does the work and all others do nothing
> except waiting for the commit to finish:
> 
> btrfs_commit_transaction()
>    if (!test_and_set_bit(BTRFS_TRANS_COMMIT_START, &cur_trans->flags)) {
>        run delayed refs before entering critical section
>    }
> 

That was my first inclination but then one of the other committers goes past
this and gets into the start TRANS_STATE_COMMIT_START place and has to wait for
the guy running the delayed refs to finish before moving forward, essentially
extending the time in the critical section for the same reason.  The mutex means
that if 9000 people try to commit the transaction, 1 guy gets to run the delayed
refs and everybody else waits for him to finish, and in the meantime the
critical section is small.  Thanks,

Josef

Reply via email to