On Tue, Nov 27, 2018 at 7:22 PM Josef Bacik <jo...@toxicpanda.com> wrote:
>
> On Fri, Nov 23, 2018 at 04:59:32PM +0000, Filipe Manana wrote:
> > On Thu, Nov 22, 2018 at 12:35 AM Josef Bacik <jo...@toxicpanda.com> wrote:
> > >
> > > I noticed in a giant dbench run that we spent a lot of time on lock
> > > contention while running transaction commit.  This is because dbench
> > > results in a lot of fsync()'s that do a btrfs_transaction_commit(), and
> > > they all run the delayed refs first thing, so they all contend with
> > > each other.  This leads to seconds of 0 throughput.  Change this to only
> > > run the delayed refs if we're the ones committing the transaction.  This
> > > makes the latency go away and we get no more lock contention.
> >
> > Can you share the following in the changelog please?
> >
> > 1) How did you ran dbench (parameters, config).
> >
> > 2) What results did you get before and after this change. So that we all get
> >     an idea of how good the impact is.
> >
> > While the reduced contention makes all sense and seems valid, I'm not
> > sure this is always a win.
> > It certainly is when multiple tasks are calling
> > btrfs_commit_transaction() simultaneously, but,
> > what about when only one does it?
> >
> > By running all delayed references inside the critical section of the
> > transaction commit
> > (state == TRANS_STATE_COMMIT_START), instead of running most of them
> > outside/before,
> > we will be blocking for a longer a time other tasks calling
> > btrfs_start_transaction() (which is used
> > a lot - creating files, unlinking files, adding links, etc, and even fsync).
> >
> > Won't there by any other types of workload and tests other then dbench
> > that can get increased
> > latency and/or smaller throughput?
> >
> > I find that sort of information useful to have in the changelog. If
> > you verified that or you think
> > it's irrelevant to measure/consider, it would be great to have it
> > mentioned in the changelog
> > (and explained).
> >
>
> Yeah I thought about the delayed refs being run in the critical section now,
> that's not awesome.  I'll drop this for now, I think just having a mutex 
> around
> running delayed refs will be good enough, since we want people who care about
> flushing delayed refs to wait around for that to finish happening.  Thanks,

Well, I think we can have a solution that doesn't bring such trade-off
nor introducing a mutex.
We could do like what is currently done for writing space caches, to
make sure only the first task
calling commit transaction does the work and all others do nothing
except waiting for the commit to finish:

btrfs_commit_transaction()
   if (!test_and_set_bit(BTRFS_TRANS_COMMIT_START, &cur_trans->flags)) {
       run delayed refs before entering critical section
   }

thanks

>
> Josef



-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

Reply via email to