+1 to Jacques, it'd be nice if we had the core changes easily re/viewable.
Also, would it make sense at this point to split the change set into
smaller patches as there seems more work to do now?


H+

On Fri, Jul 31, 2015 at 7:06 PM, Jacques Nadeau <jacq...@dremio.com> wrote:

> That sounds frustrating.
>
> I agree that we need to get this merged.  The old allocator is sloppy about
> accounting at best.  Lets work together on trying to come up with a
> solution. Can you point us at the current branch so other people can
> provide some brainstorming?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Jul 30, 2015 at 4:00 PM, Chris Westin <chriswesti...@gmail.com>
> wrote:
>
> > Short version: I'll call it quits on the merge moratorium for now. Thank
> > you to everyone for participating. Merge away.
> >
> > In the precommit suite, one query fails with an illegal reference
> counting
> > exception from the external sort, and Steven has found that for me. This
> is
> > the closest I've ever gotten. On future attempts to commit after
> rebasing,
> > I'm going to be counting on other file owners a lot more to get through
> > that quickly, rather than trying to find all the newly introduced
> problems
> > myself.
> >
> > Long version: when I run the performance suite, the results with the
> > non-locking version of the allocator are terrible. Worse than the locking
> > implementation of the allocator (I still have both on separate branches).
> > When we ran this on the locking implementation, there was roughly a 20%
> > performance degradation, and consensus was that this was too much to
> accept
> > the change. The locking implementation uses a single lock for all
> > allocators. (Yes, I know that sounds heavy-handed, but it wasn't the
> first
> > choice. There was a prior implementation that used a lock per allocator,
> > but that one got deadlocks all the time because it couldn't ensure
> > consistent lock acquisition orders when allocators went to their parents
> to
> > get more space, combined with allocators locking each other to transfer
> or
> > share buffer ownership.)
> >
> > I thought I'd solve this with a non-locking implementation. In this
> > version, the variables that are used to track the state of an allocator
> re
> > its available space, and how it is used, are kept in a small inner class;
> > the allocator has an AtomicReference to that. A space allocation consists
> > of getting that reference, making a clone of it, and then making all the
> > necessary changes to the clone. To commit the space transaction, I try to
> > swap it in with AtomicReference.compareAndSet(). If that fails, the
> > transaction is retried. I expected that there would be no failures with
> > leaf allocators, because they're only used by the thread doing the
> > fragment's work. Only the root should have seen contention. But the
> > performance cluster test showed the performance for this implementation
> to
> > be five times worse than the current master (yes 5x, not just 20% worse
> > like the locking implementation). I've done some quick sanity checks
> today,
> > but don't see anything obviously silly. I will investigate a little
> further
> > -- I've already come up with a couple of potential issues, but I need to
> do
> > a couple experiments with it over the next few hours (and which wouldn't
> > leave enough time to do the merge by the 48 hour deadline).
> >
> > If I can't over come those issues, then I will at least go for obtaining
> > the root allocator from a factory, and set things up so that the current
> > and new allocator can co-exist, because the new one definitely catches a
> > lot more problems -- we should be running tests with it on. Hopefully I
> can
> > overcome the issues, shortly, because I think the accounting is much
> better
> > (that's why it catches more problems), and we need that in order to find
> > our ongoing slow memory leak.
> >
> > On Wed, Jul 29, 2015 at 4:00 PM, Jacques Nadeau <jacq...@dremio.com>
> > wrote:
> >
> > > Makes sense.
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Wed, Jul 29, 2015 at 3:32 PM, Chris Westin <chriswesti...@gmail.com
> >
> > > wrote:
> > >
> > > > Ordinarily, I would agree. However, in this particular case, some
> other
> > > > folks wanted me closer to master so they could use my branch to track
> > > down
> > > > problems in new code. Also, the problems I was seeing were in code
> I'm
> > > not
> > > > familiar with, but there had been several recent commits claiming to
> > fix
> > > > memory issues there. So I wanted to see if the problems I was seeing
> > had
> > > > been taken care of. Sure enough, my initial testing shows that the
> > > problems
> > > > I was trying to fix had already been fixed by others -- they went
> away
> > > > after I rebased. In this case, chasing master saved me from having to
> > > track
> > > > all of those down myself, and duplicating the work. I'm hoping that
> > there
> > > > weren't any significant new ones introduced. Testing is proceeding.
> > > >
> > > > On Wed, Jul 29, 2015 at 1:59 PM, Parth Chandra <par...@apache.org>
> > > wrote:
> > > >
> > > > > I think the idea (some of the idea is mine I'm afraid) is to allow
> > > Chris
> > > > to
> > > > > catch up and rebase, not to have it reviewed and merged in two
> days.
> > > > > At the moment the problem is that every time he rebases, some test
> > > breaks
> > > > > and while he's chasing that down, the master moves ahead.
> > > > > If we get this 2 day break, we can get close enough to master and
> > share
> > > > the
> > > > > changes (a pre-review perhaps).
> > > > > Also, agree that perhaps the patch could be split into smaller
> > pieces.
> > > > Not
> > > > > renaming the allocator would in fact save several files from being
> > > > included
> > > > > in the patch.
> > > > >
> > > > >
> > > > > P
> > > > >
> > > > >
> > > > > On Wed, Jul 29, 2015 at 12:17 PM, Jacques Nadeau <
> jacq...@dremio.com
> > >
> > > > > wrote:
> > > > >
> > > > > > In general, this type of request makes a lot of sense.  That
> said,
> > we
> > > > > > should get to +1 before we hold master.
> > > > > >
> > > > > > The last changes that were posted on DRILL-1942 are more than a
> > month
> > > > > old.
> > > > > > The patch touched ~100 files and was thousands of lines.  It had
> an
> > > > issue
> > > > > > that we identified.  Since then, you exposed very little to the
> > > > community
> > > > > > on your progress.  It seems unrealistic to provide a large update
> > to
> > > > this
> > > > > > patch and expect review and merge within 48 hours.  Had you been
> > > > > exposing &
> > > > > > discussing your work over the last month and the community agreed
> > > that
> > > > it
> > > > > > was ready for merge, your ask here would seem more feasible.
> > > > > >
> > > > > > My suggestion is to stop chasing master AND break your patch into
> > > > smaller
> > > > > > pieces.  Looking at the old patch, the vast majority of changes
> > have
> > > > very
> > > > > > little to do with a new allocator functionality.  For example,
> > > renaming
> > > > > the
> > > > > > allocator could be merged without changing the underlying
> > > > implementation
> > > > > > (that would substantially reduce the patch size).
> > > > > >
> > > > > > For the allocator specifically, let's get the code right first.
> > Then
> > > > we
> > > > > > can work as a community to minimize the effort to get it merged.
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Jacques Nadeau
> > > > > > CTO and Co-Founder, Dremio
> > > > > >
> > > > > > On Wed, Jul 29, 2015 at 11:41 AM, Chris Westin <
> > > > chriswesti...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I've got a large patch that includes a completely rewritten
> > direct
> > > > > memory
> > > > > > > allocator (replaces TopLevelAllocator).
> > > > > > >
> > > > > > > The space accounting is much tighter than with the current
> > > > > > implementation,
> > > > > > > and it catches a lot more problems than the current
> > implementation
> > > > > does.
> > > > > > It
> > > > > > > also fixes issues with accounting around the use of shared
> > buffers,
> > > > and
> > > > > > > buffer ownership transfer (used by the RPC layer to hand off
> > > buffers
> > > > to
> > > > > > > fragments that will do work).
> > > > > > >
> > > > > > > It's been an ongoing battle to get this in, because every time
> I
> > > get
> > > > > > close,
> > > > > > > I rebase, and it finds more new problems (apparently introduced
> > by
> > > > > other
> > > > > > > work done since my last rebase). These take time to track down
> > and
> > > > fix,
> > > > > > > because they're often in areas of the code I don't know.
> > > > > > >
> > > > > > > It looks like I'm very close right now. I rebased against
> > > > apache/master
> > > > > > on
> > > > > > > Friday. All the unit tests passed. All of our internal tests
> > passed
> > > > > > except
> > > > > > > for one query, which takes an IllegalReferenceCountException
> (it
> > > > looks
> > > > > > like
> > > > > > > a DrillBuf is being released one more time than it should be).
> > > > > > >
> > > > > > > So, in order to keep the gap from getting wide again (it looks
> > like
> > > > I'm
> > > > > > > already a couple of commits behind, but hopefully they don't
> > > > introduce
> > > > > > more
> > > > > > > issues), I'm asking that folks hold off on merging into master
> > for
> > > 48
> > > > > > hours
> > > > > > > from now -- that's until about noon on Friday PST. I'm hoping
> > that
> > > > will
> > > > > > > give me the time needed to finally get this in. If things go
> > wrong
> > > > with
> > > > > > my
> > > > > > > current patching, or I discover other problems, or can't find
> the
> > > > > illegal
> > > > > > > reference count issue by then, I'll post a message and open
> > things
> > > up
> > > > > > > again. Meanwhile, you can still pull, do work, make pull
> > requests,
> > > > and
> > > > > > get
> > > > > > > them reviewed; just don't merge them to master.
> > > > > > >
> > > > > > > Can we agree to this?
> > > > > > >
> > > > > > > Chris
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to