If you are running query as in DRILL-1162, one of the join columns is
defined as binary in parquet file:

> inner join `lineitem2.parquet` f on a.l_comment = f.l_comment


On Fri, Sep 25, 2015 at 9:24 AM, Chris Westin <[email protected]>
wrote:

> That's interesting, because it is the reallocation of a VarCharVector that
> is the one that is failing. I'll look at the file history, but do you
> recall offhand what changed there?
>
> On Fri, Sep 25, 2015 at 9:20 AM, Aman Sinha <[email protected]> wrote:
>
> > Right, the HashJoin and HashTable code hasn't changed significantly in
> > terms of memory allocation in the last several releases.  You might want
> to
> > look at the change history for underlying vector allocations...I recall
> > that variable length vector allocations went through some changes.
> However
> > DRILL-1162 does not seem to be using varchar columns (I think..).
> >
> > On Fri, Sep 25, 2015 at 6:51 AM, Jacques Nadeau <[email protected]>
> > wrote:
> >
> > > I don't think anyone has done much there in quite some time. I'd guess
> > > something external has changed that affects it. The last substantive
> > change
> > > around that code (I think) was the introduction of the multiplexing
> work
> > > that Venki and Yuliya did early this year.
> > > On Sep 25, 2015 6:32 AM, "Chris Westin" <[email protected]>
> wrote:
> > >
> > > > I've been looking into DRILL-1162, and found that a query that used
> to
> > > run
> > > > within certain constraints (DRILL_MAX_DIRECT_MEMORY=32G) no longer
> does
> > > > even though it looks like there should be plenty of memory. I took
> the
> > > > query in that report, and removed the last ten (redundant) join
> > elements,
> > > > and it now fails with 32G direct memory, even though it previously
> ran
> > > > (although it produced the wrong results). When I check the query
> > profile,
> > > > it only consumed around ~9G -- so there should be plenty of space
> left
> > > > before it fails. I started looking at it in the debugger, and the
> > > > allocation failure occurs during an attempt to resize the output
> > vector.
> > > > The allocator being used believes there's no memory left, even though
> > > it's
> > > > parent has more than enough to satisfy the request.
> > > >
> > > > I've also found another ticket with a HashJoin that fails in a
> similar
> > > way
> > > > even though there is plenty of memory.
> > > >
> > > > Hash the execution of HashJoin or its use of its result vector
> changed
> > in
> > > > some way recently?
> > > >
> > >
> >
>

Reply via email to