Re: Beeline throws OOM on large input query

Gopal Vijayaraghavan Tue, 06 Sep 2016 09:51:28 -0700

>  1) confirm your beeline java process is indeed running with expanded memory


The OOM error is clearly coming from the HiveServer2 CBO codepath post beeline.

        at 
org.apache.calcite.rel.AbstractRelNode$1.explain_(AbstractRelNode.java:409)
        at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:157)
        at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:308)
        at 
org.apache.calcite.rel.AbstractRelNode.computeDigest(AbstractRelNode.java:416)
        at 
org.apache.calcite.rel.AbstractRelNode.recomputeDigest(AbstractRelNode.java:352)
        at 
org.apache.calcite.plan.hep.HepPlanner.buildFinalPlan(HepPlanner.java:881)
        at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:199)

In the 1.x branch, this would've failed for completely different reasons - the 
OR expressions were left leaning, so the expression a or b or c or d would be 
parsed as (((a or b) or c) or d) - instead being a balanced tree like (a or b) 
or (c or d).

The difference was between having LOG2(N) depth vs N.

>  I didn't bother though it would be good information since I found a work 
> around and troubleshooting beeline wasn't my primary goal :)

The CBO error is likely to show up anyway - this might be a scenario where your 
HiveServer2 has been started up with a 2Gb heap and keeps dying while logging 
the query plans.

Loading the points off HDFS is pretty much the ideal solution, particularly if 
pre-compute an ST_Envelope for the small table side while loading (like an 
R-Tree) to reduce the total number of actual intersection checks for complex 
polygons.

Cheers,
Gopal

Re: Beeline throws OOM on large input query

Reply via email to