> 1) confirm your beeline java process is indeed running with expanded memory
The OOM error is clearly coming from the HiveServer2 CBO codepath post beeline.
at
org.apache.calcite.rel.AbstractRelNode$1.explain_(AbstractRelNode.java:409)
at
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:157)
at
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:308)
at
org.apache.calcite.rel.AbstractRelNode.computeDigest(AbstractRelNode.java:416)
at
org.apache.calcite.rel.AbstractRelNode.recomputeDigest(AbstractRelNode.java:352)
at
org.apache.calcite.plan.hep.HepPlanner.buildFinalPlan(HepPlanner.java:881)
at
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:199)
In the 1.x branch, this would've failed for completely different reasons - the
OR expressions were left leaning, so the expression a or b or c or d would be
parsed as (((a or b) or c) or d) - instead being a balanced tree like (a or b)
or (c or d).
The difference was between having LOG2(N) depth vs N.
> I didn't bother though it would be good information since I found a work
> around and troubleshooting beeline wasn't my primary goal :)
The CBO error is likely to show up anyway - this might be a scenario where your
HiveServer2 has been started up with a 2Gb heap and keeps dying while logging
the query plans.
Loading the points off HDFS is pretty much the ideal solution, particularly if
pre-compute an ST_Envelope for the small table side while loading (like an
R-Tree) to reduce the total number of actual intersection checks for complex
polygons.
Cheers,
Gopal