Github user jinfengni commented on the issue:

    https://github.com/apache/drill/pull/905
  
    This OOM problem exposes two problems. The first one is in planning time, 
where we choose a sub-optimal plan, due to the inaccurate estimation of row 
count because of missing of appropriate statistics. The second one is in 
execution time, where we may need understand whether Drill uses too much memory 
and whether spill to disk is an option. I think the two are complementary to 
each other; even when we have spill to disk for hash join, if planner choose a 
sub-optimal plan, the query still could take long, long time to complete.
    Looks like the PR is addressing the 1st issue. I agree that the root cause 
is row count estimation, which is more appropriate to defer to the enhancement 
of statistics support. For swapJoin logic, the proposal of getMaxRowCount() 
seems to be in the line of adjusting row count estimation. I like better the 
idea of combining row count + column count, which was essentially adopted in 
swapInput() by LoptJoinOptmizeRule.
    For HashTable build side cost, hash table only has to hold the join key. 
However, since hash join is a blocking operator, it has hold all the records in 
the build side, meaning total memory requirement (for both hash table + 
non-join key columns) depends on row count and column count. Therefore, the 
cost model of hash join should reflect that. Can we use similar idea in 
SwapHashJoinVisitor?
    One further improvement would be to modify HashJoinPrel.computeSelfCost(). 
Today we only consider join key width, and it makes sense to adjust that logic, 
by considering the total column counts in build side. Such logic could be 
extracted into a common places, then SwapHashJoinVisitor could call the same 
shared logic to decide whether it's cost-wise optimal to swap the input sides. 
Thoughts?


---

Reply via email to