Github user jinfengni commented on the issue: https://github.com/apache/drill/pull/905 @vvysotskyi , the example you listed (three tables a,b,c all have same values) seems to be essentially cross-join. For such cases, clearly the current rowCount estimation is way off from the real number, which would impact the estimation of hash join memory cost, and hence the proposed idea would not work. However, I feel it's not a very common case to have two tables joined like a cross join. The question is : does it make sense to modify cost estimation for seemly uncommonly use case?
---