Github user jinfengni commented on the issue:
https://github.com/apache/drill/pull/905
@vvysotskyi , the example you listed (three tables a,b,c all have same
values) seems to be essentially cross-join. For such cases, clearly the current
rowCount estimation is way off from the real number, which would impact the
estimation of hash join memory cost, and hence the proposed idea would not
work. However, I feel it's not a very common case to have two tables joined
like a cross join. The question is : does it make sense to modify cost
estimation for seemly uncommonly use case?
---