Hi all
When reading the codes of HashTableSinkOperator, I am confused why do we
monitor memory usage only when the mapjoin key has never been seen before. If
the mapjoin keys are much less than the table rows, there is a chance that the
HashTableSinkOperator will never check the memory, as the rowContainer mapping
by the incoming join key should be not null at all. In this case, the
MapRedLocalTask catches OOM exception and crashes down the hiveserver2 when
running in the same process.
Any insight on this?
Thanks
--
Regards,
Zhihua