Make the reducer limit-aware
----------------------------
Key: HIVE-1239
URL: https://issues.apache.org/jira/browse/HIVE-1239
Project: Hadoop Hive
Issue Type: Improvement
Affects Versions: 0.6.0
Reporter: Ning Zhang
Fix For: 0.6.0
Currently if a join followed by a limit operator, the reducer still need to do
a lot of work even after the limit is reached.
A plan could look like:
ExecReducer -> ExtractOperator -> Limit Operator -> ...
In Hadoop 0.20, we can overwrite the reduce API to stop taking rows from the
underlying file, but for pre-0.20, it is not overwritable. What we can do is to
put the limit number in the ExecReducer metadata in the hive optimization
phase.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.