Make the reducer limit-aware
----------------------------

                 Key: HIVE-1239
                 URL: https://issues.apache.org/jira/browse/HIVE-1239
             Project: Hadoop Hive
          Issue Type: Improvement
    Affects Versions: 0.6.0
            Reporter: Ning Zhang
             Fix For: 0.6.0


Currently if a join followed by a limit operator, the reducer still need to do 
a lot of work even after the limit is reached. 

A plan could look like:

ExecReducer -> ExtractOperator -> Limit Operator -> ... 

In Hadoop 0.20, we can overwrite the reduce API to stop taking rows from the 
underlying file, but for pre-0.20, it is not overwritable. What we can do is to 
put the limit number in the ExecReducer metadata in the hive optimization 
phase. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to