Make the reducer limit-aware ---------------------------- Key: HIVE-1239 URL: https://issues.apache.org/jira/browse/HIVE-1239 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.6.0 Reporter: Ning Zhang Fix For: 0.6.0
Currently if a join followed by a limit operator, the reducer still need to do a lot of work even after the limit is reached. A plan could look like: ExecReducer -> ExtractOperator -> Limit Operator -> ... In Hadoop 0.20, we can overwrite the reduce API to stop taking rows from the underlying file, but for pre-0.20, it is not overwritable. What we can do is to put the limit number in the ExecReducer metadata in the hive optimization phase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.