Xuefu Zhang created HIVE-15682:
----------------------------------

             Summary: Eliminate the dummy iterator and optimize the per row 
based reducer-side processing
                 Key: HIVE-15682
                 URL: https://issues.apache.org/jira/browse/HIVE-15682
             Project: Hive
          Issue Type: Improvement
          Components: Spark
    Affects Versions: 2.2.0
            Reporter: Xuefu Zhang
            Assignee: Xuefu Zhang


HIVE-15580 introduced a dummy iterator per input row which can be eliminated. 
This is because {{SparkReduceRecordHandler}} is able to handle single key value 
pairs. We can refactor this part of code 1. to remove the need for a iterator 
and 2. to optimize the code path for per (key, value) based (instead of (key, 
value iterator)) processing. It would be also great if we can measure the 
performance after the optimizations and compare to performance prior to 
HIVE-15580.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to