Xuefu Zhang created HIVE-15682:
----------------------------------
Summary: Eliminate the dummy iterator and optimize the per row
based reducer-side processing
Key: HIVE-15682
URL: https://issues.apache.org/jira/browse/HIVE-15682
Project: Hive
Issue Type: Improvement
Components: Spark
Affects Versions: 2.2.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
HIVE-15580 introduced a dummy iterator per input row which can be eliminated.
This is because {{SparkReduceRecordHandler}} is able to handle single key value
pairs. We can refactor this part of code 1. to remove the need for a iterator
and 2. to optimize the code path for per (key, value) based (instead of (key,
value iterator)) processing. It would be also great if we can measure the
performance after the optimizations and compare to performance prior to
HIVE-15580.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)