[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lefty Leverenz updated HIVE-15104:
--
Labels: (was: TODOC3.0)
> Hive on Spark generate more shuffle data than hive on mr
> -
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lefty Leverenz updated HIVE-15104:
--
Labels: TODOC3.0 (was: )
> Hive on Spark generate more shuffle data than hive on mr
> -
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Resolution: Fixed
Fix Version/s: 3.0.0
Status: Resolved (was: Patch Available)
Pushed to mast
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.10.patch
Update to address review comments. Also changed the default switch back to
fals
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.9.patch
> Hive on Spark generate more shuffle data than hive on mr
>
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.8.patch
Fix dependencies
> Hive on Spark generate more shuffle data than hive on mr
> --
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.7.patch
> Hive on Spark generate more shuffle data than hive on mr
>
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.6.patch
Update patch v6 based on Xuefu's suggestions.
> Hive on Spark generate more shuf
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: (was: HIVE-15104.5.patch)
> Hive on Spark generate more shuffle data than hive on mr
> -
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.5.patch
> Hive on Spark generate more shuffle data than hive on mr
>
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.5.patch
Run tests with the switch on.
> Hive on Spark generate more shuffle data than hi
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.4.patch
Update patch v4:
1. Moved the registrator code to a resource file. Hopefully the
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: TPC-H 100G.xlsx
Attaching TPC-H benchmark result. It shows the improvement is more obvious for
long
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.3.patch
> Hive on Spark generate more shuffle data than hive on mr
>
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.2.patch
The CNF is due to how kryo is loaded in {{KryoMessageCodec}}. W/ relocation,
kry
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Attachment: HIVE-15104.1.patch
Spark needs the hash code on reducer side for the groupBy shuffling. Since
group
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rui Li updated HIVE-15104:
--
Status: Patch Available (was: Open)
> Hive on Spark generate more shuffle data than hive on mr
> --
[
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
wangwenli updated HIVE-15104:
-
Priority: Major (was: Critical)
> Hive on Spark generate more shuffle data than hive on mr
>
18 matches
Mail list logo