[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-31 Thread Lefty Leverenz (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15104: -- Labels: (was: TODOC3.0) > Hive on Spark generate more shuffle data than hive on mr > -

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-25 Thread Lefty Leverenz (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-15104: -- Labels: TODOC3.0 (was: ) > Hive on Spark generate more shuffle data than hive on mr > -

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-24 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to mast

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-23 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.10.patch Update to address review comments. Also changed the default switch back to fals

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.9.patch > Hive on Spark generate more shuffle data than hive on mr >

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.8.patch Fix dependencies > Hive on Spark generate more shuffle data than hive on mr > --

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.7.patch > Hive on Spark generate more shuffle data than hive on mr >

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.6.patch Update patch v6 based on Xuefu's suggestions. > Hive on Spark generate more shuf

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-16 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: (was: HIVE-15104.5.patch) > Hive on Spark generate more shuffle data than hive on mr > -

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-24 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.5.patch > Hive on Spark generate more shuffle data than hive on mr >

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-08-21 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.5.patch Run tests with the switch on. > Hive on Spark generate more shuffle data than hi

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-07-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.4.patch Update patch v4: 1. Moved the registrator code to a resource file. Hopefully the

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-05-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: TPC-H 100G.xlsx Attaching TPC-H benchmark result. It shows the improvement is more obvious for long

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-05-19 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.3.patch > Hive on Spark generate more shuffle data than hive on mr >

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-05-15 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.2.patch The CNF is due to how kryo is loaded in {{KryoMessageCodec}}. W/ relocation, kry

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-05-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Attachment: HIVE-15104.1.patch Spark needs the hash code on reducer side for the groupBy shuffling. Since group

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-05-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-15104: -- Status: Patch Available (was: Open) > Hive on Spark generate more shuffle data than hive on mr > --

[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2016-11-01 Thread wangwenli (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangwenli updated HIVE-15104: - Priority: Major (was: Critical) > Hive on Spark generate more shuffle data than hive on mr >