[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-9251: -- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Committed to spark branch. Thanks, Rui. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Fix For: spark-branch > > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch, > HIVE-9251.6-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.6-spark.patch Rebase the patch and include more update. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch, > HIVE-9251.6-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.5-spark.patch I missed some update to optimize_nullscan.q Update patch. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.4-spark.patch Update more golden files. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.3-spark.patch Addressed RB comments and updated golden files. Some notes about the reducer count changes: * Most queries changed from 3 to 2. We're using total cores to set # reducer here. But previously we counted driver as an executor so we have 1 more executor as we really do. Actually neither count is correct because in fact we have 4 cores (2 executors each with 2 cores), however we can't get cores per executor info. * Some queries changed from 3 to 1. That's because {{hive.exec.reducers.max}} is set to 1 but we previously didn't respect it. * Some queries need deterministic results and that's tracked by HIVE-9290. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, > HIVE-9251.3-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Status: Patch Available (was: Open) > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.2-spark.patch Thanks [~jxiang] and [~xuefuz]. Upload another patch. I didn't remove the memory per task data in case we'll need it in future. For now, it's only used to print a warning if bytes per reducer is much larger than memory per reducer. Would like to know your opinions. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Attachment: HIVE-9251.1-spark.patch Submit a patch for review. BTW, maybe we don't need memory per task to calculate # reducers. Any ideas? > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-9251.1-spark.patch > > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rui Li updated HIVE-9251: - Description: This may hurt performance or even lead to task failures. For example, spark's netty-based shuffle limits the max frame size to be 2G. > SetSparkReducerParallelism is likely to set too small number of reducers > [Spark Branch] > --- > > Key: HIVE-9251 > URL: https://issues.apache.org/jira/browse/HIVE-9251 > Project: Hive > Issue Type: Sub-task > Components: Spark >Reporter: Rui Li > > This may hurt performance or even lead to task failures. For example, spark's > netty-based shuffle limits the max frame size to be 2G. -- This message was sent by Atlassian JIRA (v6.3.4#6332)