[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-09 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-9251:
--
   Resolution: Fixed
Fix Version/s: spark-branch
   Status: Resolved  (was: Patch Available)

Committed to spark branch. Thanks, Rui.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Fix For: spark-branch
>
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch, 
> HIVE-9251.6-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-09 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.6-spark.patch

Rebase the patch and include more update.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch, 
> HIVE-9251.6-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-09 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.5-spark.patch

I missed some update to optimize_nullscan.q
Update patch.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch, HIVE-9251.5-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-08 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.4-spark.patch

Update more golden files.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch, HIVE-9251.4-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-07 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.3-spark.patch

Addressed RB comments and updated golden files. Some notes about the reducer 
count changes:
* Most queries changed from 3 to 2. We're using total cores to set # reducer 
here. But previously we counted driver as an executor so we have 1 more 
executor as we really do. Actually neither count is correct because in fact we 
have 4 cores (2 executors each with 2 cores), however we can't get cores per 
executor info.
* Some queries changed from 3 to 1. That's because {{hive.exec.reducers.max}} 
is set to 1 but we previously didn't respect it.
* Some queries need deterministic results and that's tracked by HIVE-9290.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch, 
> HIVE-9251.3-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-06 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Status: Patch Available  (was: Open)

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-06 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.2-spark.patch

Thanks [~jxiang] and [~xuefuz]. Upload another patch. I didn't remove the 
memory per task data in case we'll need it in future. For now, it's only used 
to print a warning if bytes per reducer is much larger than memory per reducer. 
Would like to know your opinions.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Attachment: HIVE-9251.1-spark.patch

Submit a patch for review.
BTW, maybe we don't need memory per task to calculate # reducers. Any ideas?

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-9251.1-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]

2015-01-04 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-9251:
-
Description: This may hurt performance or even lead to task failures. For 
example, spark's netty-based shuffle limits the max frame size to be 2G.

> SetSparkReducerParallelism is likely to set too small number of reducers 
> [Spark Branch]
> ---
>
> Key: HIVE-9251
> URL: https://issues.apache.org/jira/browse/HIVE-9251
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Rui Li
>
> This may hurt performance or even lead to task failures. For example, spark's 
> netty-based shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)