[
https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730669#comment-15730669
]
Xianda Ke commented on PIG-4952:
--------------------------------
Hi [~kellyzly] & [~nkollar],
how about this:
{code}
// if spark has default parallelism conf
if (sc.conf().contains("spark.default.parallelism")) {
parallelism = sc.defaultParallelism();
} else {
// use the max partitions number of parent RDD
// find out max partitions number
int maxPartitions = -1;
for (int i = 0; i < predRDDs.size(); i++) {
if (predRDDs.get(i).partitions().length > maxPartitions) {
maxPartitions = predRDDs.get(i).partitions().length;
}
}
parallelism = maxPartitions;
}
{code}
> Calculate the value of parallism for spark mode
> -----------------------------------------------
>
> Key: PIG-4952
> URL: https://issues.apache.org/jira/browse/PIG-4952
> Project: Pig
> Issue Type: Sub-task
> Components: spark
> Reporter: liyunzhang_intel
> Assignee: liyunzhang_intel
> Fix For: spark-branch
>
> Attachments: PIG-4952.patch, PIG-4952_1.patch
>
>
> Calculate the value of parallism for spark mode like what
> org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter
> does.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)