[ 
https://issues.apache.org/jira/browse/PIG-4952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15689310#comment-15689310
 ] 

liyunzhang_intel commented on PIG-4952:
---------------------------------------

[~nkollar]: thanks for taking this big work. The target of this jira is to 
calculate a proper value of parallism according to the CPU or memory resource 
on the cluster.  from the article of How-to: Tune Your Apache Spark Jobs (Part 
2)|http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/,
 it said:
{quote}
The primary concern is that the number of tasks will be too small. If there are 
fewer tasks than slots available to run them in, the stage won’t be taking 
advantage of all the CPU available.
{quote}

Currently i am not very clear how to calculate the best value of parallism 
according to cpu or memory resource.  Have any thoughts?

> Calculate the value of parallism for spark mode
> -----------------------------------------------
>
>                 Key: PIG-4952
>                 URL: https://issues.apache.org/jira/browse/PIG-4952
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Nandor Kollar
>             Fix For: spark-branch
>
>         Attachments: PIG-4952_1.patch
>
>
> Calculate the value of parallism for spark mode like what 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.optimizer.ParallelismSetter
>  does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to