[ https://issues.apache.org/jira/browse/SPARK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14206263#comment-14206263 ]
Hong Shen commented on SPARK-4341: ---------------------------------- There must be some relation between inputSplits, num-executors and spark parallelism, for example, if inputSplits (determine partitions of input rdd) is less than num-executors, it will lead to a waste of resources, if inputSplits much bigger than num-executor, job will last a long time. It's the same to num-executors and spark parallelism. So if we want spark widely used, it's should set by spark automatically. > Spark need to set num-executors automatically > --------------------------------------------- > > Key: SPARK-4341 > URL: https://issues.apache.org/jira/browse/SPARK-4341 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 1.1.0 > Reporter: Hong Shen > > The mapreduce job can set maptask automaticlly, but in spark, we have to set > num-executors, executor memory and cores. It's difficult for users to set > these args, especially for the users want to use spark sql. So when user > havn't set num-executors, spark should set num-executors automatically > accroding to the input partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org