[ 
https://issues.apache.org/jira/browse/SPARK-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-5319:
-----------------------------
    Component/s: Spark Core

> Choosing partition size instead of count
> ----------------------------------------
>
>                 Key: SPARK-5319
>                 URL: https://issues.apache.org/jira/browse/SPARK-5319
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: Spark Core
>            Reporter: Idan Zalzberg
>
> With the current API, there are multiple locations when you can set the 
> partition count when reading from sources.
> However IME, it is sometimes useful to set the partition size (in MB), and 
> infer the count from that. 
> IME, spark is sensitive to the partition size, if they are too big, it raises 
> the amount of memory needed per core, and if they are too small then the 
> stage times increase significantly, so I'd like to stay in the "sweet spot" 
> of the partition size, without trying to change the partition count around 
> until I find it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to