[GitHub] spark issue #21589: [SPARK-24591][CORE] Number of cores and executors in the...

mridulm Wed, 18 Jul 2018 03:19:11 -0700

Github user mridulm commented on the issue:

    https://github.com/apache/spark/pull/21589
  
    
    I am not convinced by the rationale given for adding the new api's in the 
jira.
    The examples given there can be easily modeled using `defaultParallelism` 
(to get current state) and executor events (to get numCores, memory per 
executor).
    For example: `df.repartition(5 * sc.defaultParallelism)`
    
    The other argument seems to be that users can override this value and set 
it to a static constant.
    User's are not expected to override it unless they want fine grained 
control over the value and spark is expected to honor it when specified.
    
    One thing to be kept in mind is that dynamic resource allocation will kick 
in after tasks are submitted (when there are insufficient resources available) 
- so trying to fine tune this for an application, in presence of DRA, uses 
these api's is not going to be effective anyway.
    
    If there are corner cases where `defaultParallelism` is not accurate, we 
should fix those to reflect the current value.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21589: [SPARK-24591][CORE] Number of cores and executors in the...

Reply via email to