[GitHub] spark issue #21589: [SPARK-24591][CORE] Number of cores and executors in the...

MaxGekk Thu, 19 Jul 2018 06:42:33 -0700

Github user MaxGekk commented on the issue:

    https://github.com/apache/spark/pull/21589
  
    > Unless there is some other compelling reason for introducing this which I 
have missed; I am -1 on introducing this change.
    
    I would like to describe one class of use cases which you don't consider 
seriously for some reasons. You are mostly talking about the cases when a 
cluster is shared among many apps/users/jobs, and not all resources are 
available to submitted jobs. In that cases, the proposed methods are useless no 
doubt. But there is another trend nowadays.
    
    Creating a cluster is becoming pretty cheap. The process takes a few 
seconds. Our clients create new cluster per one job, and typical use cases when 
one jobs occupies all cluster resources. A cluster become like a container for 
one job. Analogy between Virtual Machines and containers here is direct. I 
would like you look at this use cases more seriously. Users can spin up new 
cluster for any activity in their app - one for machine/deep learning 
(`numExecutors` is useful here), another one for crunching inputs (fine tuning 
of CPU usage is need here). I believe our users/customers are smart enough to 
use the proposed methods in right way. 
    
    Can we add the methods as experimental and if we will observe some problems 
in the upcoming releases, we will just remove them? /cc @gatorsmile @rxin



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21589: [SPARK-24591][CORE] Number of cores and executors in the...

Reply via email to