[ 
https://issues.apache.org/jira/browse/SPARK-33864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948007#comment-17948007
 ] 

Ramesha Bhatta commented on SPARK-33864:
----------------------------------------

Livy has 2 modes and job submit is similar to spark-submit and doesn't share 
session.    Sharing session not even workable if we need to allocate resources 
based on job requirements.   With livy we could host multiple Livy servers for 
scaling, however I am pointing out general efficiency issue here in this post.

> How can we submit or initiate multiple spark application with single or few 
> JVM
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-33864
>                 URL: https://issues.apache.org/jira/browse/SPARK-33864
>             Project: Spark
>          Issue Type: Improvement
>          Components: Deploy
>    Affects Versions: 2.4.5
>            Reporter: Ramesha Bhatta
>            Priority: Major
>
> How can we have single JVM or few JVM process submit multiple application to 
> cluster.
> It is observed that each spark-submit opens upto 400 JARS of >1GB size and 
> creates  _spark_conf_XXXX.zip in /tmp  and copy under application specific 
> .staging directory.    When run concurrently for # of JVMs that can be 
> supported in a server is limited and 100% CPU during job submission and  
> until client java processes start exiting.
> Initially we thought creating zip files and distributing this to hdfs for 
> each application is the source of issue. However reducing the size of zipfile 
> by 50% also we didn't see much difference and indicates the main source of 
> issue is number of JAVA process on client side.
> Direct impact is any submission with concurrency >40 (#of hyperthreaded 
> cores) leads to failure and CPU overload on GW. Tried Livy, however noticed, 
> in the background this solution also does a spark-submit and same problem 
> persists and getting "response code 404" and observe the same CPU overload on 
> server running livy. The concurrency is due to mini-batches over REST and 
> expecting and try to support 2000+ concurrent requests as long as we have the 
> resource to support in the cluster. For this spark-submit is the major 
> bottleneck because of the explained situation. For JARS submission, we have 
> more than one work-around (1.pre-distribute the jars to a specified folder 
> and refer local keyword or 2) stage the JARS in a HDFS location and specify 
> HDFS reference thus no file-copy per application).
> Is there a way to create a service/services that will stay running and submit 
> jobs to cluster. For running application in Client mode make sense to open 
> 400+ jars, however just for sumibtting the application to cluster we could 
> have a simple/lite process that runs as service.
> Regards,
> -Ramesh



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to