[ 
https://issues.apache.org/jira/browse/SPARK-47475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiale Tan updated SPARK-47475:
------------------------------
    Description: 
{*}Context{*}:
To submit spark jobs to Kubernetes under cluster mode, the {{spark-submit}} 
will be triggered twice. 
The first time {{SparkSubmit}} will run under k8s cluster mode, it will append 
primary resource to {{spark.jars}} and call 
{{KubernetesClientApplication::start}} to create a driver pod. 
The driver pod will run {{spark-submit}} again with the same primary resource 
jar. However this time the {{SparkSubmit}} will run under client mode with 
{{spark.kubernetes.submitInDriver}} as {{true}}, plus the updated 
{{spark.jars}}. Under this mode, all the jars in {{spark.jars}} will be 
downloaded to driver and those jars' urls will be replaced by the driver local 
paths. 
Then SparkSubmit will append the same primary resource to spark.jars again. So 
in this case, {{spark.jars}} will have 2 paths of duplicate copies of primary 
resource, one with the original url user submit with, the other with the driver 
local file path. 
Later when driver starts the SparkContext, it will copy all the {{spark.jars}} 
to {{spark.app.initial.jar.urls}}, and replace the driver local jars paths in 
{{spark.app.initial.jar.urls}} with driver file service paths. 
Now all the jars in the {{--jars}} or `spark.jars` in the original user 
submission will be replaced with a driver file service url and added to  
{{spark.app.initial.jar.urls}}. And the primary resource jar in the original 
submission will show up in {{spark.app.initial.jar.urls}} twice: one with the 
original path in the user submission, the other with a driver file service url.
When executors start, they will download all the jars in the 
{{spark.app.initial.jar.urls}}. 

{*}Issues{*}:
 - The executor will download 2 duplicate copies of primary resource, one with 
the original url user submit with, the other with the driver local file path, 
which leads to resource waste.
 - When jars are big and the application requests a lot of executors, the 
massive concurrent jars download from the driver will cause network saturation. 
In this case, the executors jar download will timeout, causing executors to be 
terminated. From user point of view, the application is trapped in the loop of 
massive executor loss and re-provision, but never gets enough live executors as 
requested, leads to SLA breach or sometimes failure.

  was:
{*}Issues{*}:
 - The executor will download 2 duplicate copies of primary resource, one with 
the original url user submit with, the other with the driver local file path, 
which leads to resource waste.
 - When jars are big and the application requests a lot of executors, the 
massive concurrent jars download from the driver will cause network saturation. 
In this case, the executors jar download will timeout, causing executors to be 
terminated. From user point of view, the application is trapped in the loop of 
massive executor loss and re-provision, but never gets enough live executors as 
requested, leads to SLA breach or sometimes failure.

{*}Root Cause{*}:
To submit spark jobs to Kubernetes under cluster mode, the {{spark-submit}} 
will be triggered twice. 
The first time {{SparkSubmit}} will run under k8s cluster mode, it will append 
primary resource to {{spark.jars}} and call 
{{KubernetesClientApplication::start}} to create a driver pod. 
The driver pod will run {{spark-submit}} again with the same primary resource 
jar. However this time the {{SparkSubmit}} will run under client mode with 
{{spark.kubernetes.submitInDriver}} as {{true}}, plus the updated 
{{spark.jars}}. Under this mode, all the jars in {{spark.jars}} will be 
downloaded to driver and those jars' urls will be replaced by the driver local 
paths. 
Then SparkSubmit will append the same primary resource to spark.jars again. So 
in this case, {{spark.jars}} will have 2 paths of duplicate copies of primary 
resource, one with the original url user submit with, the other with the driver 
local file path. 
Later when driver starts the SparkContext, it will copy all the {{spark.jars}} 
to {{spark.app.initial.jar.urls}}, and replace the driver local jars paths in 
{{spark.app.initial.jar.urls}} with driver file service paths. 
Now all the jars in the {{--jars}} or `spark.jars` in the original user 
submission will be replaced with a driver file service url and added to  
{{spark.app.initial.jar.urls}}. And the primary resource jar in the original 
submission will show up in {{spark.app.initial.jar.urls}} twice: one with the 
original path in the user submission, the other with a driver file service url.
When executors start, they will download all the jars in the 
{{spark.app.initial.jar.urls}}. 


> Jar Download Under K8s Cluster Mode Causes Executors Scaling Issues 
> --------------------------------------------------------------------
>
>                 Key: SPARK-47475
>                 URL: https://issues.apache.org/jira/browse/SPARK-47475
>             Project: Spark
>          Issue Type: Bug
>          Components: Deploy, Kubernetes
>    Affects Versions: 3.4.0, 3.5.0
>            Reporter: Jiale Tan
>            Priority: Major
>
> {*}Context{*}:
> To submit spark jobs to Kubernetes under cluster mode, the {{spark-submit}} 
> will be triggered twice. 
> The first time {{SparkSubmit}} will run under k8s cluster mode, it will 
> append primary resource to {{spark.jars}} and call 
> {{KubernetesClientApplication::start}} to create a driver pod. 
> The driver pod will run {{spark-submit}} again with the same primary resource 
> jar. However this time the {{SparkSubmit}} will run under client mode with 
> {{spark.kubernetes.submitInDriver}} as {{true}}, plus the updated 
> {{spark.jars}}. Under this mode, all the jars in {{spark.jars}} will be 
> downloaded to driver and those jars' urls will be replaced by the driver 
> local paths. 
> Then SparkSubmit will append the same primary resource to spark.jars again. 
> So in this case, {{spark.jars}} will have 2 paths of duplicate copies of 
> primary resource, one with the original url user submit with, the other with 
> the driver local file path. 
> Later when driver starts the SparkContext, it will copy all the 
> {{spark.jars}} to {{spark.app.initial.jar.urls}}, and replace the driver 
> local jars paths in {{spark.app.initial.jar.urls}} with driver file service 
> paths. 
> Now all the jars in the {{--jars}} or `spark.jars` in the original user 
> submission will be replaced with a driver file service url and added to  
> {{spark.app.initial.jar.urls}}. And the primary resource jar in the original 
> submission will show up in {{spark.app.initial.jar.urls}} twice: one with the 
> original path in the user submission, the other with a driver file service 
> url.
> When executors start, they will download all the jars in the 
> {{spark.app.initial.jar.urls}}. 
> {*}Issues{*}:
>  - The executor will download 2 duplicate copies of primary resource, one 
> with the original url user submit with, the other with the driver local file 
> path, which leads to resource waste.
>  - When jars are big and the application requests a lot of executors, the 
> massive concurrent jars download from the driver will cause network 
> saturation. In this case, the executors jar download will timeout, causing 
> executors to be terminated. From user point of view, the application is 
> trapped in the loop of massive executor loss and re-provision, but never gets 
> enough live executors as requested, leads to SLA breach or sometimes failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to