[ 
https://issues.apache.org/jira/browse/SPARK-52185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Roczei updated SPARK-52185:
---------------------------------
    Description: 
When a Java program runs for a long time without giving you any 
feedback/output, how do you determine what the program might be doing and 
whether it’s stuck? Thread dumps can help in such case. It shows threads 
statuses (if it's running/waiting/blocked) and which part of the code is being 
executed by each thread, being important to detect deadlocks and which part of 
the program is running.

Obtain a series of thread dumps at regular intervals. Why? Getting a single 
thread dump only shows a snapshot of threads, getting several allows us to see 
if threads are progressing by comparing states.

Collecting thread dump samples from slow Spark executors or drivers can be 
challenging, especially in YARN or Kubernetes environments.

Actual solutions which are available for debugging:

1)

We need to find out where the Java Virtual Machine (JVM) is running then run 
the jstack command manually.

2)

Download the thread dumps from the Spark UI. For example:

[http://localhost:4040/executors/threadDump/?executorId=driver]

3)

Download the thread dumps via Spark API. For example:
{code:java}
curl 
"http://localhost:4040/api/v1/applications/local-1747400853731/executors/driver/threads"{code}
 

The purpose of this feature request is to automate the thread dump collection.

  was:
Obtaining thread dumps for hung Spark executors and Spark drivers are not 
simple tasks in YARN or Kubernetes environments.

Actual solutions which are available for debugging:

1)

We need to find out where the Java Virtual Machine (JVM) is running then run 
the jstack command manually.

2)

Download the thread dumps from the Spark UI. For example:

http://localhost:4040/executors/threadDump/?executorId=driver

3)

Download the thread dumps via Spark API. For example:
{code:java}
curl 
"http://localhost:4040/api/v1/applications/local-1747400853731/executors/driver/threads"{code}
 

The purpose of this feature request is to automate the thread dump collection.


> Automate the thread dump collection for Spark applications
> ----------------------------------------------------------
>
>                 Key: SPARK-52185
>                 URL: https://issues.apache.org/jira/browse/SPARK-52185
>             Project: Spark
>          Issue Type: New Feature
>          Components: Spark Core
>    Affects Versions: 4.1.0
>            Reporter: Gabor Roczei
>            Priority: Major
>
> When a Java program runs for a long time without giving you any 
> feedback/output, how do you determine what the program might be doing and 
> whether it’s stuck? Thread dumps can help in such case. It shows threads 
> statuses (if it's running/waiting/blocked) and which part of the code is 
> being executed by each thread, being important to detect deadlocks and which 
> part of the program is running.
> Obtain a series of thread dumps at regular intervals. Why? Getting a single 
> thread dump only shows a snapshot of threads, getting several allows us to 
> see if threads are progressing by comparing states.
> Collecting thread dump samples from slow Spark executors or drivers can be 
> challenging, especially in YARN or Kubernetes environments.
> Actual solutions which are available for debugging:
> 1)
> We need to find out where the Java Virtual Machine (JVM) is running then run 
> the jstack command manually.
> 2)
> Download the thread dumps from the Spark UI. For example:
> [http://localhost:4040/executors/threadDump/?executorId=driver]
> 3)
> Download the thread dumps via Spark API. For example:
> {code:java}
> curl 
> "http://localhost:4040/api/v1/applications/local-1747400853731/executors/driver/threads"{code}
>  
> The purpose of this feature request is to automate the thread dump collection.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to