[ 
https://issues.apache.org/jira/browse/SPARK-56326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-56326.
----------------------------------
    Fix Version/s: 4.2.0
       Resolution: Fixed

Issue resolved by pull request 55166
[https://github.com/apache/spark/pull/55166]

> Add Streaming query id and batch id to task scheduling logs
> -----------------------------------------------------------
>
>                 Key: SPARK-56326
>                 URL: https://issues.apache.org/jira/browse/SPARK-56326
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler, Structured Streaming
>    Affects Versions: 4.2.0
>            Reporter: Brooks Walls
>            Assignee: Brooks Walls
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 4.2.0
>
>
> Currently, logs involving the scheduling of tasks do not contain information 
> such as query id and batch id for streaming queries. This makes debugging 
> streaming queries confusing, especially when there are multiple queries 
> running. Lets add the query id and batch id to each log about task scheduling 
> when processing a streaming query.
> The current logs involving task scheduling look like this:
> {code:java}
> 6/03/11 22:03:29 INFO FairSchedulableBuilder: Added task set TaskSet_486190.0 
> tasks to pool 1772179380933
> 6/03/11 22:03:29 INFO TaskSetManager: Starting task 0.0 in stage 486190.0 
> (TID 3075017) (10.68.141.175,executor 13, partition 0, PROCESS_LOCAL, 
> 6/03/11 22:03:29 INFO TaskSetManager: Finished task 0.0 in stage 486190.0 
> (TID 3075017) in 52 ms on 10.68.141.175 (executor 13) (1/1){code}
> Lets add query and batch information:
> {code:java}
> 6/03/11 22:03:29 INFO FairSchedulableBuilder: [queryId = 71c67] [batchId = 
> 685] Added task set TaskSet_486190.0 tasks to pool 1772179380933
> 6/03/11 22:03:29 INFO TaskSetManager: [queryId = 71c67] [batchId = 685] 
> Starting task 0.0 in stage 486190.0 (TID 3075017) (10.68.141.175,executor 13, 
> partition 0, PROCESS_LOCAL,
> 6/03/11 22:03:29 INFO TaskSetManager: [queryId = 71c67] [batchId = 685] 
> Finished task 0.0 in stage 486190.0 (TID 3075017) in 52 ms on 10.68.141.175 
> (executor 13) (1/1){code}
> *Impact:* This change is strictly additive to logs and does not change any 
> public APIs or internal scheduling logic.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to