Hi all,
I have written a program and overridden two events onStageCompleted and
onTaskEnd. However, these two events do not provide information on when a
Task/Stage is completed.
What I want to know is which Task corresponds to which stage of a DAG (the
Spark history server only tells me how
Hi,
Start with intercepting stage completions using SparkListenerStageCompleted
[1]. That's Spark Core (jobs, stages and tasks).
Go up the execution chain to Spark SQL with SparkListenerSQLExecutionStart
[2] and SparkListenerSQLExecutionEnd [3], and correlate infos.
You may want to look at how
Hi,
Can you give me more details or give me a tutorial on "You'd have to
intercept execution events and correlate them. Not an easy task yet doable"
Thank
Vào Th 4, 12 thg 4, 2023 vào lúc 21:04 Jacek Laskowski
đã viết:
> Hi,
>
> tl;dr it's not possible to "reverse-engineer" tasks to
Hi,
I was wondering if it's not possible to determine tasks to functions, is it
still possible to easily figure out which job and stage completed which
part of the query from the UI?
For example, in the SQL tab of the Spark UI, I am able to see the query and
the Job IDs for that query. However,
Hi,
tl;dr it's not possible to "reverse-engineer" tasks to functions.
In essence, Spark SQL is an abstraction layer over RDD API that's made up
of partitions and tasks. Tasks are Scala functions (possibly with some
Python for PySpark). A simple-looking high-level operator like
DataFrame.join can