Hello,

While looking through spark physical plans generated by the spark history 
server log to find any bottle necks in my code, I stumbled across an ID that 
shows up in a partitioning stage.
My goal is to use the history server log to provide meaningful analysis on my 
spark system performance. With this goal in mind, I am trying to connect spark 
physical plans to StageIDs which house useful information that I can tie back 
to my code. Below is a snippet from one of the physical plans.
+- *(2) Sort [Column#46 ASC NULLS FIRST], true, 0
    +- Exchange hashpartitioning(ColumnId#329, 200), ENSURE_REQUIREMENTS, 
[id=#278]


What exactly does [id=#278] refer to?
I have seen some examples that say this ID is a reference to a specific 
partition, a stage id, or a plan_id but I have not been able to confirm which 
one it is.

Thank you,
Tahj

Reply via email to