Tengfei Huang created SPARK-52006:
-------------------------------------

             Summary: CollectMetricsExec's accumulator should be excluded from 
heartbeats and event logs
                 Key: SPARK-52006
                 URL: https://issues.apache.org/jira/browse/SPARK-52006
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.5.5
            Reporter: Tengfei Huang


CollectMetricsExec uses an accumulator(AggregatingAccumulator) for emitting 
metrics in a side-channel that can be collected to the driver.

Unfortunately, this inherits many default accumulator behaviors, some of which 
are undesirable:
 * This is a named accumulator but isn’t an internal accumulator (i.e. its name 
doesn’t start with the internal task metrics prefix), so its result is shown in 
the Spark UI. But that result isn’t meaningful to users because it’s the 
`.toString` representation of an UnsafeRow representing an aggregation buffer.
 * This same string representation is also logged in event logs on a per-task 
basis, slowing down event logging and bloating log size.
 * The underlying accumulator value is also retained by the live UI even though 
it’s not displayed on a per-task basis, bloating driver memory consumption.
 * Even though our code tries to only publish a metric update once at the end 
of the task, there is a race condition between task completion listeners and 
metric heartbeater deregistration(stemming from the division of 
responsibilities between Task.scala and Executor.scala) which will lead to 
accumulator serialization issues.

Based on the above undesirable facts, we should give this metric a special 
internal accumulator name and should use that name to explicitly exclude it 
from heartbeats and from event logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to