[ 
https://issues.apache.org/jira/browse/SPARK-26329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745313#comment-16745313
 ] 

Wing Yew Poon commented on SPARK-26329:
---------------------------------------

I am working on this. In the executor, we can track what stages have a positive 
number of running tasks (when a task starts, we can get its stage id and 
attempt number). When polling the executor memory metrics, we attribute the 
memory to the active stage(s), and update the peaks. In a heartbeat, we send 
the per-stage peaks (for stages active at that time), and then reset the peaks. 
The semantics would be that the per-stage peaks sent in each heartbeat are the 
peaks since the last heartbeat. In addition, we keep a map of task ids to 
metrics (for running tasks), which tracks the peaks of the metrics and the 
polling thread updates this as well. At task end, we send the peak values 
associated with the task in the task result. These are the peak values of the 
executor metrics during the lifetime of the task. (Of course, this does not 
mean that that task alone contributed to those peaks, only that those were the 
peak memory values seen while that task was running.)
 If between heartbeats, a stage completes, so there are no more running tasks 
for that stage, then in the next heartbeat, there are no metrics sent for that 
stage; however, at the end of a task that belonged to that stage, the metrics 
would have been sent in the task result, so we do not lose those peaks.
 We continue to do the stage-level aggregation in the EventLoggingListener.
 For the driver, I do not plan to poll more frequently. I think this is ok 
since most memory issues are with executors rather than with the driver. We 
will still poll when the driver heartbeats. What the driver sends will be the 
current values of the metrics in the driver at the time of the heartbeat. This 
is semantically the same as before.

> ExecutorMetrics should poll faster than heartbeats
> --------------------------------------------------
>
>                 Key: SPARK-26329
>                 URL: https://issues.apache.org/jira/browse/SPARK-26329
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Web UI
>    Affects Versions: 3.0.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> We should allow faster polling of the executor memory metrics (SPARK-23429 / 
> SPARK-23206) without requiring a faster heartbeat rate.  We've seen the 
> memory usage of executors pike over 1 GB in less than a second, but 
> heartbeats are only every 10 seconds (by default).  Spark needs to enable 
> fast polling to capture these peaks, without causing too much strain on the 
> system.
> In the current implementation, the metrics are polled along with the 
> heartbeat, but this leads to a slow rate of polling metrics by default.  If 
> users were to increase the rate of the heartbeat, they risk overloading the 
> driver on a large cluster, with too many messages and too much work to 
> aggregate the metrics.  But, the executor could poll the metrics more 
> frequently, and still only send the *max* since the last heartbeat for each 
> metric.  This keeps the load on the driver the same, and only introduces a 
> small overhead on the executor to grab the metrics and keep the max.
> The downside of this approach is that we still need to wait for the next 
> heartbeat for the driver to be aware of the new peak.   If the executor dies 
> or is killed before then, then we won't find out.  A potential future 
> enhancement would be to send an update *anytime* there is an increase by some 
> percentage, but we'll leave that out for now.
> Another possibility would be to change the metrics themselves to track peaks 
> for us, so we don't have to fine-tune the polling rate.  For example, some 
> jvm metrics provide a usage threshold, and notification: 
> https://docs.oracle.com/javase/7/docs/api/java/lang/management/MemoryPoolMXBean.html#UsageThreshold
> But, that is not available on all metrics.  This proposal gives us a generic 
> way to get a more accurate peak memory usage for *all* metrics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to