Thanks David for the feedback!

> CheckpointStatsCache is also populated using the "cached execution graph,"
> so there is nothing to gain from the "staleness" pov; see
> AbstractCheckpointHandler for more details.


You are right about the CheckpointStatisticsCache. Sorry I was referring to the 
“caching” done in the CheckpointStatsTracker directly, I’m not sure why I 
confusingly said “CheckpointStatsCache”. Let me clarify!

1. At the moment, the only thing that the CheckpointingStatisticsHandler 
requires from the ExecutionGraph is the CheckpointStatsSnapshot object.
2. This ExecutionGraph is the object that is cached, and is not “refreshed” 
when the contents change. This means that the CheckpointStatsSnapshot can be up 
to 3s stale.
3. We could overcome this “staleness” by reducing the cache period of the 
ExecutionGraph, however, this same cache object is used by many other handlers. 
[2] This means reducing the cache would have the following performance impact:
  - Increased RPC messages (from all handlers)
  - Incur reconstruction of the entire job graph, can be expensive for large 
graphs.
  - Also increases the Flink dashboard refresh rate (can be overcome by 
separating out the config)
4. Given the above, we could simplify the internals of 
CheckpointingStatisticsHandler to retrieve just the updated copy of the 
CheckpointStatsSnapshot object from the JobMaster directly. Since there is a 
caching in the CheckpointStatsTracker [3], we will only insure increased RPC 
messages that will be processed quickly, since there is a cache that is 
invalidated when a new checkpoint is triggered.

One concern I can think of is the increased RPC message call, but since the 
request will be resolved quickly, this should be ok.

Let me know what you think! 

[1] 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rest/handler/job/checkpoints/CheckpointingStatisticsHandler.java#L104
[2] 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/webmonitor/WebMonitorEndpoint.java#L343-L432
[3] 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointStatsTracker.java#L117-L141

> This sounds reasonable as long as it falls back to "web.refresh-interval"
> when not defined. For consistency reasons, it should be also named
> "rest.cache-timeout”

Yep, the fallback sounds good, to maintain backwards compatibility. I reckon we 
could just start with `rest.cache-timeout.default` (for future compatibility, 
for example, we could have timeouts for different caches 
`rest.cache-timeout.execution-graph` or 
`rest.cache-timeout.checkpoint-statistics`).  

> In general, I'd be in favor of this ("rest.cache-timeout" would then need
> to become "rest.default-cache-timeout"), but I need to see a detailed FLIP
> because in my mind this could get quite complicated.

I like that it uses the industry standards, but I agree, we need to think 
carefully about the multiple layers of cache we have included in the Flink JM. 
Will take a look at this.

Let me know your thoughts!

Regards,
Hong




> On 26 Jun 2023, at 13:26, David Morávek <d...@apache.org> wrote:
> 
> Hi Hong,
> 
> Thanks for starting the discussion.
> 
> seems to be using the cached version of the entire Execution graph (stale
>> data), when it could just use the CheckpointStatsCache directly
> 
> 
> CheckpointStatsCache is also populated using the "cached execution graph,"
> so there is nothing to gain from the "staleness" pov; see
> AbstractCheckpointHandler for more details.
> 
> Anyone aware of a reason we don’t do this already?
>> 
> 
> The CheckpointStatsCache is populated lazily on the request for a
> particular checkpoint (so it might not have a full view); the used data
> structure is also slightly different; one more thing is that
> CheckpointStatsCache is meant for different purpose -> keeping a particular
> checkpoint around while it's being investigated. Otherwise, it might
> expire; using it for "overview" would break this.
> 
> Configuration for web.refresh-interval controls both dashboard refresh rate
>> and ExecutionGraph cache
>> 
> 
> This sounds reasonable as long as it falls back to "web.refresh-interval"
> when not defined. For consistency reasons, it should be also named
> "rest.cache-timeout"
> 
> 
>> Cache-Control on the HTTP headers.
>> 
> 
> In general, I'd be in favor of this ("rest.cache-timeout" would then need
> to become "rest.default-cache-timeout"), but I need to see a detailed FLIP
> because in my mind this could get quite complicated.
> 
> Best,
> D.
> 
> On Fri, Jun 23, 2023 at 6:26 PM Teoh, Hong <lian...@amazon.co.uk.invalid>
> wrote:
> 
>> Hi all,
>> 
>> I have been looking at the Flink REST API implementation, and had some
>> question on potential improvements. Looking to gather some thoughts:
>> 
>> 1. Only use what is necessary. The GET /checkpoints API seems to be using
>> the cached version of the entire Execution graph (stale data), when it
>> could just use the CheckpointStatsCache directly. I am thinking of doing
>> this refactoring. Anyone aware of a reason we don’t do this already?
>> 2. Configuration for web.refresh-interval controls both dashboard refresh
>> rate and ExecutionGraph cache. I am thinking of introducing a new
>> configuration, rest.cache.timeout
>> 3. Cache-Control on the HTTP headers. Seems like we are using caches in
>> our REST endpoint. It would be step in the right direction to introduce
>> cache-control in our REST API headers, so that we can improve the
>> programmatic access of the Flink REST API.
>> 
>> 
>> Looking forwards to hearing people’s thoughts.
>> 
>> Regards,
>> Hong
>> 
>> 

Reply via email to