Gaël Renoux created FLINK-20390:
-----------------------------------

             Summary: Programmatic access to the back-pressure
                 Key: FLINK-20390
                 URL: https://issues.apache.org/jira/browse/FLINK-20390
             Project: Flink
          Issue Type: New Feature
          Components: API / Core
            Reporter: Gaël Renoux


It would be useful to access the back-pressure monitoring from within functions.

Here is our use case: we have a real-time Flink job, which takes decisions 
based on input data. Sometimes, we have traffic spikes on the input and the 
decisions process cannot processe records fast enough. Back-pressure starts 
mounting, all the way back to the Source. What we want to do is to start 
dropping records in this case, because it's better to make decisions based on 
just a sample of the data rather than accumulate too much lag.

Right now, the only way is to have a filter with a hard-limit on the number of 
records per-interval-of-time, and to drop records once we are over this limit. 
However, this requires a lot of tuning to find out what the correct limit is, 
especially since it may depend on the nature of the inputs (some decisions take 
longer to make than others). It's also heavily dependent on the buffers: the 
limit needs to be low enough that all records that pass the limit can fit in 
the downstream buffers, or the back-pressure will will go back past the 
filtering task and we're back to square one. Finally, it's not very resilient 
to change: whenever we scale the infrastructure up, we need to redo the whole 
tuning thing.

With programmatic access to the back-pressure, we could simply start dropping 
records based on its current level. No tuning, and adjusted to the actual 
issue. For performance, I assume it would be better if it reused the existing 
back-pressure monitoring mechanism, rather than looking directly into the 
buffer. A sampling of the back-pressure should be enough, and if more precision 
is needed you can simply change the existing back-pressure configuration.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to