[ 
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823182#comment-15823182
 ] 

Pramod Immaneni commented on APEXCORE-570:
------------------------------------------

I am not sure what is left to conclude the discussion but will add this. There 
are two scenarios here when buffer spooling is enabled and when it is disabled. 
This addresses the case when buffer spooling is enabled (the default case). The 
buffer spooling disabled case, has some other basic issues, even before coming 
to back pressure, therefore requiring a different treatment and I am planning 
to look at that next, the issues here are captured in a separate jira 
APEXCORE-609.

Let me describe the approach by first describing what happens today. The data 
in the buffer server is stored in blocks. For example, if you have a 512MB 
buffer (the default) it is divided into 8 blocks of 64MB each. The problem is 
that the as the publisher publishes more data, the blocks it is done with are 
spooled to disk, their data memory invalidated, regardless of where the 
subscribers are, and new blocks are allocated with total data memory usage 
remaining at 512MB (configured capacity). This is done immaterial of where the 
subscribers are in terms of reading those blocks. If a subscriber hits a block 
that is spooled that data is loaded back into memory. My approach is simply to 
not release blocks haphazardly when publisher is done with a block but do that 
only after all subscribers have read them and if there are no more blocks left 
for a publisher, then suspend publisher till the slowest subscriber has gone 
past the earliest block and released it.

If you now have an app where the input operator can bring data into the dag at 
a much faster rate than lets say an output operator can write to a store, you 
will see the individual buffers in the intermediate containers build up and 
those operators pause as they reach the limit, with the back pressure 
propagating all the way back to the input operator and pausing it eventually, 
and as the output operator moves forward the upstream operators resume. So the 
buffer memory limit specified by the configurable attribute controls when the 
back pressure will kick in. Of course this is not an ideal way to design/run 
your app and you want an impedance match between your input and output or at 
least avg(input rate) <= avg(output rate), but in scenarios where the 
application is not designed to do this, it prevents a runaway happening with 
the upstream operators. By the way the feature can be disabled if not desired.

> Prevent upstream operators from getting too far ahead when downstream 
> operators are slow
> ----------------------------------------------------------------------------------------
>
>                 Key: APEXCORE-570
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-570
>             Project: Apache Apex Core
>          Issue Type: Improvement
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
>
> If the downstream operators are slower than upstream operators then the 
> upstream operators will get ahead and the gap can continue to increase. 
> Provide an option to slow down or temporarily pause the upstream operators 
> when they get too far ahead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to