[ 
https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823292#comment-15823292
 ] 

Pramod Immaneni commented on APEXCORE-570:
------------------------------------------

I will go into the topic specific comments first and then address your concern 
about my approach with the process.

"Now back to the topic: I think this is a good approach. It will avoid the fast 
producing operator running ahead at the speed of writing to disk indefinitely. 
With this we will not need to limit the spooling at all?"

Correct, spooling limit is not needed. Spooling size limit as a mechanism for 
back pressure will not work because we do not know how much data will be 
generated between two commits (committed windows). Also since the checkpoint 
length is configurable we cannot set it to some "reasonable" high value. Hence, 
spooling size limit is something that would not be practical.

"What about the case where you have two subscribers (and those could be 
different operators) where one can keep up with the rate at which data is 
published and the other one may be slow, albeit maybe temporarily? This will 
slow down the fast subscriber and introduce latency."

Yes, you are correct, the slowest subscriber will slow down the publisher 
(unless it is parallel partition all the way through). But, this is expected 
isn't it with back pressure.

"Let’s first address the process issue (it may warrant a separate discussion 
and additions to the contributor guidelines also). If you think there was a 
conclusion then this may indicate that there was offline discussion that isn’t 
captured here or anywhere else. Just by looking at this ticket it is everything 
but clear what lead to your PR. This is not how the community can work, 
discussion has to be in the open."

I think you have misunderstood my approach. When I created the JIRA, that 
started the discussion, I was facing a problem with a production application 
and had proposed using window difference as the way to create the back pressure 
and block publisher. Limiting spooling was suggested as an approach by both you 
and David and in my comment on 02nd Nov 16th at 22:53 I mentioned that it won't 
work because it will cause a deadlock. David's had another comment on this 
approach about suspending publisher on spool limit till committed which is 
effectively the same deadlock problem as the commit will not happen till 
publisher moves forward. There were no other approaches suggested so I 
proceeded with attempting to solve the problem via the proposed window 
difference approach. 

As I got into the weeds of the implementation and figured out all the details 
of how the current implementation works, I figured that instead of a window 
difference using the block difference was a better way to accomplish this. To 
me, the fundamental approach I originally suggested of blocking publisher till 
subscriber caught up hadn't changed rather an implementation detail. Second, 
the majority of the time during the implementation was spent in how to 
accomplish the task with the original assumption of window difference and 
coming to the conclusion to use blocks instead of windows and the actual coding 
a day or two so what you see in the PR is a relatively new discovery. In your 
comments, you have made a couple of statements, first that there may have been 
offline discussions on the implementation. This has not happened, I assure you. 
You are all seeing the implementation at the same time including the reviewers. 
The second statement is stronger about this being detrimental to community and 
discussions have to be open. I take personal offense to this statement. I know 
you want the best for the community but suggest you ascertain the truth before 
making such strong statements.

> Prevent upstream operators from getting too far ahead when downstream 
> operators are slow
> ----------------------------------------------------------------------------------------
>
>                 Key: APEXCORE-570
>                 URL: https://issues.apache.org/jira/browse/APEXCORE-570
>             Project: Apache Apex Core
>          Issue Type: Improvement
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
>
> If the downstream operators are slower than upstream operators then the 
> upstream operators will get ahead and the gap can continue to increase. 
> Provide an option to slow down or temporarily pause the upstream operators 
> when they get too far ahead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to