[ https://issues.apache.org/jira/browse/APEXCORE-570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823292#comment-15823292 ]
Pramod Immaneni commented on APEXCORE-570: ------------------------------------------ I will go into the topic specific comments first and then address your concern about my approach with the process. "Now back to the topic: I think this is a good approach. It will avoid the fast producing operator running ahead at the speed of writing to disk indefinitely. With this we will not need to limit the spooling at all?" Correct, spooling limit is not needed. Spooling size limit as a mechanism for back pressure will not work because we do not know how much data will be generated between two commits (committed windows). Also since the checkpoint length is configurable we cannot set it to some "reasonable" high value. Hence, spooling size limit is something that would not be practical. "What about the case where you have two subscribers (and those could be different operators) where one can keep up with the rate at which data is published and the other one may be slow, albeit maybe temporarily? This will slow down the fast subscriber and introduce latency." Yes, you are correct, the slowest subscriber will slow down the publisher (unless it is parallel partition all the way through). But, this is expected isn't it with back pressure. "Let’s first address the process issue (it may warrant a separate discussion and additions to the contributor guidelines also). If you think there was a conclusion then this may indicate that there was offline discussion that isn’t captured here or anywhere else. Just by looking at this ticket it is everything but clear what lead to your PR. This is not how the community can work, discussion has to be in the open." I think you have misunderstood my approach. When I created the JIRA, that started the discussion, I was facing a problem with a production application and had proposed using window difference as the way to create the back pressure and block publisher. Limiting spooling was suggested as an approach by both you and David and in my comment on 02nd Nov 16th at 22:53 I mentioned that it won't work because it will cause a deadlock. David's had another comment on this approach about suspending publisher on spool limit till committed which is effectively the same deadlock problem as the commit will not happen till publisher moves forward. There were no other approaches suggested so I proceeded with attempting to solve the problem via the proposed window difference approach. As I got into the weeds of the implementation and figured out all the details of how the current implementation works, I figured that instead of a window difference using the block difference was a better way to accomplish this. To me, the fundamental approach I originally suggested of blocking publisher till subscriber caught up hadn't changed rather an implementation detail. Second, the majority of the time during the implementation was spent in how to accomplish the task with the original assumption of window difference and coming to the conclusion to use blocks instead of windows and the actual coding a day or two so what you see in the PR is a relatively new discovery. In your comments, you have made a couple of statements, first that there may have been offline discussions on the implementation. This has not happened, I assure you. You are all seeing the implementation at the same time including the reviewers. The second statement is stronger about this being detrimental to community and discussions have to be open. I take personal offense to this statement. I know you want the best for the community but suggest you ascertain the truth before making such strong statements. > Prevent upstream operators from getting too far ahead when downstream > operators are slow > ---------------------------------------------------------------------------------------- > > Key: APEXCORE-570 > URL: https://issues.apache.org/jira/browse/APEXCORE-570 > Project: Apache Apex Core > Issue Type: Improvement > Reporter: Pramod Immaneni > Assignee: Pramod Immaneni > > If the downstream operators are slower than upstream operators then the > upstream operators will get ahead and the gap can continue to increase. > Provide an option to slow down or temporarily pause the upstream operators > when they get too far ahead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)