[ https://issues.apache.org/jira/browse/BEAM-690?focusedWorklogId=141928&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-141928 ]
ASF GitHub Bot logged work on BEAM-690: --------------------------------------- Author: ASF GitHub Bot Created on: 06/Sep/18 20:19 Start Date: 06/Sep/18 20:19 Worklog Time Spent: 10m Work Description: janotav commented on issue #6303: [BEAM-690] Backoff in the DirectRunner if no work is available URL: https://github.com/apache/beam/pull/6303#issuecomment-419227607 Thanks for the feedback guys. To be honest I'm no longer convinced this is the right thing to do. It does indeed decrease the CPU consumption significantly, however, at least in our case it is not enough. It turns out that even if the pipeline is completely empty, the driver goes THROTTLE, THROTTLE, CONTINUE, THROTTLE, THROTTLE, CONTINUE, ... and so on ... So effectively the active loop becomes loop with 15 ms sleep (average of 10 and 20 ms). Because the code performed in the active phase is itself non-trivial, this still puts easily measurable load on the CPU. I was able to achieve some further minor improvements by doing some low-level changes in how the driver works with collections, but it became obvious that (at least in my quite specific use-case) this leads nowhere. I was able to come up with an alternative (applicative) solution that simply blocks the DirectRunner threads when the pipeline is empty and only resumes the DirectRunner loop when new data enter the pipeline. I'll keep on thinking about this for a while yet and then probably close this PR unless I figure out how to make it really useful... ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 141928) Time Spent: 0.5h (was: 20m) > Backoff in the DirectRunner Monitor if no work is Available > ----------------------------------------------------------- > > Key: BEAM-690 > URL: https://issues.apache.org/jira/browse/BEAM-690 > Project: Beam > Issue Type: Bug > Components: runner-direct > Reporter: Thomas Groh > Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > When a Pipeline has no elements available to process, the Monitor Runnable > will be repeatedly scheduled. Given that there is no work to be done, this > will loop over the steps in the transform looking for timers, and prompt the > sources to perform additional work, even though there is no work to be done. > This consumes the entirety of a single core. > Add a bounded backoff to rescheduling the monitor runnable if no work has > been done since it last ran. This will reduce resource consumption on > low-throughput Pipelines. -- This message was sent by Atlassian JIRA (v7.6.3#76005)