[ 
https://issues.apache.org/jira/browse/BEAM-690?focusedWorklogId=141928&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-141928
 ]

ASF GitHub Bot logged work on BEAM-690:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Sep/18 20:19
            Start Date: 06/Sep/18 20:19
    Worklog Time Spent: 10m 
      Work Description: janotav commented on issue #6303: [BEAM-690] Backoff in 
the DirectRunner if no work is available
URL: https://github.com/apache/beam/pull/6303#issuecomment-419227607
 
 
   Thanks for the feedback guys. To be honest I'm no longer convinced this is 
the right thing to do. It does indeed decrease the CPU consumption 
significantly, however, at least in our case it is not enough. It turns out 
that even if the pipeline is completely empty, the driver goes
   
   THROTTLE, THROTTLE, CONTINUE, THROTTLE, THROTTLE, CONTINUE, ... and so on ...
   
   So effectively the active loop becomes loop with 15 ms sleep (average of 10 
and 20 ms). Because the code performed in the active phase is itself 
non-trivial, this still puts easily measurable load on the CPU. I was able to 
achieve some further minor improvements by doing some low-level changes in how 
the driver works with collections, but it became obvious that (at least in my 
quite specific use-case) this leads nowhere.
   
   I was able to come up with an alternative (applicative) solution that simply 
blocks the DirectRunner threads when the pipeline is empty and only resumes the 
DirectRunner loop when new data enter the pipeline. 
   
   I'll keep on thinking about this for a while yet and then probably close 
this PR unless I figure out how to make it really useful...
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 141928)
    Time Spent: 0.5h  (was: 20m)

> Backoff in the DirectRunner Monitor if no work is Available
> -----------------------------------------------------------
>
>                 Key: BEAM-690
>                 URL: https://issues.apache.org/jira/browse/BEAM-690
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-direct
>            Reporter: Thomas Groh
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When a Pipeline has no elements available to process, the Monitor Runnable 
> will be repeatedly scheduled. Given that there is no work to be done, this 
> will loop over the steps in the transform looking for timers, and prompt the 
> sources to perform additional work, even though there is no work to be done. 
> This consumes the entirety of a single core.
> Add a bounded backoff to rescheduling the monitor runnable if no work has 
> been done since it last ran. This will reduce resource consumption on 
> low-throughput Pipelines.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to