gonzojive commented on issue #21817:
URL: https://github.com/apache/beam/issues/21817#issuecomment-1166335979

   To support user override of the default buffering behavior, I updated the 
`exec` package and am able to spill to disk when `runtime.ReadMemStats` starts 
to exceed some threshold. Would something like what's in the pull request be 
acceptable? Hopefully it doesn't introduce too much complexity.
   
   Justification for this approach: This change allows the harness to handle 
the disk spilling of a large iterator. It's probably better if the runner 
handles this sort of spilling, but practically I found this faster to implement 
because modifying the Flink runner is onerous. (It requires learning Flink 
concepts, gradle, and submitting a robust change to upstream that I don't have 
time to put together.) Having a hook into the harness code that constructs a 
ReStream is much faster to implement for me and should work with all runners. 
It also happens to achieve the same end result with for a single-machine use 
case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to