PR for this is now open: https://github.com/apache/beam/pull/10313
Hey Max,
Thanks for the feedback.
-->
On Sun, Nov 24, 2019 at 2:04 PM Maximilian Michels wrote:
> Load-balancing the worker selection for bundle execution sounds like the
> solution to uneven work distribution across the
Load-balancing the worker selection for bundle execution sounds like the
solution to uneven work distribution across the workers. Some comments:
(1) I could imagine that in case of long-running bundle execution (e.g.
model execution), this could stall upstream operators because their busy
JIRA: https://issues.apache.org/jira/browse/BEAM-8816
On Thu, Nov 21, 2019 at 10:44 AM Thomas Weise wrote:
> Hi Luke,
>
> Thanks for the background and it is exciting to see the progress on the
> SDF side. It will help with this use case and many other challenges. I
> imagine the Python user
Hi Luke,
Thanks for the background and it is exciting to see the progress on the SDF
side. It will help with this use case and many other challenges. I
imagine the Python user code would be able to determine that it is bogged
down with high latency record processing (based on the duration it
Dataflow has run into this issue as well. Dataflow has "work items" that
are converted into bundles that are executed on the SDK. Each work item
does a greedy assignment to the SDK worker with the fewest work items
assigned. As you surmised, we use SDF splitting in batch pipelines to
balance work.
We found a problem with uneven utilization of SDK workers causing excessive
latency with Streaming/Python/Flink. Remember that with Python, we need to
execute multiple worker processes on a machine instead of relying on
threads in a single worker, which requires the runner to make a decision to