Hi David, Thanks for sharing. I'm investigating something like this recently. What's the size of your data? Best -P.
On Wed, Mar 24, 2021, 7:52 AM David Sánchez <[email protected]> wrote: > Hi folks! > > I'm testing the dataflow v2 runner in a batch pipeline (Apache Beam Python > 3.7 SDK 2.27.0) that reads many million of rows from BigQuery and writes to > PubSub and BigQuery using the flag "--experiments=use_runner_v2". > > The same job used to scale up immediately to over 50 workers, but in v2 it > never scales up further than 5-6 workers, thus it's way slower. I can see > however that the total vCPU and memory are about half than before, which is > promising. Any clue about why the scaling is behaving differently? > > Many thanks >
