Only bad thing for this approach is, at least in the flink runner it consume a task slot :(
El El mié, 12 de jun de 2024 a la(s) 9:38 a.m., Robert Bradshaw < rober...@google.com> escribió: > On Wed, Jun 12, 2024 at 7:56 AM Ruben Vargas <ruben.var...@metova.com> > wrote: > > > > The approach looks good. but one question > > > > My understanding is that this will schedule for example 8 operators > across the workers, but only one of them will be processing, the others > remain idle? Are those consuming resources in some way? I'm assuming may be > is not significant. > > That is correct, but the resources consumed by an idle operator should > be negligible. > > > Thanks. > > > > El El vie, 7 de jun de 2024 a la(s) 3:56 p.m., Robert Bradshaw via user < > user@beam.apache.org> escribió: > >> > >> You can always limit the parallelism by assigning a single key to > >> every element and then doing a grouping or reshuffle[1] on that key > >> before processing the elements. Even if the operator parallelism for > >> that step is technically, say, eight, your effective parallelism will > >> be exactly one. > >> > >> [1] > https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/Reshuffle.html > >> > >> On Fri, Jun 7, 2024 at 2:13 PM Ruben Vargas <ruben.var...@metova.com> > wrote: > >> > > >> > Hello guys > >> > > >> > One question, I have a side input which fetches an endpoint each 30 > >> > min, I pretty much copied the example here: > >> > https://beam.apache.org/documentation/patterns/side-inputs/ but added > >> > some logic to fetch the endpoint and parse the payload. > >> > > >> > My question is: it is possible to control the parallelism of this > >> > single ParDo that does the fetch/transform? I don't think I need a lot > >> > of parallelism for that one. I'm currently using flink runner and I > >> > see the parallelism is 8 (which is the general parallelism for my > >> > flink cluster). > >> > > >> > Is it possible to set it to 1 for example? > >> > > >> > > >> > Regards. >