Hi John, Thanks for the information.
Eleanore On Wed, May 6, 2020 at 11:37 PM John Youngseok Yang <johnya...@gmail.com> wrote: > Hi, > > The Nemo runner supports setting parallelism for each PTransform. > You can configure a Nemo optimization pass that traverses the operators of > your Beam program, and annotates the ParallelismProperty of operators of > your choice. > > Here is an example pass for setting parallelism: > > https://github.com/apache/incubator-nemo/blob/master/compiler/optimizer/src/main/java/org/apache/nemo/compiler/optimizer/pass/compiletime/annotating/DefaultParallelismPass.java > > Thanks, > John > > On 2020/05/06 16:27:19, Alexey Romanenko <a...@gmail.com> wrote: > > One of the option when reading from topics with a small number of > partitions could be to do a Reshuffle right after read transform to > parallelize better other pipeline steps.> > > > > We had a discussion in this Jira about that a while ago:> > > https://issues.apache.org/jira/browse/BEAM-8121 < > https://issues.apache.org/jira/browse/BEAM-8121>> > > > > > On 30 Apr 2020, at 03:56, Eleanore Jin <el...@gmail.com> wrote:> > > > > > > > Thanks all for the information! > > > > > > > > Eleanore > > > > > > > > On Wed, Apr 29, 2020 at 6:36 PM Ankur Goenka <goe...@google.com > <ma...@google.com>> wrote:> > > > Beam does support parallelism for the job which applies to all the > transforms in the job when executing on Flink using the "--parallelism" > flag.> > > > > > > > From the usecase you mentioned, Kafka read operations will be over > parallelised but it should be ok as they will only have a small amount of > memory impact in loading some state for kafka client etc.> > > > Also flink can run multiple operations for the same Job in a single > task slot so having higher parallelism for lightweight operations should > not be a problem.> > > > > > > > On Wed, Apr 29, 2020 at 6:28 PM Luke Cwik <lc...@google.com <ma...@ > google.com>> wrote:> > > > Beam doesn't expose such a thing directly but the FlinkRunner may be > able to take some pipeline options to configure this.> > > > > > > > On Wed, Apr 29, 2020 at 5:51 PM Eleanore Jin <eleanore....@gmail.com > <ma...@gmail.com>> wrote:> > > > Hi Kyle, > > > > > > > > I am using Flink Runner (v1.8.2)> > > > > > > > Thanks!> > > > Eleanore> > > > > > > > On Wed, Apr 29, 2020 at 10:33 AM Kyle Weaver <kcwea...@google.com > <ma...@google.com>> wrote:> > > > Which runner are you using?> > > > > > > > On Wed, Apr 29, 2020 at 1:32 PM Eleanore Jin <eleanore....@gmail.com > <ma...@gmail.com>> wrote:> > > > Hi all, > > > > > > > > I just wonder can Beam allow to set parallelism for each operator > (PTransform) separately? Flink provides such feature. > > > > > > > > The usecase I have is the source is kafka topics, which has less > partitions, while we have heavy PTransform and would like to scale it with > more parallelism. > > > > > > > > Thanks a lot!> > > > Eleanore> > > > > >