You should be able to configure the number of partition like this: https://github.com/GoogleCloudPlatform/dataflow-cookbook/blob/main/Java/src/main/java/jdbc/ReadPartitionsJdbc.java#L132
The code to auto infer the number of partitions seems to be unreachable (I haven't checked this carefully). More details are here: https://issues.apache.org/jira/browse/BEAM-12456 On Fri, May 31, 2024 at 7:40 AM Vardhan Thigle via user < user@beam.apache.org> wrote: > Hi Beam Experts,I have a small query about `JdbcIO#readWithPartitions` > > > ContextJdbcIO#readWithPartitions seems to always default > to 200 partitions (DEFAULT_NUM_PARTITIONS). This is set by default when the > object is constructed here > <https://github.com/apache/beam/blob/master/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L362> > There seems to be no way to override this with a null value. Hence it > seems that, the code > <https://github.com/apache/beam/blob/b50ad0fe8fc168eaded62efb08f19cf2aea341e2/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L1398> > that > checks the null value and tries to auto infer the number of partitions > based on the never runs.I am trying to use this for reading a tall table > of unknown size, and the pipeline always defaults to 200 if the value is > not set. The default of 200 seems to fall short as worker goes out of > memory in reshuffle stage. Running with higher number of partitions like 4K > helps for my test setup.Since the size is not known at the time of > implementing the pipeline, the auto-inference might help > setting maxPartitions to a reasonable value as per the heuristic decided by > Beam code. > Request for help > > Could you please clarify a few doubts around this? > > 1. Is this behavior intentional? > 2. Could you please explain the rationale behind the heuristic in L1398 > > <https://github.com/apache/beam/blob/b50ad0fe8fc168eaded62efb08f19cf2aea341e2/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L1398> > and DEFAULT_NUM_PARTITIONS=200? > > > I have also raised this as issues/31467 incase it needs any changes in > the implementation. > > > Regards and Thanks, > Vardhan Thigle, > +919535346204 <+91%2095353%2046204> >