[ https://issues.apache.org/jira/browse/BEAM-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440364#comment-16440364 ]
Jan Peuker edited comment on BEAM-4096 at 4/17/18 4:19 AM: ----------------------------------------------------------- Hi this is Jan, all set up with Jira now. Small addition here: We also need to change withNumFileShards to a ValueProvider which is a required option right now. The default 1000 mentioned in the JavaDoc is incorrect and tends to cause OutOfMemoryError in DataflowRunner. From my current, native, benchmarks it seems a more sensible suggestion for most cases seems to have 100 shards (easy to calculate shard on powers of 2 and reaches common chunk sizes earlier). was (Author: janpeuker): Hi this is Jan, all set up with Jira now. Small addition here: We also need to be change withNumFileShards to a ValueProviders which is a required option right now. The default 1000 mentioned in the JavaDoc is incorrect and tends to cause OutOfMemoryError in DataflowRunner. From my current, native, benchmarks it seems a more sensible suggestion for most cases seems to have 100 shards (easy to calculate shard on powers of 2 and reaches common chunk sizes earlier). > BigQueryIO ValueProvider support for Method and Triggering Frequency > -------------------------------------------------------------------- > > Key: BEAM-4096 > URL: https://issues.apache.org/jira/browse/BEAM-4096 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp > Affects Versions: 2.4.0 > Reporter: Ryan McDowell > Priority: Minor > Fix For: 2.5.0 > > > Enhance BigQueryIO to accept ValueProviders for: > * withMethod(..) > * withTriggeringFrequency(..) > It would allow Dataflow templates to accept these parameters at runtime > instead of being hardcoded. This opens up the ability to create Dataflow > templates which allow users to flip back-and-forth between batch and > streaming inserts. -- This message was sent by Atlassian JIRA (v7.6.3#76005)