[ https://issues.apache.org/jira/browse/BEAM-1269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948013#comment-15948013 ]
Solomon Duskis commented on BEAM-1269: -------------------------------------- BigtableIO should not set data channel pool counts for reads. This is the current line: // Set data channel count to one because there is only 1 scanner in this session BigtableOptions.Builder clonedBuilder = options.toBuilder() .setDataChannelCount(1); BigtableOptions optionsWithAgent = clonedBuilder.setUserAgent(getBeamSdkPartOfUserAgent()).build(); It should be more like: BigtableOptions optionsWithAgent = options .toBuilder() .setUserAgent(getBeamSdkPartOfUserAgent()) . setUseCachedDataPool(true) . setDataHost(BigtableOptions.BIGTABLE_BATCH_DATA_HOST_DEFAULT) .build(); > BigtableIO should make more efficient use of connections > -------------------------------------------------------- > > Key: BEAM-1269 > URL: https://issues.apache.org/jira/browse/BEAM-1269 > Project: Beam > Issue Type: Improvement > Components: sdk-java-gcp > Reporter: Daniel Halperin > Labels: newbie, starter > > RIght now, {{BigtableIO}} opens up a new Bigtable session for every DoFn, in > the {{@Setup}} function. However, sessions can support multiple connections, > so perhaps this code should be modified to open up a smaller session pool and > then allocation connections in {{@StartBundle}}. > This would likely make more efficient use of resources, especially for highly > multithreaded workers. -- This message was sent by Atlassian JIRA (v6.3.15#6346)