[ https://issues.apache.org/jira/browse/BEAM-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ismaël Mejía reassigned BEAM-8306: ---------------------------------- Assignee: Derek He > improve estimation of data byte size reading from source in ElasticsearchIO > --------------------------------------------------------------------------- > > Key: BEAM-8306 > URL: https://issues.apache.org/jira/browse/BEAM-8306 > Project: Beam > Issue Type: Improvement > Components: io-java-elasticsearch > Affects Versions: 2.14.0 > Reporter: Derek He > Assignee: Derek He > Priority: Major > > ElasticsearchIO splits BoundedSource based on the Elasticsearch index size. > We expect it can be more accurate to split it base on query result size. > Currently, we have a big Elasticsearch index. But for query result, it only > contains a few documents in the index. ElasticsearchIO splits it into up > to1024 BoundedSources in Google dataflow. It takes long time to finish the > processing the small numbers of Elasticsearch document in Google dataflow. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)