[ https://issues.apache.org/jira/browse/BEAM-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487207#comment-16487207 ]
Etienne Chauchot commented on BEAM-4389: ---------------------------------------- Indeed, by default ES mapping for a given type is guessed at first insertion (if not using a template). So any further incompatible inserts in the same index/type will fail with the current impl; it will be the same for update with your suggestion, so, fair enough. Will it support things like upsert? > Enable partial updates for Elasticsearch > ---------------------------------------- > > Key: BEAM-4389 > URL: https://issues.apache.org/jira/browse/BEAM-4389 > Project: Beam > Issue Type: New Feature > Components: io-java-elasticsearch > Affects Versions: 2.4.0 > Reporter: Tim Robertson > Assignee: Tim Robertson > Priority: Major > > Expose a configuration option on the {{ElasticsearchIO}} to enable partial > updates rather than full document inserts. > Rationale: We have the case where different pipelines process different > categories of information of the target entity (e.g. one for taxonomic > processing, another for geospatial processing). A read and merge is not > possible inside the batch call, meaning the only way to do it is through a > join. The join approach is slow, and also stops the ability to run a single > process in isolation (e.g. reprocess the geospatial component of all docs). > Use of this configuration parameter has to be used in conjunction with > controlling the document ID (possible since BEAM-3201) to make sense. > The client API would include a {{withUsePartialUpdate(true)}} such as: > {code} > source.apply( > ElasticsearchIO.write() > .withConnectionConfiguration(connectionConfiguration) > .withIdFn(new ExtractValueFn("id")) > .withUsePartialUpdate(true) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)