[ 
https://issues.apache.org/jira/browse/FLINK-18359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rinkako updated FLINK-18359:
----------------------------
    Affects Version/s:     (was: 1.10.0)

> Improve error-log strategy for Elasticsearch sink for large data documentId 
> conflict when using create mode for `insert ignore` semantics
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-18359
>                 URL: https://issues.apache.org/jira/browse/FLINK-18359
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / ElasticSearch
>            Reporter: rinkako
>            Priority: Major
>              Labels: usability
>
> The story is: when a flink job for ingesting large number of records from 
> data sources, processing and indexing with Elasticsearch sink failed, we may 
> restart it from a specific data set which contains lots of data which already 
> sink into ES.
> At this case, a `INSERT IGNORE` semantics is necessary, and we use `public 
> IndexRequest create(boolean create)` with `true` args and ignore the 409 
> restStatusCode at a customized ActionRequestFailureHandler to make it work.
> But, the `BulkProcessorListener` always log a error event before it calls the 
> `failureHandler` in its `afterBulk` method, and will produce tons of error 
> log for document id conflict, which we already know and handle them in 
> customized ActionRequestFailureHandler.
> Therefore, it seems that the error log action at the 
> ActionRequestFailureHandler (either the default IgnoringFailureHandler or a 
> custom handler) is more flexible ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to