[
https://issues.apache.org/jira/browse/FLINK-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814010#comment-15814010
]
ASF GitHub Bot commented on FLINK-5122:
---------------------------------------
Github user tzulitai commented on the issue:
https://github.com/apache/flink/pull/2861
Hi @static-max, thank you for working on this, it'll be an important fix
for proper at least once support for the ES connector.
Recently, the community has agreed to first restructure the multiple ES
connector version, so that important fixes like this one can be done once and
for all across all versions (1.x, 2.x, and 5.x which is currently pending).
Here's the discussion:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ElasticSearch-in-Flink-Strategy-td15049.html.
Could we wait just a little a bit on this PR, and once the ES connector
refactoring is complete, we can come back and rebase this PR on that? You can
follow the progress here: #2767. I'm trying to come up with the restructure PR
within the next day.
Very sorry for the extra wait needed on this, but it'll be good for the
long run, hope you can understand :)
> Elasticsearch Sink loses documents when cluster has high load
> -------------------------------------------------------------
>
> Key: FLINK-5122
> URL: https://issues.apache.org/jira/browse/FLINK-5122
> Project: Flink
> Issue Type: Bug
> Components: Streaming Connectors
> Affects Versions: 1.2.0
> Reporter: static-max
> Assignee: static-max
>
> My cluster had high load and documents got not indexed. This violates the "at
> least once" semantics in the ES connector.
> I gave pressure on my cluster to test Flink, causing new indices to be
> created and balanced. On those errors the bulk should be tried again instead
> of being discarded.
> Primary shard not active because ES decided to rebalance the index:
> 2016-11-15 15:35:16,123 ERROR
> org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink -
> Failed to index document in Elasticsearch:
> UnavailableShardsException[[index-name][3] primary shard is not active
> Timeout: [1m], request: [BulkShardRequest to [index-name] containing [20]
> requests]]
> Bulk queue on node full (I set queue to a low value to reproduce error):
> 22:37:57,702 ERROR
> org.apache.flink.streaming.connectors.elasticsearch2.ElasticsearchSink -
> Failed to index document in Elasticsearch:
> RemoteTransportException[[node1][192.168.1.240:9300][indices:data/write/bulk[s][p]]];
> nested: EsRejectedExecutionException[rejected execution of
> org.elasticsearch.transport.TransportService$4@727e677c on
> EsThreadPoolExecutor[bulk, queue capacity = 1,
> org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@51322d37[Running,
> pool size = 2, active threads = 2, queued tasks = 1, completed tasks =
> 2939]]];
> I can try to propose a PR for this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)