[ https://issues.apache.org/jira/browse/BEAM-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194722#comment-16194722 ]
Tim Robertson commented on BEAM-3026: ------------------------------------- I only looked quickly, but I _think_ the ES client only detects truly dead nodes which are returning 5xx: https://github.com/elastic/elasticsearch/blob/5.4/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L503 What I have seen is an overloaded cluster which is returning 429 (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html#_use_multiple_workers_threads_to_send_data_to_elasticsearch) which I don't _think_ are retried. The intention here was to suggest a more lenient retrying mechanism in the Beam layer before failing tasks. It would be fair to push that feature request to the ES client though too. > Improve retrying in ElasticSearch client > ---------------------------------------- > > Key: BEAM-3026 > URL: https://issues.apache.org/jira/browse/BEAM-3026 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions > Reporter: Tim Robertson > Assignee: Jean-Baptiste Onofré > > Currently an overloaded ES server will result in clients failing fast. > I suggest implementing backoff pauses. Perhaps something like this: > {code} > ElasticsearchIO.ConnectionConfiguration conn = > ElasticsearchIO.ConnectionConfiguration > .create(new String[]{"http://...:9200"}, "test", "test") > .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, > TimeUnit.MILLISECONDS) > .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10) > ); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)