[ 
https://issues.apache.org/jira/browse/BEAM-3026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194722#comment-16194722
 ] 

Tim Robertson commented on BEAM-3026:
-------------------------------------

I only looked quickly, but I _think_ the ES client only detects truly dead 
nodes which are returning 5xx: 
  
https://github.com/elastic/elasticsearch/blob/5.4/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L503

What I have seen is an overloaded cluster which is returning 429 
(https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html#_use_multiple_workers_threads_to_send_data_to_elasticsearch)
 which I don't _think_ are retried.  

The intention here was to suggest a more lenient retrying mechanism in the Beam 
layer before failing tasks.  It would be fair to push that feature request to 
the ES client though too.

> Improve retrying in ElasticSearch client
> ----------------------------------------
>
>                 Key: BEAM-3026
>                 URL: https://issues.apache.org/jira/browse/BEAM-3026
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Tim Robertson
>            Assignee: Jean-Baptiste Onofré
>
> Currently an overloaded ES server will result in clients failing fast.
> I suggest implementing backoff pauses.  Perhaps something like this:
> {code}
>     ElasticsearchIO.ConnectionConfiguration conn = 
> ElasticsearchIO.ConnectionConfiguration
>       .create(new String[]{"http://...:9200"}, "test", "test")
>       .retryWithWaitStrategy(WaitStrategies.exponentialBackoff(1000, 
> TimeUnit.MILLISECONDS)
>       .retryWithStopStrategy(StopStrategies.stopAfterAttempt(10)
>     );
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to