Hi - we're having problems with one of our map-reduce jobs that writes to 
Elasticsearch. Lots of map tasks are failing due to ES being "unavailable", 
with logs like this:

https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt

Seems like elasticsearch-hadoop tries talking to an ES node, it times out, 
tries the next one, it times out, etc until all nodes in the cluster are 
exhausted and then it gives up.

As far as I can tell, the ES cluster is healthy while this is occurring. 
May map tasks are succeeding - probably about 10% of the attempts are 
killed due to this issue. The main problem is that these killed tasks waste 
a lot of time, and slow down the overall job execution.

I'm not sure where to troubleshoot this next. Does anyone have any idea 
what would cause all of these time outs & failures?

I'm also curious about the lines like this:

2014-09-30 12:49:20,469 WARN 
org.apache.commons.httpclient.SimpleHttpConnectionManager: 
SimpleHttpConnectionManager being used incorrectly.  Be sure that 
HttpMethod.releaseConnection() is always called and that only one thread and/or 
method is using this connection manager at a time.


Would that be related to the timeout problem we're seeing?

Thanks,
Zach

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to