Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing due to ES being "unavailable", with logs like this:
https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt Seems like elasticsearch-hadoop tries talking to an ES node, it times out, tries the next one, it times out, etc until all nodes in the cluster are exhausted and then it gives up. As far as I can tell, the ES cluster is healthy while this is occurring. May map tasks are succeeding - probably about 10% of the attempts are killed due to this issue. The main problem is that these killed tasks waste a lot of time, and slow down the overall job execution. I'm not sure where to troubleshoot this next. Does anyone have any idea what would cause all of these time outs & failures? I'm also curious about the lines like this: 2014-09-30 12:49:20,469 WARN org.apache.commons.httpclient.SimpleHttpConnectionManager: SimpleHttpConnectionManager being used incorrectly. Be sure that HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection manager at a time. Would that be related to the timeout problem we're seeing? Thanks, Zach -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f304a286-399f-4dea-b7f0-032b19ad67e6%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.