Re: elasticsearch-hadoop sporadic timeouts

2014-10-03 Thread Zach Cox
Is there anything else we could try here to debug elasticsearch-hadoop being unable to write to Elasticsearch? We're still seeing the same number of these fails during the nightly batch runs even after switching to 2.0.2.BUILD-SNAPSHOT, and I don't see any additional lines from

Re: elasticsearch-hadoop sporadic timeouts

2014-10-03 Thread Costin Leau
You can always enable TRACE though that is likely to create way too much information in production and slow things down considerably. The first thing you can do is minimize the batch size to give ES more breathing space by minimizing the batch size (say to 512KB) or the number of entries (500

Re: elasticsearch-hadoop sporadic timeouts

2014-10-03 Thread Zach Cox
Our Hadoop and Elasticsearch are all on AWS. We have 2 MR jobs that write to ES - 1 of them works fine, and one of them takes forever due to 10-20% of tasks failing in the way I've described. So I don't think it's any kind of network/firewall issue. There are no nightly backups related to ES or

Re: elasticsearch-hadoop sporadic timeouts

2014-10-03 Thread Costin Leau
What type of AWS instances are you using? Virtualization tends to interfere in various ways with a running system - sometime for good, sometimes for worse. The number of tasks is good to compute the total number of data and entries you are throwing at ES at one time. You are looking at a

Re: elasticsearch-hadoop sporadic timeouts

2014-10-03 Thread Zach Cox
Our 4 ES nodes are all m1.large ( http://www.ec2instances.info/?filter=m1.large) and our 5 Hadoop nodes are all m1.xlarge (http://www.ec2instances.info/?filter=m1.xlarge). Thanks for the troubleshooting pointers - we'll do some more research. On Fri, Oct 3, 2014 at 11:27 AM, Costin Leau

Re: elasticsearch-hadoop sporadic timeouts

2014-10-01 Thread Zach Cox
Hi Costin - we updated our dependencies to use elasticsearch-hadoop 2.0.2.BUILD-SNAPSHOT, but that didn't seem to change anything. We're still seeing the same task failures while trying to write to Elasticsearch. The only difference in the logs is that now I don't see the

Re: elasticsearch-hadoop sporadic timeouts

2014-10-01 Thread Costin Leau
The error indicates the ES nodes don't reply in a timely fashion and thus the connection drops. Based on your logs it seems to be either a GC or a network issue. You could try turning on logging in package 'org.elasticsearch.hadoop.rest' to DEBUG. How many tasks do you have and what's your bulk

Re: elasticsearch-hadoop sporadic timeouts

2014-10-01 Thread Zach Cox
This particular job has 1353 map tasks, Hadoop cluster has 5 nodes with total map task capacity of 25. Elasticsearch cluster has 4 nodes. Where can I find the bulk size/entries numbers? Thanks, Zach On Wed, Oct 1, 2014 at 7:19 AM, Costin Leau costin.l...@gmail.com wrote: The error indicates

Re: elasticsearch-hadoop sporadic timeouts

2014-10-01 Thread Zach Cox
Hi Costin - by bulk size/entries number are you referring to the es.batch.size.bytes and es.batch.size.entries config values described here? http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#configuration-serialization It looks like the only

elasticsearch-hadoop sporadic timeouts

2014-09-30 Thread Zach Cox
Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing due to ES being unavailable, with logs like this: https://gist.githubusercontent.com/zcox/3d6cf4329d49ca03271b/raw/57c46a5e4c9ea04d5c4209414d6f847492d16c0d/gistfile1.txt Seems

Re: elasticsearch-hadoop sporadic timeouts

2014-09-30 Thread Costin Leau
What version of es-hadoop/es/cascading are you using? On 9/30/14 6:16 PM, Zach Cox wrote: Hi - we're having problems with one of our map-reduce jobs that writes to Elasticsearch. Lots of map tasks are failing due to ES being unavailable, with logs like this:

Re: elasticsearch-hadoop sporadic timeouts

2014-09-30 Thread Zach Cox
Hi Costin: elasticsearch-hadoop 2.0.0 cascading 2.5.4 scalding 0.10.0 Thanks, Zach On Tuesday, September 30, 2014 10:25:10 AM UTC-5, Costin Leau wrote: What version of es-hadoop/es/cascading are you using? On 9/30/14 6:16 PM, Zach Cox wrote: Hi - we're having problems with one of our

Re: elasticsearch-hadoop sporadic timeouts

2014-09-30 Thread Costin Leau
Can you please try the 2.0.2.BUILD-SNAPSHOT? I think you might be running into issue #256 which was fixed some time ago and will be part of the upcoming 2.0.2, 2.1 Beta2. Cheers, On 9/30/14 6:43 PM, Zach Cox wrote: Hi Costin: elasticsearch-hadoop 2.0.0 cascading 2.5.4 scalding 0.10.0