Re: Accessing ES in Hadoop

2015-02-05 Thread Costin Leau
There's no easy solution in this case since there's no 'constant'. Basically the environment, in this case YARN, doesn't offer any facilities or guarantees for long-lived processes. Pushing that towards the clients (the user, Kibana, Logstash) is not a solution since these care about the service

Re: Accessing ES in Hadoop

2015-02-05 Thread Douglas Moore
With YARN, the ES nodes can move from server to server upon restarts. How do Logstash and Kibana discover the right server IP port for the ES listener(s) while ES is running under YARN? What do those config fragments look like. We could not find anything online about that. Thanks in advance.

Re: Accessing ES in Hadoop

2015-02-03 Thread Costin Leau
Hi, Whether ES is running on YARN, Linux, Windows, Docker or AWS doesn't matter to the clients as long as they have access to the instance. In other words, logstash doesn't see any difference in ES if it's running on Linux vs YARN. However one has to take into account the difference in the

Re: ES with Hadoop

2015-01-29 Thread Costin Leau
wrote: Hi, I have one question related to performance of ES with Hadoop. Our Architecture: 1) use hadoop for storage big data as we have millions of data. 2) Feed to ES from Hadoop via API. 3) Search will work through ES. Will this architecture have performance issue ? OR We simple use ES

ES with Hadoop

2015-01-29 Thread Manoj Singh
Hi, I have one question related to performance of ES with Hadoop. Our Architecture: 1) use hadoop for storage big data as we have millions of data. 2) Feed to ES from Hadoop via API. 3) Search will work through ES. Will this architecture have performance issue ? OR We simple use ES

Re: [HADOOP] Anyone used TransportClient for writing to ES from Hadoop mappers?

2014-04-22 Thread Igor Romanov
Thank you for your answer, I did some tests, by writing something simple that writes bulks using TransportClient to ES server in parallel mode (something like BulkProcessor...), and it looks that the hadoop job run pretty same time, so there is no big difference here with using es-hadoop

[HADOOP] Anyone used TransportClient for writing to ES from Hadoop mappers?

2014-04-17 Thread Igor Romanov
Hi all, Currently I am working with elasticsearch-hadoop library with EsOutputFormat that is writing to elasticsearch, But it looks to me like the writing is slow (elasticsearch-hadoop works with HTTP bulks on port 9200) So my question is it worth to try to write something of my own that will

Re: [HADOOP] Anyone used TransportClient for writing to ES from Hadoop mappers?

2014-04-17 Thread Costin Leau
of remarks: - es-hadoop does not open a write http socket on each bulk but rather for an entire write task (which implies multiple bulk requests). If that's not the case, it's a bug - not sure what you mean by `Hadoop Context writable` objects - can you provide some context? When one reads data