There's no easy solution in this case since there's no 'constant'. Basically the environment, in this case YARN, doesn't
offer any facilities or guarantees for long-lived processes. Pushing that towards the clients (the user, Kibana,
Logstash) is not a solution since these care about the service
With YARN, the ES nodes can move from server to server upon restarts.
How do Logstash and Kibana discover the right server IP port for the ES
listener(s) while ES is running under YARN?
What do those config fragments look like. We could not find anything online
about that.
Thanks in advance.
Hi,
Whether ES is running on YARN, Linux, Windows, Docker or AWS doesn't matter to the clients as long as they have access
to the instance.
In other words, logstash doesn't see any difference in ES if it's running on
Linux vs YARN.
However one has to take into account the difference in the
wrote:
Hi,
I have one question related to performance of ES with Hadoop.
Our Architecture:
1) use hadoop for storage big data as we have millions of data.
2) Feed to ES from Hadoop via API.
3) Search will work through ES.
Will this architecture have performance issue ?
OR We simple use ES
Hi,
I have one question related to performance of ES with Hadoop.
Our Architecture:
1) use hadoop for storage big data as we have millions of data.
2) Feed to ES from Hadoop via API.
3) Search will work through ES.
Will this architecture have performance issue ?
OR We simple use ES
Thank you for your answer,
I did some tests, by writing something simple that writes bulks using
TransportClient to ES server in parallel mode (something like
BulkProcessor...),
and it looks that the hadoop job run pretty same time, so there is no big
difference here with using es-hadoop
Hi all,
Currently I am working with elasticsearch-hadoop library with EsOutputFormat
that
is writing to elasticsearch,
But it looks to me like the writing is slow (elasticsearch-hadoop works
with HTTP bulks on port 9200)
So my question is it worth to try to write something of my own that will
of remarks:
- es-hadoop does not open a write http socket on each bulk but rather for an entire write task (which implies multiple
bulk requests). If that's not the case, it's a bug
- not sure what you mean by `Hadoop Context writable` objects - can you provide
some context?
When one reads data