Hi,
I am sure Hadoop can help you calculate this, but you may also be able to
go about this more efficiently in Elasticsearch. If you, as you mentioned,
were to create a user centric index in addition to the event centric one
that you have got, you could store a list of all the events
Hi,
I think there is some confusion about the port number used. Kibana 4 by
default listens to port 5601, which based on the output sample you provided
seems to not have been changed. In all your examples you are however
looking for port 5061, not 5601. Can you check if you are able to connect
Hi,
As explained in the blog post, increasing the queue size will not improve
performance, just make you store more data in memory on the cluster
awaiting processing. It could actually instead end up reducing performance.
It looks like you are hitting the limit of your cluster and that the
Hi Eran,
Which version of Elasticsearch are you using?
Are you assigning your own document IDs or letting Elasticsearch assign
them automatically?
Best regards,
Christian
On Friday, April 24, 2015 at 7:49:56 AM UTC+1, Eran wrote:
Hello,
I've created an index I use for logging.
This
Hi Eran,
If you are assigning your own ID, Elasticsearch need to search and check if
the document already exists before writing it. This could explain why the
bulk insert performance goes down as the size of the index grows. If you
are not going to update the documents, I would therefore
Hi,
If I have calculated correctly, that corresponds to about 238TB of raw
data. If this is the size of JSON documents being indexed in Elasticsearch,
you will definitely need more than 2 nodes.
The good thing about using aliases the way David describes is that you will
not need to put all
Hi,
Merging of segments and the resulting removal of deleted documents is not
coordinated across nodes in Elasticsearch, meaning that the amount of
deleted documents can differ between primary and replica shards. Optimising
an index down to a single segment does resolve this, but can as noted
The creation date is given with millisecond precision. Take away the last 3
digits and you converter gives Fri, 06 Mar 2015 08:44:57 GMT for 1425631497.
Christian
On Monday, April 20, 2015 at 5:06:40 AM UTC+1, tao hiko wrote:
I query setting information of index and found that have
HI,
That sounds like a very large amount of shards for a node that size, and
this is most likely the source of your problems. Each shard in
Elasticsearch corresponds to a Lucene instance and carries with it a
certain amount of overhead. You therefore do not want your shards to be too
small.
Hi,
Having read through the thread it sounds like your configuration has been
working in the past. Is that correct?
If this is the case I would reiterate David's initial questions about your
node's RAM and heap size as the number of shards look quite large for a
single node. Could you please
Hi,
You seem to have quite a large number of shards (1180) for a single node
with only 7GB heap. As the total data volume is a bit over 600GB, the
average shard size is only a bit over 500MB, which is not very large. As
each shard is a separate Lucene index and carries some overhead, you would
Hi,
How much space the data takes up on disk in Elasticsearch depends a lot on
your mappings. In addition to storing the source in the _source field, all
fields are by default also copied over to the _all field to allow free text
search across all fields. In addition to this Elasticsearch also
Hi,
You could get around this by using routing based on customer ID when
indexing and searching. This will ensure that all documents belonging to a
single customer will be located in the same shard, which means that each
search for a specific customer can hit a single shard instead of all 9,
Hi,
Can you please share you logstash configuration, some sample data as well
as your mappings?
Best regards,
Christian
On Friday, March 6, 2015 at 11:30:45 AM UTC-8, Econgineer wrote:
I'm testing out the ELK stack on my desktop (ie 1 node) and thought I'd
start by pulling a flat file,
Hi,
You always want an odd number of master nodes (often 3), so I would
therefore recommend setting three of the four nodes to be master eligible
and leave the fourth as a pure data node. This will prevent the cluster
getting partitioned into two with equal number of master nodes on both
As Elasticsearch requires indexed documents to be in JSON format, you will
need to base64 encode any binary blobs in order to store them. This will
increase the size on disk significantly and have an impact on performance.
Unless you plan to utilise the search features in Elasticsearch at a
How many replicas do you have configured for the index?
Christian
On Thursday, February 12, 2015 at 8:32:28 PM UTC, Jay Danielian wrote:
I know this is difficult to answer, the real answer is always It Depends
:) But I am going to go ahead and hope I get some feedback here.
We are mainly
Hi,
A common approach for replicating changes across multiple geographically
distributed clusters if to put a message queue in front of Elasticsearch
and feed all data modifications through this so that they can be applied to
the clusters independently. This allows issues with unreliable
What does your mapping for the index look like? Is there any possibility
there could be a mapping conflict?
Christian
On Friday, January 9, 2015 at 10:48:52 PM UTC, Stefanie wrote:
I am having an issue with searching results if the type is not specified.
The following search request works
19 matches
Mail list logo