Re: Indexing is becoming slow, what to look for?

2014-10-02 Thread Michael McCandless
On Tue, Sep 9, 2014 at 6:55 AM, Thomas  wrote:

> By setting this parameter, some additional questions of mine have been
> generated:
>
> By setting indices.memory.index_buffer_size to a specific node and not to
> all nodes of the cluster, will this configuration be taken into account
> from all nodes? Is it going to be cluster wide or only for index operations
> of the specific node? So do I need to set this up to all nodes one by one,
> do a restart and then see the effects?
>

How did you set it for one node?  I believe (not certain!) this is supposed
to be a global (cluster wide) setting...


> Finally, if we index data into an index of 10 shards and I have 5 nodes,
> that means that the particular node will index to 2 shards, so
> the indices.memory.index_buffer_size will refer to those specific two
> shards?
>

The setting is divided up among all active shards that node is hosting.
Once a shard goes "idle" (hasn't seen indexing operation for N minutes)
then its RAM buffer is dropped to something tiny, and this returns that RAM
to be divided up among the remaining active shards.

As of 1.4.0 the stats API will now tell you much RAM buffer was allocated
for each shard: https://github.com/elasticsearch/elasticsearch/pull/7440

Mike McCandless

http://blog.mikemccandless.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smRfYJv5FGxSNbgReUFD%2Bdgz4LOBLkLw0_XmvqnGX5vnNVg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-09 Thread Thomas
By setting this parameter, some additional questions of mine have been 
generated:

By setting indices.memory.index_buffer_size to a specific node and not to 
all nodes of the cluster, will this configuration be taken into account 
from all nodes? Is it going to be cluster wide or only for index operations 
of the specific node? So do I need to set this up to all nodes one by one, 
do a restart and then see the effects?

Finally, if we index data into an index of 10 shards and I have 5 nodes, 
that means that the particular node will index to 2 shards, so 
the indices.memory.index_buffer_size will refer to those specific two 
shards?

Thank you very much

Thomas

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a7e8e0bb-aa71-4210-97db-6e8cd46cd79c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Got it thanks

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5f10aa50-13e1-4dff-b44a-ac9ce4cbf2d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Nikolas Everett
Active in this contact means currently indexing documents.
On Sep 5, 2014 8:17 AM, "Thomas"  wrote:

> Hi,
>
> I wanted to clarify something from the blog post you mentioned. You
> specify that based on calculations we should "give at most ~512 MB
> indexing buffer per active shard...". What i wanted to ask is what do we
> mean with the term active? Do you mean the primary only or not?
>
> Thank you again
>
> On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>>
>> Hi,
>>
>> I have been performing indexing operations in my elasticsearch cluster
>> for some time now. Suddenly, I have been facing some latency while indexing
>> and I'm trying to find the reason for it.
>>
>> Details:
>>
>> I have a custom process which is uploading every interval a number of
>> logs with bulk API. This process was taking about 5-7 minutes every time.
>> For some reason, the last days I noticed that the exact same procedure,
>> same volumes, takes about 15-20 minutes. While manipulating the data I run
>> update operations through scripting (groovy). My cluster is a set of 5
>> nodes, my first impression was that I need to scale therefore I added an
>> extra node. The problem seemed that it was solved but after a day again I
>> face the same issue.
>>
>> Is it possible to give some ideas about what to check, or what seems to
>> be the issue? How is possible to check if a background process is running
>> or creating any issues (expunge etc.)? Does anyone has any similar problems?
>>
>> Any help appreciated, let me know what info to share
>>
>> ES version is 1.3.1
>> JDK is 1.7
>>
>> Thanks
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/901847bc-4e02-465f-a11f-9896e31d0e7f%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3qG606-XR9NseuQv0MGyxbvjHpdok7hj9kmcUpf20A9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Hi,

I wanted to clarify something from the blog post you mentioned. You specify 
that based on calculations we should "give at most ~512 MB indexing 
buffer per active shard...". What i wanted to ask is what do we mean with 
the term active? Do you mean the primary only or not?

Thank you again

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/901847bc-4e02-465f-a11f-9896e31d0e7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Thx Michael,

I will read the post in detail and let you know for any findings

Thomas.

On Friday, 5 September 2014 11:44:42 UTC+3, Thomas wrote:
>
> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for 
> some time now. Suddenly, I have been facing some latency while indexing and 
> I'm trying to find the reason for it. 
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs 
> with bulk API. This process was taking about 5-7 minutes every time. For 
> some reason, the last days I noticed that the exact same procedure, same 
> volumes, takes about 15-20 minutes. While manipulating the data I run 
> update operations through scripting (groovy). My cluster is a set of 5 
> nodes, my first impression was that I need to scale therefore I added an 
> extra node. The problem seemed that it was solved but after a day again I 
> face the same issue. 
>
> Is it possible to give some ideas about what to check, or what seems to be 
> the issue? How is possible to check if a background process is running or 
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/caf966f1-97a3-4b86-b900-dc36dcaa279e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing is becoming slow, what to look for?

2014-09-05 Thread Michael McCandless
Maybe index throttling is happening (ES would say so in the logs) because
your merging is falling behind?  Do you throttle IO for merges (it's
throttled at paltry 20 MB / sec by default)?  What does hot threads report?
 How about top/iostat?

We just got a blog post out about improving indexing throughput:
http://www.elasticsearch.org/blog/performance-considerations-elasticsearch-indexing

Also, you should upgrade to 1.3.2, which fixes a rare but nasty corruption
issue in the upstream compression library ES embeds.  And make sure you use
at least JDK 1.7.0_55.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Sep 5, 2014 at 4:44 AM, Thomas  wrote:

> Hi,
>
> I have been performing indexing operations in my elasticsearch cluster for
> some time now. Suddenly, I have been facing some latency while indexing and
> I'm trying to find the reason for it.
>
> Details:
>
> I have a custom process which is uploading every interval a number of logs
> with bulk API. This process was taking about 5-7 minutes every time. For
> some reason, the last days I noticed that the exact same procedure, same
> volumes, takes about 15-20 minutes. While manipulating the data I run
> update operations through scripting (groovy). My cluster is a set of 5
> nodes, my first impression was that I need to scale therefore I added an
> extra node. The problem seemed that it was solved but after a day again I
> face the same issue.
>
> Is it possible to give some ideas about what to check, or what seems to be
> the issue? How is possible to check if a background process is running or
> creating any issues (expunge etc.)? Does anyone has any similar problems?
>
> Any help appreciated, let me know what info to share
>
> ES version is 1.3.1
> JDK is 1.7
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/fed6d402-d4bb-44ea-8de7-d66c2ec5cb91%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smReHr4qGpv2WcsFesvTb%3DFjYJ2NWrHXttuaWmWWnhwP%3D8w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Indexing is becoming slow, what to look for?

2014-09-05 Thread Thomas
Hi,

I have been performing indexing operations in my elasticsearch cluster for 
some time now. Suddenly, I have been facing some latency while indexing and 
I'm trying to find the reason for it. 

Details:

I have a custom process which is uploading every interval a number of logs 
with bulk API. This process was taking about 5-7 minutes every time. For 
some reason, the last days I noticed that the exact same procedure, same 
volumes, takes about 15-20 minutes. While manipulating the data I run 
update operations through scripting (groovy). My cluster is a set of 5 
nodes, my first impression was that I need to scale therefore I added an 
extra node. The problem seemed that it was solved but after a day again I 
face the same issue. 

Is it possible to give some ideas about what to check, or what seems to be 
the issue? How is possible to check if a background process is running or 
creating any issues (expunge etc.)? Does anyone has any similar problems?

Any help appreciated, let me know what info to share

ES version is 1.3.1
JDK is 1.7

Thanks

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fed6d402-d4bb-44ea-8de7-d66c2ec5cb91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.