Hello

I have been struggling off-and-on for months now to run an apparently 
highly updated ES cluster on as little hardware as possible.  There's more 
history here 
<http://elasticsearch-users.115913.n3.nabble.com/Sustainable-way-to-regularly-purge-deleted-docs-td4062259.html>,
 
but the overall issue is that my use case involves lots of frequent 
updates, which results in a growing percentage of deleted docs, and thus 
more RAM needed to handle the same dataset.  I want to run a possible 
change in index structure by the group to get any possible recommendations.

Ultimately the growing delete percentage seems to be related to both the 
size of the shards in question, and a hard-coded ratio in the Lucene Tiered 
Merge Policy code.  As it turns out, the merge code will not consider 
merging a segment until the non-deleted portion of the segment is smaller 
than half of the max_merged_segment parameter (someone please correct me if 
I'm wrong here, and I wish it had been easier to discover this).  This does 
not appear to be configurable in any version of Lucene in which I've 
looked.  This explains why bumping max_merged_segment stupidly high can 
keep the deleted percentage low; the larger segments are considered for 
merging when they have a smaller percentage of deleted docs.  This also 
explains why my shards grew to be ~45% deleted docs.

To work around this, I'm considering reindexing with 3 or 4 times the 
number of shards on the same number of nodes to shrink the shards to a more 
manageable size, but want to make sure that the overhead from additional 
shards will not counter the benefits from having smaller shards.  Note that 
I AM using routing, which I assume should reduce much of the over-sharding 
overhead.

The index in question now has six shards with one replica each.  Shard 
sizes ranges from 20-30 GB.  Currently this is running on six r3.xlarge 
(30.5 GB RAM, 20GB allocated to ES) instances.  Ideally I'd be running on 
no more than four.  With a low enough delete percentage I think I can even 
run on three.

My theory is that if I reduce the size of the shards to 5-10 GB max, then I 
can set the max_merged_segment high enough that the deleted docs will be 
merged out quickly, and I'll be able to handle the size of the merges in 
question without issue.  With the existing setup, even on six nodes, I 
occasionally run into merge memory issues with large max_merged_segment 
sizes (13 GB currently I think).

Does this seem reasonable?  I'd be looking at 36-48 shards, counting 
replicas, being spread across 3-4 nodes, so 12-18 each.  Thoughts?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/39668eef-db62-4ffc-968d-88e04f1712bd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to