@mauri, thank you for such interesting analysis.
On 21/03/2014 1:01 PM, "Mauri" <ma...@proactive-edge.com.au> wrote:

> Hi Brad
>
> I agree with what Mark and Zachary have said and will expand on these.
>
> Firstly, shard and index level operations in ElasticSearch are
> peer-to-peer. Single-shard operations will affect at most 2 nodes, the node
> receiving the request and the node hosting an instance of the shard
> (primary or replica depending on the operation). Multi-shard operations
> (such a searches) will affect from one to (N  +1) nodes where N is the
> number of shards in the index.
>
> So from an index/shard operation perspective there is no reason to split
> into two clusters. The key issue with index/shard operations is that the
> cluster is able to handle the traffic volume. So if you do decide to split
> out into to two clusters you will need to look at the load profile for each
> of your client types to determine how much raw processing power you need in
> each cluster. It may be that a 10:20 split is more optimum than a 15:15
> split between clusters to balance request traffic, and therefore CPU
> utilisation, across all nodes. If you go with one cluster this is not an
> issue because you can move shards between nodes to balance the request
> traffic.
>
> Larger clusters also imply more work for the cluster master in managing
> the cluster. This comes down to the number of nodes that the master has to
> communicate with, and manage, and the size of the cluster state. A cluster
> with 30 nodes is not too large for a master to keep track of. There will be
> an increase in network traffic associated with the increase in volume of
> master-to-worker and worker-to-master pings used to detect the
> presence/absence of nodes. This can be offset by reducing the respective
> ping intervals.
>
> In a large cluster it is good practice to have a group of dedicated master
> nodes, say 3, from which the master is elected. These nodes do not host any
> user data meaning that cluster management is not compromised by high user
> request traffic.
>
> The size of the cluster state may be more of an issue. The cluster state
> comprises all of the information about the cluster configuration. The
> cluster state has records for each node, index, document mapping, shard,
> etc. Whenever there is a change to the cluster state it is first made by
> the master which then sends the updated cluster state to each worker node.
> Note that the entire cluster state is sent, not just the changes! It is
> therefore highly desirable to limit that frequency of changes to the
> cluster state, primarily by minimizing dynamic field mapping updates, and
> the overall size of the cluster state, primarily by minimizing the number
> of indices.
>
> In your proposed model the size of the cluster state associated the set of
> 60 shared month indices will be larger than that of one set of 60 dedicated
> month indices by virtue of having 100 shards to 6. However, it may not be
> much bigger because there will be much more metadata associated with
> defining the index structure, notably the field mappings for all document
> types in the index, than the metadata defining the shards of the index. So
> it may well be that the size of the cluster state associated with 60
> "shared" month indices plus N sets of 60 "dedicated" indices is not much
> more than that of (N + 1) sets of 60 "dedicated" indices. So there may not
> be much point in splitting to two clusters. A quick way to look at this for
> your actual data model is to:
>   1. Set up an index in ES with mappings for all document types and 6
> shards and 0 replicas,
>   2. Retrieve the index metadata JSON using ES admin API,
>   3. Increase the number of replicas to 16 (102 shards total),
>   4. Retrieve the index metadata JSON using ES admin API,
>   5. Compare the two JSON documents from 2 and 4.
>
> As state above it is desirable to minimize the number of indices. Each
> shard is a Lucene index which consumes memory and requires open file
> descriptors from the OS for segment data files and Lucene index level
> files. You may find yourself running out of memory and/or file descriptors
> if you are not careful.
>
> I understand you are looking for a design that will cater for on disc data
> volume. Given that your data is split into monthly indices it may well be
> that no one index, either "shared" or "dedicated" will reach that volume in
> one month. There may also be seasonal factors to consider whereby one or
> two months have much higher volumes than others. I have read/heard about
> cases where a monthly index architecture was implemented but later scraped
> for a single index approach because the month-to-month variation in volume
> was detrimental to overall system resource utilisation and performance.
>
> In you case think about whether monthly indices are really appropriate. An
> alternative model is to partition one years worth of data into a set of
> indices bounded by size rather than time. In this model a new index is
> started on Jan 01 and data is added to it until it reaches some predefined
> size limit, at which point a new index is created to accept new data from
> that point on. This is repeated until year end at which point you might end
> up with data for Jan/Feb/Mar and half of Apr in index 2014-01, the rest of
> Apr plus May - Oct in index 2014-02 and Nov/Dec in index 2014-03. This way
> you end up using the smallest number of indices within the constraints of
> manageable shard size and overall user data volume. This may also be a
> better approach than indices with 100 shards. This does, however, come at
> the cost of more complexity when it comes to accessing data across multiple
> indices, but this is a one off development cost rather than an ongoing
> maintenance cost.
>
> Also, rather than focusing so much on shard size look at the number and
> size of the segment files comprising a shard. Remember that Lucene segments
> are essentially immutable (except for marking deletes) after they are
> written to disc. This means that some index management operations may
> reduce down to simply copying/moving one or two segment files across the
> network, rather than all segment files for a given shard. For example, when
> allocating shards to nodes ElasticSearch tries to allocate shards to nodes
> that already have some of that shard's segment data. The new incremental
> backup/restore in ES V1 also takes advantage of this segment immutability.
> With this in mind you might be able to support shards > 5G consisting of
> more segment files of bounded size rather than fewer segment files of
> unbounded size (and ultimately a single 5G segment file).
>
> Hope this helps,
> Regards
> Mauri
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5072d2c9-a418-4afc-82e6-d2b8926d82c1%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/5072d2c9-a418-4afc-82e6-d2b8926d82c1%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4L%3DhmXjxC7ogE6gr02by78vji41qY0Zsa-YzxLxbQLDfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to