Time Date: Giant Index w/Shard Routing VS Small Indices w/Little Shards and Aliasing

2014-05-13 Thread webish
I am attempting to optimize time based data such as that of a newsfeed. I've been running tests with data broken into indices based on month, week, day. I'm using aliases to query the entire set or smaller ranges such as last-month, last-quarter. I'm still trying to figure out what will be

Re: Time Date: Giant Index w/Shard Routing VS Small Indices w/Little Shards and Aliasing

2014-05-13 Thread Mark Walkom
Sharding is good for when you have multiple nodes, that way you have a small number of shards per node that can be queried in parallel, rather than one (or a few) done sequentially. However you will get similar results by having many smaller indexes across multiple nodes. The key thing between the

Re: Time Date: Giant Index w/Shard Routing VS Small Indices w/Little Shards and Aliasing

2014-05-13 Thread webish
Ok. Makes sense. I'd like to setup an indexing strategy for time data that will hold for some time without needing to reshuffle everything. Advantages I've found of the small indices and shards would be that there is NO finite number of shards. Aliasing strategies have more power than basic