Depending on your use case people also use collection aliasing for time series data. See below
https://blog.cloudera.com/blog/2013/10/collection-aliasing-near-real-time-search-for-really-big-data/ On Sat, Jul 1, 2017 at 7:13 PM, Susheel Kumar <susheel2...@gmail.com> wrote: > As Eric said 1docs/month isn't a big deal. I have 45+ million docs in one > shard but YMMV depending on other factors. > > Also there is lot of confusion in the terminology. The default routing is > compositeID routing. The implicit routing which Eric mentioned is the > manual routing. https://issues.apache.org/jira/browse/SOLR-6630 > > Which routing you are suggesting to use? Can you clarify again. Also > what's your exact use case. Do you query old aged documents or you don't > need to and most or all of your queries are supposed to go to shard with > newer documents. > > Thanks, > Susheel > > On Sat, Jul 1, 2017 at 12:14 PM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> 1M docs/month shouldn't make Solr break a sweat. If it really worries >> you and you're indexing in a big batch, index during off hours. At >> very worst, if you're ingesting them all at once you might have to >> throttle the indexing a bit. >> >> Frankly, most of the time acquiring the documents from the system of >> record is where the bottleneck is and Solr easily handles the indexing >> load. >> >> The other advantage is that if you use implicit routing rather than a >> composite ID, you can add shards to your collection one at a time as >> required, for time-series data that's an elegant way to "age out" old >> documents. >> >> Best, >> Erick >> >> On Sat, Jul 1, 2017 at 8:57 AM, mganeshs <mgane...@live.in> wrote: >> > Hi Susheel, >> > >> > Currently we have around 20M documents already and we are expecting now >> on >> > that every month 1M of documents. >> > The reason why don't want to for time based implicit routing is that, >> all >> > documents will end up with recent shard and so indexing will be heavy >> for >> > the new shard, where as older shards will be used just for query >> purpose. >> > If we have default sharding, then load for indexing is distributed >> across >> > all the shards. That's the reason we would like to stick to default >> > sharding. But Join is the issue over here when default sharding is used >> :-( >> > >> > >> > >> > -- >> > View this message in context: http://lucene.472066.n3.nabble >> .com/Allow-Join-over-two-sharded-collection-tp4343443p4343803.html >> > Sent from the Solr - User mailing list archive at Nabble.com. >> > >