You do not want to add a new shard, first you want your docs evenly spread, secondly, they are spread using hash ranges, to add more capacity, you spread out those hash ranges using shard splitting. "Adding" a new shard doesnt really make any sense here. Unless you go for implicit routing where you decide for yourself which shard a doc goes into, but it seems too late to make that decision in your case.
Upayavira On Sun, Aug 2, 2015, at 12:40 AM, Nagasharath wrote: > Yes, shard splitting will only help in managing large clusters and to > improve query performance. In my case as index size is fully grown (no > capacity to hold in the existing shards) across the collection adding a > new shard will help and for which I have to re index. > > > > On 01-Aug-2015, at 6:34 pm, Upayavira <u...@odoko.co.uk> wrote: > > > > Erm, that doesn't seem to make sense. Seems like you are talking about > > *merging* shards. > > > > Say you had two shards, 3m docs each: > > > > shard1: 3m docs > > shard2: 3m docs > > > > If you split shard1, you would have: > > > > shard1_0: 1.5m docs > > shard1_1: 1.5m docs > > shard2: 3m docs > > > > You could, of course, then split shard2. You could also split shard1 > > into three parts instead, if you preferred: > > > > shard1_0: 1m docs > > shard1_1: 1m docs > > shard1_2: 1m docs > > shard2: 3m docs > > > > Upayavira > > > >> On Sun, Aug 2, 2015, at 12:25 AM, Nagasharath wrote: > >> If my current shard is holding 3 million documents will the new subshard > >> after splitting also be able to hold 3 million documents? > >> If that is the case After shard splitting the sub shards should hold 6 > >> million documents if a shard is split in to two. Am I right? > >> > >>> On 01-Aug-2015, at 5:43 pm, Upayavira <u...@odoko.co.uk> wrote: > >>> > >>> > >>> > >>>> On Sat, Aug 1, 2015, at 11:29 PM, naga sharathrayapati wrote: > >>>> I am using solrj to index documents > >>>> > >>>> i agree with you regarding the index update but i should not see any > >>>> deleted documents as it is a fresh index. Can we actually identify what > >>>> are > >>>> those deleted documents? > >>> > >>> If you post doc 1234, then you post doc 1234 a second time, you will see > >>> a deletion in your index. If you don't want deletions to show in your > >>> index, be sure NEVER to update a document, only add new ones with > >>> absolutely distinct document IDs. > >>> > >>> You cannot see (via Solr) which docs are deleted. You could, I suppose, > >>> introspect the Lucene index, but that would most definitely be an expert > >>> task. > >>> > >>>> if there is no option of adding shards to existing collection i do not > >>>> like > >>>> the idea of re indexing the whole data (worth hours) and we have gone > >>>> with > >>>> good number of shards but there is a rapid increase of size in data over > >>>> the past few days, do you think is it worth logging a ticket? > >>> > >>> You can split a shard. See the collections API: > >>> > >>> https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3 > >>> > >>> What would you want to log a ticket for? I'm not sure that there's > >>> anything that would require that. > >>> > >>> Upayavira