Right, but even if that worked, you'd then get docs being assigned to the wrong shard. The shard assignment would be something like (hash(id)/3). So a document currently on shard 0 would be indexed next time, perhaps, on shard 2, leaving two "live" docs in your system with the same ID. Bad Things would happen then...
I believe that currently your only real option is to re-index from scratch when you add more shards. I was thinking about this at one point. Unless the guys work some magic, it will be an expensive process. Not as expensive as re-indexing for sure, but consider 12 documents in 3 shards. shard1 - 1, 4, 7, 10 shard2 - 2, 5, 8, 11 shard3 - 3, 6, 9, 12 Now you add a shard and the docs are re-distributed shard1 - 1, 5, 9 shard2 - 2, 6, 10 shard3 - 3, 7, 11 shard4 - 4, 8, 12 In this simple case, only 3 out of your 12 documents stayed on the same shard! All the rest had to be moved. Then the indexes have to be distributed across all replicas, then.... Now, there won't have to be any analysis done. You won't have to reconstruct all of the documents from your system-of-record. You won't have to a _ton_ of work that you originally had to do. This should be enormously faster than re-indexing. But it still won't be something to casually do on a live system under load <G>..... Disclaimer: I really may be talking through my hat here, but this _sounds_ right. FWIW Erick On Mon, Oct 8, 2012 at 4:33 AM, Upayavira <u...@odoko.co.uk> wrote: > Given that Solr does not support distributed IDF, adding a shard without > balancing the number of documents could seriously skew your scoring. If > you are okay with that, then the next question is what happens if you > download the clusterstate.json from ZooKeeper, and add another entry, > along the lines of "shard3":{}, then upload it again, what would happen > then? > > My theory is that the next host you start up would become the first node > of shard3. Worth a try (unless someone more knowledgeable tells us > otherwise!) > > Upayavira > > On Mon, Oct 8, 2012, at 01:35 AM, Radim Kolar wrote: >> i am reading this: http://wiki.apache.org/solr/SolrCloud section >> Re-sizing a Cluster >> >> Its possible to add shard to an existing index? I do not need to get >> data redistributed, they can stay where they are, its enough for me if >> new entries will be distributed into new number of shards. restarting >> solr is fine.