I won't be able to achieve the correct mapping as I did not store the mapping info any where. I don't know if core-node1 was mapped to shard1_recplica1 or shard2_replica1 in my old collection. But I am not worried about that as I am not going to update any existing document.
This is what I did. I created a new collection with the same schema and the same config. Shut the SolrCloud down. Then I copied the data directory. hadoop fs -cp hdfs://prod/solr53/collection_old/* hdfs://prod/solr53/collection_new/ Re-started the SolrCloud and I could see documents in the Solr UI when I queried using the "/select" handler. Thanks! On Mon, Nov 7, 2016 at 2:59 PM, Erick Erickson <erickerick...@gmail.com> wrote: > You've got it. You should be quite safe if you > 1> create the same number of shards as you used to have > 2> match the shard bits. I.e. collection1_shard1_replica1 as long as > the collection1_shard# parts match you should be fine. If this isn't > done correctly, the symptom will be that when you update an existing > document, you may have two copies returned eventually. > > Best, > Erick > > On Mon, Nov 7, 2016 at 1:47 PM, Chetas Joshi <chetas.jo...@gmail.com> > wrote: > > Thanks Erick. > > > > I had replicationFactor=1 in my old collection and going to have the same > > config for the new collection. > > When I create a new collection with number of Shards =20 and max shards > per > > node = 1, the shards are going to start on 20 hosts out of my 25 hosts > Solr > > cluster. When you say "get each shard's index to the corresponding shard > on > > your new collection", do you mean the following? > > > > shard1_replica1 -> core_node1 (old collection) > > shard1_replica1 -> has to be core_node1 (new collection) (I don't have > this > > mapping for the old collection as the collection no longer exists!!) > > > > Thanks, > > Chetas. > > > > On Mon, Nov 7, 2016 at 1:03 PM, Erick Erickson <erickerick...@gmail.com> > > wrote: > > > >> That should work. The caveat here is that you need to get the each > >> shards index to the corresponding shard on your new collection. > >> > >> Of course I'd back up _all_ of these indexes before even starting. > >> > >> And one other trick. First create your collection with 1 replica per > >> shard (leader-only). Then copy the indexes (and, btw, I'd have the > >> associated Solr nodes down during the copy) and verify the collection > >> is as you'd expect. > >> > >> Now use ADDREPLICA to expand your collection, that'll handle the > >> copying from the leader correctly. > >> > >> Best, > >> Erick > >> > >> On Mon, Nov 7, 2016 at 12:49 PM, Chetas Joshi <chetas.jo...@gmail.com> > >> wrote: > >> > I have a Solr Cloud deployed on top of HDFS. > >> > > >> > I accidentally deleted a collection using the collection API. So, > >> ZooKeeper > >> > cluster has lost all the info related to that collection. I don't > have a > >> > backup that I can restore from. However, I have indices and > transaction > >> > logs on HDFS. > >> > > >> > If I create a new collection and copy the existing data directory to > the > >> > data directory path of the new collection I have created, will I be > able > >> to > >> > go back to the state where I was? Is there anything else I would have > to > >> do? > >> > > >> > Thanks, > >> > > >> > Chetas. > >> >