I won't be able to achieve the correct mapping as I did not store the
mapping info any where. I don't know if core-node1 was mapped to
shard1_recplica1 or shard2_replica1 in my old collection. But I am not
worried about that as I am not going to update any existing document.

 This is what I did.

I created a new collection with the same schema and the same config.
Shut the SolrCloud down.
Then I copied the data directory.


hadoop fs -cp hdfs://prod/solr53/collection_old/*
hdfs://prod/solr53/collection_new/


Re-started the SolrCloud and I could see documents in the Solr UI when I
queried using the "/select" handler.


Thanks!



On Mon, Nov 7, 2016 at 2:59 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> You've got it. You should be quite safe if you
> 1> create the same number of shards as you used to have
> 2> match the shard bits. I.e. collection1_shard1_replica1 as long as
> the collection1_shard# parts match you should be fine. If this isn't
> done correctly, the symptom will be that when you update an existing
> document, you may have two copies returned eventually.
>
> Best,
> Erick
>
> On Mon, Nov 7, 2016 at 1:47 PM, Chetas Joshi <chetas.jo...@gmail.com>
> wrote:
> > Thanks Erick.
> >
> > I had replicationFactor=1 in my old collection and going to have the same
> > config for the new collection.
> > When I create a new collection with number of Shards =20 and max shards
> per
> > node = 1, the shards are going to start on 20 hosts out of my 25 hosts
> Solr
> > cluster. When you say "get each shard's index to the corresponding shard
> on
> > your new collection", do you mean the following?
> >
> > shard1_replica1 -> core_node1 (old collection)
> > shard1_replica1 -> has to be core_node1 (new collection) (I don't have
> this
> > mapping for the old collection as the collection no longer exists!!)
> >
> > Thanks,
> > Chetas.
> >
> > On Mon, Nov 7, 2016 at 1:03 PM, Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> >> That should work. The caveat here is that you need to get the each
> >> shards index to the corresponding shard on your new collection.
> >>
> >> Of course I'd back up _all_ of these indexes before even starting.
> >>
> >> And one other trick. First create your collection with 1 replica per
> >> shard (leader-only). Then copy the indexes (and, btw, I'd have the
> >> associated Solr nodes down during the copy) and verify the collection
> >> is as you'd expect.
> >>
> >> Now use ADDREPLICA to expand your collection, that'll handle the
> >> copying from the leader correctly.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Nov 7, 2016 at 12:49 PM, Chetas Joshi <chetas.jo...@gmail.com>
> >> wrote:
> >> > I have a Solr Cloud deployed on top of HDFS.
> >> >
> >> > I accidentally deleted a collection using the collection API. So,
> >> ZooKeeper
> >> > cluster has lost all the info related to that collection. I don't
> have a
> >> > backup that I can restore from. However, I have indices and
> transaction
> >> > logs on HDFS.
> >> >
> >> > If I create a new collection and copy the existing data directory to
> the
> >> > data directory path of the new collection I have created, will I be
> able
> >> to
> >> > go back to the state where I was? Is there anything else I would have
> to
> >> do?
> >> >
> >> > Thanks,
> >> >
> >> > Chetas.
> >>
>

Reply via email to