Re: Copying data

Erick Erickson Mon, 16 Mar 2020 05:24:21 -0700

It’s not at all clear what the problem is. If you have a single-shard 
collection, just 
1> create the stand-alone core 
2> shut down the Solr instance
3> replace the stand-alone core's data dir with one from any of your prod 
machines. 
4> start Solr

An alternative is to use the replication API to replace the index on your 
stand-alone core with one from one of the prod machines, see: 
https://lucene.apache.org/solr/guide/7_7/index-replication.html. You have to 
specify the masterURL and shouldn’t need to do anything with the configuration.

But assuming you have 3 shards: 

First, it’s easy enough to create a three-shard collection on your dev machine, 
either using embedded ZK or a separate ZK instance on the dev machine, so 
that’s one option. The advantage there is it’s the same environment. To do 
that, just create the 30shard replica

you can use the core admin API MERGEINDEXES command. What you’ll do is

1> create your core on your dev machine
2> copy one of the data dirs from one of the prod machines to the data dir of 
your new core.
3> copy the other two data dirs somewhere on the prod machine
4> use MERGEINDEXES, see: 
https://lucene.apache.org/solr/guide/7_4/coreadmin-api.html

Best,
Erick

> On Mar 16, 2020, at 12:32 AM, Jayadevan Maymala <jayade...@ftltechsys.com> 
> wrote:
> 
> Hi all,
> 
> I have a 3 node Solr cluster in production (with zoo keeper). In dev, I
> have one node Solr instance, no zoo keeper. Which is the best way to copy
> over the production solr data to dev?
> Operating system is CentOS 7.7, Solr Version 7.3
> Collection size is in the 40-50 GB range.
> 
> Regards,
> Jayadevan

Re: Copying data

Reply via email to