Questions regarding re-index when using Solr as a data source

Hui Liu Thu, 09 Jun 2016 08:51:42 -0700

Hi,

              We are porting an application currently hosted in Oracle 11g to 
Solr Cloud 6.x, i.e we plan to migrate all tables in Oracle as collections in 
Solr, index them, and build search tools on top of this; the goal is we won't 
be using Oracle at all after this has been implemented; every fields in Solr 
will have 'stored=true' and selectively a subset of searchable fields will have 
'indexed=true'; the question is what steps we should follow if we need to 
re-index a collection after making some schema changes - mostly we only add new 
fields to store, or make a non-indexed field as indexed, we normally do not 
delete or rename any existing fields; according to this url: 
https://wiki.apache.org/solr/HowToReindex it seems we need to setup a 
'intermediate' Solr1 to only store the data themselves without any indexing, 
then have another Solr2 setup to store the indexed data, and in case of 
re-index, just delete all the documents in Solr2 for the collection and 
re-import data from Solr1 into Solr2 using SolrEntityProcessor (from dataimport 
handler)? Is this still the recommended approach? I can see the downside of 
this approach is if we have tremendous amount of data for a collection (some of 
our collection could have several billions of documents), re-import it from 
Solr1 to Solr2 may take a few hours or even days, and during this time, users 
cannot query the data, is there any better way to do this and avoid this type 
of down time? Any feedback is appreciated!


Regards,
Hui Liu
Opentext, Inc.

Questions regarding re-index when using Solr as a data source

Reply via email to