Hi,
We have a solr index of size 626 MB and number of douments indexed are
141810. We have configured index based spellchecker with buildOnCommit
option set to true. Spellcheck index is of size 8.67 MB.

We use data import handler to create the index from scratch and also to
update the index periodically. We have created the job to run full import
once every week and the delta import after every 20 mins. The full import
takes about 38 mins to complete and the delta import takes about 12 mins to
complete. The index also serves the search queries (even at the time the
delta import is running). The number of documents that are changed during
every delta import are on an average 25 to 30.

Is there a way to reduce the amount of time delta import takes to update the
index.
The system specs are
MS Windows Server 2003 R2
Standard x64 Edition
8 GB RAM.
Solr is set up on Tomcat 6.0

The CPU utilization of the tomcat.exe at the time of delta import is 60%.

In the data-config.xml file there are 6 root entities for 6 database tables
under the <Document> element. The first root entity gets the rows from
table1, the 2nd root entity gets the rows from table2 ...so on. The root
entities have several child entities to get the fields from associated
tables.

The mergeFactor is set to 10 and ramBufferSizeMB is set to 32. The following
is the cache setting

<filterCache class="solr.LRUCache" size="16384" initialSize="4096"
autowarmCount="4096"/>
<queryResultCache class="solr.LRUCache" size="16384" initialSize="4096"
autowarmCount="4096"/>
<documentCache class="solr.LRUCache" size="16384" initialSize="16384"
autowarmCount="0"/>
<enableLazyFieldLoading>true</enableLazyFieldLoading>

Is it advisable to use master slave configuration. Does the index size of
626 MB validate the change from existing single solr core (on which delta
import is done after every 20 mins and also serves search queries) to master
slave configuration keeping into consideration that the index size will keep
on increasing over time.

Is there any other way to improve the indexing time.

Thanks,
Gurjot



**

Reply via email to