Hi,

You don't need to optimize just based on segment counts. Solr doesn't
optimize automatically because often it doesn't improve things enough to
justify the computational cost of optimizing. You shouldn't optimize unless
you do a benchmark and discover that optimizing improves performance.

If you're just worried about the segment count, you can tune that in
solrconfig.xml and Solr will merge down your index on the fly as it indexes.

Michael Della Bitta

Applications Developer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions <https://twitter.com/Appinions> | g+:
plus.google.com/appinions
<https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts>
w: appinions.com <http://www.appinions.com/>


On Tue, Jun 24, 2014 at 8:32 AM, RadhaJayalakshmi <
rlakshminaraya...@inautix.co.in> wrote:

> I am using Solr 4.5.1. I have two collections:
>                 Collection 1 - 2 shards, 3 replicas (Size of Shard 1 - 115
> MB, Size of Shard 2 - 55 MB)
>                 Collection 2 - 2 shards, 3 replicas (Size of Shard 2 - 3.5
> GB, Size of Shard 2 - 1 GB)
>
> I have a batch process that performs indexing (full refresh) - once a week
> on the same index.
>
> Here is some information on how I index:
> a) I use SolrJ's bulk ADD API for indexing - CloudSolrServer.add(Collection
> docs).
> b) I have an autoCommit (hardcommit) setting of for both my Collections
> (solrConfig.xml):
>                                 <autoCommit>
>                                                 <maxDocs>100000</maxDocs>
>
> <openSearcher>false</openSearcher>
>                                 </autoCommit>
> c) I do a programatic hardcommit at the end of the indexing cycle - with an
> open searcher of "true" - so that the documents show up on the Search
> Results.
> d) I neither programatically soft commit (nor have any autoSoftCommit
> seetings) during the batch indexing process
> e) When I re-index all my data again (the following week) into the same
> index - I don't delete existing docs. Rather, I just re-index into the same
> Collection.
> f) I am using the default mergefactor of 10 in my solrconfig.xml
>                 <mergeFactor>10</mergeFactor>
>
> Here is what I am observing:
> 1) After a batch indexing cycle - the segment counts for each shard / core
> is pretty high. The Solr Dashboard reports segment counts between 8 - 30
> segments on the variousr cores.
> 2) Sometimes the Solr Dashboard shows the status of my Core as - NOT
> OPTIMIZED. This I find unusual - since I have just finished a Batch
> indexing
> cycle - and would assume that the Index should already be optimized - Is
> this happening because I don't delete my docs before re-indexing all my
> data
> ?
> 3) After I run an optimize on my Collections - the segment count does
> reduce
> to significantly - to 1 segment.
>
> Am I doing indexing the right way ? Is there a better strategy ?
>
> Is it necessary to perform an optimize after every batch indexing cycle ??
>
> The outcome I am looking for is that I need an optimized index after every
> major Batch Indexing cycle.
>
> Thanks!!
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Does-one-need-to-perform-an-optimize-soon-after-doing-a-batch-indexing-using-SolrJ-tp4143686.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to