But again, with a master/slave setup merging should
be relatively benign. And at 200M docs, having a M/S
setup is probably indicated.

Here's a good writeup of mergepolicy
http://juanggrande.wordpress.com/2011/02/07/merge-policy-internals/

If you're indexing and searching on a single machine, merging
is much less important than how often you commit. If a M/S
situation, then you're polling interval on the slave is important.

I'd look at commit frequency long before I worried about merging,
that's usually where people shoot themselves in the foot - by
committing too often.

Overall, your mergeFactor is probably less important than other
parts of how you perform indexing/searching, but it does have
some effect for sure...

Best
Erick

On Wed, May 2, 2012 at 7:54 AM, Prakashganesh, Prabhu
<prabhu.prakashgan...@dowjones.com> wrote:
> We have a fairly large scale system - about 200 million docs and fairly high 
> indexing activity - about 300k docs per day with peak ingestion rates of 
> about 20 docs per sec. I want to work out what a good mergeFactor setting 
> would be by testing with different mergeFactor settings. I think the default 
> of 10 might be high, I want to try with 5 and compare. Unless I know when a 
> merge starts and finishes, it would be quite difficult to work out the impact 
> of changing mergeFactor. I want to be able to measure how long merges take, 
> run queries during the merge activity and see what the response times are 
> etc..
>
> Thanks
> Prabhu
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 02 May 2012 12:40
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Merge during off peak times
>
> Why do you care? Merging is generally a background process, or are
> you doing heavy indexing? In a master/slave setup,
> it's usually not really relevant except that (with 3.x), massive merges
> may temporarily stop indexing. Is that the problem?
>
> Look at the merge policys, there are configurations that make
> this less painful.
>
> In trunk, DocumentWriterPerThread makes merges happen in the
> background, which helps the long-pause-while-indexing problem.
>
> Best
> Erick
>
> On Wed, May 2, 2012 at 7:22 AM, Prakashganesh, Prabhu
> <prabhu.prakashgan...@dowjones.com> wrote:
>> Ok, thanks Otis
>> Another question on merging
>> What is the best way to monitor merging?
>> Is there something in the log file that I can look for?
>> It seems like I have to monitor the system resources - read/write IOPS etc.. 
>> and work out when a merge happened
>> It would be great if I can do it by looking at log files or in the admin UI. 
>> Do you know if this can be done or if there is some tool for this?
>>
>> Thanks
>> Prabhu
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
>> Sent: 01 May 2012 15:12
>> To: solr-user@lucene.apache.org
>> Subject: Re: Solr Merge during off peak times
>>
>> Hi Prabhu,
>>
>> I don't think such a merge policy exists, but it would be nice to have this 
>> option and I imagine it wouldn't be hard to write if you really just base 
>> the merge or no merge decision on the time of day (and maybe day of the 
>> week).
>>
>> Note that this should go into Lucene, not Solr, so if you decide to 
>> contribute your work, please 
>> see http://wiki.apache.org/lucene-java/HowToContribute
>>
>> Otis
>> ----
>> Performance Monitoring for Solr - http://sematext.com/spm
>>
>>
>>
>>
>>>________________________________
>>> From: "Prakashganesh, Prabhu" <prabhu.prakashgan...@dowjones.com>
>>>To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
>>>Sent: Tuesday, May 1, 2012 8:45 AM
>>>Subject: Solr Merge during off peak times
>>>
>>>Hi,
>>>  I would like to know if there is a way to configure index merge policy in 
>>>solr so that the merging happens during off peak hours. Can you please let 
>>>me know if such a merge policy configuration exists?
>>>
>>>Thanks
>>>Prabhu
>>>
>>>
>>>

Reply via email to