Re: Solr 4.0 segment flush times has bigger difference between tow machines

Erick Erickson Sat, 20 Oct 2012 05:52:57 -0700

My first question is why this matters? Is this curiosity or is there a real
performance issue you're tracking down?


I don't quite understand when you say "machine A forwards...to machineB".
Are you talking about replication here? Or SolrCloud? Details matter, a lot....
DIH has nothing that I know of that forwards anything anywhere, so there
must be something you're not telling us about the setup....

But the first thing I'd check is what the solrconfig.xml values are for
committing on both machines. Are they identical?

Best
Erick

On Fri, Oct 19, 2012 at 12:53 AM, Jun Wang <wangjun...@gmail.com> wrote:
> Hi
>
> I have 2 machine for a collection, and it's using DIH to import data, DIH
> is trigger via url request at one machine, let's call it A, and A will
> forward some index to machine B. Recently I have found that segment flush
> happened more in machine B. here is part of INFOSTREAM.txt.
>
> Machine A:
> ----------------------------
> DWPT 0 [Thu Oct 18 20:06:20 PDT 2012; Thread-39]: flush postings as segment
> _4r3 numDocs=71616
> DWPT 0 [Thu Oct 18 20:06:21 PDT 2012; Thread-39]: new segment has 0 deleted
> docs
> DWPT 0 [Thu Oct 18 20:06:21 PDT 2012; Thread-39]: new segment has no
> vectors; no norms; no docValues; prox; freqs
> DWPT 0 [Thu Oct 18 20:06:21 PDT 2012; Thread-39]:
> flushedFiles=[_4r3_Lucene40_0.prx, _4r3.fdt, _4r3.fdx, _4r3.fnm,
> _4r3_Lucene40_0.tip, _4r3_Lucene40_0.tim, _4r3_Lucene40_0.frq]
> DWPT 0 [Thu Oct 18 20:06:21 PDT 2012; Thread-39]: flushed codec=Lucene40
> D
>
> Machine B
> ----------------------------------
> DWPT 0 [Thu Oct 18 21:41:22 PDT 2012; http-0.0.0.0-8080-3]: flush postings
> as segment _zi0 numDocs=4302
> DWPT 0 [Thu Oct 18 21:41:22 PDT 2012; http-0.0.0.0-8080-3]: new segment has
> 0 deleted docs
> DWPT 0 [Thu Oct 18 21:41:22 PDT 2012; http-0.0.0.0-8080-3]: new segment has
> no vectors; no norms; no docValues; prox; freqs
> DWPT 0 [Thu Oct 18 21:41:22 PDT 2012; http-0.0.0.0-8080-3]:
> flushedFiles=[_zi0_Lucene40_0.prx, _zi0.fdx, _zi0_Lucene40_0.tim, _zi0.fdt,
> _zi0.fnm, _zi0_Lucene40_0.frq, _zi0_Lucene40_0.tip]
> DWPT 0 [Thu Oct 18 21:41:22 PDT 2012; http-0.0.0.0-8080-3]: flushed
> codec=Lucene40
> D
>
> I have found that flush occured  when number of doc in RAM reached
> 70000~9000 in machine A, but the number in machine B is very different,
> almost is 4000.  It seem that every doc in buffer used more RAM in machine
> B then machine A, that result in more flush . Does any one know why this
> happened?
>
> My conf is here.
>
> <ramBufferSizeMB>64</ramBufferSizeMB><maxBufferedDocs>100000</maxBufferedDocs>
>
>
>
>
> --
> from Jun Wang

Re: Solr 4.0 segment flush times has bigger difference between tow machines

Reply via email to