Re: how soft-commit works

2013-09-17 Thread Erick Erickson
Here's a rather long blog post I wrote up that might help:

http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best,
Erick


On Mon, Sep 16, 2013 at 1:43 PM, Shawn Heisey s...@elyograg.org wrote:

 On 9/16/2013 7:01 AM, Matteo Grolla wrote:
  Can anyone explain me the following things about soft-commit?
  -For searches o access new documents I think a new searcher is opened
 after a soft commit.
How does the near realtime requirement for soft commit match with
 the potentially long time taken to warm up caches for the new searcher?
  -Is it a good idea to set
openSearcher=false in auto commit
and rely on soft auto commit to see new data in searches?

 That is a very common way for installs requiring NRT updates to get
 configured.

 NRTCachingDirectoryFactory, which is the directory class used in the
 example since 4.0, is a wrapper around MMapDirectoryFactory, which is
 the old default in 3.x.

 For soft commits, the NRT directory keeps small commits in RAM rather
 than writing it to the disk, which makes the process of opening a new
 searcher happen a lot faster.


 http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/store/NRTCachingDirectory.html

 If your index rate is very fast or you index large amounts of data, the
 NRT directory doesn't gain you much over MMap, but because we made it
 the default in the example, it probably doesn't have any performance
 detriment.

 Thanks,
 Shawn




Re: how soft-commit works

2013-09-16 Thread Shawn Heisey
On 9/16/2013 7:01 AM, Matteo Grolla wrote:
 Can anyone explain me the following things about soft-commit?
 -For searches o access new documents I think a new searcher is opened after a 
 soft commit.
   How does the near realtime requirement for soft commit match with the 
 potentially long time taken to warm up caches for the new searcher?
 -Is it a good idea to set 
   openSearcher=false in auto commit 
   and rely on soft auto commit to see new data in searches?

That is a very common way for installs requiring NRT updates to get
configured.

NRTCachingDirectoryFactory, which is the directory class used in the
example since 4.0, is a wrapper around MMapDirectoryFactory, which is
the old default in 3.x.

For soft commits, the NRT directory keeps small commits in RAM rather
than writing it to the disk, which makes the process of opening a new
searcher happen a lot faster.

http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/store/NRTCachingDirectory.html

If your index rate is very fast or you index large amounts of data, the
NRT directory doesn't gain you much over MMap, but because we made it
the default in the example, it probably doesn't have any performance
detriment.

Thanks,
Shawn