Hi Erik,

<deletionPolicy class="solr.SolrDeletionPolicy">
<str name="maxCommitsToKeep">1</str>
<str name="maxOptimizedCommitsToKeep">0</str>
</deletionPolicy>

Due to 44 minutes optimization time we do an optimization once a day
during the night.

I will try with an smaler index on my development system.

Best regards,
Bernd


Am 20.04.2011 17:50, schrieb Erick Erickson:
It looks OK, but still doesn't explain keeping the old files around. What is
your<deletionPolicy>  in your solrconfig.xml look like? It's
possible that you're seeing Solr attempt to keep around several
optimized copies of the index, but that still doesn't explain why
restarting Solr removes them unless the deletionPolicy gets invoked
on sometime and you're index files are aging out (I don't know the
internals of deletion well enough to say).

About optimization. It's become less important with recent code. Once
upon a time, it made a substantial difference in search speed. More
recently, it has very little impact on search speed, and is used
much more sparingly. Its greatest benefit is reclaiming unused resources
left over from deleted documents. So you might want to avoid the pain
of optimizing (44 minutes!) and only optimize rarely of if you have
deleted a lot of documents.

It might be worthwhile to try (with a smaller index !) a bunch of optimize
cycles and see if the<deletionPolicy>  idea has any merit. I'd expect
your index to reach a maximum and stay there after the saved
copies of the index was reached...

But otherwise I'm puzzled...

Erick

On Wed, Apr 20, 2011 at 10:30 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>  wrote:
Hi Erik,

Am 20.04.2011 15:42, schrieb Erick Erickson:

Hmmmm, this isn't right. You've pretty much eliminated the obvious
things. What does lsof show? I'm assuming it shows the files are
being held open by your Solr instance, but it's worth checking.

Just commited new content 3 times and finally optimized.
Again having old index files left.

Then checked on my master, only the newest version of index files are
listed with lsof. No file handles to the old index files but the
old index files remain in data/index/.
Thats strange.

This time replication worked fine and cleaned up old index on slaves.


I'm not getting the same behavior, admittedly on a Windows box.
The only other thing I can think of is that you have a query that's
somehow never ending, but that's grasping at straws.

Do your log files show anything interesting?

Lets see:
- it has the old generation (generation=12) and its files
- and recognizes that there have been several commits (generation=18)

20.04.2011 14:05:26 org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start
commit(optimize=true,waitFlush=false,waitSearcher=true,expungeDeletes=false)
20.04.2011 14:05:26 org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=2

  
commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_c,version=1302159868435,generation=12,filenames=[_3xm.nrm,
_3xm.fdx, segment
s_c, _3xm.fnm, _3xm.fdt, _3xm.tis, _3xm.tii, _3xm.prx, _3xm.frq]

  
commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_i,version=1302159868447,generation=18,filenames=[_3xm.nrm,
_3xo.tis, _3xp.pr
x, _3xo.fnm, _3xp.fdx, _3xs.frq, _3xo.tii, _3xp.fdt, _3xn.tii, _3xm.fdx,
_3xn.nrm, _3xm.fdt, _3xs.prx, _3xn.tis, _3xn.fdx, _3xr.nrm, _3xm.prx,
_3xn.fdt, _3x
p.tii, _3xs.nrm, _3xp.tis, _3xo.prx, segments_i, _3xm.tii, _3xq.tii,
_3xs.fdx, _3xs.fdt, _3xo.frq, _3xn.prx, _3xm.tis, _3xr.prx, _3xq.tis,
_3xo.fdt, _3xp.fr
q, _3xq.fnm, _3xo.fdx, _3xp.fnm, _3xr.tis, _3xr.fnm, _3xq.frq, _3xr.tii,
_3xr.frq, _3xo.nrm, _3xs.tii, _3xq.fdx, _3xq.fdt, _3xp.nrm, _3xq.prx,
_3xs.tis, _3x
m.frq, _3xr.fdx, _3xm.fnm, _3xn.frq, _3xq.nrm, _3xs.fnm, _3xn.fnm, _3xr.fdt]
20.04.2011 14:05:26 org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1302159868447


- after 44 minutes of optimizing (over 140GB and 27.8 mio docs) it gets
  the SolrDeletionPolicy onCommit and has the new generation 19 listed.


20.04.2011 14:49:25 org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=3

  
commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_c,version=1302159868435,generation=12,filenames=[_3xm.nrm,
_3xm.fdx, segment
s_c, _3xm.fnm, _3xm.fdt, _3xm.tis, _3xm.tii, _3xm.prx, _3xm.frq]

  
commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_i,version=1302159868447,generation=18,filenames=[_3xm.nrm,
_3xo.tis, _3xp.pr
x, _3xo.fnm, _3xp.fdx, _3xs.frq, _3xo.tii, _3xp.fdt, _3xn.tii, _3xm.fdx,
_3xn.nrm, _3xm.fdt, _3xs.prx, _3xn.tis, _3xn.fdx, _3xr.nrm, _3xm.prx,
_3xn.fdt, _3x
p.tii, _3xs.nrm, _3xp.tis, _3xo.prx, segments_i, _3xm.tii, _3xq.tii,
_3xs.fdx, _3xs.fdt, _3xo.frq, _3xn.prx, _3xm.tis, _3xr.prx, _3xq.tis,
_3xo.fdt, _3xp.fr
q, _3xq.fnm, _3xo.fdx, _3xp.fnm, _3xr.tis, _3xr.fnm, _3xq.frq, _3xr.tii,
_3xr.frq, _3xo.nrm, _3xs.tii, _3xq.fdx, _3xq.fdt, _3xp.nrm, _3xq.prx,
_3xs.tis, _3x
m.frq, _3xr.fdx, _3xm.fnm, _3xn.frq, _3xq.nrm, _3xs.fnm, _3xn.fnm, _3xr.fdt]

  
commit{dir=/srv/www/solr/solr/solrserver/solr/data/index,segFN=segments_j,version=1302159868449,generation=19,filenames=[_3xt.fnm,
_3xt.nrm, _3xt.fr
q, _3xt.fdt, _3xt.tis, _3xt.fdx, segments_j, _3xt.prx, _3xt.tii]
20.04.2011 14:49:25 org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 1302159868449


- it starts a new searcher and warms it up
- it sends SolrIndexSearcher close


20.04.2011 14:49:29 org.apache.solr.search.SolrIndexSearcher<init>
INFO: Opening Searcher@2c37425f main
20.04.2011 14:49:29 org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
20.04.2011 14:49:29 org.apache.solr.search.SolrIndexSearcher warm
...
20.04.2011 14:49:29 org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@2c37425f main
20.04.2011 14:49:29 org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null
params={facet=true&start=0&event=newSearcher&q=solr&facet.limit=100&facet.field=f_dcyear&rows=10}
hits=96 status=0 QTime=816
20.04.2011 14:49:30 org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null
params={facet=true&start=0&event=newSearcher&q=*:*&facet.limit=100&facet.field=f_dcyear&rows=10}
hits=27826100 status=0 QTime=633
20.04.2011 14:49:30 org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
20.04.2011 14:49:30 org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher Searcher@2c37425f main
20.04.2011 14:49:30 org.apache.solr.search.SolrIndexSearcher close
INFO: Closing Searcher@10faa79a main
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
0.00,cumulative_inserts=0,cumulative_evictions=0}
filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=2,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00
,cumulative_inserts=0,cumulative_evictions=0}
queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=0,cumulative_lookups=6,cumulative_hits=0,cumulative_hitratio
=0.00,cumulative_inserts=6,cumulative_evictions=0}
documentCache{lookups=20,hits=10,hitratio=0.50,inserts=30,evictions=0,size=30,warmupTime=0,cumulative_lookups=120,cumulative_hits=60,cumulative_hitr
atio=0.50,cumulative_inserts=60,cumulative_evictions=0}
20.04.2011 14:49:30 org.apache.solr.update.processor.LogUpdateProcessor
finish
INFO: {optimize=} 0 2644098


Actually I cant see anything to worry about.

What is your opinion?


Best regards
Bernd


Best
Erick@NotMuchHelpIKnow

On Wed, Apr 20, 2011 at 8:37 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>    wrote:

Hi Erik,

Am 20.04.2011 13:56, schrieb Erick Erickson:

Does this persist? In other words, if you just watch it for
some time, does the disk usage go back to normal?

Only after restarting the whole solr the disk usage goes back to normal.


Because it's typical that your index size will temporarily
spike after the operations you describe as new searchers
are warmed up. During that interval, both the old and new
searchers are open.

Temporarily yes, but still after a couple of hours after optimize
or replication?


Look particularly at your warmup time in the Solr admin page,
that should give you an indication of how long it takes your
warmup to happen and give you a clue about when you should
expect the index sizes to drop again.

We have newSearcher and firstSearcher (both with 2 simple queries) and
<useColdSearcher>false</useColdSearcher>
<maxWarmingSearchers>2</maxWarmingSearchers>
The QTime is less than 500 (0.5 second).

warmupTime=0 for all autowarming Searcher


How often do you optimize on the master and replicate on the
slave? Because you may be getting into the runaway warmup
problem where a new searcher is opened before the last one
is autowarmed and spiraling out of control.

We commit new content about every hour and do an optimze once a day.
So replication is also once a day after optimize finished and
system has settled down.
No commit during optimize and replication.


Any further hints?



Hope that helps
Erick

On Wed, Apr 20, 2011 at 2:36 AM, Bernd Fehling
<bernd.fehl...@uni-bielefeld.de>      wrote:

Hello list,

we have the problem that old searchers often are not closing
after optimize (on master) or replication (on slaves) and
therefore have huge index volumes.
Only solution so far is to stop and start solr which cleans
up everything successfully, but this can only be a workaround.

Is the parameter "waitSearcher=false" an option to solve this?

Any hints what to check or to debug?

We use Apache Solr 3.1.0 on Linux.

Regards
Bernd



--
*************************************************************
Bernd Fehling                Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universitätsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehl...@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************


--
*************************************************************
Bernd Fehling                Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universitätsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehl...@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Reply via email to