RE: *Very* slow Commit after upgrading to solr 1.3

2008-10-08 Thread Ben Shlomo, Yatir
So other than me doing trial  error, do you have any guidance on how to
configure the merge factor (and ramBufferSizeMB ? ).
any formula that supplies the optimal value ?
Thanks,
Yatir

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Tuesday, October 07, 2008 1:10 PM
To: solr-user@lucene.apache.org
Subject: Re: *Very* slow Commit after upgrading to solr 1.3

On Tue, Oct 7, 2008 at 6:32 AM, Ben Shlomo, Yatir
[EMAIL PROTECTED] wrote:
 The problem is solved, see below.
 Since the performance is so sensitive to configuration - do you have a
 tip on how to determine the optimal configuration for
 mergeFactor, ramBufferSizeMB and other properties ?

The issue might have been your high merge factor coupled with changes
in how Lucene closes an index.  To prevent possible corruption on a
crash, Lucene now does an fsync on the index files before it writes
the new segment descriptor that references those files.  A high merge
factor means more segments, hence more segment files to sync on a
close.

-Yonik


 My original problem occurred even on a fresh rebuild of the index with
 solr 1.3
 To solve it I used the entire IndexWriter section settings from the
solr
 1.3 example file
 This had a dramatic impact:
 I indexed 20 GB of data (52M docs)
 The total indexing time was 13 hours
 The index size was 30 GB
 The total commit time was less than 2 minutes

 Tomcat Log for reference

 Oct 5, 2008 9:43:24 PM org.apache.solr.update.DirectUpdateHandler2
 commit
 INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher init
 INFO: Opening [EMAIL PROTECTED] main
 Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2
 commit
 INFO: end_commit_flush
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main


filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,

warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
 0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for [EMAIL PROTECTED] main


filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,

warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
 0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main


queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si

ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
 atio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for [EMAIL PROTECTED] main


queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si

ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
 atio=0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main


documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=

0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
 o=0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming result for [EMAIL PROTECTED] main


documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=

0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
 o=0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore registerSearcher
 INFO: [] Registered new searcher [EMAIL PROTECTED] main
 Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher close
 INFO: Closing [EMAIL PROTECTED] main


filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,

warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
 0.00,cumulative_inserts=0,cumulative_evictions=0}


queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si

ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
 atio=0.00,cumulative_inserts=0,cumulative_evictions=0}


documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=

0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
 o=0.00,cumulative_inserts=0,cumulative_evictions=0}
 Oct 5, 2008 9:43:43 PM
 org.apache.solr.update.processor.LogUpdateProcessor finish
 INFO: {commit=} 0 18406
 Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore execute
 INFO: [] webapp=/dss1 path=/update params={} status=0 QTime=18406
 Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2
 commit
 INFO: start commit(optimize=true

Re: *Very* slow Commit after upgrading to solr 1.3

2008-10-07 Thread Yonik Seeley
@lucene.apache.org
 Subject: Re: *Very* slow Commit after upgrading to solr 1.3

 Ben, see also

 http://www.nabble.com/Commit-in-solr-1.3-can-take-up-to-5-minutes-td1980
 2781.html#a19802781

 What type of physical drive is this and what interface is used (SATA,
 etc)?
 What is the filesystem (NTFS)?

 Did you add to an existing index from an older version of Solr, or
 start from scratch?

 If you add a single document to the index and commit, does it take a
 long time?

 I notice your merge factor is 1000... this will create many files that
 need to be sync'd
 It may help to try the IndexWriter settings from the 1.3 example
 setup... the important changes being:

mergeFactor10/mergeFactor
!--maxBufferedDocs1000/maxBufferedDocs--
ramBufferSizeMB32/ramBufferSizeMB

 -Yonik

 On Mon, Sep 29, 2008 at 5:33 AM, Ben Shlomo, Yatir
 [EMAIL PROTECTED] wrote:
 Hi!



 I am running on widows 64 bit ...
 I have upgraded to solr 1.3 in order to use the distributed search.

 I haven't changed the solrConfig and the schema xml files during the
 upgrade.

 I am indexing ~ 350K documents (each one is about 0.5 KB in size)

 The indexing takes a reasonable amount of time (350 seconds)

 See tomcat log:

 INFO: {add=[8x-wbTscWftuu1sVWpdnGw==, VOu1eSv0obBl1xkj2jGjIA==,
 YkOm-nKPrTVVVyeCZM4-4A==, rvaq_TyYsqt3aBc0KKDVbQ==,
 9NdzWXsErbF_5btyT1JUjw==, ...(398728 more)]} 0 349875



 But when I commit it takes more than an hour ! (5000 seconds!, the
 optimize after the commit took 14 seconds)

 INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)



 p.s. its not a machine problem I moved to another machine and the same
 thing happened


 I noticed something very strange during the time I wait for the
 commit:

 While the solr index is 210MB in size

 In the windows task manager I noticed that the java process is making
 a
 HUGE amounts of IO reads:

 It reads more than 350 GB ! (- which takes a lot of time.)

 The process is constantly taking 25% of the cpu resources.

 All my autowarmCount in Solrconfig  file do not exceed 256...



 Any more ideas to check?

 Thanks.







 Here is part of my solrConfig file:

 - file:///C:\dss1\SolrHome\conf\solrconfig.xml##   -
 indexDefaults

 - !--  Values here affect all index writers and act as a default
 unless
 overridden.

  --

  useCompoundFilefalse/useCompoundFile

  mergeFactor1000/mergeFactor

  maxBufferedDocs1000/maxBufferedDocs

  maxMergeDocs2147483647/maxMergeDocs

  maxFieldLength1/maxFieldLength

  writeLockTimeout1000/writeLockTimeout

  commitLockTimeout1/commitLockTimeout

  /indexDefaults

 - mainIndex

 - !--  options specific to the main on-disk lucene index

  --

  useCompoundFilefalse/useCompoundFile

  mergeFactor1000/mergeFactor

  maxBufferedDocs1000/maxBufferedDocs

  maxMergeDocs2147483647/maxMergeDocs

  maxFieldLength1/maxFieldLength

 - !--  If true, unlock any held write or commit locks on startup.

 This defeats the locking mechanism that allows multiple

 processes to safely access a lucene index, and should be

 used with care.

  --

  unlockOnStartuptrue/unlockOnStartup

  /mainIndex











 Yatir Ben-shlomo | eBay, Inc. | Classification Track, Shopping.com
 (Israel) | w: +972-9-892-1373 |  email: [EMAIL PROTECTED] |







RE: *Very* slow Commit after upgrading to solr 1.3

2008-10-07 Thread Ben Shlomo, Yatir
Thanks Yonik,

The problem is solved, see below.
Since the performance is so sensitive to configuration - do you have a
tip on how to determine the optimal configuration for 
mergeFactor, ramBufferSizeMB and other properties ?

My original problem occurred even on a fresh rebuild of the index with
solr 1.3
To solve it I used the entire IndexWriter section settings from the solr
1.3 example file
This had a dramatic impact:
I indexed 20 GB of data (52M docs)
The total indexing time was 13 hours
The index size was 30 GB
The total commit time was less than 2 minutes

Tomcat Log for reference

Oct 5, 2008 9:43:24 PM org.apache.solr.update.DirectUpdateHandler2
commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher init
INFO: Opening [EMAIL PROTECTED] main
Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2
commit
INFO: end_commit_flush
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for [EMAIL PROTECTED] main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main

queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
atio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for [EMAIL PROTECTED] main

queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
atio=0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main

documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
o=0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming result for [EMAIL PROTECTED] main

documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
o=0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [] Registered new searcher [EMAIL PROTECTED] main
Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing [EMAIL PROTECTED] main

filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,
warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=
0.00,cumulative_inserts=0,cumulative_evictions=0}

queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si
ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr
atio=0.00,cumulative_inserts=0,cumulative_evictions=0}

documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=
0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati
o=0.00,cumulative_inserts=0,cumulative_evictions=0}
Oct 5, 2008 9:43:43 PM
org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {commit=} 0 18406
Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/dss1 path=/update params={} status=0 QTime=18406 
Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2
commit
INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true)
Oct 5, 2008 9:45:07 PM org.apache.solr.search.SolrIndexSearcher init
INFO: Opening [EMAIL PROTECTED] main
Oct 5, 2008 9:45:07 PM org.apache.solr.update.DirectUpdateHandler2
commit
INFO: end_commit_flush


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik
Seeley
Sent: Saturday, October 04, 2008 6:07 PM
To: solr-user@lucene.apache.org
Subject: Re: *Very* slow Commit after upgrading to solr 1.3

Ben, see also

http://www.nabble.com/Commit-in-solr-1.3-can-take-up-to-5-minutes-td1980
2781.html#a19802781

What type of physical drive is this and what interface is used (SATA,
etc)?
What is the filesystem (NTFS)?

Did you add to an existing index from an older version of Solr, or
start from scratch?

If you add a single document to the index and commit

Re: *Very* slow Commit after upgrading to solr 1.3

2008-10-04 Thread Yonik Seeley
Ben, see also

http://www.nabble.com/Commit-in-solr-1.3-can-take-up-to-5-minutes-td19802781.html#a19802781

What type of physical drive is this and what interface is used (SATA, etc)?
What is the filesystem (NTFS)?

Did you add to an existing index from an older version of Solr, or
start from scratch?

If you add a single document to the index and commit, does it take a long time?

I notice your merge factor is 1000... this will create many files that
need to be sync'd
It may help to try the IndexWriter settings from the 1.3 example
setup... the important changes being:

mergeFactor10/mergeFactor
!--maxBufferedDocs1000/maxBufferedDocs--
ramBufferSizeMB32/ramBufferSizeMB

-Yonik

On Mon, Sep 29, 2008 at 5:33 AM, Ben Shlomo, Yatir
[EMAIL PROTECTED] wrote:
 Hi!



 I am running on widows 64 bit ...
 I have upgraded to solr 1.3 in order to use the distributed search.

 I haven't changed the solrConfig and the schema xml files during the
 upgrade.

 I am indexing ~ 350K documents (each one is about 0.5 KB in size)

 The indexing takes a reasonable amount of time (350 seconds)

 See tomcat log:

 INFO: {add=[8x-wbTscWftuu1sVWpdnGw==, VOu1eSv0obBl1xkj2jGjIA==,
 YkOm-nKPrTVVVyeCZM4-4A==, rvaq_TyYsqt3aBc0KKDVbQ==,
 9NdzWXsErbF_5btyT1JUjw==, ...(398728 more)]} 0 349875



 But when I commit it takes more than an hour ! (5000 seconds!, the
 optimize after the commit took 14 seconds)

 INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)



 p.s. its not a machine problem I moved to another machine and the same
 thing happened


 I noticed something very strange during the time I wait for the commit:

 While the solr index is 210MB in size

 In the windows task manager I noticed that the java process is making a
 HUGE amounts of IO reads:

 It reads more than 350 GB ! (- which takes a lot of time.)

 The process is constantly taking 25% of the cpu resources.

 All my autowarmCount in Solrconfig  file do not exceed 256...



 Any more ideas to check?

 Thanks.







 Here is part of my solrConfig file:

 - file:///C:\dss1\SolrHome\conf\solrconfig.xml##   - indexDefaults

 - !--  Values here affect all index writers and act as a default unless
 overridden.

  --

  useCompoundFilefalse/useCompoundFile

  mergeFactor1000/mergeFactor

  maxBufferedDocs1000/maxBufferedDocs

  maxMergeDocs2147483647/maxMergeDocs

  maxFieldLength1/maxFieldLength

  writeLockTimeout1000/writeLockTimeout

  commitLockTimeout1/commitLockTimeout

  /indexDefaults

 - mainIndex

 - !--  options specific to the main on-disk lucene index

  --

  useCompoundFilefalse/useCompoundFile

  mergeFactor1000/mergeFactor

  maxBufferedDocs1000/maxBufferedDocs

  maxMergeDocs2147483647/maxMergeDocs

  maxFieldLength1/maxFieldLength

 - !--  If true, unlock any held write or commit locks on startup.

 This defeats the locking mechanism that allows multiple

 processes to safely access a lucene index, and should be

 used with care.

  --

  unlockOnStartuptrue/unlockOnStartup

  /mainIndex











 Yatir Ben-shlomo | eBay, Inc. | Classification Track, Shopping.com
 (Israel) | w: +972-9-892-1373 |  email: [EMAIL PROTECTED] |






*Very* slow Commit after upgrading to solr 1.3

2008-09-29 Thread Ben Shlomo, Yatir
Hi!

 

I am running on widows 64 bit ...
I have upgraded to solr 1.3 in order to use the distributed search.

I haven't changed the solrConfig and the schema xml files during the
upgrade.

I am indexing ~ 350K documents (each one is about 0.5 KB in size)

The indexing takes a reasonable amount of time (350 seconds)

See tomcat log:

INFO: {add=[8x-wbTscWftuu1sVWpdnGw==, VOu1eSv0obBl1xkj2jGjIA==,
YkOm-nKPrTVVVyeCZM4-4A==, rvaq_TyYsqt3aBc0KKDVbQ==,
9NdzWXsErbF_5btyT1JUjw==, ...(398728 more)]} 0 349875

 

But when I commit it takes more than an hour ! (5000 seconds!, the
optimize after the commit took 14 seconds)

INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)

 

p.s. its not a machine problem I moved to another machine and the same
thing happened


I noticed something very strange during the time I wait for the commit:

While the solr index is 210MB in size

In the windows task manager I noticed that the java process is making a
HUGE amounts of IO reads:

It reads more than 350 GB ! (- which takes a lot of time.)

The process is constantly taking 25% of the cpu resources.

All my autowarmCount in Solrconfig  file do not exceed 256...

 

Any more ideas to check?

Thanks.

 

 

 

Here is part of my solrConfig file:

- file:///C:\dss1\SolrHome\conf\solrconfig.xml##   - indexDefaults

- !--  Values here affect all index writers and act as a default unless
overridden. 

  -- 

  useCompoundFilefalse/useCompoundFile 

  mergeFactor1000/mergeFactor 

  maxBufferedDocs1000/maxBufferedDocs 

  maxMergeDocs2147483647/maxMergeDocs 

  maxFieldLength1/maxFieldLength 

  writeLockTimeout1000/writeLockTimeout 

  commitLockTimeout1/commitLockTimeout 

  /indexDefaults

- mainIndex

- !--  options specific to the main on-disk lucene index 

  -- 

  useCompoundFilefalse/useCompoundFile 

  mergeFactor1000/mergeFactor 

  maxBufferedDocs1000/maxBufferedDocs 

  maxMergeDocs2147483647/maxMergeDocs 

  maxFieldLength1/maxFieldLength 

- !--  If true, unlock any held write or commit locks on startup. 

 This defeats the locking mechanism that allows multiple

 processes to safely access a lucene index, and should be

 used with care. 

  -- 

  unlockOnStartuptrue/unlockOnStartup 

  /mainIndex

 

 

 

 

 

Yatir Ben-shlomo | eBay, Inc. | Classification Track, Shopping.com
(Israel) | w: +972-9-892-1373 |  email: [EMAIL PROTECTED] |