On Tue, Oct 7, 2008 at 6:32 AM, Ben Shlomo, Yatir <[EMAIL PROTECTED]> wrote: > The problem is solved, see below. > Since the performance is so sensitive to configuration - do you have a > tip on how to determine the optimal configuration for > mergeFactor, ramBufferSizeMB and other properties ?
The issue might have been your high merge factor coupled with changes in how Lucene closes an index. To prevent possible corruption on a crash, Lucene now does an fsync on the index files before it writes the new segment descriptor that references those files. A high merge factor means more segments, hence more segment files to sync on a close. -Yonik > My original problem occurred even on a fresh rebuild of the index with > solr 1.3 > To solve it I used the entire IndexWriter section settings from the solr > 1.3 example file > This had a dramatic impact: > I indexed 20 GB of data (52M docs) > The total indexing time was 13 hours > The index size was 30 GB > The total commit time was less than 2 minutes > > Tomcat Log for reference > > Oct 5, 2008 9:43:24 PM org.apache.solr.update.DirectUpdateHandler2 > commit > INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true) > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher <init> > INFO: Opening [EMAIL PROTECTED] main > Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2 > commit > INFO: end_commit_flush > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main > > filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0, > warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio= > 0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming result for [EMAIL PROTECTED] main > > filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0, > warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio= > 0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main > > queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si > ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr > atio=0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming result for [EMAIL PROTECTED] main > > queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si > ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr > atio=0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main > > documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size= > 0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati > o=0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher warm > INFO: autowarming result for [EMAIL PROTECTED] main > > documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size= > 0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati > o=0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore registerSearcher > INFO: [] Registered new searcher [EMAIL PROTECTED] main > Oct 5, 2008 9:43:43 PM org.apache.solr.search.SolrIndexSearcher close > INFO: Closing [EMAIL PROTECTED] main > > filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0, > warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio= > 0.00,cumulative_inserts=0,cumulative_evictions=0} > > queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,si > ze=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitr > atio=0.00,cumulative_inserts=0,cumulative_evictions=0} > > documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size= > 0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitrati > o=0.00,cumulative_inserts=0,cumulative_evictions=0} > Oct 5, 2008 9:43:43 PM > org.apache.solr.update.processor.LogUpdateProcessor finish > INFO: {commit=} 0 18406 > Oct 5, 2008 9:43:43 PM org.apache.solr.core.SolrCore execute > INFO: [] webapp=/dss1 path=/update params={} status=0 QTime=18406 > Oct 5, 2008 9:43:43 PM org.apache.solr.update.DirectUpdateHandler2 > commit > INFO: start commit(optimize=true,waitFlush=false,waitSearcher=true) > Oct 5, 2008 9:45:07 PM org.apache.solr.search.SolrIndexSearcher <init> > INFO: Opening [EMAIL PROTECTED] main > Oct 5, 2008 9:45:07 PM org.apache.solr.update.DirectUpdateHandler2 > commit > INFO: end_commit_flush > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Yonik > Seeley > Sent: Saturday, October 04, 2008 6:07 PM > To: solr-user@lucene.apache.org > Subject: Re: *Very* slow Commit after upgrading to solr 1.3 > > Ben, see also > > http://www.nabble.com/Commit-in-solr-1.3-can-take-up-to-5-minutes-td1980 > 2781.html#a19802781 > > What type of physical drive is this and what interface is used (SATA, > etc)? > What is the filesystem (NTFS)? > > Did you add to an existing index from an older version of Solr, or > start from scratch? > > If you add a single document to the index and commit, does it take a > long time? > > I notice your merge factor is 1000... this will create many files that > need to be sync'd > It may help to try the IndexWriter settings from the 1.3 example > setup... the important changes being: > > <mergeFactor>10</mergeFactor> > <!--<maxBufferedDocs>1000</maxBufferedDocs>--> > <ramBufferSizeMB>32</ramBufferSizeMB> > > -Yonik > > On Mon, Sep 29, 2008 at 5:33 AM, Ben Shlomo, Yatir > <[EMAIL PROTECTED]> wrote: >> Hi! >> >> >> >> I am running on widows 64 bit ... >> I have upgraded to solr 1.3 in order to use the distributed search. >> >> I haven't changed the solrConfig and the schema xml files during the >> upgrade. >> >> I am indexing ~ 350K documents (each one is about 0.5 KB in size) >> >> The indexing takes a reasonable amount of time (350 seconds) >> >> See tomcat log: >> >> INFO: {add=[8x-wbTscWftuu1sVWpdnGw==, VOu1eSv0obBl1xkj2jGjIA==, >> YkOm-nKPrTVVVyeCZM4-4A==, rvaq_TyYsqt3aBc0KKDVbQ==, >> 9NdzWXsErbF_5btyT1JUjw==, ...(398728 more)]} 0 349875 >> >> >> >> But when I commit it takes more than an hour ! (5000 seconds!, the >> optimize after the commit took 14 seconds) >> >> INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true) >> >> >> >> p.s. its not a machine problem I moved to another machine and the same >> thing happened >> >> >> I noticed something very strange during the time I wait for the > commit: >> >> While the solr index is 210MB in size >> >> In the windows task manager I noticed that the java process is making > a >> HUGE amounts of IO reads: >> >> It reads more than 350 GB ! (- which takes a lot of time.) >> >> The process is constantly taking 25% of the cpu resources. >> >> All my autowarmCount in Solrconfig file do not exceed 256... >> >> >> >> Any more ideas to check? >> >> Thanks. >> >> >> >> >> >> >> >> Here is part of my solrConfig file: >> >> - <file:///C:\dss1\SolrHome\conf\solrconfig.xml##> < - > <indexDefaults> >> >> - <!-- Values here affect all index writers and act as a default > unless >> overridden. >> >> --> >> >> <useCompoundFile>false</useCompoundFile> >> >> <mergeFactor>1000</mergeFactor> >> >> <maxBufferedDocs>1000</maxBufferedDocs> >> >> <maxMergeDocs>2147483647</maxMergeDocs> >> >> <maxFieldLength>10000</maxFieldLength> >> >> <writeLockTimeout>1000</writeLockTimeout> >> >> <commitLockTimeout>10000</commitLockTimeout> >> >> </indexDefaults> >> >> - <mainIndex> >> >> - <!-- options specific to the main on-disk lucene index >> >> --> >> >> <useCompoundFile>false</useCompoundFile> >> >> <mergeFactor>1000</mergeFactor> >> >> <maxBufferedDocs>1000</maxBufferedDocs> >> >> <maxMergeDocs>2147483647</maxMergeDocs> >> >> <maxFieldLength>10000</maxFieldLength> >> >> - <!-- If true, unlock any held write or commit locks on startup. >> >> This defeats the locking mechanism that allows multiple >> >> processes to safely access a lucene index, and should be >> >> used with care. >> >> --> >> >> <unlockOnStartup>true</unlockOnStartup> >> >> </mainIndex> >> >> >> >> >> >> >> >> >> >> >> >> Yatir Ben-shlomo | eBay, Inc. | Classification Track, Shopping.com >> (Israel) | w: +972-9-892-1373 | email: [EMAIL PROTECTED] | >> >> >> >> >