Hi,
Yes, assuming you didn't change the index files, say by optimizing the index, the hot portions of the index should remain in the OS cache unless something else kicked them out. Re other thread - I don't think I have those messages any more. Otis --- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Salman Akram <salman.ak...@northbaysolutions.net> > To: solr-user@lucene.apache.org > Sent: Mon, February 7, 2011 2:49:44 AM > Subject: Re: Performance optimization of Proximity/Wildcard searches > > Only couple of thousand documents are added daily so the old OS cache should > still be useful since old documents remain same, right? > > Also can you please comment on my other thread related to Term Vectors? > Thanks! > > On Sat, Feb 5, 2011 at 8:40 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com > > wrote: > > > Yes, OS cache mostly remains (obviously index files that are no longer > > around > > are going to remain the OS cache for a while, but will be useless and > > gradually > > replaced by new index files). > > How long warmup takes is not relevant here, but what queries you use to > > warm up > > the index and how much you auto-warm the caches. > > > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > ----- Original Message ---- > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > To: solr-user@lucene.apache.org > > > Sent: Sat, February 5, 2011 4:06:54 AM > > > Subject: Re: Performance optimization of Proximity/Wildcard searches > > > > > > Correct me if I am wrong. > > > > > > Commit in index flushes SOLR cache but of course OS cache would still be > > > useful? If a an index is updated every hour then a warm up that takes > > less > > > than 5 mins should be more than enough, right? > > > > > > On Sat, Feb 5, 2011 at 7:42 AM, Otis Gospodnetic < > > otis_gospodne...@yahoo.com > > > > wrote: > > > > > > > Salman, > > > > > > > > Warming up may be useful if your caches are getting decent hit ratios. > > > > Plus, you > > > > are warming up the OS cache when you warm up. > > > > > > > > Otis > > > > ---- > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Fri, February 4, 2011 3:33:41 PM > > > > > Subject: Re: Performance optimization of Proximity/Wildcard searches > > > > > > > > > > I know so we are not really using it for regular warm-ups (in any > > case > > > > index > > > > > is updated on hourly basis). Just tried few times to compare > > results. > > > > The > > > > > issue is I am not even sure if warming up is useful for such > > regular > > > > > updates. > > > > > > > > > > > > > > > > > > > > On Fri, Feb 4, 2011 at 5:16 PM, Otis Gospodnetic < > > > > otis_gospodne...@yahoo.com > > > > > > wrote: > > > > > > > > > > > Salman, > > > > > > > > > > > > I only skimmed your email, but wanted to say that this part > > sounds a > > > > little > > > > > > suspicious: > > > > > > > > > > > > > Our warm up script currently executes all distinct queries in > > our > > > > logs > > > > > > > having count > 5. It was run yesterday (with all the indexing > > > > update > > > > > > every > > > > > > > > > > > > It sounds like this will make warmup take a looooong time, > > assuming > > > > you > > > > > > have > > > > > > more than a handful distinct queries in your logs. > > > > > > > > > > > > Otis > > > > > > ---- > > > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ---- > > > > > > > From: Salman Akram <salman.ak...@northbaysolutions.net> > > > > > > > To: solr-user@lucene.apache.org; t...@statsbiblioteket.dk > > > > > > > Sent: Tue, January 25, 2011 6:32:48 AM > > > > > > > Subject: Re: Performance optimization of Proximity/Wildcard > > searches > > > > > > > > > > > > > > By warmed index you only mean warming the SOLR cache or OS > > cache? As > > > > I > > > > > > said > > > > > > > our index is updated every hour so I am not sure how much SOLR > > cache > > > > > > would > > > > > > > be helpful but OS cache should still be helpful, right? > > > > > > > > > > > > > > I haven't compared the results with a proper script but from > > manual > > > > > > testing > > > > > > > here are some of the observations. > > > > > > > > > > > > > > 'Recent' queries which are in cache of course return > > immediately > > > > (only > > > > > > if > > > > > > > they are exactly same - even if they took 3-4 mins first > > time). I > > > > will > > > > > > need > > > > > > > to test how many recent queries stay in cache but still this > > would > > > > work > > > > > > only > > > > > > > for very common queries. User can run different queries and I > > want > > > > at > > > > > > least > > > > > > > them to be at 'acceptable' level (5-10 secs) even if not very > > fast. > > > > > > > > > > > > > > Our warm up script currently executes all distinct queries in > > our > > > > logs > > > > > > > having count > 5. It was run yesterday (with all the indexing > > > > update > > > > > > every > > > > > > > hour after that) and today when I executed some of the same > > > > queries > > > > > > again > > > > > > > their time seemed a little less (around 15-20%), I am not > > sure if > > > > this > > > > > > means > > > > > > > anything. However, still their time is not acceptable. > > > > > > > > > > > > > > What do you think is the best way to compare results? First > > run all > > > > the > > > > > > warm > > > > > > > up queries and then execute same randomly and compare? > > > > > > > > > > > > > > We are using Windows server, would it make a big difference if > > we > > > > move > > > > > > to > > > > > > > Linux? Our load is not high but some queries are really > > complex. > > > > > > > > > > > > > > Also I was hoping to move to SSD in last after trying out all > > > > software > > > > > > > options. Is that an agreed fact that on large indexes (which > > don't > > > > fit > > > > > > in > > > > > > > RAM) proximity/wildcard/phrase queries (on common words) would > > be > > > > slow > > > > > > and > > > > > > > it can be only improved by cache warm up and better hardware? > > > > Otherwise > > > > > > with > > > > > > > an index of around 150GB such queries will take more than a > > min? > > > > > > > > > > > > > > If that's the case I know this question is very subjective but > > if a > > > > > > single > > > > > > > query takes 2 min on SAS 10K RPM what would its approx time be > > on a > > > > good > > > > > > SSD > > > > > > > (everything else same)? > > > > > > > > > > > > > > Thanks! > > > > > > > > > > > > > > > > > > > > > On Tue, Jan 25, 2011 at 3:44 PM, Toke Eskildsen > > > > > > <t...@statsbiblioteket.dk>wrote: > > > > > > > > > > > > > > > On Tue, 2011-01-25 at 10:20 +0100, Salman Akram wrote: > > > > > > > > > Cache warming is a good option too but the index get > > updated > > > > every > > > > > > hour > > > > > > > > so > > > > > > > > > not sure how much would that help. > > > > > > > > > > > > > > > > What is the time difference between queries with a warmed > > index > > > > and a > > > > > > > > cold one? If the warmed index performs satisfactory, then > > one > > > > answer > > > > > > is > > > > > > > > to upgrade your underlying storage. As always for IO-caused > > > > > > performance > > > > > > > > problem in Lucene/Solr-land, SSD is the answer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Regards, > > > > > > > > > > > > > > Salman Akram > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Regards, > > > > > > > > > > Salman Akram > > > > > > > > > > > > > > > > > > > > > -- > > > Regards, > > > > > > Salman Akram > > > > > > > > > -- > Regards, > > Salman Akram >