Re: latest lucene update
On Thu, Jul 16, 2009 at 10:06 AM, Uwe Schindler wrote: > OK. At least I have seen a speed up during my tests :). I have the logs > somewhere. Which tests were affected negative, then I can look into the > before/after logs? Sorry, not sure - it wasn't the slower tests that tipped me off, but the "speed of BooleanQueries on 2.9" thread that prompted me to look at the default implementations of the new methods. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
OK. At least I have seen a speed up during my tests :). I have the logs somewhere. Which tests were affected negative, then I can look into the before/after logs? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Thursday, July 16, 2009 3:53 PM > To: java-dev@lucene.apache.org > Subject: Re: latest lucene update > > On Thu, Jul 16, 2009 at 2:11 AM, Uwe Schindler wrote: > > Did you also test, that the speed was going back to normal with the > latest > > fix in trunk (without modifying Solr code)? > > I didn't - I was already part way through implementing advance() in Solr. > I'm sure the advance() fix in Lucene would have worked too though. > > -Yonik > http://www.lucidimagination.com > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: latest lucene update
On Thu, Jul 16, 2009 at 2:11 AM, Uwe Schindler wrote: > Did you also test, that the speed was going back to normal with the latest > fix in trunk (without modifying Solr code)? I didn't - I was already part way through implementing advance() in Solr. I'm sure the advance() fix in Lucene would have worked too though. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
Did you also test, that the speed was going back to normal with the latest fix in trunk (without modifying Solr code)? I ran the Solr tests with updated lucene-core-2.9.jar here, but I was not able to find out, which of the tests had the big slowdown. I only noticed some speedup in some tests related to search. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Thursday, July 16, 2009 2:57 AM > To: java-dev@lucene.apache.org > Subject: Re: latest lucene update > > Thanks guys, I had actually meant this message to go to solr-dev... > hence the "but I think we should implement the new methods anyway". > I've implemented them, and the performance has returned to normal. > > -Yonik > http://www.lucidimagination.com > > > > On Wed, Jul 15, 2009 at 4:00 PM, Yonik Seeley > wrote: > > Running solr unit tests seems a fair bit slower now. I think the root > > cause may be this: > > > http://search.lucidimagination.com/search/document/a8bd12c3b87e98a3/speed_ > of_booleanqueries_on_2_9 > > That may be fixed, but I think we should implement the new methods > anyway. > > > > I'm also surprised that more changes weren't necessary to get the > > latest Lucene to work... one thing in particular is docs out of order > > - Solr currently requires them in-order to correctly create DocSet > > instances, and I'm not sure this is the case any more. I'll look into > > it. > > > > -Yonik > > http://www.lucidimagination.com > > > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: latest lucene update
Thanks guys, I had actually meant this message to go to solr-dev... hence the "but I think we should implement the new methods anyway". I've implemented them, and the performance has returned to normal. -Yonik http://www.lucidimagination.com On Wed, Jul 15, 2009 at 4:00 PM, Yonik Seeley wrote: > Running solr unit tests seems a fair bit slower now. I think the root > cause may be this: > http://search.lucidimagination.com/search/document/a8bd12c3b87e98a3/speed_of_booleanqueries_on_2_9 > That may be fixed, but I think we should implement the new methods anyway. > > I'm also surprised that more changes weren't necessary to get the > latest Lucene to work... one thing in particular is docs out of order > - Solr currently requires them in-order to correctly create DocSet > instances, and I'm not sure this is the case any more. I'll look into > it. > > -Yonik > http://www.lucidimagination.com > - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
Hi Yonik, Mike committed my patch, so the fix should be in trunk. I tested solr after that with a new Lucene JAR. Some tests run faster now! But Solr should update its DocIdSetIterators soon... - UWE SCHINDLER Webserver/Middleware Development PANGAEA - Publishing Network for Geoscientific and Environmental Data MARUM - University of Bremen Room 2500, Leobener Str., D-28359 Bremen Tel.: +49 421 218 65595 Fax: +49 421 218 65505 http://www.pangaea.de/ E-mail: uschind...@pangaea.de > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Wednesday, July 15, 2009 10:00 PM > To: java-dev@lucene.apache.org > Subject: latest lucene update > > Running solr unit tests seems a fair bit slower now. I think the root > cause may be this: > http://search.lucidimagination.com/search/document/a8bd12c3b87e98a3/speed_ > of_booleanqueries_on_2_9 > That may be fixed, but I think we should implement the new methods anyway. > > I'm also surprised that more changes weren't necessary to get the > latest Lucene to work... one thing in particular is docs out of order > - Solr currently requires them in-order to correctly create DocSet > instances, and I'm not sure this is the case any more. I'll look into > it. > > -Yonik > http://www.lucidimagination.com > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
> On Wed, Jul 15, 2009 at 4:12 PM, Uwe Schindler wrote: > > > As far as I know, Shalin implemented the Collectors in Solr with the > method > > allowDocsOutOfOrder() returning false. So the collectors should create > > DocIdSet with correct order. > > I think you meant "so the Scorers will be created so that they provide > docIDs in order". The latest patch in https://issues.apache.org/jira/browse/SOLR-940, here Shalin implemented allowDocsOutOfOrder as false in some cases. I think from the discussion in other issues, that this was because of the order of doc ids was needed to create DocIdSets out of the collected results. I have something similar in my code here, too. I use a Collector, but need the doc ids in order. > Ie, Lucene first asks the Collector if it'll accept docs out of order. > If it returns false, then it asks the Weight for a Scorer that always > returns docs in order. > > But in the case of BooleanQuery this is a sizable performance hit, if > it's a query (only OR'd terms and at most 32 MUST_NOT terms) that can > use BooleanScorer not BooleanScorer2. Sometimes ordered DocIds are more important, because ordering them later using a quicksort would be more costy than a slower query. Because of this it is good to have this method allowDocsOutOfOrder(). Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: latest lucene update
On Wed, Jul 15, 2009 at 4:12 PM, Uwe Schindler wrote: > As far as I know, Shalin implemented the Collectors in Solr with the method > allowDocsOutOfOrder() returning false. So the collectors should create > DocIdSet with correct order. I think you meant "so the Scorers will be created so that they provide docIDs in order". Ie, Lucene first asks the Collector if it'll accept docs out of order. If it returns false, then it asks the Weight for a Scorer that always returns docs in order. But in the case of BooleanQuery this is a sizable performance hit, if it's a query (only OR'd terms and at most 32 MUST_NOT terms) that can use BooleanScorer not BooleanScorer2. Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
> I'm also surprised that more changes weren't necessary to get the > latest Lucene to work... one thing in particular is docs out of order > - Solr currently requires them in-order to correctly create DocSet > instances, and I'm not sure this is the case any more. I'll look into > it. As far as I know, Shalin implemented the Collectors in Solr with the method allowDocsOutOfOrder() returning false. So the collectors should create DocIdSet with correct order. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: latest lucene update
Hi Yonik, could you try out, if my patch in https://issues.apache.org/jira/browse/LUCENE-1614 makes this better? Currently I am not sure if the first is enough, or the second one is needed. Maybe Mike and Shai should answer, why advance may be called with NO_MORE_DOCS. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Wednesday, July 15, 2009 10:00 PM > To: java-dev@lucene.apache.org > Subject: latest lucene update > > Running solr unit tests seems a fair bit slower now. I think the root > cause may be this: > http://search.lucidimagination.com/search/document/a8bd12c3b87e98a3/speed_ > of_booleanqueries_on_2_9 > That may be fixed, but I think we should implement the new methods anyway. > > I'm also surprised that more changes weren't necessary to get the > latest Lucene to work... one thing in particular is docs out of order > - Solr currently requires them in-order to correctly create DocSet > instances, and I'm not sure this is the case any more. I'll look into > it. > > -Yonik > http://www.lucidimagination.com > > - > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org