I've been testing 2.9 RC2 lately and comparing query performance to 2.3.2.
I'm seeing a huge increase in throughput (2x-10x) on an index that was built
with 2.3.2. The queries have a lot of BoostingTermQuerys and boolean clauses
containing a custom scorer. Using JProfiler, I observe that the improvement
is due to a huge reduction in the number of calls to TermDocs.next and
TermDocs.skipTo (about 65% fewer calls). Since the custom scorer is somewhat
expensive and frequent in these queries, the improvement is quite noticable.

Have there been a lot of optimizations in the IndexSearcher/skipto/next
logic since 2.3.2?
Should I expect to see even more query improvement by building the index
with 2.9?  (pinch me!)
Good stuff!

Thanks,
Peter

On Fri, Aug 28, 2009 at 3:02 PM, Mark Miller <markrmil...@gmail.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hello Lucene users,
>
> On behalf of the Lucene dev community (a growing community far larger
> than just the committers) I would like to announce the second release
> candidate for Lucene 2.9.
>
> Please download and check it out – take it for a spin and kick the
> tires. If all goes well, we hope to release the final version of
> Lucene 2.9 in a little over a week.
>
> The following changes have been applied since the first release candidate:
> LUCENE-1867: replace collation/lib/icu4j.jar with a smaller icu jar
> LUCENE-1868: add Arabic stemmer to notice.txt
> LUCENE-1870: fix contrib dist - missing analyzers/db binaries - extra
> miscellaneous folder with misc readme in it
> LUCENE-1869: include 'file exists?' when we throw RuntimeException
> from fdx or tvx size mismatches during flush or merge
> LUCENE-1871: Add option to avoid needlessly wrapping TokenStream with
> CachingTokenFilter in Highlighter
>
> While we generally try and maintain full backwards compatibility
> between major versions, Lucene 2.9 has a variety of breaks that are
> spelled out in the 'Changes in backwards compatibility policy' section
> of CHANGES.txt.
>
> We recommend that you recompile your application with Lucene 2.9
> rather than attempting to “drop” it in. This will alert you to any
> issues you may have to fix if you are affected by one of the backward
> compatibility breaks. As always, its a really good idea to thoroughly
> read CHANGES.txt before upgrading. Also, remember that this is a
> release candidate, and not the final Lucene 2.9 release.
>
> Lucene 2.9 comes with a bevy of new features, including:
>
> Per segment searching and caching (can lead to much faster reopen
> among other things)
>
> Near real-time search capabilities added to IndexWriter
>
> New queries, including NumericRangeQuery and NumericRangeFilter –
> fast, highly scalable alternatives to RangeQuery/RangeFilter for
> numeric searches.
>
> Smarter, more scalable multi-term queries (wildcard, range, etc)
>
> A freshly optimized Collector/Scorer API
>
> Improved Unicode support
>
> A new Attribute based TokenStream API
>
> A new QueryParser framework in contrib with a core QueryParser
> replacement impl included.
>
>  ….
>
> And many, many more features, bug fixes, optimizations, and various
> improvements. You can find the full list of changes here:
>
> HTML version:
>
> http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/Changes.html<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/Changes.html>
>
> Text version:
>
> http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/CHANGES.txt<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/CHANGES.txt>
>
> Many changes have also occurred in Lucene's contrib area:
>
> http://people.apache.org/~markrmiller/staging-area/lucene2.9changes/contrib/CHANGES.txt<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9changes/contrib/CHANGES.txt>
>
> Download release candidate 1 here:
> http://people.apache.org/~markrmiller/staging-area/lucene2.9rc2/<http://people.apache.org/%7Emarkrmiller/staging-area/lucene2.9rc2/>
>
> Be sure to report back with any issues you find!
>
> Thanks,
>
> Mark Miller
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkqYKcIACgkQ0DU3IV7ywDkZHACfbzHKac0sVjNkdSi+79WTYWme
> JR4An18v1SJ6HN8mkCYHF0ybqUSypOG/
> =qS9t
> -----END PGP SIGNATURE-----
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to