Re: [ANNOUNCE] Apache Lucene 5.0.0 released

Dawid Weiss Fri, 20 Feb 2015 13:25:27 -0800

Thanks for contributing time to the release, Anshum.

Dawid


On Fri, Feb 20, 2015 at 10:16 PM, Anshum Gupta <[email protected]> wrote:
> Sure, I'll fix that on the wiki. Thanks for pointing that out Uwe.
>
> On Fri, Feb 20, 2015 at 1:10 PM, Uwe Schindler <[email protected]> wrote:
>
>> Many thanks! :-) Nice work!
>>
>> I found a small typo in the announcement text on the mail and web page: "
>> Those indexes can then be read (see next section) with Lucene 5..."
>> The "see next section" should not be there, it's only relevant in the
>> migration guide (because there is a section following). Maybe fix this on
>> the web page, for the mail it's too late.
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: [email protected]
>>
>>
>> > -----Original Message-----
>> > From: Anshum Gupta [mailto:[email protected]]
>> > Sent: Friday, February 20, 2015 9:55 PM
>> > To: [email protected]; [email protected]; java-
>> > [email protected]
>> > Subject: [ANNOUNCE] Apache Lucene 5.0.0 released
>> >
>> > 20 February 2015, Apache Lucene™ 5.0.0 available
>> >
>> > The Lucene PMC is pleased to announce the release of Apache Lucene 5.0.
>> >
>> > Apache Lucene is a high-performance, full-featured text search engine
>> > library written entirely in Java. It is a technology suitable for nearly
>> any
>> > application that requires full-text search, especially cross-platform.
>> >
>> > This release contains numerous bug fixes, optimizations, and
>> improvements,
>> > some of which are highlighted below. The release is available for
>> immediate
>> > download at:
>> >   http://lucene.apache.org/core/mirrors-core-latest-redir.html
>> >
>> > See the CHANGES.txt file included with the release for a full list of
>> details.
>> >
>> > Lucene 5.0 Release Highlights:
>> >
>> > Stronger index safety
>> >
>> >  * All file access now uses Java’s NIO.2 APIs which give Lucene stronger
>> index
>> > safety in terms of better error handling and safer commits.
>> >
>> >  * Every Lucene segment now stores a unique id per-segment and per-
>> > commit to aid in accurate replication of index files.
>> >
>> >  * During merging, IndexWriter now always checks the incoming segments
>> > for corruption before merging. This can mean, on upgrading to 5.0.0, that
>> > merging may uncover long-standing latent corruption in an older 4.x
>> index.
>> >
>> > Reduced heap usage
>> >
>> >  * Lucene now supports random-writable and advance-able sparse bitsets
>> > (RoaringDocIdSet and SparseFixedBitSet), so the heap required is in
>> > proportion to how many bits are set, not how many total documents exist
>> in
>> > the index.
>> >
>> >  * Heap usage during IndexWriter merging is also much lower with the new
>> > Lucene50Codec, since doc values and norms for the segments being merged
>> > are no longer fully loaded into heap for all fields; now they are loaded
>> for the
>> > one field currently being merged, and then dropped.
>> >
>> >  * The default norms format now uses sparse encoding when appropriate, so
>> > indices that enable norms for many sparse fields will see a large
>> reduction in
>> > required heap at search time.
>> >
>> >  * 5.0 has a new API to print a tree structure showing a recursive
>> breakdown
>> > of which parts are using how much heap.
>> >
>> > Other features
>> >
>> >  * FieldCache is gone (moved to a dedicated UninvertingReader in the misc
>> > module). This means when you intend to sort on a field, you should index
>> > that field using doc values, which is much faster and less heap consuming
>> > than FieldCache.
>> >
>> >  * Tokenizers and Analyzers no longer require Reader on init.
>> >
>> >  * NormsFormat now gets its own dedicated NormsConsumer/Producer
>> >
>> >  * SortedSetSortField, used to sort on a multi-valued field, is promoted
>> from
>> > sandbox to Lucene's core.
>> >
>> >  * PostingsFormat now uses a "pull" API when writing postings, just like
>> doc
>> > values. This is powerful because you can do things in your postings
>> format
>> > that require making more than one pass through the postings such as
>> > iterating over all postings for each term to decide which compression
>> format
>> > it should use.
>> >
>> >  * New DateRangeField type enables Indexing and searching of date ranges,
>> > particularly multi-valued ones.
>> >
>> >  * A new ExitableDirectoryReader extends FilterDirectoryReader and
>> enables
>> > exiting requests that take too long to enumerate over terms.
>> >
>> >  * Suggesters from multi-valued field can now be built as
>> > DocumentDictionary now enumerates each value separately in a multi-
>> > valued field.
>> >
>> >  * ConcurrentMergeScheduler detects whether the index is on SSD or not
>> > and does a better job defaulting its settings. This only works on Linux
>> for
>> > now; other OS's will continue to use the previous defaults (tuned for
>> > spinning disks).
>> >
>> >  * Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate
>> > limit IO writes for each merge depending on incoming merge rate.
>> >
>> >  * CustomAnalyzer has been added that allows to configure analyzers like
>> > you do in Solr's index schema. This class has a builder API to configure
>> > Tokenizers, TokenFilters, and CharFilters based on their SPI names and
>> > parameters as documented by the corresponding factories.
>> >
>> >  * Memory index now supports payloads.
>> >
>> >  * Added a filter cache with a usage tracking policy that caches filters
>> based
>> > on frequency of use.
>> >
>> >  * The default codec has an option to control BEST_SPEED or
>> > BEST_COMPRESSION for stored fields.
>> >
>> >  * Stored fields are merged more efficiently, especially when upgrading
>> from
>> > previous versions or using SortingMergePolicy
>> >
>> > NOTE: Lucene 5 no longer supports the Lucene 3.x index format. Opening
>> > indexes will result in IndexFormatTooOldException. It is recommended to
>> > either reindex all your data, or upgrade the old indexes with the
>> > IndexUpgrader tool of latest Lucene 4 version (4.10.x). Those indexes can
>> > then be read (see next section) with Lucene 5.
>> >
>> > To read more about the changes, also see:
>> > http://blog.mikemccandless.com/2014/11/apache-lucene-500-is-
>> > coming.html
>> >
>> > Please read CHANGES.txt (
>> > https://lucene.apache.org/core/5_0_0/changes/Changes.html) and
>> > MIGRATE.txt for a full list of new features and notes on upgrading.
>> >
>> > Please report any feedback to the mailing lists (
>> > http://lucene.apache.org/core/discussion.html)
>> >
>> > --
>> > Anshum Gupta
>> > http://about.me/anshumgupta
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>
>
> --
> Anshum Gupta
> http://about.me/anshumgupta

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [ANNOUNCE] Apache Lucene 5.0.0 released

Reply via email to