I think we should upgrade Lucene again since the index file format has changed:
https://issues.apache.org/jira/browse/LUCENE-1654
This also contains a fix for unifying the FieldCache and
ExtendedFieldCache instances.
$ svn diff -r r776177 CHANGES.txt
Index: CHANGES.txt
===================================================================
--- CHANGES.txt (revision 776177)
+++ CHANGES.txt (working copy)
@@ -27,7 +27,11 @@
implement Searchable or extend Searcher, you should change you
code to implement this method. If you already extend
IndexSearcher, no further changes are needed to use Collector.
- (Shai Erera via Mike McCandless)
+
+ Finally, the values Float.Nan, Float.NEGATIVE_INFINITY and
+ Float.POSITIVE_INFINITY are not valid scores. Lucene uses these
+ values internally in certain places, so if you have hits with such
+ scores it will cause problems. (Shai Erera via Mike McCandless)
Changes in runtime behavior
@@ -107,10 +111,10 @@
that's visited. All core collectors now use this API. (Mark
Miller, Mike McCandless)
-8. LUCENE-1546: Add IndexReader.flush(String commitUserData), allowing
- you to record an opaque commitUserData into the commit written by
- IndexReader. This matches IndexWriter's commit methods. (Jason
- Rutherglen via Mike McCandless)
+8. LUCENE-1546: Add IndexReader.flush(Map commitUserData), allowing
+ you to record an opaque commitUserData (maps String -> String) into
+ the commit written by IndexReader. This matches IndexWriter's
+ commit methods. (Jason Rutherglen via Mike McCandless)
9. LUCENE-652: Added org.apache.lucene.document.CompressionTools, to
enable compressing & decompressing binary content, external to
@@ -135,6 +139,9 @@
not make sense for all subclasses of MultiTermQuery. Check individual
subclasses to see if they support #getTerm(). (Mark Miller)
+14. LUCENE-1636: Make TokenFilter.input final so it's set only
+ once. (Wouter Heijke, Uwe Schindler via Mike McCandless).
+
Bug fixes
1. LUCENE-1415: MultiPhraseQuery has incorrect hashCode() and equals()
@@ -176,6 +183,9 @@
sort) by doc Id in a consistent manner (i.e., if Sort.FIELD_DOC
was used vs.
when it wasn't). (Shai Erera via Michael McCandless)
+10. LUCENE-1647: Fix case where IndexReader.undeleteAll would cause
+ the segment's deletion count to be incorrect. (Mike McCandless)
+
New features
1. LUCENE-1411: Added expert API to open an IndexWriter on a prior
@@ -186,10 +196,11 @@
when building transactional support on top of Lucene. (Mike
McCandless)
- 2. LUCENE-1382: Add an optional arbitrary String "commitUserData" to
- IndexWriter.commit(), which is stored in the segments file and is
- then retrievable via IndexReader.getCommitUserData instance and
- static methods. (Shalin Shekhar Mangar via Mike McCandless)
+ 2. LUCENE-1382: Add an optional arbitrary Map (String -> String)
+ "commitUserData" to IndexWriter.commit(), which is stored in the
+ segments file and is then retrievable via
+ IndexReader.getCommitUserData instance and static methods.
+ (Shalin Shekhar Mangar via Mike McCandless)
3. LUCENE-1406: Added Arabic analyzer. (Robert Muir via Grant Ingersoll)
@@ -311,6 +322,10 @@
25. LUCENE-1634: Add calibrateSizeByDeletes to LogMergePolicy, to take
deletions into account when considering merges. (Yasuhiro Matsuda
via Mike McCandless)
+
+26. LUCENE-1550: Added new n-gram based String distance measure for
spell checking.
+ See the Javadocs for NGramDistance.java for a reference paper on
why this is helpful (Tom Morton via Grant Ingersoll)
+
Optimizations
-Yonik
http://www.lucidimagination.com