[
https://issues.apache.org/jira/browse/SOLR-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800484#action_12800484
]
Marvin Humphrey commented on SOLR-1677:
---------------------------------------
> I'd still like to clarify this whole issue of wether "Lucene-Java", as a
> project,
> has an expectation that client applications will always use a consistent value
> for Version when constructing objects that interact with an index
Yes. The whole point is to avoid Analyzer mismatches.
Say a stoplist was modified between Lucene versions. Sure, you can hack it
and ask for an old match version, so you get a stoplist other than the one that
was used to build the index... but why would you want to?
> Are there any threads/docs about the expecations of Version
> homo/hetero-genousness in Lucene-Java?
The original thread from last May, I guess... which culminated in LUCENE-1684:
http://markmail.org/thread/egqe6rm4c4om7swv
It's very long, though.
> Add support for o.a.lucene.util.Version for BaseTokenizerFactory and
> BaseTokenFilterFactory
> -------------------------------------------------------------------------------------------
>
> Key: SOLR-1677
> URL: https://issues.apache.org/jira/browse/SOLR-1677
> Project: Solr
> Issue Type: Sub-task
> Components: Schema and Analysis
> Reporter: Uwe Schindler
> Attachments: SOLR-1677.patch, SOLR-1677.patch, SOLR-1677.patch,
> SOLR-1677.patch
>
>
> Since Lucene 2.9, a lot of analyzers use a Version constant to keep backwards
> compatibility with old indexes created using older versions of Lucene. The
> most important example is StandardTokenizer, which changed its behaviour with
> posIncr and incorrect host token types in 2.4 and also in 2.9.
> In Lucene 3.0 this matchVersion ctor parameter is mandatory and in 3.1, with
> much more Unicode support, almost every Tokenizer/TokenFilter needs this
> Version parameter. In 2.9, the deprecated old ctors without Version take
> LUCENE_24 as default to mimic the old behaviour, e.g. in StandardTokenizer.
> This patch adds basic support for the Lucene Version property to the base
> factories. Subclasses then can use the luceneMatchVersion decoded enum (in
> 3.0) / Parameter (in 2.9) for constructing Tokenstreams. The code currently
> contains a helper map to decode the version strings, but in 3.0 is can be
> replaced by Version.valueOf(String), as the Version is a subclass of Java5
> enums. The default value is Version.LUCENE_24 (as this is the default for the
> no-version ctors in Lucene).
> This patch also removes unneeded conversions to CharArraySet from
> StopFilterFactory (now done by Lucene since 2.9). The generics are also fixed
> to match Lucene 3.0.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.