[jira] [Commented] (LUCENE-3366) StandardFilter only works with ClassicTokenizer and only when version < 3.1

David Smiley (JIRA) Mon, 08 Aug 2011 19:55:00 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081385#comment-13081385
 ]


David Smiley commented on LUCENE-3366:
--------------------------------------

Ok.  (I've been in no hurry to rush anything)

I updated the http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters page 
to fix references to StandardFilter that should have been to ClassicFilter, and 
I removed some uses of StandardFilter altogether because it doesn't do 
anything. I'm disinclined to mention this filter in the upcoming revision of my 
book, but I'll be sure to mention the Classic* variants.

Feel free to close this issue if you feel it is appropriate. I created it as an 
"improvement" because StandardFilter seems unfinished, and you've acknowledged 
it is. So perhaps it should stay open until it actually does something some 
day. 

> StandardFilter only works with ClassicTokenizer and only when version < 3.1
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-3366
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3366
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 3.3
>            Reporter: David Smiley
>
> The StandardFilter used to remove periods from acronyms and apostrophes-S's 
> where they occurred. And it used to work in conjunction with the 
> StandardTokenizer.  Presently, it only does this with ClassicTokenizer and 
> when the lucene match version is before 3.1. Here is a excerpt from the code:
> {code:lang=java}
>   public final boolean incrementToken() throws IOException {
>     if (matchVersion.onOrAfter(Version.LUCENE_31))
>       return input.incrementToken(); // TODO: add some niceties for the new 
> grammar
>     else
>       return incrementTokenClassic();
>   }
> {code}
> It seems to me that in the great refactor of the standard tokenizer, 
> LUCENE-2167, something was forgotten here. I think that if someone uses the 
> ClassicTokenizer then no matter what the version is, this filter should do 
> what it used to do. And the TODO suggests someone forgot to make this filter 
> do something useful for the StandardTokenizer.  Or perhaps that idea should 
> be discarded and this class should be named ClassicTokenFilter.
> In any event, the javadocs for this class appear out of date as there is no 
> mention of ClassicTokenizer, and the wiki is out of date too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3366) StandardFilter only works with ClassicTokenizer and only when version < 3.1

Reply via email to