[ 
https://issues.apache.org/jira/browse/LUCENE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481816#comment-15481816
 ] 

Michael McCandless commented on LUCENE-7318:
--------------------------------------------

bq. Or deprecate what was done in this issue instead

I don't think we should do that: {{StandardAnalyzer}} makes a great default, 
now, finally.  It's based on a Unicode standard (UAX 29).

bq. Example: LowerCaseFilter and UpperCaseFilter are now in different packages 
and different jars?!

But I think this reflects typical usage?  {{UpperCaseFilter}} is rarely used.

bq. Steering people toward using StopFilter by default isn't necessarily a good 
idea either.

Yes, there are difficult tradeoffs if you filter stop words or not, but for 
better or worse, many apps do in fact need to filter stop words, and I think 
it's important we make it easy for apps to use the default analyzer 
({{StandardAnalyzer}}) with stop words, and we should not remove it from core.

> Graduate StandardAnalyzer out of analyzers module into core
> -----------------------------------------------------------
>
>                 Key: LUCENE-7318
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7318
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Blocker
>             Fix For: master (7.0), 6.2, 6.2.1
>
>         Attachments: LUCENE-7318.patch
>
>
> Spinoff from LUCENE-7314:
> {{StandardAnalyzer}} has progressed substantially since we broke out the 
> analyzers module ... it now follows a real Unicode standard (UAX #29 Unicode 
> Text Segmentation).  It's also much faster than it used to be, since it 
> switched to JFlex a while back.  Many bug fixes, etc.
> I think it would make a good default for most Lucene users, and we should 
> graduate it from the analyzers module into core, and make it the default for 
> {{IndexWriter}}.
> It's really quite crazy that users must go digging in the analyzers module to 
> get started with Lucene ... we don't make them dig through the codecs module 
> to find a good default codec ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to