[ https://issues.apache.org/jira/browse/LUCENE-7318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-7318: --------------------------------------- Attachment: LUCENE-7318.patch Rote patch, moving {{StandardAnalyzer/Tokenizer}}, and the utility classes it uses, to core's oal.analysis module. I left {{ClassicAnalyzer}} and {{UAX29URLEmailTokenizer}} in the analysis module. "ant test" passes but precommit is still angry about some javadocs ... I'll iterate. The one non-rote change I did was to move the {{ENGLISH_STOP_WORDS_SET}} from {{StopAnalyzer}} (still in analyzers module) to {{StandardAnalyzer}}. I also added "jflex" target to core's build.xml, to regenerate the tokenizer. I left {{ClassicAnalyzer}}, and the factories, in the analysis/common module. > Graduate StandardAnalyzer out of analyzers module into core > ----------------------------------------------------------- > > Key: LUCENE-7318 > URL: https://issues.apache.org/jira/browse/LUCENE-7318 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7318.patch > > > Spinoff from LUCENE-7314: > {{StandardAnalyzer}} has progressed substantially since we broke out the > analyzers module ... it now follows a real Unicode standard (UAX #29 Unicode > Text Segmentation). It's also much faster than it used to be, since it > switched to JFlex a while back. Many bug fixes, etc. > I think it would make a good default for most Lucene users, and we should > graduate it from the analyzers module into core, and make it the default for > {{IndexWriter}}. > It's really quite crazy that users must go digging in the analyzers module to > get started with Lucene ... we don't make them dig through the codecs module > to find a good default codec ... -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org