[ https://issues.apache.org/jira/browse/LUCENE-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062375#comment-13062375 ]
Michael McCandless commented on LUCENE-3233: -------------------------------------------- I think this is ready to commit, but I'd like to rename existing syn filter to SlowSynonymFilter and rename the new one to SynonymFilter. Because there are some minor diffs (deduping rules, lowercasing), for Solr to cutover I think we need some back compat logic; I'll open a separate issue for this. > HuperDuperSynonymsFilterâ„¢ > ------------------------- > > Key: LUCENE-3233 > URL: https://issues.apache.org/jira/browse/LUCENE-3233 > Project: Lucene - Java > Issue Type: Improvement > Reporter: Robert Muir > Attachments: LUCENE-3223.patch, LUCENE-3233.patch, LUCENE-3233.patch, > LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, > LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, > LUCENE-3233.patch, LUCENE-3233.patch, LUCENE-3233.patch, synonyms.zip > > > The current synonymsfilter uses a lot of ram and cpu, especially at build > time. > I think yesterday I heard about "huge synonyms files" three times. > So, I think we should use an FST-based structure, sharing the inputs and > outputs. > And we should be more efficient with the tokenStream api, e.g. using > save/restoreState instead of cloneAttributes() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org