Robert Muir created LUCENE-5227: ----------------------------------- Summary: consider unifying stopwords formats for 5.0 Key: LUCENE-5227 URL: https://issues.apache.org/jira/browse/LUCENE-5227 Project: Lucene - Core Issue Type: Task Affects Versions: 5.0 Reporter: Robert Muir
Hossman has background on LUCENE-5211. The story is we added these to lucene (it used to be a 'svn export' from snowball tree!!!!) and i had several reasons for supporting the snowball format: 1. svn export/easier to maintain diffs 2. in lucene from Analyzer APIs, as a default "Set" it didnt much matter. 3. the snowball format is nice the way they present e.g. inflection tables for some languages that inflect pronouns. But the reality is: 1. people try to use these from e.g. solr and hit traps. 2. these things are not changing hardly at all in the snowball repository. 3. we don't do svn export anymore. 4. the "tables" could just be preserved inside # comments and still explain why the words are in the file. We could convert our files for 5.0, and just update our stuff appropriately, and of course still support parsing the old format, and it wouldnt break anyway, just reduce traps i think. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org