Robert Muir created LUCENE-5227:
-----------------------------------

             Summary: consider unifying stopwords formats for 5.0
                 Key: LUCENE-5227
                 URL: https://issues.apache.org/jira/browse/LUCENE-5227
             Project: Lucene - Core
          Issue Type: Task
    Affects Versions: 5.0
            Reporter: Robert Muir


Hossman has background on LUCENE-5211.

The story is we added these to lucene (it used to be a 'svn export' from 
snowball tree!!!!) and i had several reasons for supporting the snowball format:
1. svn export/easier to maintain diffs
2. in lucene from Analyzer APIs, as a default "Set" it didnt much matter.
3. the snowball format is nice the way they present e.g. inflection tables for 
some languages that inflect pronouns.

But the reality is:
1. people try to use these from e.g. solr and hit traps.
2. these things are not changing hardly at all in the snowball repository.
3. we don't do svn export anymore.
4. the "tables" could just be preserved inside # comments and still explain why 
the words are in the file.

We could convert our files for 5.0, and just update our stuff appropriately, 
and of course still support parsing the old format, and it wouldnt break 
anyway, just reduce traps i think.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to