[ https://issues.apache.org/jira/browse/LUCENE-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luca Cavanna updated LUCENE-4019: --------------------------------- Attachment: LUCENE-4019.patch Small patch: affix rules with less than 5 elements are now ignored. I added a specific test with a new affix file containing an example of rule shorter than it should be. Let me know if you prefer to add a warning when a rule is skipped. Hunspell does that only with a specific command line option. > Parsing Hunspell affix rules without regexp condition > ----------------------------------------------------- > > Key: LUCENE-4019 > URL: https://issues.apache.org/jira/browse/LUCENE-4019 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 3.6 > Reporter: Luca Cavanna > Attachments: LUCENE-4019.patch > > > We found out that some recent Dutch hunspell dictionaries contain suffix or > prefix rules like the following: > {code} > SFX Na N 1 > SFX Na 0 ste > {code} > The rule on the second line doesn't contain the 5th parameter, which should > be the condition (a regexp usually). You can usually see a '.' as condition, > meaning always (for every character). As explained in LUCENE-3976 the > readAffix method throws error. I wonder if we should treat the missing value > as a kind of default value, like '.'. On the other hand I haven't found any > information about this within the spec. Any thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org