[ https://issues.apache.org/jira/browse/LUCENE-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260612#comment-13260612 ]
Luca Cavanna commented on LUCENE-4019: -------------------------------------- Robert, with "spec" I meant exactly your links :) Actually it's clear that the affix header has 4 elements while each rule has at least 5 elements. I don't really know what hunspell does with that kind of malformed rules. Lucene just throws an error while loading the dictionary. Looking at the hunspell source code, I might be wrong but I suspect it just skips that specific rule with some warning. But honestly it's hard to believe that at least 4 dictionaries I tried contain mistaken rules, isn't it? I'll investigate more, thanks! > Parsing Hunspell affix rules without regexp condition > ----------------------------------------------------- > > Key: LUCENE-4019 > URL: https://issues.apache.org/jira/browse/LUCENE-4019 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 3.6 > Reporter: Luca Cavanna > > We found out that some recent Dutch hunspell dictionaries contain suffix or > prefix rules like the following: > {code} > SFX Na N 1 > SFX Na 0 ste > {code} > The rule on the second line doesn't contain the 5th parameter, which should > be the condition (a regexp usually). You can usually see a '.' as condition, > meaning always (for every character). As explained in LUCENE-3976 the > readAffix method throws error. I wonder if we should treat the missing value > as a kind of default value, like '.'. On the other hand I haven't found any > information about this within the spec. Any thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org