[ 
https://issues.apache.org/jira/browse/SOLR-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-2764:
------------------------------

    Attachment: SOLR-2764.patch

Thanks Christian. I further refined stuff:

- For MinimalStemmer, we now do two-pass removal for the -dom and -het endings. 
This means that the word kristendom will first be stemmed to kristen, and then 
all the general rules apply so it will be further stemmed to krist. The effect 
of this is that both "kristen,kristendom,kristendommen,kristendommens" will all 
be stemmed to "krist" (due to in this case incorrect interpretation of -en as 
plural ending), but when stopping at -dom removal, kristendom would not match 
inflections of kristen.

What do you think, is this a reasonable improvement or could there be side 
effects? I've not added these rules to the MinimalStemmer, to keep it simpler.
                
> Create a NorwegianLightStemmer and NorwegianMinimalStemmer
> ----------------------------------------------------------
>
>                 Key: SOLR-2764
>                 URL: https://issues.apache.org/jira/browse/SOLR-2764
>             Project: Solr
>          Issue Type: New Feature
>          Components: Schema and Analysis
>            Reporter: Jan Høydahl
>             Fix For: 3.6, 4.0
>
>         Attachments: SOLR-2764.patch, SOLR-2764.patch, SOLR-2764.patch, 
> SOLR-2764.patch
>
>
> We need a simple light-weight stemmer and a minimal stemmer for 
> plural/singlular only in Norwegian

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to