[ https://issues.apache.org/jira/browse/LUCENE-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15348927#comment-15348927 ]
Andriy Rysin commented on LUCENE-7287: -------------------------------------- Ok, I was able to run solr with Ukrainian analyzer and I can confirm it generates unique lemmas. I've created a pull request https://github.com/apache/lucene-solr/pull/45 I've also added mapping_uk.txt so we can use mapping filter in solr, once it's merged we can add this line: <charFilter class="solr.MappingCharFilterFactory" mapping="org/apache/lucene/analysis/uk/mapping_uk.txt"/> We could potentially change UkrainianMorfologikAnalyzer to use MappingCharFilterFactory to read from the same file (so we don't have the mapping both in the code and the file) but not sure how appropriate using of factories in lucene is. Many thanks to Ahmet who helped with solr integration and found duplicate tokens! > New lemma-tizer plugin for ukrainian language. > ---------------------------------------------- > > Key: LUCENE-7287 > URL: https://issues.apache.org/jira/browse/LUCENE-7287 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis > Reporter: Dmytro Hambal > Priority: Minor > Labels: analysis, language, plugin > Fix For: master (7.0), 6.2 > > Attachments: LUCENE-7287.patch, Screen Shot 2016-06-23 at 8.23.01 > PM.png, Screen Shot 2016-06-23 at 8.41.28 PM.png > > > Hi all, > I wonder whether you are interested in supporting a plugin which provides a > mapping between ukrainian word forms and their lemmas. Some tests and docs go > out-of-the-box =) . > https://github.com/mrgambal/elasticsearch-ukrainian-lemmatizer > It's really simple but still works and generates some value for its users. > More: https://github.com/elastic/elasticsearch/issues/18303 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org