[ 
https://issues.apache.org/jira/browse/LUCENE-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15969215#comment-15969215
 ] 

ASF GitHub Bot commented on LUCENE-7785:
----------------------------------------

Github user arysin commented on a diff in the pull request:

    https://github.com/apache/lucene-solr/pull/187#discussion_r111595706
  
    --- Diff: 
lucene/analysis/morfologik/src/java/org/apache/lucene/analysis/uk/UkrainianMorfologikAnalyzer.java
 ---
    @@ -107,11 +107,18 @@ public UkrainianMorfologikAnalyzer(CharArraySet 
stopwords, CharArraySet stemExcl
       @Override
       protected Reader initReader(String fieldName, Reader reader) {
         NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder();
    +    // different apostrophes
         builder.add("\u2019", "'");
    +    builder.add("\u0218", "'");
         builder.add("\u02BC", "'");
    +    builder.add("`", "'");
    +    builder.add("ยด", "'");
    +    // ignored characters
         builder.add("\u0301", "");
    -    NormalizeCharMap normMap = builder.build();
    +    builder.add("\u00AD", "");
    +    builder.add("\uFEFF", "");
    --- End diff --
    
    That was from the note [Wikimedia guys 
suggested](https://www.mediawiki.org/wiki/User:TJones_(WMF)/Notes/Ukrainian_Morfologik_Analysis#Recommendations_.26_Plan),
 but agree it does not make sense here, I'll remove it


> Move dictionary for Ukrainian analyzer to external dependency
> -------------------------------------------------------------
>
>                 Key: LUCENE-7785
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7785
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Andriy Rysin
>            Assignee: Dawid Weiss
>
> Currently the dictionary for Ukrainian analyzer is a blob in the source tree. 
> We should move it out to external dependency, this allows:
> * to have less binaries in the source
> * easier to update the dictionary and track updates



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to