[jira] [Commented] (LUCENE-6053) Serbian Analyzer

Robert Muir (JIRA) Fri, 07 Nov 2014 04:35:28 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-6053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201993#comment-14201993
 ]


Robert Muir commented on LUCENE-6053:
-------------------------------------

Looks good (caveat: I am not intimately familiar with the normalizations of 
diacritics here).

Should we add a note to SerbianNormalizationFilter that it expects lowercase 
input?

> Serbian Analyzer
> ----------------
>
>                 Key: LUCENE-6053
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6053
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Nikola Smolenski
>         Attachments: LUCENE-Serbian.patch
>
>
> This is analyzer for Serbian language, so far consisting only of a 
> normalizer. Serbian language uses both Cyrillic and Latin alphabet, so the 
> normalizer works with both alphabets.
> In the future, I'll see to add stopwords, stemmer and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-6053) Serbian Analyzer

Reply via email to