[ 
https://issues.apache.org/jira/browse/LUCENENET-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16067035#comment-16067035
 ] 

Shad Storhaug commented on LUCENENET-551:
-----------------------------------------

I am curious, do you still think this would be useful? If so, would you be 
interested in taking on this project? It doesn't look like it would be too 
difficult. Or, have you already done it?

Now that we are on Lucene 4.8.0, the contrib project is gone and the snowball 
analyzers are part of the Lucene.Net.Analysis.Common project. They originated 
from here: http://snowball.tartarus.org/ which has moved to: 
https://github.com/snowballstem/snowball. There was no Latin in the original, 
but I don't think it would be very difficult to port from Ruby.

That said, this is something that would put us out of sync with Lucene, since 
they don't have a Latin Snowball analyzer. So it feels like it doesn't belong 
here (instead it should be in its own repo). On the other side of that 
argument, it would be a lot easier to keep in version sync with Lucene.Net if 
it were in our repo. And if it were contributed directly to Lucene, it would 
take many months/years to trickle down to Lucene.Net. Itamar, what are your 
thoughts on this?

> Latin language Stemmer (feature request)
> ----------------------------------------
>
>                 Key: LUCENENET-551
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-551
>             Project: Lucene.Net
>          Issue Type: Improvement
>          Components: Lucene.Net Contrib, Lucene.Net.Analysis.Common
>    Affects Versions: Lucene.Net 3.0.3, Lucene.Net 4.8.0
>            Reporter: Peter Halasz
>
> I would find a Latin language stemmer very helpful. The Schinke Latin 
> stemming algorithm has been converted to Snowball here: 
> http://snowball.tartarus.org/otherapps/schinke/intro.html . I have not worked 
> out how to compile Snowball into .cs to try it.
> There are currently 5 romance-languages supported (French, Spanish, 
> Portuguese, Italian, Romanian). so if the above doesn't work, I imagine one 
> of these could be modified to support Latin.
> I realise SF.Snowball is considered a contrib package rather than core, but 
> Lucene.Net seems to be the main place where Snowball stemmers are provided 
> and maintained for C# / .Net.
> Note, other language ports of Snowball support Latin (using the Schinke 
> contribution), such as Ruby: https://github.com/aurelian/ruby-stemmer



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to