Hi all, 

Thanks a lot for all the replies. I need to look into what Lucene provides and 
see how far I'll get. 
@Jörn, I will make sure to log the IRA tickets and think about making a 
contribution. I am not sure, if my programming skills are sufficient and I'd 
need to look into the source code, but I'll definitely check it out when / if 
time allows.  

Cheers, 

Martin
 

Am 23.02.2014 um 15:24 schrieb Jörn Kottmann <[email protected]>:

> Hello,
> 
> the current trunk version includes the Porter and Snowball stemmers. We 
> didn't develop the ourself
> but redistribute them as part of OpenNLP.
> It would be nice to add more stemmers, in case you need a certain one it 
> would be nice if you could
> point it out, and we might be able to redistribute it as well. Or maybe just 
> implement it.
> 
> We don't have stoplists, but I think it will be easy to change that. We could 
> probably use the ones from snowball.
> 
> There is no language modeling, it would be nice to get a contribution there. 
> Maybe you are interested in implementing it?
> 
> Anyway, it would be nice if you could open two ira issues to request stopword 
> lists and the language model.
> 
> Jörn
> 
> On 02/23/2014 02:35 PM, Martin Wunderlich wrote:
>> Hi all,
>> 
>> I recently started working with OpenNLP for a project in the area of text 
>> classification with neural networks. So far, OpenNLP is a great library and 
>> very useful.
>> There are just three things that I haven't been able to find, but maybe they 
>> do exist:
>> - language models: e.g. to create a bigram language model with relative and 
>> absolute frequencies from several texts
>> - stemming: to reduce different word forms in inflected languages to a 
>> canonical root form
>> - stoplist: to remove certain words (e.g. from the language model) that are 
>> deemed irrelevant
>> 
>> Do these functions exist in OpenNLP? If not, can you recommend another 
>> library to complement these functions?
>> 
>> Kind regards,
>> 
>> Martin
>> 
>> 
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to