Re: What should we do with the SF models?

2014-10-28 Thread Gustavo Knuppe
I believe that models are important for users, since not every user has
access to appropriate data files to train basic models.

My suggestion is to use an alternative service to host these models,
like github, torrent or other file share service...

Github is a good option since they don't have any quota or bandwidth
limitation.

Gustvo K.

2014-10-28 15:19 GMT-02:00 Joern Kottmann kottm...@gmail.com:

 Hi all,

 OpenNLP always came with a couple of trained models which were ready to
 use for a few languages. The performance a user encounters with those
 models heavily depends on their input text.

 Especially the English name finder models which were trained on MUC 6/7
 data perform very poorly these days if run on current news articles and
 even worse on data which is not in the news domain.

 Anyway, we often get judged on how well OpenNLP works just based on the
 performance of those models (or maybe people who compare their NLP
 systems against OpenNLP just love to have OpenNLP perform badly).

 I think we are now at a point with those models were it is questionable
 if having them is still an advantage for OpenNLP. The SourceForge page
 is often blocked due to traffic limitations. We definitely have to act
 somehow.

 The old models have definitely some historic value and are used for
 testing the release.

 What should we do?

 We could take them offline and advice our users to train their own
 models on one of the various corpora we support. We could also do both
 and place a prominent link to our corpora documentation on the download
 page and in a less visible place a link to he historic SF models.

 Jörn




Re: OpenNLP Port

2014-10-08 Thread Gustavo Knuppe
Jörn,

thanks for the reply, I added the NOTICE was the only thing that was not in
compliance.

the project is under AL2.0 too, and I really don't have any intention to
use the lib in
any proprietary software, at least for now :)


*Tom*,
Yes you are correct, however I'm not even looking at the code of other
projects,
are very outdated and are not maintained for years.

I'm actually building from scratch and my goal is to be 100% compatible
with the OpenNLP.

I believe that in a few weeks I have everything working

Thanks

2014-10-08 6:58 GMT-03:00 Tom Morton tsmor...@gmail.com:

 Didn't someone do a C# port already?  Not sure if the project is active but
 there might be some learnings there.

 https://sharpnlp.codeplex.com/

 On Wed, Oct 8, 2014 at 3:44 AM, Jörn Kottmann kottm...@gmail.com wrote:

  Hello,
 
  if you copy our source code you need to respect the AL 2.0.
 
  The license itself can be found here:
  https://www.apache.org/licenses/LICENSE-2.0.html
 
  And here is an overview of what you have to do to distribute
  it as part of an other project:
  http://www.inteist.com/2010/05/how-to-use-apache-2-0-in-
  commercial-products-explained-in-simple-terms/
 
  The attribution should be copied from our NOTICE file.
 
  Jörn
 
 
 
 
 
  On 10/07/2014 06:02 AM, Gustavo Knuppe wrote:
 
  Hi,
 
  first I would like to thank this team that developed the library, I need
  to
  say it has lots of cool stuff :)
 
  I'm working on a independent port of the OpenNLP to C#, I wonder if I
 need
  to add anything else to the readme or license or if the information is
  enough.
 
  Project:
  https://github.com/knuppe/SharpNL
 
  Any comments and suggestions are welcomed
 
 
  Thanks
 
  Gustavo K.