Re: Getting our first release out

Jörn Kottmann Tue, 01 Feb 2011 14:06:34 -0800

On 2/1/11 10:45 PM, Grant Ingersoll wrote:

Yes, we should start assembling a list of corpora, even so we at least have it 
for others that come later and want to reproduce them.  In the meantime, I 
would agree that we can just keep the models elsewhere.  We don't have to 
provide models.  They are a convenience for all involved, but not a requirement 
in order to run.  I wonder how many people actually train there own.  (BTW, we 
should update our website to point to older models, too.  They are really hard 
to find unless you do some URL rewriting.)

OK, then lets get out the release as quickly as possible withoutdepending on the legal issues for the modelsAnd lets do as much as possible to resolve these issues, just next tothe release work. I might have a

few spare cycles here and there to work on that.

To get started with the legal stuff we need to compile a list with allthe necessary information,

that list will make a nice corpora page in our wiki.

Our documentation already contains instructions on how to train on somefreely available data.

In the end I believe we are all best served with a wikinews corpus whichcan be labeled by our community.


Jörn

Re: Getting our first release out

Reply via email to