Re: Training NEs?

Jörn Kottmann Tue, 15 Oct 2013 02:03:47 -0700

You can also use a tools like the Apache UIMA Cas Editor, Brat, WebAnno,etc.Usually the annotation speed is much higher if you don't need to edit atext file

yourself.

The Tagging Server in the sandbox can be used to pre-label data for brator the Apache UIMA Cas Editor.

Another tool you should try is word2vec, it can create word clusterswhich can be used as part ofthe feature generation, in my tests that increased the recall a fewpercents, but it is still work in progress,it will take a few days until that works with the TokenNameFinderTrainercommand line tool.


HTH,
Jörn

On 10/14/2013 09:27 PM, Thomas Zastrow wrote:

Hello,

I have a question: when creating training material, does it make a
difference if there are " " (blanks) around the NE? In other words, is
it the same to have:

<START:loc>Hamburg<END>

or:

<START:loc> Hamburg <END>

The example in the documentation shows up with the " " ... ?

Best,

Tom

P.S.: ca. 1300 sentences for a free German NE model are done :-)

Re: Training NEs?

Reply via email to