Dear Jörn,

Thanks for your answer. I know these tools, but I'm happy (and
effective) with my little, self-programmed tool. If it will be stable
enough, I will publish it sometime. word2vec sounds interesting, I will
take a look.

Best,

Tom


Am 15.10.2013 11:02, schrieb Jörn Kottmann:
> You can also use a tools like the Apache UIMA Cas Editor, Brat, WebAnno,
> etc.
> Usually the annotation speed is much higher if you don't need to edit a
> text file
> yourself.
> 
> The Tagging Server in the sandbox can be used to pre-label data for brat
> or the Apache UIMA Cas Editor.
> 
> Another tool you should try is word2vec, it can create word clusters
> which can be used as part of
> the feature generation, in my tests that increased the recall a few
> percents, but it is still work in progress,
> it will take a few days until that works with the TokenNameFinderTrainer
> command line tool.
> 
> HTH,
> Jörn
> 
> On 10/14/2013 09:27 PM, Thomas Zastrow wrote:
>> Hello,
>>
>> I have a question: when creating training material, does it make a
>> difference if there are " " (blanks) around the NE? In other words, is
>> it the same to have:
>>
>> <START:loc>Hamburg<END>
>>
>> or:
>>
>> <START:loc> Hamburg <END>
>>
>> The example in the documentation shows up with the " " ... ?
>>
>> Best,
>>
>> Tom
>>
>> P.S.: ca. 1300 sentences for a free German NE model are done :-)
> 

Reply via email to