Hello,
There seems to be no free German NE model available, so I started to
think about creating one - just using free resources like Wikipedia etc.
I still have some questions:
Somewhere in the documnetation, I read about a dictionary driven NE
recognizer in OpenNLP. But I didn't found any further information about
it. Anyway, would it be possible to combine the statistic approach with
dictionaries? For example, having a list of country names would be useful.
As far as I understood, the name finder is at the moment only stable for
one property, like person names. I would like to have the traditional
divison into persons, locations, organizations and misc. When creating
manually the training data, would it be OK to add all four kinds already
to the text and then, maybe create later 4 models for the different
properties?
The name finder uses as input sentences and tokens. Would it be OK to
also have POS tags assigned to the training data? That would make it
much easier to manually annotate the data when e.g. NEs are already
marked by the POS tagger.
Thats it for the moment, I'm quite sure I will come back later with more
questions :-)
Best,
Tom
--
Dr. Thomas Zastrow
Riemerfeldring 7a
85748 Garching
Tel.: 0162 422 8029
www.thomas-zastrow.de