Hello,

There seems to be no free German NE model available, so I started to think about creating one - just using free resources like Wikipedia etc.

I still have some questions:

Somewhere in the documnetation, I read about a dictionary driven NE recognizer in OpenNLP. But I didn't found any further information about it. Anyway, would it be possible to combine the statistic approach with dictionaries? For example, having a list of country names would be useful.

As far as I understood, the name finder is at the moment only stable for one property, like person names. I would like to have the traditional divison into persons, locations, organizations and misc. When creating manually the training data, would it be OK to add all four kinds already to the text and then, maybe create later 4 models for the different properties?

The name finder uses as input sentences and tokens. Would it be OK to also have POS tags assigned to the training data? That would make it much easier to manually annotate the data when e.g. NEs are already marked by the POS tagger.

Thats it for the moment, I'm quite sure I will come back later with more questions :-)

Best,

Tom

--
Dr. Thomas Zastrow
Riemerfeldring 7a

85748 Garching
Tel.: 0162 422 8029
www.thomas-zastrow.de

Reply via email to