On 08.08.2013 09:15, Jörn Kottmann wrote: > The learnable NER component could probably detect the names if you > would have > much more training data. I suggest to use some rule based extractor, > maybe have a look > at UIMA Ruca.
UIMA Ruta: http://uima.apache.org/ruta.html ;-) Peter > > Jörn > > On 08/06/2013 03:22 PM, Markus Marks wrote: >> Hi all, >> >> >> i'm a german computer science student, who is currently writing on >> his bachelor thesis. I write you because i'm very desperate. I have >> to solve an information extraction task and i'm not quite sure, how >> to solve it and i was hoping, you could help me or tell me if openNLP >> would work out. >> Ok... here it comes: >> Let's assume I have a sender's adress from a letter. And i have few >> annotated examples. >> >> new document example with annotation >> Mr. XYZ Enterprise Something >> Example Company John Doe >> Sample road 12514 somewhere else >> somewhere another road >> something >> something something else >> >> So the problem is how to generate a matching or learning algorithm, >> so that I'm able to extract for example the name of the company or >> the name of a new sender, considering some annotated examples i can >> provide, with the problem that not every sender is written with the >> same order or expressions. >> >> The thing is that, i only have really few examples, like less than 10. >> You have any suggestions how to solve this? I would be really >> thankful, since i'm very disappointed, not finding a solution. >> >> Yours thankfully, >> >> Markus >>
