On 23/02/12 15:50, Jörn Kottmann wrote:
On 02/23/2012 04:41 PM, Jim wrote:
If you could help me sort out the dictionary problem it would be
amazing!!!
+1 to work on better gazetteer support! There are many people
who need a feature like this.
The stuff we have right now in OpenNLP was created to have a simple
lookup dictionary for feature generation. I am sure there is lot of room
to improve it.
Jörn
I'm not sure i understood...Can the Dictionary find multi-word entities?
I would say it can't simply because it expects tokenized text and then
it tries to find an exact match. So an entry like "Folic acid" is being
split into 2 tokens just before the Dictionary sees it...so it will try
to first match "folic" and "acid" separately!
Any ideas? Apparently someone else had the same problem and solved it!!!
I'd donate my left hand to know how!!!!
Thanks,
Jim