THere are a bunch of papers on this. Search named entity recognizer CRF
on google.
The basic idea is that an HMM or CRF has internal state that can be used to
mark named entities. We don't have to define what the hidden states mean,
just help the HMM or CRF find an internal representation that
] on behalf of Dhruv Kumar
[dku...@ecs.umass.edu]
Sent: Wednesday, January 25, 2012 1:17 AM
To: user@mahout.apache.org
Subject: Re: Suggestions Needed : Developing application using Mahout
HMMs seem to be a good fit for this problem. They are used ubiquitously for
pattern detection.
If you
The HMM implementations might be of help, but I think that a small CRF
implementation that is oriented around string transduction would be more
helpful.
The Stanford Named Entity Recognizer (NER) has such an implementation. I
think NLTK has one. I think GATE has one as well.
The basic