Re: ClearNLP POSTagger

Jörn Kottmann Tue, 09 Apr 2013 01:11:58 -0700

Would it be possible to run some benchmarks so we know the performancedifference between the two?

The OpenNLP POS Tagger can be customized, currently is possible toreplace the feature generation,it can probably be optimized for the medical domain, the default featuregeneration is tuned for the news domain.Replacing the learning algorithm is currently not possible, but we willwork on that for the next release.

Do you use a tag dictionary? Maybe it is possible to generate somethingfrom the existing dictionaries already

used by cTAKES.

Jörn

On 04/08/2013 06:15 PM, Chen, Pei wrote:

Hi,
While working on the Dependency Parser/SRL labeler,  we also have a POSTagger 
from ClearNLP.  It is fairly simple and I have the code ready (also trained on 
the same data as the dep parser- MiPaq/SHARP) to be checked-in.  What does the 
folks think:
We can include both Analysis Engines in the ctakes-pos-tagger project.  But 
should we leave the current OpenNLP in the default pipeline or default to the 
latest?

"The ClearNLP POS tagger shows more robust results on unknown words by 
generalizing lexical features.  You can find the reference from this paper.
Fast and Robust Part-of-Speech Tagging Using Dynamic Model Selection, Jinho D. Choi, 
Martha Palmer, Proceedings of the 50th Annual Meeting of the Association for 
Computational Linguistics (ACL'12), 363-367, Jeju, Korea, 2012. [1] It also uses 
AdaGrad for machine learning, which is a more advanced learning algorithm than 
maximum entropy used by OpenNLP."

[1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf

Re: ClearNLP POSTagger

Reply via email to