beginner's question about Maximum Entropy for POS tagging

liyang yu Fri, 09 Mar 2018 17:09:06 -0800

Hi All,

Sorry for the naive questions...


I am new to NLP and have been reading about maximum entropy (MaxEnt)
approach for POS tagging. Now, my questions are:

   1. the maximum entropy method produces a "model" based on the training
   data (already tagged text body)...so this model is *the* probability
   distribution that guarantees all the features and also maximizes the
   entropy value?
   2. how does this "model/probability distribution function" look? do we
   have an example, or a place we can actually see what is inside this model?
   does it look like a table? a collection of probabilities and features?
   3. how is this "model" actually used? is there a place we can see a
   simple example... such as if the sentence is "I am confused", then the
   model is somehow checked/searched and will tag all the above 3 words? what
   if a new sentence that does NOT follow any existing feature is submitted to
   the model?

I have been googling, wishing to see a simple Hello World kind of example,
from building the training data, to the model, to how actually the model is
used... but no luck. Also read the paper by Adam Berger, still don't really
understand how the model looks, and how it is actually used... I appreciate
any help!

thanks!

beginner's question about Maximum Entropy for POS tagging

Reply via email to