Re: Feature Manipulation for Maxent model in NER

Dan Russ Tue, 19 Dec 2017 06:41:52 -0800

> On Dec 19, 2017, at 8:27 AM, Manoj B. Narayanan 
> <[email protected]> wrote:
> 
> Hi all,
> 
> I tried varying the custom features we provide to the model. I have a few
> queries regarding it.
> 
> 1. Will the probability for a particular feature get affected if I add it
> multiple times?
>     Eg. If I add a feature 'pos=NN' multiple times, will it have an impact
> on model performance?


Yes, but not necessarily the way you expect.  Remember that maxent tries to 
maximize the probability of being correct given the training data.  Naive Bayes 
models are much easier to see the effect of multiple pos=NN features than in 
maxent 

> 
> 2. What if I add the same feature differently?
>     Eg. I add 'pos=NN' and 'partsOfSpeech=NN', what will be the impact.
> These 2 are always co-occurring too. So how will the model treat them.
> 

Maxent is not really effected by correlated terms.  It will split the weights 
among the terms.  Your two terms (pos=NN and partOfSpeech=NN) are completely 
terms.

> 3. How does the model learn the features? Please give a small example.

There are 2 training methods in OpenNLP.  GIS and L-BFGS.  Essentially you are 
trying to maximize P( correct outcome | features),   the outcome is exp (sum( 
WiXi) )/ sumj ( exp(W,Xj).   Both GIS and L-BFGS are iterative methods to get 
the weights that get the most training cases correct.  

> 
> 4. What if we can add classes to the features?
>     Eg. Certain features can have only a certain set of values.

This is actually an incorrect statement.  A feature is a function that 
indicates the PRESENCE of something WITH A PARTICULAR OUTCOME.  What does that 
mean?  If your “document” is: “The cat” and you have two outcomes (“animal” and 
“mineral”). Than there are 2 features associated with the word “cat”, 
F(“CAT”,animal) and F(“CAT”,mineral).   These features each have different 
weights.

> If we are
> able to label them, can we make the model learn features according to the
> labels?

Given what I said about features earlier, yes you can (you actually alway do 
this).  In openNLP, if you add a term (actually a predicate, but I won’t call 
it that) to an event context, you are associating a term with an outcome (a 
feature) -- (But understand that all potential features exist, they just have a 
weight of 0.). 

>     Say, I have a) pos feature b) dictionary feature
>     If the probability is calculated with respect to the corresponding
> class (pos / dictionary) and then the overall probability is calculated how
> will the model behave?

Not sure I understand your question.  

>    Instead of giving a single string as a feature what if we give a key,
> value pair as feature?

If you give a=b that is a single feature, a=c is another feature that has 
nothing to do with a=b.

> 
> Awaiting discussion regarding these.
> 
> Thanks,
> Manoj.

Re: Feature Manipulation for Maxent model in NER

Reply via email to