Re: Feature Manipulation for Maxent model in NER

Manoj B. Narayanan Thu, 21 Dec 2017 05:58:13 -0800

Thanks Dan.

On Tue, Dec 19, 2017 at 8:10 PM, Dan Russ <[email protected]> wrote:


>
> > On Dec 19, 2017, at 8:27 AM, Manoj B. Narayanan <
> [email protected]> wrote:
> >
> > Hi all,
> >
> > I tried varying the custom features we provide to the model. I have a few
> > queries regarding it.
> >
> > 1. Will the probability for a particular feature get affected if I add it
> > multiple times?
> >     Eg. If I add a feature 'pos=NN' multiple times, will it have an
> impact
> > on model performance?
>
> Yes, but not necessarily the way you expect.  Remember that maxent tries
> to maximize the probability of being correct given the training data.
> Naive Bayes models are much easier to see the effect of multiple pos=NN
> features than in maxent
>
> >
> > 2. What if I add the same feature differently?
> >     Eg. I add 'pos=NN' and 'partsOfSpeech=NN', what will be the impact.
> > These 2 are always co-occurring too. So how will the model treat them.
> >
>
> Maxent is not really effected by correlated terms.  It will split the
> weights among the terms.  Your two terms (pos=NN and partOfSpeech=NN) are
> completely terms.
>
> > 3. How does the model learn the features? Please give a small example.
>
> There are 2 training methods in OpenNLP.  GIS and L-BFGS.  Essentially you
> are trying to maximize P( correct outcome | features),   the outcome is exp
> (sum( WiXi) )/ sumj ( exp(W,Xj).   Both GIS and L-BFGS are iterative
> methods to get the weights that get the most training cases correct.
>
> >
> > 4. What if we can add classes to the features?
> >     Eg. Certain features can have only a certain set of values.
>
> This is actually an incorrect statement.  A feature is a function that
> indicates the PRESENCE of something WITH A PARTICULAR OUTCOME.  What does
> that mean?  If your “document” is: “The cat” and you have two outcomes
> (“animal” and “mineral”). Than there are 2 features associated with the
> word “cat”, F(“CAT”,animal) and F(“CAT”,mineral).   These features each
> have different weights.
>
> > If we are
> > able to label them, can we make the model learn features according to the
> > labels?
>
> Given what I said about features earlier, yes you can (you actually alway
> do this).  In openNLP, if you add a term (actually a predicate, but I won’t
> call it that) to an event context, you are associating a term with an
> outcome (a feature) -- (But understand that all potential features exist,
> they just have a weight of 0.).
>
> >     Say, I have a) pos feature b) dictionary feature
> >     If the probability is calculated with respect to the corresponding
> > class (pos / dictionary) and then the overall probability is calculated
> how
> > will the model behave?
>
> Not sure I understand your question.
>
> >    Instead of giving a single string as a feature what if we give a key,
> > value pair as feature?
>
> If you give a=b that is a single feature, a=c is another feature that has
> nothing to do with a=b.
>
> >
> > Awaiting discussion regarding these.
> >
> > Thanks,
> > Manoj.
>
>


-- 
Regards,
Manoj.

Re: Feature Manipulation for Maxent model in NER

Reply via email to