Re: newbie intro

Ted Dunning Fri, 25 Sep 2009 08:57:44 -0700

This may help in the margins, but it is surprising how good simpler methods
work.

tf-idf is, btw, an approximation of the LLR score.  There some interesting
edge conditions where the approximation breaks, notably when there are
several occurrences in the text of interest.

On Fri, Sep 25, 2009 at 5:37 AM, Isabel Drost <[email protected]> wrote:

> So I think, POS tags and TFIDF should be features determining whether
> a phrase should be considered as key phrase or not - maybe even key
> indicators to generate a key phrase candidate set. But there may be many
> more features. Lastly it might be easier to come up with a
> training set of good and bad phrases (plus their feature vectors) and
> let a classifier do the selection compared to manually hand coding the
> rules and feature weights for phrase selection.
>

-- 
Ted Dunning, CTO
DeepDyve

Re: newbie intro

Reply via email to