2013/7/12 Olivier Grisel <[email protected]>:
> 2013/7/12 Lars Buitinck <[email protected]>:
>> 2013/7/12 Antonio Manuel Macías Ojeda <[email protected]>:
>> Pretty good results actually. I was clustering these words to get
>> extra features for a NER tagger, which immediately got a boost in F1
>> score.
>
> Interesting. Do you run a clustering algorithm for each individual
> words or do cluster POS tag context for all the center words at once?

All words, represented as typical NER feature vectors (previous word
is "mr.", capitalization, that kind of stuff, and conjunctions of
these). The trick to make this work is to train a feature selector on
the labeled set first, otherwise the centroids get huge.

(Also L2-normalization seems to help; I'm not really sure why yet.
Might have to do with the conjunctive features.)

> How many cluster do you extract? Have you tried any heuristics to find
> the "true" number of clusters or do you just over allocate n_cluster
> and let the supervised model that will use the cluster activation
> features deal with an overcomplete feature space?

So far, I've been following the advice in the recent literature, which
is "more clusters is always better" :)

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to