2013/7/14 Lars Buitinck <[email protected]>: > 2013/7/12 Olivier Grisel <[email protected]>: >> 2013/7/12 Lars Buitinck <[email protected]>: >>> 2013/7/12 Antonio Manuel Macías Ojeda <[email protected]>: >>> Pretty good results actually. I was clustering these words to get >>> extra features for a NER tagger, which immediately got a boost in F1 >>> score. >> >> Interesting. Do you run a clustering algorithm for each individual >> words or do cluster POS tag context for all the center words at once? > > All words, represented as typical NER feature vectors (previous word > is "mr.", capitalization, that kind of stuff, and conjunctions of > these). The trick to make this work is to train a feature selector on > the labeled set first, otherwise the centroids get huge. > > (Also L2-normalization seems to help; I'm not really sure why yet. > Might have to do with the conjunctive features.) > >> How many cluster do you extract? Have you tried any heuristics to find >> the "true" number of clusters or do you just over allocate n_cluster >> and let the supervised model that will use the cluster activation >> features deal with an overcomplete feature space? > > So far, I've been following the advice in the recent literature, which > is "more clusters is always better" :)
Thanks! Looking forward to reading the preprint or some code on this :) -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
