2013/7/11 Gad Abraham <[email protected]>:
> I'm very much a sklearn beginner, and I'd like to use FeatureHasher to
> reduce the dimensionality of a numeric matrix. Any hints on how to do this?
> I've seen the examples showing how to use it with text.
You mean the input is a NumPy array? There's no special support for
that, but the following should work (though it may be slow). Let X be
your array and d the desired dimensionality, then:
hasher = FeatureHasher(n_features=d, input_type="pair")
features = map(str, range(X.shape[1]))
Xh = hasher.transform(zip(features, row) for row in X).toarray()
hashes X into Xh of shape (X.shape[0], d).
You might want to look at the random projection module [1], which can
do somewhat similar transforms much more quickly.
[1]
http://scikit-learn.org/stable/modules/random_projection.html#random-projection
--
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general