2010/1/19 Ted Dunning <[email protected]>: > Look at BinaryRandomizer (which implements TermRandomizer). > > On Tue, Jan 19, 2010 at 10:58 AM, Jeff Eastman > <[email protected]>wrote: > >> Looking in MAHOUT-228-3.patch, I don't see any sparse vectorizer. Did you >> have another patch in mind?
I you plan to apply the MAHOUT-228-3 patch you can directly clone my git branch: http://github.com/ogrisel/mahout/commits/MAHOUT-228 I have also just added a sample hadoop driver to deterministically map unbounded dimensional documents to some fixed dimensional space using the random projection induced by the BinaryRandomizer. -- Olivier http://twitter.com/ogrisel - http://code.oliviergrisel.name
