Jeff, it's mahout-376 patch i don't think it is committed. the driver class there is SSVDCli, for your convenience you can find it here : https://github.com/dlyubimov/ssvd-lsi/tree/givens-ssvd/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd
but like i said, i did not try to use it with -D option since i wanted to give an explicit option to increase split size if needed (and a help for it). Another reason is that solver has a series of jobs and only those reading the source matrix have anything to do with the split size. -d On Tue, Dec 28, 2010 at 4:39 PM, Jeff Eastman <[email protected]> wrote: > What's the driver class? If the -D parameters are working for you I want to > compare to the clustering drovers > > -----Original Message----- > From: Dmitriy Lyubimov [mailto:[email protected]] > Sent: Tuesday, December 28, 2010 4:37 PM > To: [email protected] > Subject: Re: where i can set -Dmapred.map.tasks=X > > as far as i understand, this option is not forced. I suspect it actually > means 'minimum degree of parallelism'. so if you expect to use that to > reduce number of mappers, i don't think this is expected to work so much. > The one that do enforce anything are min split size and max split size in > file input so i guess you can try those. I rely on them (and open it up as > a > job-specific option) in stochastic svd. > > but usually forcing split size to increase creates a 'superslits' problem, > where a lot of data is moved around to just supply data to mappers. which > is > perhaps why this option is meant to increase parallelism only, but probably > not to decrease it. > > -d > > On Tue, Dec 28, 2010 at 4:05 PM, Jeff Eastman <[email protected]> wrote: > > > This is supposed to be a generic option. You should be able to specify > > Hadoop options such as this on the command line invocation of your > favorite > > Mahout routine, but I'm having a similar problem setting > > -Dmapred.reduce.tasks=10 with Canopy and k-Means. This is both with and > > without a space after the -D. > > > > Can someone point me to a Mahout command where this does work? Both > drivers > > extend AbstractJob and do the usual option processing pushups. I don't > have > > Hadoop source locally so I can't debug the generic options parsing. > > > > -----Original Message----- > > From: beneo_7 [mailto:[email protected]] > > Sent: Monday, December 27, 2010 10:45 PM > > To: [email protected] > > Subject: where i can set -Dmapred.map.tasks=X > > > > i read onMahout in Action that I should set -Dmapred.map.tasks=X > > but it did not work for hadoop > > >
