Dear Mahout-Users,
I would like to use the ALS implementation available in Mahout as
reference in a performance evaluation. The challenge for me, as I have
little knowledge about the Mahout implementation, is to ensure that the
exact same setup is running.
I want to obtain timings for the alternating least square iteration,
using a defined test matrix and a dimension of 'f' for the small matrix
used in the minimization process - I think it is called 'feature space
dimension' in literature.
Assuming I want to run 10 iterations, use a feature space dimension of
50 and 8 threads, is the following command correct, or does this include
more than the ALS algorithm?
mahout parallelALS --input test.data --output output --lambda 0.1
--implicitFeedback true --alpha 0.8 --numFeatures 50 --numIterations 1-
--numThreadsPerSolver 8 --tempDir tmp
I am asking because the runtime seems to be quite large. In case the
timing includes operations other then the ALS, is there a way to exclude
them?
I appreciate any feedback! Thanks in advance, Hartwig