Hi,

I'm trying to apply a PCA to reduce the dimension of a matrix of 1603
columns and 100.000 to 30.000.000 lines using ssvd with the pca option, and
I always get a StackOverflowError :

Here is my command line :
mahout ssvd -i /user/myUser/Echant100k -o /user/myUser/Echant/SVD100 -k 100
-pca "true" -U "false" -V "false" -t 3 -ow

I also tried to put "-us true" as mentionned in
https://cwiki.apache.org/confluence/download/attachments/27832158/SSVD-CLI.pdf?version=18&modificationDate=1381347063000&api=v2but
the option is not available anymore.

The output of the previous command is :
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/cloudera/parcels/CDH/lib/hadoop/bin/hadoop
and HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.7-cdh4.5.0-job.jar
14/03/04 14:45:16 INFO common.AbstractJob: Command line arguments:
{--abtBlockHeight=[200000], --blockHeight=[10000], --broadcast=[true],
--computeU=[false], --computeV=[false], --endPhase=[2147483647],
--input=[/user/myUser/Echant100k], --minSplitSize=[-1],
--outerProdBlockHeight=[30000], --output=[/user/myUser/Echant/SVD100],
--oversampling=[15], --overwrite=null, --pca=[true], --powerIter=[0],
--rank=[100], --reduceTasks=[3], --startPhase=[0], --tempDir=[temp],
--uHalfSigma=[false], --vHalfSigma=[false]}
Exception in thread "main" java.lang.StackOverflowError
at
org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
 at
org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
at
org.apache.mahout.math.hadoop.MatrixColumnMeansJob.run(MatrixColumnMeansJob.java:55)
...

I search online and didn't find a solution to my problem.

Can you help me ?

Thanks in advance,

-- 
Kévin Moulart

Reply via email to