Hello,

I am trying to use the Mahout/Java API to do PCA but I am confused about
the write order to do things.  To start, I have a list of DenseVectors that
I am reading into the code and turning it into a distributed matrix in the
following form.

 DistributedRowMatrix m = new DistributedRowMatrix(input_vec, matrix_path,
num_rows,num_cols);

When I run this code, I would have thought it would output the result into
the path called "matrix_path" so that I can then use something like
MatrixColumnMeansJob.run
to get mean. When I run this bit of code I get no output, is there
something else I should do or is there a better way to calculate the mean
for my file.


>From what I understand about the SSVD CI code, you need to calculate the
column mean and then output it into a directory. Is there a good way to do
this if I am starting from a file which is a sequence file of DenseVectors?


-- 

*Chirag Lakhani*

Data Scientist

Zaloni, Inc. | www.zaloni.com

633 Davis Dr., Suite 200

Durham, NC 27713
e: clakh...@zaloni.com
p: 919.602.4965 x7020

Reply via email to