Hello, I am trying to use the Mahout/Java API to do PCA but I am confused about the write order to do things. To start, I have a list of DenseVectors that I am reading into the code and turning it into a distributed matrix in the following form.
DistributedRowMatrix m = new DistributedRowMatrix(input_vec, matrix_path, num_rows,num_cols); When I run this code, I would have thought it would output the result into the path called "matrix_path" so that I can then use something like MatrixColumnMeansJob.run to get mean. When I run this bit of code I get no output, is there something else I should do or is there a better way to calculate the mean for my file. >From what I understand about the SSVD CI code, you need to calculate the column mean and then output it into a directory. Is there a good way to do this if I am starting from a file which is a sequence file of DenseVectors? -- *Chirag Lakhani* Data Scientist Zaloni, Inc. | www.zaloni.com 633 Davis Dr., Suite 200 Durham, NC 27713 e: clakh...@zaloni.com p: 919.602.4965 x7020