adam35413 wrote:
I did some testing and determined that a local copy of the results files AND
an HDFS copy has to exist. From looking at the code, it appears that
ClusterDump checks if the files exist locally at Line 129, but then tries to
read the files from HDFS aroudn Line 135. If the files
Hi Young,
The problem is one of documentation, and poor naming of the method:
DistributedRowMatrix.times(DistributedRowMatrix m)
should be called
DistributedRowMatrix.transposeTimes(DistributedRowMatrix m),
as it computes a.transpose().times(b), not a.times(b).
See the javadocs for the inte
I did some testing and determined that a local copy of the results files AND
an HDFS copy has to exist. From looking at the code, it appears that
ClusterDump checks if the files exist locally at Line 129, but then tries to
read the files from HDFS aroudn Line 135. If the files don't exist at bot
I'm trying to test DistributedRowMatrix in eclipse for matrix calcuration in
hadoop.
A =
[[85,68,30,15,50,34],
[53,38,19,70,90,29],
[20,83,19,38,82,34],
[67,50,68,86,64,53],
[84,71,30,85,82,73],
[2,43,54,50,66,31]]
DistributedRowMatrix m = DistributedRowMatrix(path,...);
and check the values of m
In my opinion the most important take-away from MapReduce is that you
dont move the data, but move the "computation towards the data". So I
implemented a persistent storage, and moved most of the critical
computations directly into the storage ( I did play around with Hama
before this).
Right now
The naive matrix-multiplication algorithm is highly parallelizable if
you have the data available locally at all the nodes. The persistent
storage issue was one of the first problems that I tried solving (HDFS
is just wrong for the access patterns in matrix algorithms).
I cant compete with Matlab
I'm trying to test DistributedRowMatrix in eclipse for matrix calcuration in
hadoop.
A =
[[85,68,30,15,50,34],
[53,38,19,70,90,29],
[20,83,19,38,82,34],
[67,50,68,86,64,53],
[84,71,30,85,82,73],
[2,43,54,50,66,31]]
DistributedRowMatrix m = DistributedRowMatrix(path,...)
;
and check the values of m