[ 
https://issues.apache.org/jira/browse/MAHOUT-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687584#comment-13687584
 ] 

Jake Mannix commented on MAHOUT-1266:
-------------------------------------

As mentioned in the javadocs for the method, it does *not* implement A * B, it 
implements A.transpose() * B, because this operation can be done in one 
map-reduce pass (with both SequenceFiles backing A and B as inputs), while 
computing A * B takes two map-reduce passes.

Why try and super-speed up the process with GPU, like in your code linked to, 
if you're going to have to make two full passes (your call to .transpose()) 
over your distributed data set?  That will inevitably be way slower than 
anything (unoptimized) you can compute in one MR pass, by nature of all the 
disk IO.
                
> Two minor problems in DistributedRowMatrix using MatrixMultiplication
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-1266
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1266
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.7
>            Reporter: Martin Illecker
>            Priority: Trivial
>              Labels: newbie
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> Hello,
> I think I have found two minor problems in *DistributedRowMatrix*.
> In [1] the condition is wrong, because (l x m) * (m x n) = (l x n).
> The condition should be like in [2]. 
> And in *times*[3] the {{this.transpose()}} seems to be missing? (See [4])
> Do you have any benchmark results for Mahout MatrixMultiplication?
> Thanks!
> Martin
> [1] 
> [https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java#L191-193]
> [2] 
> [https://github.com/millecker/applications/blob/master/hadoop/rootbeer/matrixmultiplication/src/at/illecker/hadoop/rootbeer/examples/matrixmultiplication/DistributedRowMatrix.java#L221-225]
> [3] 
> [https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java#L190-206]
> [4] 
> [https://github.com/millecker/applications/blob/master/hadoop/rootbeer/matrixmultiplication/src/at/illecker/hadoop/rootbeer/examples/matrixmultiplication/DistributedRowMatrix.java#L230-231]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to