[ 
https://issues.apache.org/jira/browse/MAHOUT-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159075#comment-13159075
 ] 

Jake Mannix commented on MAHOUT-880:
------------------------------------

many of these sound great, yes!

I'd have one suggestion, however: DistributedRowMatrix implements the interface 
VectorIterable, which the interface Matrix extends.  The methods you mention 
which are already in VectorIterable should just get pulled up into 
VectorIterable.  

Of course, it requires that we do some careful checking that someone who calls 
DistributedRowMatrix.minus(DenseMatrix) behaves sensibly.  I would imagine this 
case would be handled by the fact that there is no sensible reason why you 
would have a DistributedRowMatrix and a DenseMatrix of the exact same 
cardinalities (one fits in RAM, but the other needs to live on HDFS?).

Regarding some of these methods: 4) I'm not sure about - do we have uses for 
these?  If you have a DistributedRowMatrix: a humongous HDFS SequenceFile of 
Vectors, what exactly are you going to do with the upper triangle of it?  
Diagonal I can see, I guess.  Extract a vector of the diagonal from the whole 
distributed matrix, sure.

6) is actually being looked at in MAHOUT-884

7) we like solvers, yes, but the methods don't go in our matrix classes, they 
go in separate solver classes, and take matrix (or DistributedRowMatrix) as 
inputs.

8) also is good and we'd always like more I/O hooks, but again, should be in 
other classes, and in some ways already 
exists: VectorDumper allows the option of dumping a DistributedRowMatrix from 
SequenceFile to CSV, and I think we have some support for ARFF as well, 
somewhere.
                
> Add some matrix method(like addition, subtraction, norm ... etc) to 
> DistributedRowMatrix
> ----------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-880
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-880
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Math
>    Affects Versions: 0.6
>            Reporter: Wangda Tan
>            Priority: Minor
>              Labels: DistributedRowMatrix
>         Attachments: MAHOUT-880.patch
>
>
> I'm a new to Mahout, I didn't find some basic matrix functions. This make 
> users cannot do many tasks by CLI or API, if user get some result through 
> existing map-reduce matrix operation (like svd), he cannot do farther steps. 
> I make a list for it:
> 1) Addition, Subtraction 
> 2) Norm (like norm-1, norm-2, norm-frobenius)
> 3) Matrix compare
> 4) Get lower triangle, upper triangle and diagonal
> 5) Get identity and zero matrix
> 6) Put two or matrix to together: A = [A1, A2]
> 7) More linear equations solver method, like Gaussian elimination (maybe it's 
> hard to implement)
> 8) import and export CSV, ARFF ... (this will very useful when user want to 
> reuse result from or to other applications like MATLAB)
> I want to know is there any plan to do this, if so, I can make some efforts 
> to implement these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to