[ https://issues.apache.org/jira/browse/MAHOUT-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13159075#comment-13159075 ]
Jake Mannix commented on MAHOUT-880: ------------------------------------ many of these sound great, yes! I'd have one suggestion, however: DistributedRowMatrix implements the interface VectorIterable, which the interface Matrix extends. The methods you mention which are already in VectorIterable should just get pulled up into VectorIterable. Of course, it requires that we do some careful checking that someone who calls DistributedRowMatrix.minus(DenseMatrix) behaves sensibly. I would imagine this case would be handled by the fact that there is no sensible reason why you would have a DistributedRowMatrix and a DenseMatrix of the exact same cardinalities (one fits in RAM, but the other needs to live on HDFS?). Regarding some of these methods: 4) I'm not sure about - do we have uses for these? If you have a DistributedRowMatrix: a humongous HDFS SequenceFile of Vectors, what exactly are you going to do with the upper triangle of it? Diagonal I can see, I guess. Extract a vector of the diagonal from the whole distributed matrix, sure. 6) is actually being looked at in MAHOUT-884 7) we like solvers, yes, but the methods don't go in our matrix classes, they go in separate solver classes, and take matrix (or DistributedRowMatrix) as inputs. 8) also is good and we'd always like more I/O hooks, but again, should be in other classes, and in some ways already exists: VectorDumper allows the option of dumping a DistributedRowMatrix from SequenceFile to CSV, and I think we have some support for ARFF as well, somewhere. > Add some matrix method(like addition, subtraction, norm ... etc) to > DistributedRowMatrix > ---------------------------------------------------------------------------------------- > > Key: MAHOUT-880 > URL: https://issues.apache.org/jira/browse/MAHOUT-880 > Project: Mahout > Issue Type: New Feature > Components: Math > Affects Versions: 0.6 > Reporter: Wangda Tan > Priority: Minor > Labels: DistributedRowMatrix > Attachments: MAHOUT-880.patch > > > I'm a new to Mahout, I didn't find some basic matrix functions. This make > users cannot do many tasks by CLI or API, if user get some result through > existing map-reduce matrix operation (like svd), he cannot do farther steps. > I make a list for it: > 1) Addition, Subtraction > 2) Norm (like norm-1, norm-2, norm-frobenius) > 3) Matrix compare > 4) Get lower triangle, upper triangle and diagonal > 5) Get identity and zero matrix > 6) Put two or matrix to together: A = [A1, A2] > 7) More linear equations solver method, like Gaussian elimination (maybe it's > hard to implement) > 8) import and export CSV, ARFF ... (this will very useful when user want to > reuse result from or to other applications like MATLAB) > I want to know is there any plan to do this, if so, I can make some efforts > to implement these. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira