[ https://issues.apache.org/jira/browse/MAHOUT-884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13149487#comment-13149487 ]
Jake Mannix commented on MAHOUT-884: ------------------------------------ Ah! So the point is to concatenate the *rows themselves*. This makes much more sense, yes, I can see wanting to do this. Then this should be a map-reduce job, not a sequential process, as these matrices could be really large. Identity mapper + reduce-side join with concatenation would be the most straightforward scalable way to do it. > Matrix Concatenate utility > -------------------------- > > Key: MAHOUT-884 > URL: https://issues.apache.org/jira/browse/MAHOUT-884 > Project: Mahout > Issue Type: New Feature > Components: Integration > Reporter: Lance Norskog > Priority: Minor > Attachments: MAHOUT-884.patch > > > Utility to concatenate matrices stored as SequenceFiles of vectors. > Each pair in the SequenceFile is the IntWritable row number and a > VectorWritable. > The input and output files may skip rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira