Hey all, Our "flash-freeze" has unthawed considerably, but I've been trying to be good and not check in stuff with functionality improvements I've been wanting to check in.
What do you folks say about me checking in fixes for MAHOUT-312 </jira/browse/MAHOUT-312> DistributedRowMatrix iterateAll() and iterate() don't work on multi-part SequenceFiles </jira/browse/MAHOUT-312> MAHOUT-313 </jira/browse/MAHOUT-313> DistributedRowMatrix needs times(Vector) implementation as M/R job </jira/browse/MAHOUT-313> which together basically solve MAHOUT-310 </jira/browse/MAHOUT-310> LanczosSolver and DistributedLanczosSolver always assume rectangular input, but should also handle symmetric eigensystems. </jira/browse/MAHOUT-310> and a nice freebie I've got a patch for is MAHOUT-314 </jira/browse/MAHOUT-314>DistributedRowMatrix needs a sparse DistributedRowMatrix times(DistributedRowMatrix other) implementation</jira/browse/MAHOUT-314> Unit tests are included, and all current tests continue to pass. These changes make DistributedRowMatrix a much-more fully-featured Matrix implementation, and will allow for our users to come up with their own ideas for uses of a nice big HDFS-backed sparse matrix, instead of simply using it for SVD (which is basically all it's useful for without this patch). Can I get these in for 0.3? I could commit today if it's ok with the team. -jake