[ https://issues.apache.org/jira/browse/MAHOUT-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated MAHOUT-71: ---------------------------- Fix Version/s: 0.3 Assignee: Deneche A. Hakim Deneche do you think this issue is still live? Is it possible to read any input in general into matrix form? > Dataset to Matrix Reader > ------------------------ > > Key: MAHOUT-71 > URL: https://issues.apache.org/jira/browse/MAHOUT-71 > Project: Mahout > Issue Type: New Feature > Reporter: Deneche A. Hakim > Assignee: Deneche A. Hakim > Priority: Minor > Fix For: 0.3 > > > This component should allow the input datasets to be read as Matrix Rows. > A Map-Reduce Algorithm should handle any dataset in a matrix format, where > the collumns are the attributes (and one of them is the Label) and the rows > are the datas. > Working with Hadoop, we'll need to pass the dataset in the mapper's input, so > it must be a file (or many files). We'll then need a custom InputFormat to > feed the mappers with the data, and here comes the lovely-named "row-wise > splitting matrix input format". > Now we want to be able to work with any given dataset file format (including > the ARFF and my custom format), and thus the InputFormat needs a decoder that > converts the dataset lines into matrix rows. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.