[ 
https://issues.apache.org/jira/browse/MAHOUT-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated MAHOUT-71:
----------------------------

    Fix Version/s: 0.3
         Assignee: Deneche A. Hakim

Deneche do you think this issue is still live? Is it possible to read any input 
in general into matrix form?

> Dataset to Matrix Reader
> ------------------------
>
>                 Key: MAHOUT-71
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-71
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>            Priority: Minor
>             Fix For: 0.3
>
>
> This component should allow the input datasets to be read as Matrix Rows.
> A Map-Reduce Algorithm should handle any dataset in a matrix format, where 
> the collumns are the attributes (and one of them is the Label) and the rows 
> are the datas.
> Working with Hadoop, we'll need to pass the dataset in the mapper's input, so 
> it must be a file (or many files). We'll then need a custom InputFormat to 
> feed the mappers with the data, and here comes the lovely-named "row-wise 
> splitting matrix input format".
> Now we want to be able to work with any given dataset file format (including 
> the ARFF and my custom format), and thus the InputFormat needs a decoder that 
> converts the dataset lines into matrix rows.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to