I think we should split out a separate issue for ARFF (didn't Karl start one already?) and tackle that too. It seems like reading ARFF should be generally useful.

On Jul 24, 2008, at 2:41 PM, Deneche A. Hakim (JIRA) wrote:


[ https://issues.apache.org/jira/browse/MAHOUT-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616573 #action_12616573 ]

Deneche A. Hakim commented on MAHOUT-56:
----------------------------------------

After meny attemps to load all the informations that you gave me in my brain-processing-cluster-that-doesnt-work-quit-well, let's see if I understand it correctly:

The algortihm handles any dataset in a matrix format, where (in my case) the collumns are the attributes (and one of them is the Label) and the rows are the datas.

Working with Hadoop, we'll need to pass the dataset in the mapper's input, so it must be a file (or many files). We'll then need a custom InputFormat to feed the mappers with the data, and here comes the lovely-named "row-wise splitting matrix input format".

Now we want to be able to work with any given dataset file format (including the ARFF and my custom format), and thus the InputFormat needs a decoder that converts the dataset lines into matrix rows.

Watchmaker Integration
----------------------

               Key: MAHOUT-56
               URL: https://issues.apache.org/jira/browse/MAHOUT-56
           Project: Mahout
        Issue Type: Task
        Components: Genetic Algorithms
          Reporter: Deneche A. Hakim
          Assignee: Grant Ingersoll
          Priority: Minor
           Fix For: 0.1

Attachments: libs.zip, libs.zip, libs.zip, tsp- screenshot-1.jpg, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch, watchmaker-tsp.patch


The goal of this task is to allow watchmaker definded problems be solved in Mahout.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Reply via email to