[ 
https://issues.apache.org/jira/browse/MAHOUT-148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Anil updated MAHOUT-148:
------------------------------

    Attachment: MAHOUT-148.patch

Verified by running all combinations of

Bayes|CBayes
hdfs|hbase 
sequential|mapreduce
both Training and Testing.

Noticed a slight improvement in running time of various map/reduce jobs (20% 
decrease for 20newsgroups dataset)



> Convert Classification Algs to use richer Writable syntax
> ---------------------------------------------------------
>
>                 Key: MAHOUT-148
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-148
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.1, 0.2
>            Reporter: Grant Ingersoll
>            Assignee: Robin Anil
>             Fix For: 0.2
>
>         Attachments: MAHOUT-148-Work-In-Progress.patch, MAHOUT-148.patch
>
>
> Much of the classification capabilities relies on parsing values out from the 
> Text object just to determine what type of "thing" is being used.  We should 
> try to avoid having to do string manipulation for this kind of thing and 
> instead encapsulate it in Writable instances.  This should make things 
> perform faster and bring stronger typing to the problem, which should make it 
> easier to understand and debug the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to