[ 
https://issues.apache.org/jira/browse/MAHOUT-384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859703#action_12859703
 ] 

Sean Owen commented on MAHOUT-384:
----------------------------------

Let's also think about where it fits into the project. This is not a CF 
algorithm, is it? It looks more like classification. So I am not sure if a 
"top-level" outlier package is the right place?

Yes, as Robin says this ought to look a lot more like the other jobs in 
classification. More broadly we should be moving all jobs to work more alike 
(e.g. around AbstractJob) but if it looks like its neighbors, that's good. 
Right now we are using the older Hadoop 0.19.x APIs (i.e. not Configuration) 
since, well, the new APIs don't quite work in all cases and services like AWS 
don't support them yet.

> Implement of AVF algorithm
> --------------------------
>
>                 Key: MAHOUT-384
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-384
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Collaborative Filtering
>            Reporter: tony cui
>         Attachments: mahout-384.patch
>
>
> This program realize a outlier detection algorithm called avf, which is kind 
> of 
> Fast Parallel Outlier Detection for Categorical Datasets using Mapreduce and 
> introduced by this paper : 
>     http://thepublicgrid.org/papers/koufakou_wcci_08.pdf
> Following is an example how to run this program under haodoop:
> $hadoop jar programName.jar avfDriver inputData interTempData outputData
> The output data contains ordered avfValue in the first column, followed by 
> original input data. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to