[
https://issues.apache.org/jira/browse/MAHOUT-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176266#comment-13176266
]
Paritosh Ranjan commented on MAHOUT-931:
----------------------------------------
Ok.
Should I proceed like this :
Step 1) Encapsulte Cluster specific CLI arguments (ClusterConfig and its
cluster specific implementations)
Step 2) Implement all Clustering policies
Step 3) Implement outlier removal in policies.
Step 3a) First cut : use a probability threshold based outlier removal ( as
described in previous comment )
Step 3b) Final cut : Use cluster specific arguments for outlier removal.
Step 4) Replace Clustering Algorithms with Classifier/Iterator ( for algorithms
which can be done using this )
Regarding naming, I would say, that, readability should always be given
importance. I consider naming as an important part of software development,
either working alone or in a team. I prefer readable code than JavaDocs. The
current code is not having ample JavaDocs, so at least naming should be
appropriate. I am not pushing for name change, just expressing my thoughts.
If you agree upon implementing things in the order (Steps) I mentioned. Then I
can start implementing them. If you have any suggestions to improve them, then
please suggest.
> Implement a pluggable outlier removal capability for cluster classifiers
> ------------------------------------------------------------------------
>
> Key: MAHOUT-931
> URL: https://issues.apache.org/jira/browse/MAHOUT-931
> Project: Mahout
> Issue Type: Improvement
> Components: Classification, Clustering
> Affects Versions: 0.6
> Reporter: Paritosh Ranjan
> Fix For: 0.7
>
> Attachments: MAHOUT-931
>
>
> A pluggable outlier removal capability while classifying the clusters is
> needed. The classification and outlier removal implementations, both should
> be completely separate entities for better abstraction.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira