[ 
https://issues.apache.org/jira/browse/MAHOUT-1976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999581#comment-15999581
 ] 

ASF GitHub Bot commented on MAHOUT-1976:
----------------------------------------

GitHub user rawkintrevo opened a pull request:

    https://github.com/apache/mahout/pull/314

    MAHOUT-1976 Add CanopyClustering

    MAHOUT-1976 Add Canopy Clustering
    
    ### Purpose of PR:
    1 . Primarily, this PR adds CanopyClustering to Algorithms Framework.
    2. This PR introduces the "clustering" framework of the algorithms framework
    3. this PR introduces distance metrics and ports two metrics from the old 
MR code base. 
    
    ### Important ToDos
    Please mark each with an "x"
    - [x] Opening PR against `develop` NOT `master` (OR `feature-name` if this 
is part of an ongoing feature development). **need to delete this requirement, 
JIRA needed**
    - [x] A JIRA ticket exists (if not, please create this 
first)[https://issues.apache.org/jira/browse/ZEPPELIN/]
    - [x] Title of PR is "MAHOUT-XXXX Brief Description of Changes" where XXXX
    is the JIRA number.
    - [x] Created unit tests where appropriate
    - [x] Added licenses correct on newly added files
    - [x] Assigned JIRA to self
    - [x] Added documentation in scala docs/java docs, (and website once that
    is merged to dev)
    - [x] Successfully built and ran all unit tests, verified that all tests
    pass locally.
    
    
    Oh by the way, does this change break earlier versions?
    No
    
    Is this the beginning of a larger project for which a feature branch should 
be made?
    No

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rawkintrevo/mahout mahout-1976

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/314.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #314
    
----
commit 7f18775afae639c1b291fb0273d92dc71de24884
Author: rawkintrevo <trevor.d.gr...@gmail.com>
Date:   2017-05-04T14:25:42Z

    MAHOUT-1976 Add CanopyClustering
    
    MAHOUT-1976 Add Canopy Clustering
    
    forgot unit tests

----


> Add Canopy Clustering Algorithm
> -------------------------------
>
>                 Key: MAHOUT-1976
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1976
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Algorithms
>    Affects Versions: 0.13.2
>            Reporter: Trevor Grant
>            Assignee: Trevor Grant
>
> Primarily, we need to lay out the clustering section of the Algorihtms 
> Framework.
> The Canopy Clustering Algorithm is very simple and yet very useful as a 
> preprocessing step for more advanced clustering algorithms such as KMeans and 
> Hierarchical Clustering. 
> https://en.wikipedia.org/wiki/Canopy_clustering_algorithm
> The majority of the "work" on this PR will be creating the framework. 
> It is also one of the Legacy MR algorithms that would be nice to port.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to