[jira] [Commented] (SPARK-4936) Please support Named Vector so as to maintain the record ID in clustering etc.

Sean Owen (JIRA) Tue, 23 Dec 2014 06:08:50 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14256985#comment-14256985
 ]


Sean Owen commented on SPARK-4936:
----------------------------------

Are you referring to the NamedVector idea from Mahout? I think that already 
exists in a different form here.

If you have an RDD of (<identifier>,Vector), then you can already use a 
clustering model to map all the values to a predicted cluster with mapValues(), 
and end up with an RDD of (<identifier>, <cluster>). 

If that's what you're looking for, then it does not require further work in 
Spark.

> Please support Named Vector so as to maintain the record ID in clustering etc.
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-4936
>                 URL: https://issues.apache.org/jira/browse/SPARK-4936
>             Project: Spark
>          Issue Type: Improvement
>    Affects Versions: 1.1.1
>            Reporter: mahesh bhole
>            Priority: Minor
>
> Hi
> Please support Named Vector so as to maintain the record ID in clustering etc.
> Thanks,
> Mahesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-4936) Please support Named Vector so as to maintain the record ID in clustering etc.

Reply via email to