[ 
https://issues.apache.org/jira/browse/SPARK-1357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964649#comment-13964649
 ] 

Sean Owen commented on SPARK-1357:
----------------------------------

Yeah I think it's reasonable to say that the core ALS API is only in terms of 
numeric IDs and leave a higher-level translation to the caller. Longs give that 
much more space to hash into.

The "cost" in terms of memory of something like a String is just a reference, 
so roughly the same as a Double anyway. I think the more important question is 
whether Double is too hacky API-wise as a representation of fundamentally 
non-numeric data. That's up for debate, but yeah the question here is more 
about reserving the right to change.

I'll submit a PR that marks the items I mention as experimental, for 
consideration. See if it seems reasonable.

> [MLLIB] Annotate developer and experimental API's
> -------------------------------------------------
>
>                 Key: SPARK-1357
>                 URL: https://issues.apache.org/jira/browse/SPARK-1357
>             Project: Spark
>          Issue Type: Sub-task
>          Components: MLlib
>    Affects Versions: 1.0.0
>            Reporter: Patrick Wendell
>            Assignee: Xiangrui Meng
>             Fix For: 1.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to