Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1393#issuecomment-49069526 No, MLlib is not experimental, only the parts annotated with @Exprimental are. The reason is that we felt we could continue supporting these low-level APIs indefinitely and add other ones later if we need to. Again, for real users, API stability matters *much* more than you'd think -- there's nothing more annoying than having to change your app to implement a software upgrade, and it causes fragmentation of the userbase as users stick to an older version instead of upgrading. In this particular case, there are a few things we can do. We can think of additions to the API here that preserve the old methods but add new versions of predict. We can add a new class called LongALS or something like that, and have these ones call it and get back a LongMatrixFactorizationModel. Or we can offer a utility to generate unique IDs. The reason I was asking about hash collisions above is that even with 64-bit IDs, you're not guaranteed to be collision-free. With 2-3 billion users you actually have a good chance of a collision. So if applications care about that, they may not be okay with this solution either.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---