There's no easy way to d this currently. The pieces are there from the PySpark code for regression which should be adaptable.
But you'd have to roll your own solution. This is something I also want so I intend to put together a pull request for this soon — Sent from Mailbox On Tue, Apr 29, 2014 at 4:28 PM, Laird, Benjamin <benjamin.la...@capitalone.com> wrote: > Hi all - > I’m using pySpark/MLLib ALS for user/item clustering and would like to > directly access the user/product RDDs (called userFeatures/productFeatures in > class MatrixFactorizationModel in > mllib/recommendation/MatrixFactorizationModel.scala > This doesn’t seem to complex, but it doesn’t seem like the functionality is > currently available. I think it requires accessing the underlying java mode > like so: > model = ALS.train(ratings,1,iterations=1,blocks=5) > userFeatures = RDD(model.javamodel.userFeatures, sc, ???) > However, I don’t know what to pass as the deserializer. I need these low > dimensional vectors as an RDD to then use in Kmeans clustering. Has anyone > done something similar? > Ben > ________________________________________________________ > The information contained in this e-mail is confidential and/or proprietary > to Capital One and/or its affiliates. The information transmitted herewith is > intended only for use by the individual or entity to which it is addressed. > If the reader of this message is not the intended recipient, you are hereby > notified that any review, retransmission, dissemination, distribution, > copying or other use of, or taking of any action in reliance upon this > information is strictly prohibited. If you have received this communication > in error, please contact the sender and delete the material from your > computer.