Re: User/Product Clustering with pySpark ALS

Nick Pentreath Tue, 29 Apr 2014 09:24:41 -0700

There's no easy way to d this currently. The pieces are there from the PySpark 
code for regression which should be adaptable.



But you'd have to roll your own solution.




This is something I also want so I intend to put together a pull request for 
this soon
—
Sent from Mailbox

On Tue, Apr 29, 2014 at 4:28 PM, Laird, Benjamin
<benjamin.la...@capitalone.com> wrote:

> Hi all -
> I’m using pySpark/MLLib ALS for user/item clustering and would like to 
> directly access the user/product RDDs (called userFeatures/productFeatures in 
> class MatrixFactorizationModel in 
> mllib/recommendation/MatrixFactorizationModel.scala
> This doesn’t seem to complex, but it doesn’t seem like the functionality is 
> currently available. I think it requires accessing the underlying java mode 
> like so:
> model = ALS.train(ratings,1,iterations=1,blocks=5)
> userFeatures = RDD(model.javamodel.userFeatures, sc, ???)
> However, I don’t know what to pass as the deserializer. I need these low 
> dimensional vectors as an RDD to then use in Kmeans clustering. Has anyone 
> done something similar?
> Ben
> ________________________________________________________
> The information contained in this e-mail is confidential and/or proprietary 
> to Capital One and/or its affiliates. The information transmitted herewith is 
> intended only for use by the individual or entity to which it is addressed.  
> If the reader of this message is not the intended recipient, you are hereby 
> notified that any review, retransmission, dissemination, distribution, 
> copying or other use of, or taking of any action in reliance upon this 
> information is strictly prohibited. If you have received this communication 
> in error, please contact the sender and delete the material from your 
> computer.

Re: User/Product Clustering with pySpark ALS

Reply via email to