Re: how to implement ALS with csv file? getting error while calling Rating class

2016-03-08 Thread Nick Pentreath
As I mentioned, using that *train* method returns the user and item factor RDDs, as opposed to an ALSModel instance. You first need to construct a model manually yourself. This is exactly why it's marked as *DeveloperApi*, since it is not user-friendly and not strictly part of the ML pipeline

Re: how to implement ALS with csv file? getting error while calling Rating class

2016-03-07 Thread Kevin Mellott
If you are using DataFrames, then you also can specify the schema when loading as an alternate solution. I've found Spark-CSV to be a very useful library when working with CSV data.

Re: how to implement ALS with csv file? getting error while calling Rating class

2016-03-06 Thread Nick Pentreath
As you've pointed out, Rating requires user and item ids in Int form. So you will need to map String user ids to integers. See this thread for example: https://mail-archives.apache.org/mod_mbox/spark-user/201501.mbox/%3CCAJgQjQ9GhGqpg1=hvxpfrs+59elfj9f7knhp8nyqnh1ut_6...@mail.gmail.com%3E .

how to implement ALS with csv file? getting error while calling Rating class

2016-03-06 Thread Shishir Anshuman
I am new to apache Spark, and I want to implement the Alternating Least Squares algorithm. The data set is stored in a csv file in the format: *Name,Value1,Value2*. When I read the csv file, I get *java.lang.NumberFormatException.forInputString* error because the Rating class needs the parameters