Re: ALS.trainImplicit block sizes

2016-10-21 Thread Nick Pentreath
Oh also you mention 20 partitions. Is that how many you have? How many ratings? It may be worth trying to reparation to larger number of partitions. On Fri, 21 Oct 2016 at 17:04, Nick Pentreath wrote: > I wonder if you can try with setting different blocks for user

Re: ALS.trainImplicit block sizes

2016-10-21 Thread Nick Pentreath
I wonder if you can try with setting different blocks for user and item? Are you able to try 2.0 or use Scala for setting it in 1.6? You want your item blocks to be a lot less than user blocks. Items maybe 5-10, users perhaps 250-500? Do you have many "power items" that are connected to almost

Re: ALS.trainImplicit block sizes

2016-10-21 Thread Nick Pentreath
How many nodes are you using in the cluster? On Fri, 21 Oct 2016 at 08:58 Nikhil Mishra wrote: > Thanks Nick. > > So we do partition U x I matrix into BxB matrices, each of size around U/B > and I/B. Is that correct? Do you know whether a single block of the matrix

Re: ALS.trainImplicit block sizes

2016-10-21 Thread Nick Pentreath
The blocks params will set both user and item blocks. Spark 2.0 supports user and item blocks for PySpark: http://spark.apache.org/docs/latest/api/python/pyspark.ml.html#module-pyspark.ml.recommendation On Fri, 21 Oct 2016 at 08:12 Nikhil Mishra wrote: > Hi, > > I

ALS.trainImplicit block sizes

2016-10-21 Thread Nikhil Mishra
Hi, I have a question about the block size to be specified in ALS.trainImplicit() in pyspark (Spark 1.6.1). There is only one block size parameter to be specified. I want to know if that would result in partitioning both the users as well as the items axes. For example, I am using the following