Hello,

I am trying to find some performance figures of spark vs various other
languages for ALS based recommender system. I am using 20 million ratings
movielens dataset. The test environment involves one big 30 core machine
with 132 GB memory. I am using the scala version of the script provided
here,
http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html
<http://spark.apache.org/docs/latest/mllib-collaborative-filtering.html>

I am not an expert in spark, and I assume that varying the n while invoking
spark with following flags, --master local[n], is supposed to provide ideal
scaling.

Initial observations didnt favour spark by some small margins, but as I
said since I am not a spark expert, I would comment only after being
assured that this is the most optimal way of running the ALS snippet.

Could the experts please help me with the most optimal way to get the best
timings out of sparks ALS example on the mentioned environment. Thanks.

-- 
Best regards,
Abhijith

Reply via email to