Github user jfeher commented on the issue: https://github.com/apache/flink/pull/2542 Hi, we have measured the training time of als and ials with the given dataset. After filtering the data to unique item user pairs we got approximatly 64 million rankings. We measured on a cluster with four nodes and on yarn. All of the nodes had 16 GB of memory. The taskmanagers got 12 GB and the jobmanager got 2 GB. We had four taskmanagers, one four each node. After some testing it looked like a block number between 100 and 1500 is the most optimal. And between 100 and 300 the running times were steadily low. **For ials we got the following measurments:** The average time for block numbers between 100 and 1500 and 1 iteration in seconds: 2000.33s The average time for block numbers between 100 and 300 and 1 iteration in seconds: 1729.44s More detailed results by block sizes on the diagram: http://imgur.com/LjJavti **For als with the same configurations we got the following measurments:** The average time for block numbers between 100 and 1500 and 1 iteration in seconds: 1694.04s The average time for block numbers between 100 and 300 and 1 iteration in seconds: 1465.77s So the ials version was 300 s slower on this data than the als. When we increased the iteration number for 10 the time difference stayed under 1000 s which is less than ten times 300. This is because the fix time cost for the whole training is big.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---