Github user jfeher commented on the issue:

    https://github.com/apache/flink/pull/2542
  
    Hi, we have measured the training time of als and ials with the given 
dataset.
    After filtering the data to unique item user pairs we got approximatly 64 
million rankings.
    
    We measured on a cluster with four nodes and on yarn. All of the nodes had 
16 GB of memory. 
    The taskmanagers got 12 GB and the jobmanager got 2 GB.
    We had four taskmanagers, one four each node.
    After some testing it looked like a block number between 100 and 1500 is 
the most optimal.
    And between 100 and 300 the running times were steadily low.
    
    **For ials we got the following measurments:**
    
    The average time for block numbers between 100 and 1500 and 1 iteration in 
seconds: 2000.33s
    
    The average time for block numbers between 100 and 300 and 1 iteration in 
seconds: 1729.44s
    
    More detailed results by block sizes on the diagram: 
http://imgur.com/LjJavti
    
    **For als with the same configurations we got the following measurments:**
    
    The average time for block numbers between 100 and 1500 and 1 iteration in 
seconds: 1694.04s
    
    The average time for block numbers between 100 and 300 and 1 iteration in 
seconds: 1465.77s
    
    So the ials version was 300 s slower on this data than the als.
    
    When we increased the iteration number for 10 the time difference stayed 
under 1000 s which is less than ten times 300.
    This is because the fix time cost for the whole training is big.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to