[ https://issues.apache.org/jira/browse/SPARK-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-21389. ---------------------------------- Resolution: Incomplete > ALS recommendForAll optimization uses Native BLAS > ------------------------------------------------- > > Key: SPARK-21389 > URL: https://issues.apache.org/jira/browse/SPARK-21389 > Project: Spark > Issue Type: Improvement > Components: ML, MLlib > Affects Versions: 2.3.0 > Reporter: Peng Meng > Priority: Major > Labels: bulk-closed > Original Estimate: 168h > Remaining Estimate: 168h > > In Spark 2.2, we have optimized ALS recommendForAll, which uses a handwriting > matrix multiplication, and get the topK items for each matrix. The method > effectively reduce the GC problem. However, Native BLAS GEMM, like Intel MKL, > and OpenBLAS, the performance of matrix multiplication is about 10X comparing > with handwriting method. > I have rewritten the code of recommendForAll with GEMM, and got about 50% > improvement comparing with the master recommendForAll method. > The key point of this optimization: > 1), use GEMM to replace hand-written matrix multiplication. > 2), Use matrix to keep temp result, largely reduce GC and computing time. The > master method create many small objects, which causes using GEMM directly > cannot get good performance. > 3), Use sort and merge to get the topK items, which don't need to call > priority queue two times. > Test Result: > 479818 users, 13727 products, rank = 10, topK = 20. > 3 workers, each with 35 cores. Native BLAS is Intel MKL. > Block Size: 1000===2000===4000===8000 > Master Method:40s==39.4s===39.5s===39.1s > This Method 26.5s==25.9s===26s===27.1s > Performance Improvement: (OldTime - NewTime)/NewTime = about 50% -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org