[ https://issues.apache.org/jira/browse/SYSTEMML-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias Boehm resolved SYSTEMML-809. ------------------------------------- Resolution: Fixed Fix Version/s: SystemML 0.11 > Performance issue multi-threaded wdivmm_left > -------------------------------------------- > > Key: SYSTEMML-809 > URL: https://issues.apache.org/jira/browse/SYSTEMML-809 > Project: SystemML > Issue Type: Bug > Components: Runtime > Reporter: Matthias Boehm > Assignee: Matthias Boehm > Fix For: SystemML 0.11 > > > Recent experiments of wdivmm left (e.g., t(U) %*% (X/(U%*%t(V)+eps))) and > right (e.g., (X/(U%*%t(V)+eps))%*%V) revealed severe performance issues of > wdivmm left with increasing sparsity. While wdivmm right shows good > multi-threading speedups of 9x-10x (with 12 physical cores), wdivmm left > shows only a speedup of 3x. > Cause: After detailed analysis, the cause was a high ratio of L3 cache misses > on the output despite the existing cache-conscious blocking and > column-block-wise parallelization. This is due to the row major sparse input > X, where an input row Xi touches a single output row for wdivmm right, but > potentially all output rows for wdivmm left. -- This message was sent by Atlassian JIRA (v6.3.4#6332)