[ https://issues.apache.org/jira/browse/SPARK-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16845557#comment-16845557 ]
Ohad Raviv commented on SPARK-16820: ------------------------------------ this issue was resolved by SPARK-19368 and SPARK-16469. > Sparse - Sparse matrix multiplication > ------------------------------------- > > Key: SPARK-16820 > URL: https://issues.apache.org/jira/browse/SPARK-16820 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.0.0 > Reporter: Ohad Raviv > Priority: Major > Labels: bulk-closed > > While working on MCL implementation on Spark we have encountered some > difficulties. > The main part of this process is distributed sparse matrix multiplication > that has two main steps: > 1. Simulate multiply – preparation before the real multiplication in order > to see which blocks should be multiplied. > 2. The actual blocks multiplication and summation. > In our case the sparse matrix has 50M rows and columns, and 2B non-zeros. > The current multiplication suffers from these issues: > 1. A relatively trivial bug already fixed in the first step the caused the > process to be very slow [SPARK-16469] > 2. Still after the bug fix, if we have too many blocks the Simulate > multiply will take very long time and will multiply the data many times. > (O(n^3) where n is the number of blocks) > 3. Spark supports only multiplication with Dense matrices. Thus, it > converts a Sparse matrix into a dense matrix before the multiplication. > 4. For summing the intermediate block results Spark uses Breeze’s CSC > matrix operations – here the problem is that it is very inefficient to update > a CSC matrix in a zero value. > That means that with many blocks (default block size is 1024) – in our case > 50M/1024 ~= 50K, the simulate multiply will effectively never finish or will > generate 50K*16GB ~= 1000TB of data. On the other hand, if we use bigger > block size e.g. 100k we get OutOfMemoryException in the “toDense” method of > the multiply. We have worked around that by implementing our-selves both the > Sparse multiplication and addition in a very naïve way – but at least it > works. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org