Imran Younus created SYSTEMML-1172:
--------------------------------------

             Summary: Matrix Multipy (cpmm) fails in spark mode.
                 Key: SYSTEMML-1172
                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1172
             Project: SystemML
          Issue Type: Bug
         Environment: spark 1.6.1
Three node cluster with 512GB ram and 48 cores per node.
            Reporter: Imran Younus
         Attachments: output_ATA_50k.log

I'm running this simple dml code for a {{50k x 50k}} matrix

{code}
N = $N
X = Rand(rows=N, cols=N, max=1, min=-1, pdf="uniform")
A = t(X) %*% X
fn = sum(A * A)
print(fn)
{code}

I'm running this with spark 1.6.1:

{code}
/opt/spark-1.6.2-bin-hadoop2.6/bin/spark-submit 
--master=spark://rr-ram4.softlayer.com:7077 --executor-memory=40g 
--driver-memory=40g  sysml/target/SystemML.jar -f genDataForCholeskey.dml 
-explain -stats -nvargs N=50000 output=/user/iyounus/data/PDmatrix_50k.csv >& 
output_ATA_50k_no_writing.log
{code}

When this code runs, the executors start dying because of java heap 
OutOfMemoryError. After multiple retries the code just fails.

The exact same code in python using numpy takes 7 min!!

log file is attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to