Imran Younus created SYSTEMML-1172: -------------------------------------- Summary: Matrix Multipy (cpmm) fails in spark mode. Key: SYSTEMML-1172 URL: https://issues.apache.org/jira/browse/SYSTEMML-1172 Project: SystemML Issue Type: Bug Environment: spark 1.6.1 Three node cluster with 512GB ram and 48 cores per node. Reporter: Imran Younus Attachments: output_ATA_50k.log
I'm running this simple dml code for a {{50k x 50k}} matrix {code} N = $N X = Rand(rows=N, cols=N, max=1, min=-1, pdf="uniform") A = t(X) %*% X fn = sum(A * A) print(fn) {code} I'm running this with spark 1.6.1: {code} /opt/spark-1.6.2-bin-hadoop2.6/bin/spark-submit --master=spark://rr-ram4.softlayer.com:7077 --executor-memory=40g --driver-memory=40g sysml/target/SystemML.jar -f genDataForCholeskey.dml -explain -stats -nvargs N=50000 output=/user/iyounus/data/PDmatrix_50k.csv >& output_ATA_50k_no_writing.log {code} When this code runs, the executors start dying because of java heap OutOfMemoryError. After multiple retries the code just fails. The exact same code in python using numpy takes 7 min!! log file is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)