[ https://issues.apache.org/jira/browse/SYSTEMML-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Matthias Boehm updated SYSTEMML-1627: ------------------------------------- Description: Scenario: MultiLogReg over MNIST480m (480M rows x 784, sparse) fails for certain memory configurations (where unary operations over 480Mx2 intermediates run in CP and binary operations in SPARK), with the following exception: {code} Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error in program block generated from statement block between lines 261 and 273 -- Error evaluating instruction: SPARK°tak+*°Y·MATRIX·DOUBLE°_mVar432·MATRIX·DOUBLE°1·SCALAR·INT·true°_Var437·SCALAR·DOUBLE at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:322) at org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221) at org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:167) at org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165) ... 14 more Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://larry.almaden.ibm.com:8020/user/biuser/scratch_space/_p684936_9.1.44.28/_t0/temp154_56 at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) {code} The root cause is a missing export on guarded parallelize (as introduced in the 0.14 release) of cached matrices which have previously been collected from input rdds. These matrix objects are not marked dirty and hence not exported although they do not have an associated hdfs file yet. > Mlogreg fails with file not found on MNIST480m and certain mem configs > ---------------------------------------------------------------------- > > Key: SYSTEMML-1627 > URL: https://issues.apache.org/jira/browse/SYSTEMML-1627 > Project: SystemML > Issue Type: Bug > Affects Versions: SystemML 0.14 > Reporter: Matthias Boehm > > Scenario: MultiLogReg over MNIST480m (480M rows x 784, sparse) fails for > certain memory configurations (where unary operations over 480Mx2 > intermediates run in CP and binary operations in SPARK), with the following > exception: > {code} > Caused by: org.apache.sysml.runtime.DMLRuntimeException: ERROR: Runtime error > in program block generated from statement block between lines 261 and 273 -- > Error evaluating instruction: > SPARK°tak+*°Y·MATRIX·DOUBLE°_mVar432·MATRIX·DOUBLE°1·SCALAR·INT·true°_Var437·SCALAR·DOUBLE > at > org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:322) > at > org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:221) > at > org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:167) > at > org.apache.sysml.runtime.controlprogram.WhileProgramBlock.execute(WhileProgramBlock.java:165) > ... 14 more > Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does > not exist: > hdfs://larry.almaden.ibm.com:8020/user/biuser/scratch_space/_p684936_9.1.44.28/_t0/temp154_56 > at > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287) > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229) > at > org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) > {code} > The root cause is a missing export on guarded parallelize (as introduced in > the 0.14 release) of cached matrices which have previously been collected > from input rdds. These matrix objects are not marked dirty and hence not > exported although they do not have an associated hdfs file yet. -- This message was sent by Atlassian JIRA (v6.3.15#6346)