[ https://issues.apache.org/jira/browse/SYSTEMML-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564750#comment-16564750 ]
Matthias Boehm commented on SYSTEMML-2476: ------------------------------------------ thanks for catching this [~Guobao]. Let me demystify this my explaining the three overlapping issues here: * You see MR instead of SPARK jobs because the tests did not set SPARK hybrid mode and hence we're running in hybrid (i.e., CP and MR). * These distributed operations are caused by a missing literal replacement for scalar lookups into lists which make C unknown and because the output sizes of operations in the same DAG depend on C we compile conservative distributed operations. I have an extension of the recompiler that fixes these unnecessary distributed operations. * However, there is a remaining issue. Specifically C comes out of the list with value type STRING. I made the runtime robust enough to handle this but we should also fix the root cause. I can have a look into this remaining issue tomorrow. Until then please leave the JIRA open. > Unexpected mapreduce task > ------------------------- > > Key: SYSTEMML-2476 > URL: https://issues.apache.org/jira/browse/SYSTEMML-2476 > Project: SystemML > Issue Type: Bug > Reporter: LI Guobao > Priority: Major > > When trying to use scalar casting to get element from a list, unexpected > mapreduce tasks are launched instead of CP mode. The scenario is to replace > *C = 1* with *C = as.scalar(hyperparams["C"])* inside the {{_gradient > function_}} found in > {{_src/test/scripts/functions/paramserv/mnist_lenet_paramserv.dml_}}. And > then the problem could be reproduced by launching the method > {{_testParamservBSPBatchDisjointContiguous_}} inside class > _{{org.apache.sysml.test.integration.functions.paramserv.ParamservLocalNNTest}}_ > Here is the stack: > {code:java} > 18/07/31 22:10:27 INFO mapred.MapTask: numReduceTasks: 1 > 18/07/31 22:10:27 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584) > 18/07/31 22:10:27 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100 > 18/07/31 22:10:27 INFO mapred.MapTask: soft limit at 83886080 > 18/07/31 22:10:27 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600 > 18/07/31 22:10:27 INFO mapred.MapTask: kvstart = 26214396; length = 6553600 > 18/07/31 22:10:27 INFO mapreduce.Job: The url to track the job: > http://localhost:8080/ > 18/07/31 22:10:27 INFO mapreduce.Job: Running job: job_local792652629_0008 > {code} > [~mboehm7], if possible, could you take a look on this? And I've double > checked the creation of execution context in > {{ParamservBuiltinCPInstruction}}. But it is instance of ExecutionContext not > SparkExecutionContext. -- This message was sent by Atlassian JIRA (v7.6.3#76005)