Imran Younus created SYSTEMML-1156: -------------------------------------- Summary: problem with MLContext and QR Key: SYSTEMML-1156 URL: https://issues.apache.org/jira/browse/SYSTEMML-1156 Project: SystemML Issue Type: Bug Components: Runtime Environment: spark 1.6.2 centOS7
Reporter: Imran Younus I'm trying to run this simple code to get QR {code} X = rand(rows=4, cols=2) [H, R] = qr(X) print(toString(H)) print ("X is of size : " + nrow(X) + "," + ncol(X)) print ("H is of size : " + nrow(H) + "," + ncol(H)) print ("R is of size : " + nrow(R) + "," + ncol(R)) n = ncol(H) for( j in n:1 ) { print(j); V = H[,j]; print ("V is of size : " + nrow(V) + "," + ncol(V)) VTV = t(V) %*% V print(toString(VTV)) } {code} I ran this in CP mode and in hybrid spark mode. In the CP mode this works perfectly fine. But, when I run this with spark then the behavior is strange. The problem is that inside the for loop, when I assign {{H\[,j\]}} to {{V}}, it becomes {{H}} instead of just a column of {{H}}. So, {{VTV}} then becomes a matrix instead of just a number which I want. This only happens inside the for loop. If I do this without for loop then there is no problem. Also, this is occurs only for matrix {{H}}. If I replace {{H}} with {{X}} instead, then there is no problem. Here is the out of the code when I run it with spark: {code} 16/12/16 11:53:27 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:53:27 16/12/16 11:53:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable X is of size : 4,2 H is of size : 4,2 R is of size : 4,2 1.526 0.000 0.459 1.905 0.280 -0.202 0.659 0.373 2 V is of size : 4,1 3.051 1.064 1.064 3.811 1 V is of size : 4,1 3.051 1.064 1.064 3.811 16/12/16 11:53:27 INFO api.DMLScript: SystemML Statistics: Total execution time: 0.624 sec. Number of executed Spark inst: 0. 16/12/16 11:53:27 INFO api.DMLScript: END DML run 12/16/2016 11:53:27 {code} As you can see from the output, the size of {{V}} is correct. Its supposed to be a column vector. But, {{VTV}} is a 2x2 matrix instead of a number because {{V}} is just {{H}}. We print {{V}} and see that. Here is correct output form CP mode: {code} ================================================================================ ================================================================================ 16/12/16 11:54:56 INFO api.DMLScript: BEGIN DML run 12/16/2016 11:54:56 16/12/16 11:54:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable X is of size : 4,2 H is of size : 4,2 R is of size : 4,2 1.575 0.000 0.476 1.591 0.296 -0.772 0.596 0.233 2 V is of size : 4,1 3.182 1 V is of size : 4,1 3.151 16/12/16 11:54:57 INFO api.DMLScript: SystemML Statistics: Total execution time: 0.199 sec. Number of executed MR Jobs: 0. 16/12/16 11:54:57 INFO api.DMLScript: END DML run 12/16/2016 11:54:57 {code} [~mboehm7] [~niketanpansare] -- This message was sent by Atlassian JIRA (v6.3.4#6332)