Thank you Shirish and Deron for the suggestions. Looking forward to the fix from Matthias!
We are using the hadoop-common shipped with CDH4.2.1, and it's in classpath. I'm a bit hesitate to alter our hadoop configuration to include other versions since other people are using it too. Not sure if/how the following naive approach affects the program behavior, but I did try changing the scope of <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>${hadoop.version}</version> in SystemML's pom.xml from 'provided' to 'compile' and rebuilt the jar (21MB), and it threw the same error. By the way this is in pom.xml line 65 - 72: <properties> <hadoop.version>2.4.1</hadoop.version> <antlr.version>4.3</antlr.version> <spark.version>1.4.1</spark.version> <!-- OS-specific JVM arguments for running integration tests --> <integrationTestExtraJVMArgs /> </properties> Am I supposed to modify the hadoop.version before build? Thanks again, Ethan On Fri, Feb 5, 2016 at 2:29 AM, Deron Eriksson <deroneriks...@gmail.com> wrote: > Hi Matthias, > > Glad to hear the fix is simple. Mixing jar versions sometimes is not very > fun. > > Deron > > > On Thu, Feb 4, 2016 at 11:10 PM, Matthias Boehm <mbo...@us.ibm.com> wrote: > > > well, let's not mix different hadoop versions in the class path or > > client/server. If I'm not mistaken, cdh 4.x always shipped with MR v1. > It's > > a trivial fix for us and will be in the repo tomorrow morning anyway. > > Thanks for catching this issue Ethan. > > > > Regards, > > Matthias > > > > [image: Inactive hide details for Deron Eriksson ---02/04/2016 11:04:38 > > PM---Hi Ethan, Just FYI, I looked at hadoop-common-2.0.0-cdh4.2]Deron > > Eriksson ---02/04/2016 11:04:38 PM---Hi Ethan, Just FYI, I looked at > > hadoop-common-2.0.0-cdh4.2.1.jar ( > > > > From: Deron Eriksson <deroneriks...@gmail.com> > > To: dev@systemml.incubator.apache.org > > Date: 02/04/2016 11:04 PM > > Subject: Re: Compatibility with MR1 Cloudera cdh4.2.1 > > ------------------------------ > > > > > > > > Hi Ethan, > > > > Just FYI, I looked at hadoop-common-2.0.0-cdh4.2.1.jar ( > > > > > https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-common/2.0.0-cdh4.2.1/ > > ), > > since I don't see a 2.0.0-mr1-cdh4.2.1 version, and the > > org.apache.hadoop.conf.Configuration class in that jar doesn't appear to > > have a getDouble method, so using that version of hadoop-common won't > work. > > > > However, the hadoop-common-2.4.1.jar ( > > > > > https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-common/2.4.1/ > > ) > > > > does appear to have the getDouble method. It's possible that adding that > > jar to your classpath may fix your problem, as Shirish pointed out. > > > > It sounds like Matthias may have another fix. > > > > Deron > > > > > > > > On Thu, Feb 4, 2016 at 6:40 PM, Matthias Boehm <mbo...@us.ibm.com> > wrote: > > > > > well, we did indeed not run on MR v1 for a while now. However, I don't > > > want to get that far and say we don't support it anymore. I'll fix this > > > particular issue by tomorrow. > > > > > > In the next couple of weeks we should run our full performance > testsuite > > > (for broad coverage) over an MR v1 cluster and systematically remove > > > unnecessary incompatibility like this instance. Any volunteers? > > > > > > Regards, > > > Matthias > > > > > > [image: Inactive hide details for Ethan Xu ---02/04/2016 05:51:28 > > > PM---Hello, I got an error when running the > > systemML/scripts/Univar-S]Ethan > > > Xu ---02/04/2016 05:51:28 PM---Hello, I got an error when running the > > > systemML/scripts/Univar-Stats.dml script on > > > > > > From: Ethan Xu <ethan.yifa...@gmail.com> > > > To: dev@systemml.incubator.apache.org > > > Date: 02/04/2016 05:51 PM > > > Subject: Compatibility with MR1 Cloudera cdh4.2.1 > > > ------------------------------ > > > > > > > > > > > > > > Hello, > > > > > > I got an error when running the systemML/scripts/Univar-Stats.dml > script > > on > > > a hadoop cluster (Cloudera CDH4.2.1) on a 6GB data set. Error message > is > > at > > > the bottom of the email. The same script ran fine on a smaller sample > > > (several MB) of the same data set, when MR was not invoked. > > > > > > The main error was java.lang.NoSuchMethodError: > > > org.apache.hadoop.mapred.JobConf.getDouble() > > > Digging deeper, it looks like the CDH4.2.1 version of MR indeed didn't > > have > > > the JobConf.getDouble() method. > > > > > > The hadoop-core jar of CDH4.2.1 can be found here: > > > > > > > > > > > https://repository.cloudera.com/artifactory/repo/org/apache/hadoop/hadoop-core/2.0.0-mr1-cdh4.2.1/ > > > > > > > > The calling line of SystemML is line 1194 of > > > > > > > > > https://github.com/apache/incubator-systemml/blob/master/src/main/java/org/apache/sysml/runtime/matrix/mapred/MRJobConfiguration.java > > > > > > I was wondering, if the finding is accurate, is there a potential fix, > or > > > does this mean the current version of SystemML is not compatible with > > > CDH4.2.1? > > > > > > Thank you, > > > > > > Ethan > > > > > > > > > hadoop jar $sysDir/target/SystemML.jar -f > > > $sysDir/scripts/algorithms/Univar-Stats.dml -nvargs > > > X=$baseDirHDFS/original-coded.csv > > > TYPES=$baseDirHDFS/original-coded-type.csv > > > STATS=$baseDirHDFS/univariate-summary.csv > > > > > > 16/02/04 20:35:03 INFO api.DMLScript: BEGIN DML run 02/04/2016 20:35:03 > > > 16/02/04 20:35:03 INFO api.DMLScript: HADOOP_HOME: null > > > 16/02/04 20:35:03 WARN conf.DMLConfig: No default SystemML config file > > > (./SystemML-config.xml) found > > > 16/02/04 20:35:03 WARN conf.DMLConfig: Using default settings in > > DMLConfig > > > 16/02/04 20:35:04 WARN hops.OptimizerUtils: Auto-disable multi-threaded > > > text read for 'text' and 'csv' due to thread contention on JRE < 1.8 > > > (java.version=1.7.0_71). > > > SLF4J: Class path contains multiple SLF4J bindings. > > > SLF4J: Found binding in > > > > > > > > > [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > > SLF4J: Found binding in > > > > > > > > > [jar:file:/usr/local/explorys/datagrid/lib/slf4j-jdk14-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > > SLF4J: Found binding in > > > > > > > > > [jar:file:/usr/local/explorys/datagrid/lib/logback-classic-1.0.6.jar!/org/slf4j/impl/StaticLoggerBinder.class] > > > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > > > explanation. > > > 16/02/04 20:35:07 INFO api.DMLScript: SystemML Statistics: > > > Total execution time: 0.880 sec. > > > Number of executed MR Jobs: 0. > > > > > > 16/02/04 20:35:07 INFO api.DMLScript: END DML run 02/04/2016 20:35:07 > > > Exception in thread "main" java.lang.NoSuchMethodError: > > > org.apache.hadoop.mapred.JobConf.getDouble(Ljava/lang/String;D)D > > > at > > > > > > > > > org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration.setUpMultipleInputs(MRJobConfiguration.java:1195) > > > at > > > > > > > > > org.apache.sysml.runtime.matrix.mapred.MRJobConfiguration.setUpMultipleInputs(MRJobConfiguration.java:1129) > > > at > > > > > > > > > org.apache.sysml.runtime.matrix.CSVReblockMR.runAssignRowIDMRJob(CSVReblockMR.java:307) > > > at > > > > > > > > > org.apache.sysml.runtime.matrix.CSVReblockMR.runAssignRowIDMRJob(CSVReblockMR.java:289) > > > at > > > > > > org.apache.sysml.runtime.matrix.CSVReblockMR.runJob(CSVReblockMR.java:275) > > > at > > org.apache.sysml.lops.runtime.RunMRJobs.submitJob(RunMRJobs.java:257) > > > at > > > > > > > > > org.apache.sysml.lops.runtime.RunMRJobs.prepareAndSubmitJob(RunMRJobs.java:143) > > > at > > > > > > > > > org.apache.sysml.runtime.instructions.MRJobInstruction.processInstruction(MRJobInstruction.java:1500) > > > at > > > > > > > > > org.apache.sysml.runtime.controlprogram.ProgramBlock.executeSingleInstruction(ProgramBlock.java:309) > > > at > > > > > > > > > org.apache.sysml.runtime.controlprogram.ProgramBlock.executeInstructions(ProgramBlock.java:227) > > > at > > > > > > > > > org.apache.sysml.runtime.controlprogram.ProgramBlock.execute(ProgramBlock.java:169) > > > at > > > > org.apache.sysml.runtime.controlprogram.Program.execute(Program.java:146) > > > at org.apache.sysml.api.DMLScript.execute(DMLScript.java:676) > > > at org.apache.sysml.api.DMLScript.executeScript(DMLScript.java:338) > > > at org.apache.sysml.api.DMLScript.main(DMLScript.java:197) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at > > > > > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > > at > > > > > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > > at java.lang.reflect.Method.invoke(Method.java:606) > > > at org.apache.hadoop.util.RunJar.main(RunJar.java:208) > > > > > > > > > > > > > > > > -- Yifan "Ethan" Xu, PhD Data Scientist / Statistician Explorys, IBM Watson Health Adjunct Faculty Department of Epidemiology and Biostatistics Case Western Reserve University -------------- Email: ethan.yifa...@gmail.com Phone: (607) 760-6817 --------------