[ https://issues.apache.org/jira/browse/HADOOP-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648921#comment-15648921 ]
Andrew Wang edited comment on HADOOP-11804 at 11/8/16 10:00 PM: ---------------------------------------------------------------- Thanks for the rev Sean. I tried it with Avro and got NoClassDefFound for Log4J: {noformat} testSort(org.apache.avro.mapred.TestAvroTextSort) Time elapsed: 0.051 sec <<< ERROR! java.lang.NoClassDefFoundError: org/apache/log4j/Level at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:356) at org.apache.avro.mapred.TestAvroTextSort.testSort(TestAvroTextSort.java:37) {noformat} I think this is expected based on the contents of the hadoop-client-runtime pom.xml, which marks log4j as optional. I manually added this dependency, and then hit this: {noformat} testReadAvro(org.apache.avro.hadoop.io.TestAvroSequenceFile) Time elapsed: 0.016 sec <<< ERROR! java.lang.NullPointerException: null at org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:58) at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1248) at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1207) at org.apache.avro.hadoop.io.AvroSequenceFile$Writer.<init>(AvroSequenceFile.java:532) at org.apache.avro.hadoop.io.TestAvroSequenceFile.writeSequenceFile(TestAvroSequenceFile.java:200) at org.apache.avro.hadoop.io.TestAvroSequenceFile.testReadAvro(TestAvroSequenceFile.java:53) {noformat} I decompiled the SerializationFactory class, and noticed that it messed with the config key. I think we need to add some kind of exclusion for CommonConfigurationKeysPublic. {code} // before if (conf.get(CommonConfigurationKeys.IO_SERIALIZATIONS_KEY).equals("")) { // decompiled if (conf.get("org.apache.hadoop.shaded.io.serializations").equals("")) { {code} Here's my Avro diff for master (without the log4j addition) if you want to try this yourself: https://gist.github.com/anonymous/c064c283348a2d1bbec00845678339f9 was (Author: andrew.wang): Thanks for the rev Sean. I tried it with Avro and got NoClassDefFound for Log4J: {noformat} testSort(org.apache.avro.mapred.TestAvroTextSort) Time elapsed: 0.051 sec <<< ERROR! java.lang.NoClassDefFoundError: org/apache/log4j/Level at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.hadoop.mapred.JobConf.<clinit>(JobConf.java:356) at org.apache.avro.mapred.TestAvroTextSort.testSort(TestAvroTextSort.java:37) {noformat} I think this is expected based on the contents of the hadoop-client-runtime pom.xml, which marks log4j as optional. I manually added this dependency, and then hit this: {noformat} testReadAvro(org.apache.avro.hadoop.io.TestAvroSequenceFile) Time elapsed: 0.016 sec <<< ERROR! java.lang.NullPointerException: null at org.apache.hadoop.io.serializer.SerializationFactory.<init>(SerializationFactory.java:58) at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:1248) at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1207) at org.apache.avro.hadoop.io.AvroSequenceFile$Writer.<init>(AvroSequenceFile.java:532) at org.apache.avro.hadoop.io.TestAvroSequenceFile.writeSequenceFile(TestAvroSequenceFile.java:200) at org.apache.avro.hadoop.io.TestAvroSequenceFile.testReadAvro(TestAvroSequenceFile.java:53) {noformat} I decompiled the SerializationFactory class, and noticed that it messed with the config key. I think we need to add some kind of exclusion for CommonConfigurationKeysPublic. {code} // before if (conf.get(CommonConfigurationKeys.IO_SERIALIZATIONS_KEY).equals("")) { // decompiled if (conf.get("org.apache.hadoop.shaded.io.serializations").equals("")) { {noformat} Here's my Avro diff for master (without the log4j addition) if you want to try this yourself: https://gist.github.com/anonymous/c064c283348a2d1bbec00845678339f9 > POC Hadoop Client w/o transitive dependencies > --------------------------------------------- > > Key: HADOOP-11804 > URL: https://issues.apache.org/jira/browse/HADOOP-11804 > Project: Hadoop Common > Issue Type: Sub-task > Components: build > Reporter: Sean Busbey > Assignee: Sean Busbey > Attachments: HADOOP-11804.1.patch, HADOOP-11804.2.patch, > HADOOP-11804.3.patch, HADOOP-11804.4.patch, HADOOP-11804.5.patch, > HADOOP-11804.6.patch, HADOOP-11804.7.patch > > > make a hadoop-client-api and hadoop-client-runtime that i.e. HBase can use to > talk with a Hadoop cluster without seeing any of the implementation > dependencies. > see proposal on parent for details. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org