I am using HDP 2.6.4 and have followed
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/ch_oozie-spark-action.html
to make oozie use spark2.

After this, I found there are still a bunch of issues:

1. oozie and spark tries to add the same jars multiple time into cache.
This is resolved by removing the duplicate jars
from /user/oozie/share/lib/lib_20180303065325/spark2/ folder.

2. jar conflict which is not resolved. The exception is below:

18/03/06 23:51:18 ERROR ApplicationMaster: User class threw exception:
java.lang.NoSuchFieldError: USE_DEFAULTS
java.lang.NoSuchFieldError: USE_DEFAULTS
at
com.fasterxml.jackson.databind.introspect.JacksonAnnotationIntrospector.findSerializationInclusion(JacksonAnnotationIntrospector.java:498)
at
com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findSerializationInclusion(AnnotationIntrospectorPair.java:332)
at
com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findSerializationInclusion(AnnotationIntrospectorPair.java:332)
at
com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findSerializationInclusion(BasicBeanDescription.java:381)
at
com.fasterxml.jackson.databind.ser.PropertyBuilder.<init>(PropertyBuilder.java:41)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.constructPropertyBuilder(BeanSerializerFactory.java:507)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.findBeanProperties(BeanSerializerFactory.java:558)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.constructBeanSerializer(BeanSerializerFactory.java:361)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.findBeanSerializer(BeanSerializerFactory.java:272)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:225)
at
com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at
com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at
com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at
com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at
com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at
com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at
com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at
com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:145)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)



My dependencies are:

libraryDependencies += "com.typesafe.scala-logging" %%
"scala-logging-api" % "2.1.2"
libraryDependencies += "com.typesafe.scala-logging" %%
"scala-logging-slf4j" % "2.1.2"
libraryDependencies += "ch.qos.logback" % "logback-classic" % "1.2.3"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.2.0"
libraryDependencies += "com.typesafe" % "config" % "1.3.2"
libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.4"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.4" % "test"
libraryDependencies += "org.scalamock" %% "scalamock" % "4.1.0" % "test"
libraryDependencies += "com.jsuereth" %% "scala-arm" % "2.0"
libraryDependencies += "com.github.scopt" %% "scopt" % "3.7.0"
libraryDependencies += "com.typesafe.akka" %% "akka-actor" % "2.3.8"
libraryDependencies += "io.dropwizard.metrics" % "metrics-core" % "4.0.2"
libraryDependencies += "com.typesafe.slick" %% "slick" % "3.2.1"
libraryDependencies += "com.typesafe.slick" %% "slick-hikaricp" % "3.2.1"
libraryDependencies += "com.typesafe.slick" %% "slick-extensions" % "3.0.0"
libraryDependencies += "org.scalaz" %% "scalaz-core" % "7.2.19"
libraryDependencies += "org.json4s" %% "json4s-native" % "3.5.3"
libraryDependencies += "com.softwaremill.retry" %% "retry" % "0.3.0"
libraryDependencies += "org.apache.httpcomponents" % "httpclient" % "4.5.5"
libraryDependencies += "org.apache.httpcomponents" % "httpcore" % "4.4.9"


Sbt dependency tree shows jackson 2.6.5 coming from spark-core is in use.
But per
https://stackoverflow.com/questions/36982173/java-lang-nosuchfielderror-use-defaults-thrown-while-validating-json-schema-thr,
spark is using jackson version before 2.6 causing "NoSuchFieldError:
USE_DEFAULTS".

I have done:

1. succeed to run the same application through spark-submit.

2. make sure the spark dependencies are 2.2.0 to be consistent with that in
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_release-notes/content/comp_versions.html
.

What else I missed? Appreciate any help!

Reply via email to