Michael Armbrust created SPARK-8470: ---------------------------------------
Summary: MissingRequirementError for ScalaReflection on user classes Key: SPARK-8470 URL: https://issues.apache.org/jira/browse/SPARK-8470 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0 Reporter: Michael Armbrust Priority: Blocker >From the mailing list: {code} Since upgrading to Spark 1.4, I'm getting a scala.reflect.internal.MissingRequirementError when creating a DataFrame from an RDD. The error references a case class in the application (the RDD's type parameter), which has been verified to be present. Items of note: 1) This is running on AWS EMR (YARN). I do not get this error running locally (standalone). 2) Reverting to Spark 1.3.1 makes the problem go away 3) The jar file containing the referenced class (the app assembly jar) is not listed in the classpath expansion dumped in the error message. I have seen SPARK-5281, and am guessing that this is the root cause, especially since the code added there is involved in the stacktrace. That said, my grasp on scala reflection isn't strong enough to make sense of the change to say for sure. It certainly looks, though, that in this scenario the current thread's context classloader may not be what we think it is (given #3 above). Any ideas? App code: def registerTable[A <: Product : TypeTag](name: String, rdd: RDD[A])(implicit hc: HiveContext) = { val df = hc.createDataFrame(rdd) df.registerTempTable(name) } Stack trace: scala.reflect.internal.MissingRequirementError: class com....MyClass in JavaMirror with sun.misc.Launcher$AppClassLoader@d16e5d6 of type class sun.misc.Launcher$AppClassLoader with classpath [ lots and lots of paths and jars, but not the app assembly jar] not found at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:119) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:21) at com.ipcoop.spark.sql.SqlEnv$$typecreator1$1.apply(SqlEnv.scala:87) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:71) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) {code} Another report: {code} Hi, I use spark 0.14. I tried to create dataframe from RDD below, but got scala.reflect.internal.MissingRequirementError val partitionedTestDF2 = pairVarRDD.toDF("column1","column2","column3") //pairVarRDD is RDD[Record4Dim_2], and Record4Dim_2 is a Case Class How can I fix this? Exception in thread "main" scala.reflect.internal.MissingRequirementError: class etl.Record4Dim_2 in JavaMirror with sun.misc.Launcher$AppClassLoader@30177039 of type class sun.misc.Launcher$AppClassLoader with classpath [file:/local/spark140/conf/,file:/local/spark140/lib/spark-assembly-1.4.0-SNAPSHOT-hadoop2.6.0.jar,file:/local/spark140/lib/datanucleus-core-3.2.10.jar,file:/local/spark140/lib/datanucleus-rdbms-3.2.9.jar,file:/local/spark140/lib/datanucleus-api-jdo-3.2.6.jar,file:/etc/hadoop/conf/] and parent being sun.misc.Launcher$ExtClassLoader@52c8c6d9 of type class sun.misc.Launcher$ExtClassLoader with classpath [file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunec.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunjce_provider.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/sunpkcs11.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/zipfs.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/localedata.jar,file:/usr/jdk64/jdk1.7.0_67/jre/lib/ext/dnsns.jar] and parent being primordial classloader with boot classpath [/usr/jdk64/jdk1.7.0_67/jre/lib/resources.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/rt.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/sunrsasign.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jsse.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jce.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/charsets.jar:/usr/jdk64/jdk1.7.0_67/jre/lib/jfr.jar:/usr/jdk64/jdk1.7.0_67/jre/classes] not found. at scala.reflect.internal.MissingRequirementError$.signal(MissingRequirementError.scala:16) at scala.reflect.internal.MissingRequirementError$.notFound(MissingRequirementError.scala:17) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:48) at scala.reflect.internal.Mirrors$RootsBase.getModuleOrClass(Mirrors.scala:61) at scala.reflect.internal.Mirrors$RootsBase.staticModuleOrClass(Mirrors.scala:72) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:119) at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:21) at no.uni.computing.etl.LoadWrfV14$$typecreator1$1.apply(LoadWrfV14.scala:91) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:231) at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:231) at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:71) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:59) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:28) at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:410) at org.apache.spark.sql.SQLContext$implicits$.rddToDataFrameHolder(SQLContext.scala:335) BR, Patcharee {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org