Ok, Cheng. Thank you!
Un saludo Jorge López-Malla Matute Big Data Developer Vía de las Dos Castillas, 33. Ática 4. 3ª Planta 28224 Pozuelo de Alarcón, Madrid Tel: 91 828 64 73 // @stratiobd 2015-01-28 19:44 GMT+01:00 Cheng Lian <lian.cs....@gmail.com>: > Hey Jorge, > > This is expected. Because there isn’t an obvious mapping from Set[T] to > any SQL types. Currently we have complex types like array, map, and struct, > which are inherited from Hive. In your case, I’d transform the Set[T] > into a Seq[T] first, then Spark SQL can map it to an array. > > Cheng > > On 1/28/15 7:15 AM, Jorge Lopez-Malla wrote: > > Hello, > > We are trying to insert a case class in Parquet using SparkSql. When i'm > creating the SchemaRDD, that include a Set, i have the following exception: > > sqc.createSchemaRDD(r) > scala.MatchError: Set[(scala.Int, scala.Int)] (of class > scala.reflect.internal.Types$TypeRef$anon$1) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:53) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:64) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:62) > at > scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244) > at > scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.immutable.List.foreach(List.scala:318) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:62) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:50) > at > org.apache.spark.sql.catalyst.ScalaReflection$.attributesFor(ScalaReflection.scala:44) > at > org.apache.spark.sql.execution.ExistingRdd$.fromProductRdd(basicOperators.scala:229) > at org.apache.spark.sql.SQLContext.createSchemaRDD(SQLContext.scala:94) > at $iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:49) > at $iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:54) > at $iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:56) > at $iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:58) > at $iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:60) > at $iwC$iwC$iwC$iwC$iwC.<init>(<console>:62) > at $iwC$iwC$iwC$iwC.<init>(<console>:64) > at $iwC$iwC$iwC.<init>(<console>:66) > at $iwC$iwC.<init>(<console>:68) > at $iwC.<init>(<console>:70) > at <init>(<console>:72) > at .<init>(<console>:76) > at .<clinit>(<console>) > at .<init>(<console>:7) > at .<clinit>(<console>) > at $print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:789) > at > org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1062) > at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:615) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:646) > at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:610) > at > org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:814) > at > org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:859) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:771) > at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:616) > at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:624) > at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:629) > at > org.apache.spark.repl.SparkILoop$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:954) > at > org.apache.spark.repl.SparkILoop$anonfun$process$1.apply(SparkILoop.scala:902) > at > org.apache.spark.repl.SparkILoop$anonfun$process$1.apply(SparkILoop.scala:902) > at > scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:902) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:997) > at org.apache.spark.repl.Main$.main(Main.scala:31) > at org.apache.spark.repl.Main.main(Main.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > There is a code snippet: > > scala> case class A(a: Set[(Int, Int)]) > defined class A > > scala> val a = A(Set((1, 2), (1, 3))) > a: A = A(Set((1,2), (1,3))) > > scala> val r = sc.parallelize(Array(a)) > r: org.apache.spark.rdd.RDD[A] = ParallelCollectionRDD[17] at parallelize > at <console>:44 > > scala> sqc.createSchemaRDD(r) > scala.MatchError: Set[(scala.Int, scala.Int)] (of class > scala.reflect.internal.Types$TypeRef$anon$1) > at > org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:53) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:64) > at > org.apache.spark.sql.catalyst.ScalaReflection$anonfun$schemaFor$1.apply(ScalaReflection.scala:62) > .... > > This code has been tested with Spark 1.1.0 y 1.2.0, is this the expected > behaviour or maybe we doing something wrong? > > Un saludo > > Jorge López-Malla Matute > Big Data Developer > > > Vía de las Dos Castillas, 33. Ática 4. 3ª Planta > 28224 Pozuelo de Alarcón, Madrid > Tel: 91 828 64 73 // @stratiobd > > >