[ https://issues.apache.org/jira/browse/SPARK-38042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17489167#comment-17489167 ]
Johan Nyström-Persson edited comment on SPARK-38042 at 2/8/22, 11:50 PM: ------------------------------------------------------------------------- My initial idea above was wrong. I ended up changing {code:java} val TypeRef(_, _, Seq(elementType)) = tpe{code} to {code:java} val TypeRef(_, _, Seq(elementType)) = tpe.dealias{code} and this seems to work. was (Author: JIRAUSER284274): My initial idea above was wrong, and instead I fixed this by changing {code:java} val TypeRef(_, _, Seq(elementType)) = tpe{code} to {code:java} val TypeRef(_, _, Seq(elementType)) = tpe.dealias{code} > Encoder cannot be found when a tuple component is a type alias for an Array > --------------------------------------------------------------------------- > > Key: SPARK-38042 > URL: https://issues.apache.org/jira/browse/SPARK-38042 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.1.2, 3.2.0 > Reporter: Johan Nyström-Persson > Priority: Major > > ScalaReflection.dataTypeFor fails when Array[T] has been aliased for some T, > and then the alias is being used as a component of e.g. a product. > Minimal example, tested in version 3.1.2: > {code:java} > type Data = Array[Long] > val xs:List[(Data,Int)] = List((Array(1),1), (Array(2),2)) > sc.parallelize(xs).toDF("a", "b"){code} > This gives the following exception: > {code:java} > scala.MatchError: Data (of class > scala.reflect.internal.Types$AliasNoArgsTypeRef) > at > org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$dataTypeFor$1(ScalaReflection.scala:104) > > at > scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.dataTypeFor(ScalaReflection.scala:88) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerFor$6(ScalaReflection.scala:573) > > at > scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.map(TraversableLike.scala:238) > at scala.collection.TraversableLike.map$(TraversableLike.scala:231) > at scala.collection.immutable.List.map(List.scala:298) > at > org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerFor$1(ScalaReflection.scala:562) > > at > scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:432) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.$anonfun$serializerForType$1(ScalaReflection.scala:421) > > at > scala.reflect.internal.tpe.TypeConstraints$UndoLog.undo(TypeConstraints.scala:69) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects(ScalaReflection.scala:904) > > at > org.apache.spark.sql.catalyst.ScalaReflection.cleanUpReflectionObjects$(ScalaReflection.scala:903) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.cleanUpReflectionObjects(ScalaReflection.scala:49) > > at > org.apache.spark.sql.catalyst.ScalaReflection$.serializerForType(ScalaReflection.scala:413) > > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:55) > > at org.apache.spark.sql.Encoders$.product(Encoders.scala:285) > at > org.apache.spark.sql.LowPrioritySQLImplicits.newProductEncoder(SQLImplicits.scala:251) > > at > org.apache.spark.sql.LowPrioritySQLImplicits.newProductEncoder$(SQLImplicits.scala:251) > > at > org.apache.spark.sql.SQLImplicits.newProductEncoder(SQLImplicits.scala:32) > ... 48 elided{code} > At first glance, I think this could be fixed by changing e.g. > {code:java} > getClassNameFromType(tpe) to > getClassNameFromType(tpe.dealias) > {code} > in ScalaReflection.dataTypeFor. I will try to test that and submit a pull > request shortly. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org