Andoni Teso created SPARK-46679: ----------------------------------- Summary: Encoders with multiple inheritance - Key not found: T Key: SPARK-46679 URL: https://issues.apache.org/jira/browse/SPARK-46679 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.5.0, 3.4.2 Reporter: Andoni Teso Attachments: spark_test.zip
Since version 3.4, I've been experiencing the following error when using encoders. {code:java} Exception in thread "main" java.util.NoSuchElementException: key not found: T at scala.collection.immutable.Map$Map1.apply(Map.scala:163) at org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:121) at org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138) at org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198) at org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138) at org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:60) at org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:53) at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:62) at org.apache.spark.sql.Encoders$.bean(Encoders.scala:179) at org.apache.spark.sql.Encoders.bean(Encoders.scala) at org.example.Main.main(Main.java:26) {code} I'm attaching the code I use to reproduce the error locally. The issue is in the JavaTypeInference class when it tries to find the encoder for a ParameterizedType with the value Team<T>. When running JavaTypeUtils.getTypeArguments(pt).asScala.toMap, it returns the type T again, but this time as a Company object, and pt.getRawType as Team. This ends up generating a tuple of Team, Company in the typeVariables map, leading to errors when searching for TypeVariables. In my case, I've resolved this by doing the following: {code:java} case tv: TypeVariable[_] => encoderFor(typeVariables.head._2, seenTypeSet, typeVariables) case pt: ParameterizedType => encoderFor(pt.getRawType, seenTypeSet, typeVariables) {code} I haven't submitted a pull request because it doesn't seem to be the most optimal solution, or it might break some parts of the code. Additional validations or conditions may need to be added. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org