[jira] [Updated] (SPARK-46679) Encoders with multiple inheritance - Key not found: T

Andoni Teso (Jira) Thu, 11 Jan 2024 02:57:05 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-46679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andoni Teso updated SPARK-46679:
--------------------------------
    Description: 
Since version 3.4, I've been experiencing the following error when using 
encoders.
{code:java}
Exception in thread "main" java.util.NoSuchElementException: key not found: T
    at scala.collection.immutable.Map$Map1.apply(Map.scala:163)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:121)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
    at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
    at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:60)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:53)
    at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:62)
    at org.apache.spark.sql.Encoders$.bean(Encoders.scala:179)
    at org.apache.spark.sql.Encoders.bean(Encoders.scala)
    at org.example.Main.main(Main.java:26) {code}
I'm attaching the code I use to reproduce the error locally.  [^spark_test.zip]

The issue is in the JavaTypeInference class when it tries to find the encoder 
for a ParameterizedType with the value Team<T>. When running 
JavaTypeUtils.getTypeArguments(pt).asScala.toMap, it returns the type T again, 
but this time as a Company object, and pt.getRawType as Team. This ends up 
generating a tuple of Team, Company in the typeVariables map, leading to errors 
when searching for TypeVariables.

In my case, I've resolved this by doing the following:
{code:java}
case tv: TypeVariable[_] =>
  encoderFor(typeVariables.head._2, seenTypeSet, typeVariables)

case pt: ParameterizedType =>
  encoderFor(pt.getRawType, seenTypeSet, typeVariables) {code}
I haven't submitted a pull request because it doesn't seem to be the most 
optimal solution, or it might break some parts of the code. Additional 
validations or conditions may need to be added.

  was:
Since version 3.4, I've been experiencing the following error when using 
encoders.
{code:java}
Exception in thread "main" java.util.NoSuchElementException: key not found: T
    at scala.collection.immutable.Map$Map1.apply(Map.scala:163)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:121)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
    at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
    at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:60)
    at 
org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:53)
    at 
org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:62)
    at org.apache.spark.sql.Encoders$.bean(Encoders.scala:179)
    at org.apache.spark.sql.Encoders.bean(Encoders.scala)
    at org.example.Main.main(Main.java:26) {code}
I'm attaching the code I use to reproduce the error locally. 

The issue is in the JavaTypeInference class when it tries to find the encoder 
for a ParameterizedType with the value Team<T>. When running 
JavaTypeUtils.getTypeArguments(pt).asScala.toMap, it returns the type T again, 
but this time as a Company object, and pt.getRawType as Team. This ends up 
generating a tuple of Team, Company in the typeVariables map, leading to errors 
when searching for TypeVariables.

In my case, I've resolved this by doing the following:
{code:java}
case tv: TypeVariable[_] =>
  encoderFor(typeVariables.head._2, seenTypeSet, typeVariables)

case pt: ParameterizedType =>
  encoderFor(pt.getRawType, seenTypeSet, typeVariables) {code}
I haven't submitted a pull request because it doesn't seem to be the most 
optimal solution, or it might break some parts of the code. Additional 
validations or conditions may need to be added.


> Encoders with multiple inheritance - Key not found: T
> -----------------------------------------------------
>
>                 Key: SPARK-46679
>                 URL: https://issues.apache.org/jira/browse/SPARK-46679
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.2, 3.5.0
>            Reporter: Andoni Teso
>            Priority: Major
>         Attachments: spark_test.zip
>
>
> Since version 3.4, I've been experiencing the following error when using 
> encoders.
> {code:java}
> Exception in thread "main" java.util.NoSuchElementException: key not found: T
>     at scala.collection.immutable.Map$Map1.apply(Map.scala:163)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:121)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
>     at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>     at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>     at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>     at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
>     at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>     at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>     at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.$anonfun$encoderFor$1(JavaTypeInference.scala:140)
>     at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
>     at 
> scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
>     at 
> scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
>     at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
>     at scala.collection.TraversableLike.map(TraversableLike.scala:286)
>     at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
>     at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:138)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:60)
>     at 
> org.apache.spark.sql.catalyst.JavaTypeInference$.encoderFor(JavaTypeInference.scala:53)
>     at 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:62)
>     at org.apache.spark.sql.Encoders$.bean(Encoders.scala:179)
>     at org.apache.spark.sql.Encoders.bean(Encoders.scala)
>     at org.example.Main.main(Main.java:26) {code}
> I'm attaching the code I use to reproduce the error locally.  
> [^spark_test.zip]
> The issue is in the JavaTypeInference class when it tries to find the encoder 
> for a ParameterizedType with the value Team<T>. When running 
> JavaTypeUtils.getTypeArguments(pt).asScala.toMap, it returns the type T 
> again, but this time as a Company object, and pt.getRawType as Team. This 
> ends up generating a tuple of Team, Company in the typeVariables map, leading 
> to errors when searching for TypeVariables.
> In my case, I've resolved this by doing the following:
> {code:java}
> case tv: TypeVariable[_] =>
>   encoderFor(typeVariables.head._2, seenTypeSet, typeVariables)
> case pt: ParameterizedType =>
>   encoderFor(pt.getRawType, seenTypeSet, typeVariables) {code}
> I haven't submitted a pull request because it doesn't seem to be the most 
> optimal solution, or it might break some parts of the code. Additional 
> validations or conditions may need to be added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-46679) Encoders with multiple inheritance - Key not found: T

Reply via email to