[jira] [Updated] (SPARK-43744) Spark Connect scala UDF serialization pulling in unrelated classes not available on server

2024-02-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-43744:
---
Labels: SPARK-43745 pull-request-available  (was: SPARK-43745)

> Spark Connect scala UDF serialization pulling in unrelated classes not 
> available on server
> --
>
> Key: SPARK-43744
> URL: https://issues.apache.org/jira/browse/SPARK-43744
> Project: Spark
>  Issue Type: Bug
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Juliusz Sompolski
>Assignee: Zhen Li
>Priority: Major
>  Labels: SPARK-43745, pull-request-available
> Fix For: 3.5.0
>
>
> [https://github.com/apache/spark/pull/41487] moved "interrupt all - 
> background queries, foreground interrupt" and "interrupt all - foreground 
> queries, background interrupt" tests from ClientE2ETestSuite into a new 
> isolated suite SparkSessionE2ESuite to avoid an unexplicable UDF 
> serialization issue.
>  
> When these tests are moved back to ClientE2ETestSuite and when testing with
> {code:java}
> build/mvn clean install -DskipTests -Phive
> build/mvn test -pl connector/connect/client/jvm -Dtest=none 
> -DwildcardSuites=org.apache.spark.sql.ClientE2ETestSuite{code}
>  
> the tests fails with
> {code:java}
> 23/05/22 15:44:11 ERROR SparkConnectService: Error during: execute. UserId: . 
> SessionId: 0f4013ca-3af9-443b-a0e5-e339a827e0cf.
> java.lang.NoClassDefFoundError: 
> org/apache/spark/sql/connect/client/SparkResult
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1643)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:79)
> at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:520)
> at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:494)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:494)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:391)
> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:681)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1852)
> at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1815)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1640)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
> at org.apache.spark.util.Utils$.deserialize(Utils.scala:148)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.org$apache$spark$sql$connect$planner$SparkConnectPlanner$$unpackUdf(SparkConnectPlanner.scala:1353)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner$TypedScalaUdf$.apply(SparkConnectPlanner.scala:761)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformTypedMapPartitions(SparkConnectPlanner.scala:531)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformMapPartitions(SparkConnectPlanner.scala:495)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformRelation(SparkConnectPlanner.scala:143)
> at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handlePlan(SparkConnectStreamHandler.scala:100)
> at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.$anonfun$handle$2(SparkConnectStreamHandler.scala:87)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> at 

[jira] [Updated] (SPARK-43744) Spark Connect scala UDF serialization pulling in unrelated classes not available on server

2023-06-12 Thread Juliusz Sompolski (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Juliusz Sompolski updated SPARK-43744:
--
Summary: Spark Connect scala UDF serialization pulling in unrelated classes 
not available on server  (was: Maven test failure when "interrupt all" tests 
are moved to ClientE2ETestSuite)

> Spark Connect scala UDF serialization pulling in unrelated classes not 
> available on server
> --
>
> Key: SPARK-43744
> URL: https://issues.apache.org/jira/browse/SPARK-43744
> Project: Spark
>  Issue Type: Bug
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Juliusz Sompolski
>Priority: Major
>  Labels: SPARK-43745
>
> [https://github.com/apache/spark/pull/41487] moved "interrupt all - 
> background queries, foreground interrupt" and "interrupt all - foreground 
> queries, background interrupt" tests from ClientE2ETestSuite into a new 
> isolated suite SparkSessionE2ESuite to avoid an unexplicable UDF 
> serialization issue.
>  
> When these tests are moved back to ClientE2ETestSuite and when testing with
> {code:java}
> build/mvn clean install -DskipTests -Phive
> build/mvn test -pl connector/connect/client/jvm -Dtest=none 
> -DwildcardSuites=org.apache.spark.sql.ClientE2ETestSuite{code}
>  
> the tests fails with
> {code:java}
> 23/05/22 15:44:11 ERROR SparkConnectService: Error during: execute. UserId: . 
> SessionId: 0f4013ca-3af9-443b-a0e5-e339a827e0cf.
> java.lang.NoClassDefFoundError: 
> org/apache/spark/sql/connect/client/SparkResult
> at java.lang.Class.getDeclaredMethods0(Native Method)
> at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
> at java.lang.Class.getDeclaredMethod(Class.java:2128)
> at java.io.ObjectStreamClass.getPrivateMethod(ObjectStreamClass.java:1643)
> at java.io.ObjectStreamClass.access$1700(ObjectStreamClass.java:79)
> at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:520)
> at java.io.ObjectStreamClass$3.run(ObjectStreamClass.java:494)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.(ObjectStreamClass.java:494)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:391)
> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:681)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2005)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1852)
> at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1815)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1640)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.readArray(ObjectInputStream.java:2119)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1657)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355)
> at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
> at org.apache.spark.util.Utils$.deserialize(Utils.scala:148)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.org$apache$spark$sql$connect$planner$SparkConnectPlanner$$unpackUdf(SparkConnectPlanner.scala:1353)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner$TypedScalaUdf$.apply(SparkConnectPlanner.scala:761)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformTypedMapPartitions(SparkConnectPlanner.scala:531)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformMapPartitions(SparkConnectPlanner.scala:495)
> at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformRelation(SparkConnectPlanner.scala:143)
> at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handlePlan(SparkConnectStreamHandler.scala:100)
> at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.$anonfun$handle$2(SparkConnectStreamHandler.scala:87)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
> at