Xinyi Yu created SPARK-48718: -------------------------------- Summary: Got incastable error when deserializer in cogroup is resolved during application of DeduplicateRelation rule Key: SPARK-48718 URL: https://issues.apache.org/jira/browse/SPARK-48718 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 4.0.0 Reporter: Xinyi Yu
When running the following commands: {code:java} val lhs = spark.createDataFrame( List(Row(123L)).asJava, StructType(Seq(StructField("GROUPING_KEY", LongType))) ) val rhs = spark.createDataFrame( List(Row(0L, 123L)).asJava, StructType(Seq(StructField("ID", LongType), StructField("GROUPING_KEY", LongType))) ) val lhsKV = lhs.groupByKey((r: Row) => r.getAs[Long]("GROUPING_KEY")) val rhsKV = rhs.groupByKey((r: Row) => r.getAs[Long]("GROUPING_KEY")) val cogrouped = lhsKV.cogroup(rhsKV)( (a: Long, b: Iterator[Row], c: Iterator[Row]) => Iterator(0L) ) val joined = rhs.join(cogrouped, col("ID") === col("value"), "left") {code} It gets an error: {code:java} java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.objects.AssertNotNull cannot be cast to org.apache.spark.sql.catalyst.analysis.UnresolvedDeserializer {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org