[ https://issues.apache.org/jira/browse/SPARK-40535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Max Gekk updated SPARK-40535: ----------------------------- Description: The code below reproduces the issue: {code:scala} import org.apache.spark.sql.functions._ val df = spark.range(1,10,1,11) df.observe("collectedList", collect_list("id")).collect() {code} instead of {code} Array(1, 2, 3, 4, 5, 6, 7, 8, 9) {code} it fails with the exception: {code:java} java.lang.NullPointerException at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:641) at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:602) at org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:624) at org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:205) at org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:33) {code} was: The code below reproduces the issue: {code:scala} import org.apache.spark.sql.functions._ val df = spark.range(1,10,1,11) df.observe("collectedList", collect_list("id")).collect() {code} > NPE from observe of collect_list > -------------------------------- > > Key: SPARK-40535 > URL: https://issues.apache.org/jira/browse/SPARK-40535 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.4.0 > Reporter: Max Gekk > Priority: Major > > The code below reproduces the issue: > {code:scala} > import org.apache.spark.sql.functions._ > val df = spark.range(1,10,1,11) > df.observe("collectedList", collect_list("id")).collect() > {code} > instead of > {code} > Array(1, 2, 3, 4, 5, 6, 7, 8, 9) > {code} > it fails with the exception: > {code:java} > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:641) > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:602) > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:624) > at > org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:205) > at > org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:33) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org