[ https://issues.apache.org/jira/browse/SPARK-40535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-40535: ------------------------------------ Assignee: (was: Apache Spark) > NPE from observe of collect_list > -------------------------------- > > Key: SPARK-40535 > URL: https://issues.apache.org/jira/browse/SPARK-40535 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.4.0 > Reporter: Max Gekk > Priority: Major > > The code below reproduces the issue: > {code:scala} > import org.apache.spark.sql.functions._ > val df = spark.range(1,10,1,11) > df.observe("collectedList", collect_list("id")).collect() > {code} > instead of > {code} > Array(1, 2, 3, 4, 5, 6, 7, 8, 9) > {code} > it fails with the NPE: > {code:java} > java.lang.NullPointerException > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:641) > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.getBufferObject(interfaces.scala:602) > at > org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.serializeAggregateBufferInPlace(interfaces.scala:624) > at > org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:205) > at > org.apache.spark.sql.execution.AggregatingAccumulator.withBufferSerialized(AggregatingAccumulator.scala:33) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org