[ https://issues.apache.org/jira/browse/SPARK-38099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
L. C. Hsieh resolved SPARK-38099. --------------------------------- Resolution: Invalid > Query using an aggregation on a literal value with an empty underlying > dataframe returns null > --------------------------------------------------------------------------------------------- > > Key: SPARK-38099 > URL: https://issues.apache.org/jira/browse/SPARK-38099 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.2.0 > Environment: Windows 10, Spark 3.2.0, Java 11. > Reporter: Laurens Versluis > Priority: Major > > Running a query with an aggregation functions such as average on literal > value input with an empty dataframe in the FROM clause causes Spark to return > null. > Minimal reproducible example using Spark 3.2.0 with Java 11: > > {code:java} > sparkSession.emptyDataFrame().createOrReplaceTempView("empty"); > StructType someSchema = new StructType(new > StructField[]{DataTypes.createStructField("a", DataTypes.StringType, false)}); > final Row aRow = Row.fromSeq(asScalaBuffer(List.of("a"))); > sparkSession.createDataFrame(List.of(aRow), > someSchema).createOrReplaceTempView("non_empty"); > sparkSession.sql("SELECT avg(1)").show(); // standalone query works > sparkSession.sql("SELECT avg(1) FROM empty").show(); // empty DF gives null > sparkSession.sql("SELECT avg(1) FROM non_empty").show(); // It does work with > any non-empty DF{code} > Output is as follows: > {noformat} > +------+ > |avg(1)| > +------+ > | 1.0| > +------+ > +------+ > |avg(1)| > +------+ > | null| > +------+ > +------+ > |avg(1)| > +------+ > | 1.0| > +------+ > {noformat} > I would expect that the second query also returns 1.0. It seems that any > non-empty DataFrame returns 1.0. > > Out of curiosity: is this Spark Catalyst doing some empty DataFrame > optimizations that affect the result? -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org