[jira] [Commented] (SPARK-38604) ceil and floor return different types when called from scala than sql
[ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509915#comment-17509915 ] Apache Spark commented on SPARK-38604: -- User 'revans2' has created a pull request for this issue: https://github.com/apache/spark/pull/35925 > ceil and floor return different types when called from scala than sql > - > > Key: SPARK-38604 > URL: https://issues.apache.org/jira/browse/SPARK-38604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Robert Joseph Evans >Priority: Critical > > In Spark 3.3.0 SPARK-37475 > [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in > and added support for a scale parameter to floor and ceil. There was > [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] > about potential incompatibilities, specifically with respect to the return > types. It looks like it was > [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to > keep the old behavior if no scale parameter was passed in, but use the new > functionality if a scale is passed in. > > But the scala API didn't get updated to do the same thing as the SQL API. > {code:scala} > scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, > 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), > ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema > root > |-- id: long (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(20,0) (nullable = true) > |-- one_arg_func: decimal(20,0) (nullable = true) > |-- two_arg_func: decimal(20,0) (nullable = true) > > scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", > "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), > ceil(col("id")).alias("one_arg_func"), ceil(col("id"), > lit(0)).alias("two_arg_func")).printSchema > root > |-- id: double (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(30,0) (nullable = true) > |-- one_arg_func: decimal(30,0) (nullable = true) > |-- two_arg_func: decimal(30,0) (nullable = true) {code} > And because the python code call into this too it also has the same problem. > I suspect that the java and R code also expose it too, but I didn't check. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38604) ceil and floor return different types when called from scala than sql
[ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509916#comment-17509916 ] Apache Spark commented on SPARK-38604: -- User 'revans2' has created a pull request for this issue: https://github.com/apache/spark/pull/35925 > ceil and floor return different types when called from scala than sql > - > > Key: SPARK-38604 > URL: https://issues.apache.org/jira/browse/SPARK-38604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Robert Joseph Evans >Priority: Critical > > In Spark 3.3.0 SPARK-37475 > [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in > and added support for a scale parameter to floor and ceil. There was > [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] > about potential incompatibilities, specifically with respect to the return > types. It looks like it was > [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to > keep the old behavior if no scale parameter was passed in, but use the new > functionality if a scale is passed in. > > But the scala API didn't get updated to do the same thing as the SQL API. > {code:scala} > scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, > 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), > ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema > root > |-- id: long (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(20,0) (nullable = true) > |-- one_arg_func: decimal(20,0) (nullable = true) > |-- two_arg_func: decimal(20,0) (nullable = true) > > scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", > "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), > ceil(col("id")).alias("one_arg_func"), ceil(col("id"), > lit(0)).alias("two_arg_func")).printSchema > root > |-- id: double (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(30,0) (nullable = true) > |-- one_arg_func: decimal(30,0) (nullable = true) > |-- two_arg_func: decimal(30,0) (nullable = true) {code} > And because the python code call into this too it also has the same problem. > I suspect that the java and R code also expose it too, but I didn't check. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38604) ceil and floor return different types when called from scala than sql
[ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509266#comment-17509266 ] Apache Spark commented on SPARK-38604: -- User 'revans2' has created a pull request for this issue: https://github.com/apache/spark/pull/35913 > ceil and floor return different types when called from scala than sql > - > > Key: SPARK-38604 > URL: https://issues.apache.org/jira/browse/SPARK-38604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Robert Joseph Evans >Priority: Critical > > In Spark 3.3.0 SPARK-37475 > [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in > and added support for a scale parameter to floor and ceil. There was > [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] > about potential incompatibilities, specifically with respect to the return > types. It looks like it was > [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to > keep the old behavior if no scale parameter was passed in, but use the new > functionality if a scale is passed in. > > But the scala API didn't get updated to do the same thing as the SQL API. > {code:scala} > scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, > 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), > ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema > root > |-- id: long (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(20,0) (nullable = true) > |-- one_arg_func: decimal(20,0) (nullable = true) > |-- two_arg_func: decimal(20,0) (nullable = true) > > scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", > "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), > ceil(col("id")).alias("one_arg_func"), ceil(col("id"), > lit(0)).alias("two_arg_func")).printSchema > root > |-- id: double (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(30,0) (nullable = true) > |-- one_arg_func: decimal(30,0) (nullable = true) > |-- two_arg_func: decimal(30,0) (nullable = true) {code} > And because the python code call into this too it also has the same problem. > I suspect that the java and R code also expose it too, but I didn't check. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38604) ceil and floor return different types when called from scala than sql
[ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509267#comment-17509267 ] Apache Spark commented on SPARK-38604: -- User 'revans2' has created a pull request for this issue: https://github.com/apache/spark/pull/35913 > ceil and floor return different types when called from scala than sql > - > > Key: SPARK-38604 > URL: https://issues.apache.org/jira/browse/SPARK-38604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Robert Joseph Evans >Priority: Critical > > In Spark 3.3.0 SPARK-37475 > [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in > and added support for a scale parameter to floor and ceil. There was > [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] > about potential incompatibilities, specifically with respect to the return > types. It looks like it was > [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to > keep the old behavior if no scale parameter was passed in, but use the new > functionality if a scale is passed in. > > But the scala API didn't get updated to do the same thing as the SQL API. > {code:scala} > scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, > 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), > ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema > root > |-- id: long (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(20,0) (nullable = true) > |-- one_arg_func: decimal(20,0) (nullable = true) > |-- two_arg_func: decimal(20,0) (nullable = true) > > scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", > "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), > ceil(col("id")).alias("one_arg_func"), ceil(col("id"), > lit(0)).alias("two_arg_func")).printSchema > root > |-- id: double (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(30,0) (nullable = true) > |-- one_arg_func: decimal(30,0) (nullable = true) > |-- two_arg_func: decimal(30,0) (nullable = true) {code} > And because the python code call into this too it also has the same problem. > I suspect that the java and R code also expose it too, but I didn't check. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38604) ceil and floor return different types when called from scala than sql
[ https://issues.apache.org/jira/browse/SPARK-38604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17509265#comment-17509265 ] Robert Joseph Evans commented on SPARK-38604: - I marked this as a critical because it technically is a breaking change for all but the SQL APIs. I could see blocker because it could be considered data corruption, but only in a really loose definition of that. Could someone assign this to me? I have a patch that fixes it. It is a very simple patch. I don't appear to have permission to assign to myself. > ceil and floor return different types when called from scala than sql > - > > Key: SPARK-38604 > URL: https://issues.apache.org/jira/browse/SPARK-38604 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Robert Joseph Evans >Priority: Critical > > In Spark 3.3.0 SPARK-37475 > [PR|http://example.com/][https://github.com/apache/spark/pull/34729] went in > and added support for a scale parameter to floor and ceil. There was > [discussion|https://github.com/apache/spark/pull/34729#discussion_r761157050] > about potential incompatibilities, specifically with respect to the return > types. It looks like it was > [decided|https://github.com/apache/spark/pull/34729#discussion_r767446855 to > keep the old behavior if no scale parameter was passed in, but use the new > functionality if a scale is passed in. > > But the scala API didn't get updated to do the same thing as the SQL API. > {code:scala} > scala> spark.range(1).selectExpr("id", "ceil(id) as one_arg_sql", "ceil(id, > 0) as two_arg_sql").select(col("*"), ceil(col("id")).alias("one_arg_func"), > ceil(col("id"), lit(0)).alias("two_arg_func")).printSchema > root > |-- id: long (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(20,0) (nullable = true) > |-- one_arg_func: decimal(20,0) (nullable = true) > |-- two_arg_func: decimal(20,0) (nullable = true) > > scala> spark.range(1).selectExpr("cast(id as double) as id").selectExpr("id", > "ceil(id) as one_arg_sql", "ceil(id, 0) as two_arg_sql").select(col("*"), > ceil(col("id")).alias("one_arg_func"), ceil(col("id"), > lit(0)).alias("two_arg_func")).printSchema > root > |-- id: double (nullable = false) > |-- one_arg_sql: long (nullable = true) > |-- two_arg_sql: decimal(30,0) (nullable = true) > |-- one_arg_func: decimal(30,0) (nullable = true) > |-- two_arg_func: decimal(30,0) (nullable = true) {code} > And because the python code call into this too it also has the same problem. > I suspect that the java and R code also expose it too, but I didn't check. > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org