[ https://issues.apache.org/jira/browse/SPARK-40624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
xsys updated SPARK-40624: ------------------------- Description: h3. Describe the bug Storing an invalid value (e.g. {{{}BigDecimal("1.0/0"){}}}) via {{spark-shell}} errors out during RDD creation. However, {{1.0/0}} evaluates to {{NULL}} if the value is inserted into a {{DECIMAL(20,10)}} column of a table via {{{}spark-sql{}}}. h3. To Reproduce On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: {code:java} $SPARK_HOME/bin/spark-sql{code} Execute the following: (evaluated to {{{}NULL{}}}) {code:java} spark-sql> create table decimal_vals(c1 DECIMAL(20,10)) stored as ORC; spark-sql> insert into decimal_vals select 1.0/0; spark-sql> select * from decimal_vals; NULL{code} Using {{{}spark-shell{}}}: {code:java} $SPARK_HOME/bin/spark-shell{code} Execute the following: (errors out during RDD creation) {code:java} scala> import org.apache.spark.sql.{Row, SparkSession} import org.apache.spark.sql.{Row, SparkSession}scala> import org.apache.spark.sql.types._ import org.apache.spark.sql.types._ scala> val rdd = sc.parallelize(Seq(Row(BigDecimal("1.0/0")))) java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:497) at java.math.BigDecimal.<init>(BigDecimal.java:383) at java.math.BigDecimal.<init>(BigDecimal.java:809) at scala.math.BigDecimal$.exact(BigDecimal.scala:126) at scala.math.BigDecimal$.apply(BigDecimal.scala:284) ... 49 elided{code} h3. Expected behavior We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) to behave consistently for the same data type & input combination ({{{}BigDecimal{}}}/{{{}DECIMAL(20,10){}}} and {{{}1.0/0{}}}). was: h3. Describe the bug Storing an invalid value (e.g. {{{}BigDecimal("1.0/0"){}}}) via {{spark-shell}} errors out during RDD creation. However, {{1.0/0}} evaluates to {{NULL}} if the value is inserted into a {{DECIMAL(20,10)}} column of a table via {{{}spark-sql{}}}. h3. To Reproduce On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: {code:java} $SPARK_HOME/bin/spark-sql{code} Execute the following: (evaluated to {{{}NULL{}}}) {code:java} spark-sql> create table decimal_vals(c1 DECIMAL(20,10)) stored as ORC; spark-sql> insert into decimal_vals select 1.0/0; spark-sql> select * from decimal_vals; NULL{code} Using {{{}spark-shell{}}}: {code:java} $SPARK_HOME/bin/spark-shell{code} Execute the following: (errors out during RDD creation) {code:java} scala> val rdd = sc.parallelize(Seq(Row(BigDecimal("1.0/0")))) java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:497) at java.math.BigDecimal.<init>(BigDecimal.java:383) at java.math.BigDecimal.<init>(BigDecimal.java:809) at scala.math.BigDecimal$.exact(BigDecimal.scala:126) at scala.math.BigDecimal$.apply(BigDecimal.scala:284) ... 49 elided{code} h3. Expected behavior We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) to behave consistently for the same data type & input combination ({{{}BigDecimal{}}}/{{{}DECIMAL(20,10){}}} and {{{}1.0/0{}}}). > A DECIMAL value with division by 0 errors in DataFrame but evaluates to NULL > in SparkSQL > ---------------------------------------------------------------------------------------- > > Key: SPARK-40624 > URL: https://issues.apache.org/jira/browse/SPARK-40624 > Project: Spark > Issue Type: Bug > Components: Spark Shell > Affects Versions: 3.2.1 > Reporter: xsys > Priority: Major > > h3. Describe the bug > Storing an invalid value (e.g. {{{}BigDecimal("1.0/0"){}}}) via > {{spark-shell}} errors out during RDD creation. However, {{1.0/0}} evaluates > to {{NULL}} if the value is inserted into a {{DECIMAL(20,10)}} column of a > table via {{{}spark-sql{}}}. > h3. To Reproduce > On Spark 3.2.1 (commit {{{}4f25b3f712{}}}), using {{{}spark-sql{}}}: > {code:java} > $SPARK_HOME/bin/spark-sql{code} > Execute the following: (evaluated to {{{}NULL{}}}) > {code:java} > spark-sql> create table decimal_vals(c1 DECIMAL(20,10)) stored as ORC; > spark-sql> insert into decimal_vals select 1.0/0; > spark-sql> select * from decimal_vals; > NULL{code} > Using {{{}spark-shell{}}}: > {code:java} > $SPARK_HOME/bin/spark-shell{code} > Execute the following: (errors out during RDD creation) > {code:java} > scala> import org.apache.spark.sql.{Row, SparkSession} > import org.apache.spark.sql.{Row, SparkSession}scala> import > org.apache.spark.sql.types._ > import org.apache.spark.sql.types._ > scala> val rdd = sc.parallelize(Seq(Row(BigDecimal("1.0/0")))) > java.lang.NumberFormatException > at java.math.BigDecimal.<init>(BigDecimal.java:497) > at java.math.BigDecimal.<init>(BigDecimal.java:383) > at java.math.BigDecimal.<init>(BigDecimal.java:809) > at scala.math.BigDecimal$.exact(BigDecimal.scala:126) > at scala.math.BigDecimal$.apply(BigDecimal.scala:284) > ... 49 elided{code} > h3. Expected behavior > We expect the two Spark interfaces ({{{}spark-sql{}}} & {{{}spark-shell{}}}) > to behave consistently for the same data type & input combination > ({{{}BigDecimal{}}}/{{{}DECIMAL(20,10){}}} and {{{}1.0/0{}}}). > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org