[ https://issues.apache.org/jira/browse/SPARK-13612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15177428#comment-15177428 ]
Liang-Chi Hsieh edited comment on SPARK-13612 at 3/3/16 7:35 AM: ----------------------------------------------------------------- Because the internal type for BigDecimal would be Decimal(38, 18) by default, (you can print the schema of x and y), the result scale of x("a") * y("b") will be 18 + 18 = 36. That is detected to have overflow so you get a null value back. You can cast the decimal column to proper precision and scale, e.g.: {code} val newX = x.withColumn("a", x("a").cast(DecimalType(10, 1))) val newY = y.withColumn("b", y("b").cast(DecimalType(10, 1))) newX.join(newY, newX("id") === newY("id")).withColumn("z", newX("a") * newY("b")).show +---+----+---+----+------+ | id| a| id| b| z| +---+----+---+----+------+ | 1|10.0| 1|10.0|100.00| +---+----+---+----+------+ {code} was (Author: viirya): Because the internal type for BigDecimal would be Decimal(38, 18) by default, (you can print the schema of x and y), the result scale of x("a") * y("b") will be 18 + 18 = 36. That is detected to have overflow so you get a null value back. You can cast the decimal column to proper precision and scale, e.g.: {{code}} val newX = x.withColumn("a", x("a").cast(DecimalType(10, 1))) val newY = y.withColumn("b", y("b").cast(DecimalType(10, 1))) newX.join(newY, newX("id") === newY("id")).withColumn("z", newX("a") * newY("b")).show +---+----+---+----+------+ | id| a| id| b| z| +---+----+---+----+------+ | 1|10.0| 1|10.0|100.00| +---+----+---+----+------+ {{code}} > Multiplication of BigDecimal columns not working as expected > ------------------------------------------------------------ > > Key: SPARK-13612 > URL: https://issues.apache.org/jira/browse/SPARK-13612 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.6.0 > Reporter: Varadharajan > > Please consider the below snippet: > {code} > case class AM(id: Int, a: BigDecimal) > case class AX(id: Int, b: BigDecimal) > val x = sc.parallelize(List(AM(1, 10))).toDF > val y = sc.parallelize(List(AX(1, 10))).toDF > x.join(y, x("id") === y("id")).withColumn("z", x("a") * y("b")).show > {code} > output: > {code} > | id| a| id| b| z| > | 1|10.00000000000000...| 1|10.00000000000000...|null| > {code} > Here the multiplication of the columns ("z") return null instead of 100. > As of now we are using the below workaround, but definitely looks like a > serious issue. > {code} > x.join(y, x("id") === y("id")).withColumn("z", x("a") / (expr("1") / > y("b"))).show > {code} > {code} > | id| a| id| b| z| > | 1|10.00000000000000...| 1|10.00000000000000...|100.0000000000000...| > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org