[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168989#comment-16168989 ]
Olivier Blanvillain edited comment on SPARK-22036 at 9/16/17 5:22 PM: ---------------------------------------------------------------------- It's surprising because in this case the resulting value seems fits within the range of representable values: {code:java} scala> val result = BigDecimal(-0.1267333984375) * BigDecimal(-1000.1) result: scala.math.BigDecimal = 126.74607177734375 scala> sqlContenxt.createDataset(List(result)).head == result res10: Boolean = true {code} Also Spark will silently loses BigDecimal precision in other circumstances: {code:java} scala> val tooPrecise = BigDecimal("126.74607177734375111111111") tooPrecise: scala.math.BigDecimal = 126.74607177734375111111111 scala> val ds = sqlContenxt.createDataset(List(tooPrecise)) ds: org.apache.spark.sql.Dataset[scala.math.BigDecimal] = [value: decimal(38,18)] scala> ds.head res14: scala.math.BigDecimal = 126.746071777343751111 scala> ds.select(ds("value") * BigDecimal(1)).head res15: org.apache.spark.sql.Row = [126.746071777343751111] {code} > I am not sure of what should be done in this case Given that Sparks' BigDecimal have bounded precision I would consider following that is done for other numeric representations and return the closest representable value in case of overflow. was (Author: olivierblanvillain): It's surprising because in this case the resulting value fits within the range of representable values: {code:java} scala> val result = BigDecimal(-0.1267333984375) * BigDecimal(-1000.1) result: scala.math.BigDecimal = 126.74607177734375 scala> sqlContenxt.createDataset(List(result)).head == result res10: Boolean = true {code} Also Spark will silently loses BigDecimal precision in other circumstances: {code:java} scala> val tooPrecise = BigDecimal("126.74607177734375111111111") tooPrecise: scala.math.BigDecimal = 126.74607177734375111111111 scala> val ds = sqlContenxt.createDataset(List(tooPrecise)) ds: org.apache.spark.sql.Dataset[scala.math.BigDecimal] = [value: decimal(38,18)] scala> ds.head res14: scala.math.BigDecimal = 126.746071777343751111 scala> ds.select(ds("value") * BigDecimal(1)).head res15: org.apache.spark.sql.Row = [126.746071777343751111] {code} > I am not sure of what should be done in this case Given that Sparks' BigDecimal have bounded precision I would consider following that is done for other numeric representations and return the closest representable value in case of overflow. > BigDecimal multiplication sometimes returns null > ------------------------------------------------ > > Key: SPARK-22036 > URL: https://issues.apache.org/jira/browse/SPARK-22036 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: Olivier Blanvillain > > The multiplication of two BigDecimal numbers sometimes returns null. This > issue we discovered while doing property based testing for the frameless > project. Here is a minimal reproduction: > {code:java} > object Main extends App { > import org.apache.spark.{SparkConf, SparkContext} > import org.apache.spark.sql.SparkSession > import spark.implicits._ > val conf = new > SparkConf().setMaster("local[*]").setAppName("REPL").set("spark.ui.enabled", > "false") > val spark = > SparkSession.builder().config(conf).appName("REPL").getOrCreate() > implicit val sqlContext = spark.sqlContext > case class X2(a: BigDecimal, b: BigDecimal) > val ds = sqlContext.createDataset(List(X2(BigDecimal(-0.1267333984375), > BigDecimal(-1000.1)))) > val result = ds.select(ds("a") * ds("b")).collect.head > println(result) // [null] > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org