I’m trying to track down what seems to be a very slight imprecision in our
Spark application; two of our columns, which should be netting out to
exactly zero, are coming up with very small fractions of non-zero value.
The only thing that I’ve found out of place is that a case class entry into
a Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after
it goes through Spark, and I don’t know if there’s any appreciable
difference between that and the actual 0 value, which can be generated with
BigDecimal. Here’s a contrived example:

scala> case class Data(num: BigDecimal)
defined class Data

scala> val x = Data(0)
x: Data = Data(0)

scala> x.num
res9: BigDecimal = 0

scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num))
y: Data = Data(0E-18)

scala> y.num
res12: BigDecimal = 0E-18

scala> BigDecimal("1") - 1
res15: scala.math.BigDecimal = 0

Am I looking at anything valuable?

Efe

Reply via email to