Yes, thanks for elaborating Michael. The other thing that I wanted to highlight was that in this specific case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0).
On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko <m...@gwmail.gwu.edu> wrote: > Efe, > > I think Jakob's point is that that there is no problem. When you deal with > real numbers, you don't get exact representations of numbers. There is > always some slop in representations, things don't ever cancel out exactly. > Testing reals for equality to zero will almost never work. > > Look at Goldberg's paper > https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf > for a quick intro. > > Mike > > On Oct 24, 2016, at 10:36 PM, Efe Selcuk <efema...@gmail.com> wrote: > > Okay, so this isn't contributing to any kind of imprecision. I suppose I > need to go digging further then. Thanks for the quick help. > > On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky <ja...@odersky.com> wrote: >> >> What you're seeing is merely a strange representation, 0E-18 is zero. >> The E-18 represents the precision that Spark uses to store the decimal >> >> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky <ja...@odersky.com> wrote: >> > An even smaller example that demonstrates the same behaviour: >> > >> > Seq(Data(BigDecimal(0))).toDS.head >> > >> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk <efema...@gmail.com> wrote: >> >> I’m trying to track down what seems to be a very slight imprecision in >> >> our >> >> Spark application; two of our columns, which should be netting out to >> >> exactly zero, are coming up with very small fractions of non-zero >> >> value. The >> >> only thing that I’ve found out of place is that a case class entry into >> >> a >> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after >> >> it >> >> goes through Spark, and I don’t know if there’s any appreciable >> >> difference >> >> between that and the actual 0 value, which can be generated with >> >> BigDecimal. >> >> Here’s a contrived example: >> >> >> >> scala> case class Data(num: BigDecimal) >> >> defined class Data >> >> >> >> scala> val x = Data(0) >> >> x: Data = Data(0) >> >> >> >> scala> x.num >> >> res9: BigDecimal = 0 >> >> >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + >> >> b.num)) >> >> y: Data = Data(0E-18) >> >> >> >> scala> y.num >> >> res12: BigDecimal = 0E-18 >> >> >> >> scala> BigDecimal("1") - 1 >> >> res15: scala.math.BigDecimal = 0 >> >> >> >> Am I looking at anything valuable? >> >> >> >> Efe --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org