Yes, thanks for elaborating Michael.
The other thing that I wanted to highlight was that in this specific
case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0).

On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko <m...@gwmail.gwu.edu> wrote:
> Efe,
>
> I think Jakob's point is that that there is no problem.  When you deal with
> real numbers, you don't get exact representations of numbers.  There is
> always some slop in representations, things don't ever cancel out exactly.
> Testing reals for equality to zero will almost never work.
>
> Look at Goldberg's paper
> https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf
> for a quick intro.
>
> Mike
>
> On Oct 24, 2016, at 10:36 PM, Efe Selcuk <efema...@gmail.com> wrote:
>
> Okay, so this isn't contributing to any kind of imprecision. I suppose I
> need to go digging further then. Thanks for the quick help.
>
> On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky <ja...@odersky.com> wrote:
>>
>> What you're seeing is merely a strange representation, 0E-18 is zero.
>> The E-18 represents the precision that Spark uses to store the decimal
>>
>> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky <ja...@odersky.com> wrote:
>> > An even smaller example that demonstrates the same behaviour:
>> >
>> >     Seq(Data(BigDecimal(0))).toDS.head
>> >
>> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk <efema...@gmail.com> wrote:
>> >> I’m trying to track down what seems to be a very slight imprecision in
>> >> our
>> >> Spark application; two of our columns, which should be netting out to
>> >> exactly zero, are coming up with very small fractions of non-zero
>> >> value. The
>> >> only thing that I’ve found out of place is that a case class entry into
>> >> a
>> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after
>> >> it
>> >> goes through Spark, and I don’t know if there’s any appreciable
>> >> difference
>> >> between that and the actual 0 value, which can be generated with
>> >> BigDecimal.
>> >> Here’s a contrived example:
>> >>
>> >> scala> case class Data(num: BigDecimal)
>> >> defined class Data
>> >>
>> >> scala> val x = Data(0)
>> >> x: Data = Data(0)
>> >>
>> >> scala> x.num
>> >> res9: BigDecimal = 0
>> >>
>> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num +
>> >> b.num))
>> >> y: Data = Data(0E-18)
>> >>
>> >> scala> y.num
>> >> res12: BigDecimal = 0E-18
>> >>
>> >> scala> BigDecimal("1") - 1
>> >> res15: scala.math.BigDecimal = 0
>> >>
>> >> Am I looking at anything valuable?
>> >>
>> >> Efe

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to