Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
I should have noted that I understand the notation of 0E-18 (exponential
form, I think) and that in a normal case it is no different than 0; I just
wanted to make sure that there wasn't something tricky going on since the
representation was seemingly changing.

Michael, that's a fair point. I keep operating under the assumption of some
guaranteed performance from BigDecimal but I realize there is probably some
math happening that's causing results that can't perfectly be represented.

Thanks guys. I'm good now.

On Mon, Oct 24, 2016 at 8:57 PM Jakob Odersky  wrote:

> Yes, thanks for elaborating Michael.
> The other thing that I wanted to highlight was that in this specific
> case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0).
>
> On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko 
> wrote:
> > Efe,
> >
> > I think Jakob's point is that that there is no problem.  When you deal
> with
> > real numbers, you don't get exact representations of numbers.  There is
> > always some slop in representations, things don't ever cancel out
> exactly.
> > Testing reals for equality to zero will almost never work.
> >
> > Look at Goldberg's paper
> >
> https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf
> > for a quick intro.
> >
> > Mike
> >
> > On Oct 24, 2016, at 10:36 PM, Efe Selcuk  wrote:
> >
> > Okay, so this isn't contributing to any kind of imprecision. I suppose I
> > need to go digging further then. Thanks for the quick help.
> >
> > On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky  wrote:
> >>
> >> What you're seeing is merely a strange representation, 0E-18 is zero.
> >> The E-18 represents the precision that Spark uses to store the decimal
> >>
> >> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky 
> wrote:
> >> > An even smaller example that demonstrates the same behaviour:
> >> >
> >> > Seq(Data(BigDecimal(0))).toDS.head
> >> >
> >> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk 
> wrote:
> >> >> I’m trying to track down what seems to be a very slight imprecision
> in
> >> >> our
> >> >> Spark application; two of our columns, which should be netting out to
> >> >> exactly zero, are coming up with very small fractions of non-zero
> >> >> value. The
> >> >> only thing that I’ve found out of place is that a case class entry
> into
> >> >> a
> >> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18
> after
> >> >> it
> >> >> goes through Spark, and I don’t know if there’s any appreciable
> >> >> difference
> >> >> between that and the actual 0 value, which can be generated with
> >> >> BigDecimal.
> >> >> Here’s a contrived example:
> >> >>
> >> >> scala> case class Data(num: BigDecimal)
> >> >> defined class Data
> >> >>
> >> >> scala> val x = Data(0)
> >> >> x: Data = Data(0)
> >> >>
> >> >> scala> x.num
> >> >> res9: BigDecimal = 0
> >> >>
> >> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num +
> >> >> b.num))
> >> >> y: Data = Data(0E-18)
> >> >>
> >> >> scala> y.num
> >> >> res12: BigDecimal = 0E-18
> >> >>
> >> >> scala> BigDecimal("1") - 1
> >> >> res15: scala.math.BigDecimal = 0
> >> >>
> >> >> Am I looking at anything valuable?
> >> >>
> >> >> Efe
>


Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
Yes, thanks for elaborating Michael.
The other thing that I wanted to highlight was that in this specific
case the value is actually exactly zero (0E-18 = 0*10^(-18) = 0).

On Mon, Oct 24, 2016 at 8:50 PM, Michael Matsko  wrote:
> Efe,
>
> I think Jakob's point is that that there is no problem.  When you deal with
> real numbers, you don't get exact representations of numbers.  There is
> always some slop in representations, things don't ever cancel out exactly.
> Testing reals for equality to zero will almost never work.
>
> Look at Goldberg's paper
> https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf
> for a quick intro.
>
> Mike
>
> On Oct 24, 2016, at 10:36 PM, Efe Selcuk  wrote:
>
> Okay, so this isn't contributing to any kind of imprecision. I suppose I
> need to go digging further then. Thanks for the quick help.
>
> On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky  wrote:
>>
>> What you're seeing is merely a strange representation, 0E-18 is zero.
>> The E-18 represents the precision that Spark uses to store the decimal
>>
>> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky  wrote:
>> > An even smaller example that demonstrates the same behaviour:
>> >
>> > Seq(Data(BigDecimal(0))).toDS.head
>> >
>> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk  wrote:
>> >> I’m trying to track down what seems to be a very slight imprecision in
>> >> our
>> >> Spark application; two of our columns, which should be netting out to
>> >> exactly zero, are coming up with very small fractions of non-zero
>> >> value. The
>> >> only thing that I’ve found out of place is that a case class entry into
>> >> a
>> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after
>> >> it
>> >> goes through Spark, and I don’t know if there’s any appreciable
>> >> difference
>> >> between that and the actual 0 value, which can be generated with
>> >> BigDecimal.
>> >> Here’s a contrived example:
>> >>
>> >> scala> case class Data(num: BigDecimal)
>> >> defined class Data
>> >>
>> >> scala> val x = Data(0)
>> >> x: Data = Data(0)
>> >>
>> >> scala> x.num
>> >> res9: BigDecimal = 0
>> >>
>> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num +
>> >> b.num))
>> >> y: Data = Data(0E-18)
>> >>
>> >> scala> y.num
>> >> res12: BigDecimal = 0E-18
>> >>
>> >> scala> BigDecimal("1") - 1
>> >> res15: scala.math.BigDecimal = 0
>> >>
>> >> Am I looking at anything valuable?
>> >>
>> >> Efe

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Michael Matsko
Efe,

I think Jakob's point is that that there is no problem.  When you deal with 
real numbers, you don't get exact representations of numbers.  There is always 
some slop in representations, things don't ever cancel out exactly.  Testing 
reals for equality to zero will almost never work.  

Look at Goldberg's paper 
https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Double/paper.pdf
 for a quick intro.

Mike

> On Oct 24, 2016, at 10:36 PM, Efe Selcuk  wrote:
> 
> Okay, so this isn't contributing to any kind of imprecision. I suppose I need 
> to go digging further then. Thanks for the quick help.
> 
>> On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky  wrote:
>> What you're seeing is merely a strange representation, 0E-18 is zero.
>> The E-18 represents the precision that Spark uses to store the decimal
>> 
>> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky  wrote:
>> > An even smaller example that demonstrates the same behaviour:
>> >
>> > Seq(Data(BigDecimal(0))).toDS.head
>> >
>> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk  wrote:
>> >> I’m trying to track down what seems to be a very slight imprecision in our
>> >> Spark application; two of our columns, which should be netting out to
>> >> exactly zero, are coming up with very small fractions of non-zero value. 
>> >> The
>> >> only thing that I’ve found out of place is that a case class entry into a
>> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it
>> >> goes through Spark, and I don’t know if there’s any appreciable difference
>> >> between that and the actual 0 value, which can be generated with 
>> >> BigDecimal.
>> >> Here’s a contrived example:
>> >>
>> >> scala> case class Data(num: BigDecimal)
>> >> defined class Data
>> >>
>> >> scala> val x = Data(0)
>> >> x: Data = Data(0)
>> >>
>> >> scala> x.num
>> >> res9: BigDecimal = 0
>> >>
>> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + 
>> >> b.num))
>> >> y: Data = Data(0E-18)
>> >>
>> >> scala> y.num
>> >> res12: BigDecimal = 0E-18
>> >>
>> >> scala> BigDecimal("1") - 1
>> >> res15: scala.math.BigDecimal = 0
>> >>
>> >> Am I looking at anything valuable?
>> >>
>> >> Efe


Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
Okay, so this isn't contributing to any kind of imprecision. I suppose I
need to go digging further then. Thanks for the quick help.

On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky  wrote:

> What you're seeing is merely a strange representation, 0E-18 is zero.
> The E-18 represents the precision that Spark uses to store the decimal
>
> On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky  wrote:
> > An even smaller example that demonstrates the same behaviour:
> >
> > Seq(Data(BigDecimal(0))).toDS.head
> >
> > On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk  wrote:
> >> I’m trying to track down what seems to be a very slight imprecision in
> our
> >> Spark application; two of our columns, which should be netting out to
> >> exactly zero, are coming up with very small fractions of non-zero
> value. The
> >> only thing that I’ve found out of place is that a case class entry into
> a
> >> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after
> it
> >> goes through Spark, and I don’t know if there’s any appreciable
> difference
> >> between that and the actual 0 value, which can be generated with
> BigDecimal.
> >> Here’s a contrived example:
> >>
> >> scala> case class Data(num: BigDecimal)
> >> defined class Data
> >>
> >> scala> val x = Data(0)
> >> x: Data = Data(0)
> >>
> >> scala> x.num
> >> res9: BigDecimal = 0
> >>
> >> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num +
> b.num))
> >> y: Data = Data(0E-18)
> >>
> >> scala> y.num
> >> res12: BigDecimal = 0E-18
> >>
> >> scala> BigDecimal("1") - 1
> >> res15: scala.math.BigDecimal = 0
> >>
> >> Am I looking at anything valuable?
> >>
> >> Efe
>


Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
What you're seeing is merely a strange representation, 0E-18 is zero.
The E-18 represents the precision that Spark uses to store the decimal

On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky  wrote:
> An even smaller example that demonstrates the same behaviour:
>
> Seq(Data(BigDecimal(0))).toDS.head
>
> On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk  wrote:
>> I’m trying to track down what seems to be a very slight imprecision in our
>> Spark application; two of our columns, which should be netting out to
>> exactly zero, are coming up with very small fractions of non-zero value. The
>> only thing that I’ve found out of place is that a case class entry into a
>> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it
>> goes through Spark, and I don’t know if there’s any appreciable difference
>> between that and the actual 0 value, which can be generated with BigDecimal.
>> Here’s a contrived example:
>>
>> scala> case class Data(num: BigDecimal)
>> defined class Data
>>
>> scala> val x = Data(0)
>> x: Data = Data(0)
>>
>> scala> x.num
>> res9: BigDecimal = 0
>>
>> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num))
>> y: Data = Data(0E-18)
>>
>> scala> y.num
>> res12: BigDecimal = 0E-18
>>
>> scala> BigDecimal("1") - 1
>> res15: scala.math.BigDecimal = 0
>>
>> Am I looking at anything valuable?
>>
>> Efe

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
An even smaller example that demonstrates the same behaviour:

Seq(Data(BigDecimal(0))).toDS.head

On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk  wrote:
> I’m trying to track down what seems to be a very slight imprecision in our
> Spark application; two of our columns, which should be netting out to
> exactly zero, are coming up with very small fractions of non-zero value. The
> only thing that I’ve found out of place is that a case class entry into a
> Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after it
> goes through Spark, and I don’t know if there’s any appreciable difference
> between that and the actual 0 value, which can be generated with BigDecimal.
> Here’s a contrived example:
>
> scala> case class Data(num: BigDecimal)
> defined class Data
>
> scala> val x = Data(0)
> x: Data = Data(0)
>
> scala> x.num
> res9: BigDecimal = 0
>
> scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num))
> y: Data = Data(0E-18)
>
> scala> y.num
> res12: BigDecimal = 0E-18
>
> scala> BigDecimal("1") - 1
> res15: scala.math.BigDecimal = 0
>
> Am I looking at anything valuable?
>
> Efe

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



[Spark 2] BigDecimal and 0

2016-10-24 Thread Efe Selcuk
I’m trying to track down what seems to be a very slight imprecision in our
Spark application; two of our columns, which should be netting out to
exactly zero, are coming up with very small fractions of non-zero value.
The only thing that I’ve found out of place is that a case class entry into
a Dataset we’ve generated with BigDecimal(“0”) will end up as 0E-18 after
it goes through Spark, and I don’t know if there’s any appreciable
difference between that and the actual 0 value, which can be generated with
BigDecimal. Here’s a contrived example:

scala> case class Data(num: BigDecimal)
defined class Data

scala> val x = Data(0)
x: Data = Data(0)

scala> x.num
res9: BigDecimal = 0

scala> val y = Seq(x, x.copy()).toDS.reduce( (a,b) => a.copy(a.num + b.num))
y: Data = Data(0E-18)

scala> y.num
res12: BigDecimal = 0E-18

scala> BigDecimal("1") - 1
res15: scala.math.BigDecimal = 0

Am I looking at anything valuable?

Efe