RE: Short-circuit Logic

2013-06-12 Thread Carlos Nepomuceno

> From: oscar.j.benja...@gmail.com
> Date: Thu, 30 May 2013 23:57:28 +0100
> Subject: Re: Short-circuit Logic
> To: carlosnepomuc...@outlook.com
> CC: python-list@python.org
>
> On 30 May 2013 22:03, Carlos Nepomuceno  wrote:
>>> Here's another way, mathematically equivalent (although not necessarily
>>> equivalent using floating point computations!) which avoids the divide-by-
>>> zero problem:
>>>
>>> abs(a - b) < epsilon*a
>>
>> That's wrong! If abs(a) < abs(a-b)/epsilon you will break the commutative 
>> law.
>
> There is no commutative law for relative tolerance floating point
> comparisons. If you want to compare with a relative tolerance then you
> you should choose carefully what your tolerance is to be relative to
> (and how big your relative tolerance should be).

Off course there is! It might not suite your specific needs though.

I'll just quote Knuth because it's pretty damn good:

"A. An axiomatic approach. Although the associative law is not valid, the 
commutative law

u (+) v == v (+) u (2)

does hold, and this law can be a valuable conceptual asset in programming and 
in the analysis of programs. This example suggests that we should look for
important laws that are satified by (+), (-), (*), and (/); it is not 
unreasonable to say that floating point routines should be designed to preserve 
as many of the ordinary mathematical laws as possible. If more axioms are 
valid, it becomes easier to write good programs, and programs also become more 
portable from
machine to machine."
TAOCP, Vol .2, p. 214


> In some applications it's obvious which of a or b you should use to
> scale the tolerance but in others it is not or you should compare with
> something more complex. For an example where it is obvious, when
> testing numerical code I might write something like:
>
> eps = 1e-7
> true_answer = 123.4567879
> estimate = myfunc(5)
> assert abs(estimate - true_answer) < eps * abs(true_answer)
>
>
> Oscar   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-06-12 Thread Oscar Benjamin
On 30 May 2013 22:03, Carlos Nepomuceno  wrote:
>> Here's another way, mathematically equivalent (although not necessarily
>> equivalent using floating point computations!) which avoids the divide-by-
>> zero problem:
>>
>> abs(a - b) < epsilon*a
>
> That's wrong! If abs(a) < abs(a-b)/epsilon you will break the commutative law.

There is no commutative law for relative tolerance floating point
comparisons. If you want to compare with a relative tolerance then you
you should choose carefully what your tolerance is to be relative to
(and how big your relative tolerance should be).

In some applications it's obvious which of a or b you should use to
scale the tolerance but in others it is not or you should compare with
something more complex. For an example where it is obvious, when
testing numerical code I might write something like:

eps = 1e-7
true_answer = 123.4567879
estimate = myfunc(5)
assert abs(estimate - true_answer) < eps * abs(true_answer)


Oscar
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Short-circuit Logic

2013-06-01 Thread Carlos Nepomuceno

> From: steve+comp.lang.pyt...@pearwood.info
> Subject: Re: Short-circuit Logic
> Date: Fri, 31 May 2013 08:45:13 +
> To: python-list@python.org
>
> On Fri, 31 May 2013 17:09:01 +1000, Chris Angelico wrote:
>
>> On Fri, May 31, 2013 at 3:13 PM, Steven D'Aprano
>>  wrote:
>>> What makes you think that the commutative law is relevant here?
>>>
>>>
>> Equality should be commutative. If a == b, then b == a. Also, it's
>> generally understood that if a == c and b == c, then a == b, though
>> there are more exceptions to that (especially in loosely-typed
>> languages).
>
> Who is talking about equality? Did I just pass through the Looking Glass
> into Wonderland again? *wink*
>
> We're talking about *approximate equality*, which is not the same thing,
> despite the presence of the word "equality" in it. It is non-commutative,
> just like other comparisons like "less than" and "greater than or equal
> to". Nobody gets their knickers in a twist because the>= operator is non-
> commutative.

Approximately equality CAN be commutative! I have just showed you that in the 
beginning using the following criteria:

|v-u| <= ε*max(|u|,|v|)

Which is implemented as fpc_aeq():

def fpc_aeq(u,v,eps=sys.float_info.epsilon):
    au=abs(u)
    av=abs(v)
    return abs(v-u) <= (eps*(au if au>av else av))  # |v-u| <= ε*max(|u|,|v|)


> Approximate equality is not just non-commutative, it's also intransitive.
> I'm reminded of a story about Ken Iverson, the creator of APL. Iverson
> was a strong proponent of what he called "tolerant equality", and APL
> defined the = operator as a relative approximate equal, rather than the
> more familiar exactly-equal operator most programming languages use.
>
> In an early talk Ken was explaining the advantages of tolerant
> comparison. A member of the audience asked incredulously,
> “Surely you don’t mean that when A=B and B=C, A may not equal C?”
> Without skipping a beat, Ken replied, “Any carpenter knows that!”
> and went on to the next question. — Paul Berry

That's true! But it's a consequence of floating points (discretes representing 
a continuous set -- real numbers).
Out of context, as you put it, looks like approximate equality is 
non-commutative, but that's wrong.

Did you read the paper[1] you have suggested? Because SHARP APL in fact uses 
the same criteria I have mentioned and it supports it extensively to the point 
of applying it by default to many primitive functions, according to Lathwell[2] 
wich is reference 19 of [1].

"less than      ab
not equal           a≠b
floor       ⌊a
ceiling         ⌈a
membership      a∊b
index of        a⍳b"


I'll quote Lathwell. He called "tolerant comparison" what we are now calling 
"approximate equality".

"Tolerant comparison considers two numbers to be equal if they are within some 
neighborhood. The neighborhood has a radius of ⎕ct times the larger of the two 
in absolute value."

He says "larger of the two" which means "max(|u|,|v|)". So, you reference just 
reaffirms what TAOCP have demonstrated to be the best practice.

I really don't know what the fuck you are arguing about?

Can you show me at least one case where the commutative law wouldn't benefit 
the use of the approximate equality operator?

[1] http://www.jsoftware.com/papers/APLEvol.htm
[2] http://www.jsoftware.com/papers/satn23.htm


> The intransitivity of [tolerant] equality is well known in
> practical situations and can be easily demonstrated by sawing
> several pieces of wood of equal length. In one case, use the
> first piece to measure subsequent lengths; in the second case,
> use the last piece cut to measure the next. Compare the lengths
> of the two final pieces.
> — Richard Lathwell, APL Comparison Tolerance, APL76, 1976
>
> See also here:
>
> http://www.jsoftware.com/papers/APLEvol.htm
>
> (search for "fuzz" or "tolerance".
>
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list   
>   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-31 Thread Stefan Drees

On 2013-05-30 08:29:41 +, Steven D'Aprano said:

On Thu, 30 May 2013 10:22:02 +0300, Jussi Piitulainen wrote:

I wonder why floating-point errors are not routinely discussed in terms
of ulps (units in last position). ...

That is an excellent question! ...
I have a module that works with ULPs. I may clean it up and publish it.
Would there be interest in seeing it in the standard library? ...


I am definitely interested seeing this in the python standard library.
But as I continued to read the lines following your proposal and the 
excellent article from Bruce pointed to by Carlos on this thread, maybe 
a package on pypi first grounding somewhat the presumably massive 
discussion thread on python-ideas :-?)


All the best,

Stefan.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-31 Thread Roy Smith
In article <51a86319$0$29966$c3e8da3$54964...@news.astraweb.com>,
 Steven D'Aprano  wrote:

> In an early talk Ken was explaining the advantages of tolerant
> comparison. A member of the audience asked incredulously, 
> “Surely you don’t mean that when A=B and B=C, A may not equal C?”
> Without skipping a beat, Ken replied, “Any carpenter knows that!”
> and went on to the next question. — Paul Berry

Any any good carpenter also knows it's better to copy than to measure.  
Let's say I have a door frame and I need to trim a door to fit it 
exactly.  I can do one of two things.

First, I could take out my tape measure and measure that the frame is 29 
and 11/32 inches wide.  Then, carry that tape measure to the door, 
measure off 29 and 11/32 inches, and make a mark.

Or, I could take a handy stick of wood which is 30-something inches 
long, lay it down at the bottom of the door frame with one end up snug 
against one side, and make a mark at the other side of the frame.  Then 
carry my stick to the door and keep trimming until it's the same width 
as the marked section on the stick.

Google for "story stick".

The tape measure is like digital floating point.  It introduces all 
sorts of ways for errors to creep in and people who care about getting 
doors to properly fit into door frames understand all that.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-31 Thread Steven D'Aprano
On Fri, 31 May 2013 17:09:01 +1000, Chris Angelico wrote:

> On Fri, May 31, 2013 at 3:13 PM, Steven D'Aprano
>  wrote:
>> What makes you think that the commutative law is relevant here?
>>
>>
> Equality should be commutative. If a == b, then b == a. Also, it's
> generally understood that if a == c and b == c, then a == b, though
> there are more exceptions to that (especially in loosely-typed
> languages).

Who is talking about equality? Did I just pass through the Looking Glass 
into Wonderland again? *wink*

We're talking about *approximate equality*, which is not the same thing, 
despite the presence of the word "equality" in it. It is non-commutative, 
just like other comparisons like "less than" and "greater than or equal 
to". Nobody gets their knickers in a twist because the >= operator is non-
commutative.

Approximate equality is not just non-commutative, it's also intransitive. 
I'm reminded of a story about Ken Iverson, the creator of APL. Iverson 
was a strong proponent of what he called "tolerant equality", and APL 
defined the = operator as a relative approximate equal, rather than the 
more familiar exactly-equal operator most programming languages use.

In an early talk Ken was explaining the advantages of tolerant
comparison. A member of the audience asked incredulously, 
“Surely you don’t mean that when A=B and B=C, A may not equal C?”
Without skipping a beat, Ken replied, “Any carpenter knows that!”
and went on to the next question. — Paul Berry

 
The intransitivity of [tolerant] equality is well known in
practical situations and can be easily demonstrated by sawing
several pieces of wood of equal length. In one case, use the
first piece to measure subsequent lengths; in the second case,
use the last piece cut to measure the next. Compare the lengths
of the two final pieces.
— Richard Lathwell, APL Comparison Tolerance, APL76, 1976 


See also here:

http://www.jsoftware.com/papers/APLEvol.htm

(search for "fuzz" or "tolerance".



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-31 Thread Steven D'Aprano
On Fri, 31 May 2013 09:42:38 +0300, Carlos Nepomuceno wrote:

>> From: steve+comp.lang.pyt...@pearwood.info Subject: Re: Short-circuit
>> Logic
>> Date: Fri, 31 May 2013 05:13:51 + To: python-list@python.org
>>
>> On Fri, 31 May 2013 00:03:13 +0300, Carlos Nepomuceno wrote:
>>>> From: steve+comp.lang.pyt...@pearwood.info Subject: Re: Short-circuit
>>>> Logic
>>>> Date: Thu, 30 May 2013 05:42:17 + To: python-list@python.org
>>> [...]
>>>> Here's another way, mathematically equivalent (although not
>>>> necessarily equivalent using floating point computations!) which
>>>> avoids the divide-by- zero problem:
>>>>
>>>> abs(a - b) < epsilon*a
>>>
>>> That's wrong! If abs(a) < abs(a-b)/epsilon you will break the
>>> commutative law. For example:
>>
>> What makes you think that the commutative law is relevant here?
> 
> How can't you see?

I can ask the same thing about you. How can you see that it is not 
relevant?


> I'll requote a previous message:

Thanks, but that's entirely irrelevant. It says nothing about the 
commutative law.

[...]
> Since we are considering Chris's supposition ("to compare floating point
> numbers") it's totally relevant to understand how that operation can be
> correctly implemented.

Of course! But what does that have to do with the commutative law?


>> Many things break the commutative law, starting with division and
>> subtraction:
>>
>> 20 - 10 != 10 - 20
>>
>> 1/2 != 2/1
>>
>> Most comparison operators other than equality and inequality:
>>
>> (23 < 42) != (42 < 23)
[...]
> That's is totally irrelevant in this case. The commutative law is
> essential to the equality operation.

That's fine, but we're not talking about equality, we're talking about 
*approximately equality* or *almost equal*. Given the simple definition 
of relative error under discussion, the commutative law does not hold. 
The mere fact that it does not hold is no big deal. It doesn't hold for 
many comparison operators.

Nor does the transitive law hold, even using absolute epsilon:

eps = 0.5
a = 1.1
b = 1.5
c = 1.9

then a ≈ b, and b ≈ c, but a ≉ c.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-31 Thread Chris Angelico
On Fri, May 31, 2013 at 3:13 PM, Steven D'Aprano
 wrote:
> What makes you think that the commutative law is relevant here?
>

Equality should be commutative. If a == b, then b == a. Also, it's
generally understood that if a == c and b == c, then a == b, though
there are more exceptions to that (especially in loosely-typed
languages).

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Short-circuit Logic

2013-05-30 Thread Carlos Nepomuceno

> From: steve+comp.lang.pyt...@pearwood.info
> Subject: Re: Short-circuit Logic
> Date: Fri, 31 May 2013 05:13:51 +
> To: python-list@python.org
>
> On Fri, 31 May 2013 00:03:13 +0300, Carlos Nepomuceno wrote:
>
>> 
>>> From: steve+comp.lang.pyt...@pearwood.info Subject: Re: Short-circuit
>>> Logic
>>> Date: Thu, 30 May 2013 05:42:17 + To: python-list@python.org
>> [...]
>>> Here's another way, mathematically equivalent (although not necessarily
>>> equivalent using floating point computations!) which avoids the
>>> divide-by- zero problem:
>>>
>>> abs(a - b) < epsilon*a
>>
>> That's wrong! If abs(a) < abs(a-b)/epsilon you will break the
>> commutative law. For example:
>
> What makes you think that the commutative law is relevant here?

How can't you see?

I'll requote a previous message:

}On Thu, 30 May 2013 13:45:13 +1000, Chris Angelico wrote:
} 
}> Let's suppose someone is told to compare floating point numbers by
}> seeing if the absolute value of the difference is less than some
}> epsilon. 
} 
}Which is usually the wrong way to do it! Normally one would prefer 
}*relative* error, not absolute:
 
Since we are considering Chris's supposition ("to compare floating point 
numbers") it's totally relevant to understand how that operation can be 
correctly implemented.


> Many things break the commutative law, starting with division and
> subtraction:
>
> 20 - 10 != 10 - 20
>
> 1/2 != 2/1
>
> Most comparison operators other than equality and inequality:
>
> (23 < 42) != (42 < 23)
>
> String concatenation:
>
> "Hello" + "World" != "World" + "Hello"
>
> Many operations in the real world:
>
> put on socks, then shoes != put on shoes, then socks.
>

That's is totally irrelevant in this case. The commutative law is essential to 
the equality operation.

> But you are correct that approximately-equal using *relative* error is
> not commutative. (Absolute error, on the other hand, is commutative.) As
> I said, any form of "approximate equality" has gotchas. But this gotcha
> is simple to overcome:
>
> abs(a -b) < eps*max(abs(a), abs(b))
>
> (Knuth's "approximately equal to" which you give.)
>
>
>> This discussion reminded me of TAOCP and I paid a visit and bring the
>> following functions:
>
> "TAOCP"?

The Art of Computer Programming[1]! An old book full of excellent stuff! A MUST 
READ ;)

http://www-cs-faculty.stanford.edu/~uno/taocp.html

[1] Knuth, Donald (1981). The Art of Computer Programming. 2nd ed. Vol. 2. p. 
218. Addison-Wesley. ISBN 0-201-03822-6.

>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list   
>   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Steven D'Aprano
On Fri, 31 May 2013 00:03:13 +0300, Carlos Nepomuceno wrote:

> 
>> From: steve+comp.lang.pyt...@pearwood.info Subject: Re: Short-circuit
>> Logic
>> Date: Thu, 30 May 2013 05:42:17 + To: python-list@python.org
> [...]
>> Here's another way, mathematically equivalent (although not necessarily
>> equivalent using floating point computations!) which avoids the
>> divide-by- zero problem:
>>
>> abs(a - b) < epsilon*a
> 
> That's wrong! If abs(a) < abs(a-b)/epsilon you will break the
> commutative law. For example:

What makes you think that the commutative law is relevant here?

Many things break the commutative law, starting with division and 
subtraction:

20 - 10 != 10 - 20

1/2 != 2/1

Most comparison operators other than equality and inequality:

(23 < 42) != (42 < 23)

String concatenation:

"Hello" + "World" != "World" + "Hello"

Many operations in the real world:

put on socks, then shoes != put on shoes, then socks.


But you are correct that approximately-equal using *relative* error is 
not commutative. (Absolute error, on the other hand, is commutative.) As 
I said, any form of "approximate equality" has gotchas. But this gotcha 
is simple to overcome: 

abs(a -b) < eps*max(abs(a), abs(b))

(Knuth's "approximately equal to" which you give.)


> This discussion reminded me of TAOCP and I paid a visit and bring the
> following functions:

"TAOCP"?


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Michael Torrie
On 05/30/2013 07:10 PM, Nobody wrote:
> This is why technical drawings which include regularly-spaced features
> will normally specify the positions of features relative to their
> neighbours instead of (or as well as) relative to some origin.

If I am planting trees, putting in fence posts, or drilling lots of
little holes in steel, I am actually more likely to measure from the
origin (or one arbitrary position).  I trust that the errors
accumulating as the tape measure marks were printed on the tape is less
than the error I'd accumulate by digging a hole, and measuring from
there to the next hole.  And definitely when drilling a series of holes
I'll never measure hole to hole to mark.  If I measure from the origin
than any error for the hole is limited to itself as much as possible
rather than passing on the error to subsequent hole positions.  If I was
making a server rack, for example, having the holes consistently near
their desired position is necessary.  Tolerances are such that my hole
can be off by as much as a 1/16" of inch of my desired position and it
would still be fine, but not if each hole was off by an additional
1/16".  I guess what I've described is accuracy vs precision.  In the
case of the server rack accuracy is important, and precision can be more
coarse depending on the screw size and the mount type (threaded hole vs
square hole with snap-in nut).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread 88888 Dihedral
Steven D'Aprano於 2013年5月30日星期四UTC+8上午10時28分57秒寫道:
> On Wed, 29 May 2013 10:50:47 -0600, Ian Kelly wrote:
> 
> 
> 
> > On Wed, May 29, 2013 at 8:33 AM, rusi  wrote:
> 
> >> 0.0 == 0.0 implies 5.4 == 5.4
> 
> >> is not a true statement is what (I think) Steven is saying. 0 (or if
> 
> >> you prefer 0.0) is special and is treated specially.
> 
> > 
> 
> > It has nothing to do with 0 being special.  A floating point number will
> 
> > always equal itself (except for nan, which is even more special), and in
> 
> > particular 5.4 == 5.4.  But if you have two different calculations that
> 
> > produce 0, or two different calculations that produce 5.4, you might
> 
> > actually get two different numbers that approximate 0 or 5.4 thanks to
> 
> > rounding error.  If you then compare those two ever-so-slightly
> 
> > different numbers, you will find them unequal.
> 
> 
> 
> EXACTLY!
> 
> 
> 
> The problem does not lie with the *equality operator*, it lies with the 
> 
> calculations. And that is an intractable problem -- in general, floating 
> 
> point is *hard*. So the problem occurs when we start with a perfectly 
> 
> good statement of the facts:
> 
> 
> 
> "If you naively test the results of a calculation for equality without 
> 
> understanding what you are doing, you will often get surprising results"
> 
> 
> 
> which then turns into a general heuristic that is often, but not always, 
> 
> reasonable:
> 
> 
> 
> "In general, you should test for floating point *approximate* equality, 
> 
> in some appropriate sense, rather than exact equality"
> 
> 
> 
> which then gets mangled to:
> 
> 
> 
> "Never test floating point numbers for equality"
> 
> 
> 
> and then implemented badly by people who have no clue what they are doing 
> 
> and have misunderstood the nature of the problem, leading to either:
> 
> 
> 
> * de facto exact equality testing, only slower and with the *illusion* of 
> 
> avoiding equality, e.g. "abs(x-y) < sys.float_info.epsilon" is just a 
> 
> long and slow way of saying "x == y" when both numbers are sufficiently 
> 
> large;
> 
> 
> 
> * incorrectly accepting non-equal numbers as "equal" just because they 
> 
> happen to be "close".
> 
> 
> 
> 
> 
> The problem is that there is *no one right answer*, except "have everyone 
> 
> become an expert in floating point, then judge every case on its merits", 
> 
> which will never happen.
> 
> 
> 
> But if nothing else, I wish that we can get past the rank superstition 
> 
> that you should "never" test floats for equality. That would be a step 
> 
> forward.
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven

The string used to represent a floating number
in a computer language is normally in the decimal base of very 
some limited digits.

Anyway with the advances of A/D-converters in the past 10 years
which are reflected in the anttena- transmitter parts in phones, 
the long integer part in Python can really beat the low cost 
32- 64 bit floating computations in scientific calculations.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Fri, May 31, 2013 at 10:13 AM, Rick Johnson
 wrote:
> What if you need to perform operations on a sequence (more than once) in a 
> non-linear fashion? What if you need to modify the sequence whilst looping? 
> In many cases your simplistic "for loop" will fail miserably.


What has this to do with the original question of iterating across
integers? What you're now saying is that both the meaning of the
current index and the top boundary can change during iteration; that's
unrelated to whether to use equality or inequality for comparisons.

Oh wait. Rick's back. He's been away so long that I stopped looking
for his name in the headers.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Roy Smith
In article ,
 Nobody  wrote:

> On Thu, 30 May 2013 19:38:31 -0400, Dennis Lee Bieber wrote:
> 
> > Measuring 1 foot from the 1000 foot stake leaves you with any error
> > from datum to the 1000 foot, plus any error from the 1000 foot, PLUS any
> > azimuth error which would contribute to shortening the datum distance.
> 
> First, let's ignore azimuthal error.
> 
> If you measure both distances from the same origin, and you have a
> measurement error of 0.1% (i.e. 1/1000), then the 1000' measurement will
> actually be between 999' and 1001', while the 1001' measurement will be
> between 1000' and 1002' (to the nearest whole foot).
> 
> Meaning that the distance from the 1000' stake to the 1001' stake could be
> anywhere between -1' and 3' (i.e. the 1001' stake could be measured as
> being closer than the 1000' stake).
> 
> This is why technical drawings which include regularly-spaced features
> will normally specify the positions of features relative to their
> neighbours instead of (or as well as) relative to some origin.

Not to mention "Do not scale drawing" warnings.  Do they still put that 
on drawings?  It was standard practice back when I was learning drafting.

> When you're dealing with relative error, the obvious question is
> "relative to what?".

Exactly.  Most programmers are very poorly training in these sorts of 
things (not to mention crypto, UX, etc).  I put myself in that camp too.  
I know just enough about floating point to understand that I don't 
really know what I'm doing.  I would never write a program where 
numerical accuracy was critical (say, stress analysis of a new airframe 
or a nuclear power plant control system) without having somebody who 
really knew that stuff on the team.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Nobody
On Thu, 30 May 2013 19:38:31 -0400, Dennis Lee Bieber wrote:

>   Measuring 1 foot from the 1000 foot stake leaves you with any error
> from datum to the 1000 foot, plus any error from the 1000 foot, PLUS any
> azimuth error which would contribute to shortening the datum distance.

First, let's ignore azimuthal error.

If you measure both distances from the same origin, and you have a
measurement error of 0.1% (i.e. 1/1000), then the 1000' measurement will
actually be between 999' and 1001', while the 1001' measurement will be
between 1000' and 1002' (to the nearest whole foot).

Meaning that the distance from the 1000' stake to the 1001' stake could be
anywhere between -1' and 3' (i.e. the 1001' stake could be measured as
being closer than the 1000' stake).

This is why technical drawings which include regularly-spaced features
will normally specify the positions of features relative to their
neighbours instead of (or as well as) relative to some origin.

When you're dealing with relative error, the obvious question is
"relative to what?".

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Rick Johnson
> On Fri, May 31, 2013 at 2:58 AM, rusi wrote:
> > On May 30, 5:58 pm, Chris Angelico wrote:
> > > The alternative would be an infinite number of iterations, which is far 
> > > far worse.
> >
> > There was one heavyweight among programming teachers -- E.W. Dijkstra
> > -- who had some rather extreme views on this.
> > 
> > He taught that when writing a loop of the form
> >
> > i = 0
> > while i < n:
> >   some code
> >   i += 1
> >
> > one should write the loop test as i != n rather than i <
> > n, precisely because if i got erroneously initialized to
> > some value greater than n, (and thereby broke the loop
> > invariant), it would loop infinitely rather than stop
> > with a wrong result.
> > 
> 
> And do you agree or disagree with him? :) I disagree with
> Dijkstra on a number of points, and this might be one of
> them. When you consider that the obvious Pythonic version
> of that code:
> 
> for i in range(n,m):
> some code

Maybe from your limited view point. What if you need to perform operations on a 
sequence (more than once) in a non-linear fashion? What if you need to modify 
the sequence whilst looping? In many cases your simplistic "for loop" will fail 
miserably. 

py> lst = range(5)
py> for n in lst:
... print lst.pop()
4
3
2

Oops, can't do that with a for loop!

py> lst = range(5)
py> while len(lst):
... print lst.pop()
4
3
2
1
0
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Short-circuit Logic

2013-05-30 Thread Carlos Nepomuceno

> To: python-list@python.org
> From: wlfr...@ix.netcom.com
> Subject: Re: Short-circuit Logic
> Date: Thu, 30 May 2013 19:38:31 -0400
>
> On Thu, 30 May 2013 08:48:59 -0400, Roy Smith  declaimed
> the following in gmane.comp.python.general:
>
>>
>> Analysis of error is a complicated topic (and is much older than digital
>> computers). These sorts of things come up in the real world, too. For
>> example, let's say I have two stakes driven into the ground 1000 feet
>> apart. One of them is near me and is my measurement datum.
>>
>> I want to drive a third stake which is 1001 feet away from the datum.
>> Do I measure 1 foot from the second stake, or do I take out my
>> super-long tape measure and measure 1001 feet from the datum?
>
> On the same azimuth? Using the "super long tape" and ensuring it
> traverses the 1000 foot stake is probably going to be most accurate --
> you only have the uncertainty of the positioning of the tape on the
> datum, and the small uncertainty of azimuth over the 1000 foot stake.
> And even the azimuth error isn't contributing to the distance error.
>
> Measuring 1 foot from the 1000 foot stake leaves you with any error
> from datum to the 1000 foot, plus any error from the 1000 foot, PLUS any
> azimuth error which would contribute to shortening the datum distance.

Just because you have more causes of error doesn't mean you have lesser 
accurate measures.

If fact, errors may compensate each other. It all depends on the bias 
(accuracy) and variation (precision) involved in the measurements you are 
considering.

> --
> Wulfraed Dennis Lee Bieber AF6VN
> wlfr...@ix.netcom.com HTTP://wlfraed.home.netcom.com/
>
> --
> http://mail.python.org/mailman/listinfo/python-list   
>   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Nobody
On Thu, 30 May 2013 12:07:40 +0300, Jussi Piitulainen wrote:

> I suppose this depends on the complexity of the process and the amount
> of data that produced the numbers of interest. Many individual
> floating point operations are required to be within an ulp or two of
> the mathematically correct result, I think, and the rounding error
> when parsing a written representation of a number should be similar.

Elementary operations (+, -, *, /, %, sqrt) are supposed to be within
+/- 0.5 ULP (for round-to-nearest), i.e. the actual result should be the
closest representable value to the exact result.

Transcendental functions should ideally be within +/- 1 ULP, i.e. the
actual result should be one of the two closest representable values to the
exact result. Determining the closest value isn't always feasible due to
the "table-maker's dilemma", i.e. the fact that regardless of the number
of digits used for intermediate results, the upper and lower bounds
can remain on opposite sides of the dividing line.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Fri, May 31, 2013 at 5:22 AM, Steven D'Aprano
 wrote:
> On Thu, 30 May 2013 16:40:52 +, Steven D'Aprano wrote:
>
>> On Fri, 31 May 2013 01:56:09 +1000, Chris Angelico wrote:
>
>>> You're assuming you can casually hit Ctrl-C to stop an infinite loop,
>>> meaning that it's trivial. It's not. Not everything lets you do that;
>>> or possibly halting the process will halt far more than you intended.
>>> What if you're editing live code in something that's had uninterrupted
>>> uptime for over a year?
>>
>> Then more fool you for editing live code.
>
> Ouch! That came out much harsher than it sounded in my head :(
>
> Sorry Chris, that wasn't intended as a personal attack against you, just
> as a comment on the general inadvisability of modifying code on the fly
> while it is being run.

Apology accepted :)

You're right that, in theory, a staging area is a Good Thing. But it's
not always feasible. At work, we have a lot of Pike code that really
does keep running indefinitely (okay, we have yet to get anywhere near
a year's uptime for administrative reasons, but it'll be plausible
once we go live; the >1year figure came from one of my personal
projects). While all's going well, code changes follow a sane
progression:

dev -> alpha -> beta -> live

with testing at every stage. What happens when we get a problem,
though? Maybe some process is leaking resources, maybe we come under
some kind of crazy DOS attack, whatever. We need a solution, and we
need to not break things for the currently-connected clients. That
means editing the live code. Of course, there are *some* protections;
the new code won't be switched in unless it passes the compilation
phase (think "except ImportError: keep_existing_code", kinda), and
hopefully I would at least spin it up on my dev box before pushing it
to live, but even so, there's every possibility that there'll be a
specific case that I didn't think of - remembering that we're not
talking about iteration from constant to constant, but from variable
to constant or constant to variable or variable to variable. That's
why I would prefer, in language design, for a 'failed loop' to result
in no iterations than an infinite number of them. The infinite loop
might be easily caught on my dev test - but only if I pass the code
through that exact code path.

But to go back to your point about editing live code: You backed down
from the implication that it's *foolish*, but I would maintain it at a
weaker level. Editing code in a running process is a *rare* thing to
do. MOST programming is not done that way. It's like the old joke
about the car mechanic and the heart surgeon (see eg
http://www.medindia.net/jokes/viewjokes.asp?hid=200 if you haven't
heard it, and I will be spoiling the punch line in the next line or
so); most programmers are mechanics, shutting down the system to do
any work on it, but very occasionally there are times when you need to
do it with the engine running. It's like C compilers. Most of us never
write them, but a few people (relatively) actually need to drop to the
uber-low-level coding and think about how it all works in assembly
language. For everyone else, thinking about machine code is an utter
waste of time/effort, but that doesn't mean that it's folly for a
compiler writer. Does that make sense?

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: Short-circuit Logic

2013-05-30 Thread Carlos Nepomuceno

> From: steve+comp.lang.pyt...@pearwood.info
> Subject: Re: Short-circuit Logic
> Date: Thu, 30 May 2013 05:42:17 +
> To: python-list@python.org
[...]
> Here's another way, mathematically equivalent (although not necessarily
> equivalent using floating point computations!) which avoids the divide-by-
> zero problem:
>
> abs(a - b) < epsilon*a

That's wrong! If abs(a) < abs(a-b)/epsilon you will break the commutative law. 
For example:

import sys
eps = sys.float_info.epsilon
def almost_equalSD(a,b):
    return abs(a-b) < eps*a

#special case
a=1
b=1/(1-eps)
almost_equalSD(a,b) == almost_equalSD(b,a)

Returns False.

This discussion reminded me of TAOCP and I paid a visit and bring the following 
functions:


#Floating Point Comparison Operations
#Knuth, Donald (1981). The Art of Computer Programming. 2nd ed. Vol. 2. p. 218. 
Addison-Wesley. ISBN 0-201-03822-6.
import sys

#floating point comparison: u ≺ v(ε) "definitely less than" (definition 21)
def fpc_dlt(u,v,eps=sys.float_info.epsilon):
    au=abs(u)
    av=abs(v)
    return (v-u)> (eps*(au if au>av else av))  # v-u> ε*max(|u|,|v|)

#floating point comparison: u ~ v(ε) "approximately equal to" (definition 22)
def fpc_aeq(u,v,eps=sys.float_info.epsilon):
    au=abs(u)
    av=abs(v)
    return abs(v-u) <= (eps*(au if au>av else av))  # |v-u| <= ε*max(|u|,|v|)

#floating point comparison: u ≻ v(ε) "definitely greater than" (definition 23)
def fpc_dgt(u,v,eps=sys.float_info.epsilon):
    au=abs(u)
    av=abs(v)
    return (u-v)> (eps*(au if au>av else av))  # u-v> ε*max(|u|,|v|)

#floating point comparison: u ≈ v(ε) "essentially equal to" (definition 24)
def fpc_eeq(u,v,eps=sys.float_info.epsilon):
    au=abs(u)
    av=abs(v)
    return abs(v-u) <= (eps*(au if au
> Whichever method you choose, there are gotchas to watch out for.
>
>> http://xkcd.com/1047/
>
> Nice!
>
>
> --
> Steven
> --
> http://mail.python.org/mailman/listinfo/python-list   
>   
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Ian Kelly
On Thu, May 30, 2013 at 1:30 PM, Neil Cerutti  wrote:
> On 2013-05-30, Chris Angelico  wrote:
>> On Thu, May 30, 2013 at 3:10 PM, Steven D'Aprano
>> wrote:
>>> # Wrong, don't do this!
>>> x = 0.1
>>> while x != 17.3:
>>> print(x)
>>> x += 0.1
>>
>> Actually, I wouldn't do that with integers either.
>
> I propose borrowing the concept of significant digits from the
> world of Physics.
>
> The above has at least three significant digits. With that scheme
> x would approximately equal 17.3 when 17.25 <= x < 17.35.
>
> But I don't see immediately how to calculate 17.25 and 17.35 from
> 17.3, 00.1 and 3 significant digits.

How about this:

while round(x, 1) != round(17.3, 1):
pass

The second round call may be unnecessary.  I would expect the parser
to ensure that round(17.3, 1) == 17.3, but I'm not certain that is the
case.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Neil Cerutti
On 2013-05-30, Chris Angelico  wrote:
> On Thu, May 30, 2013 at 3:10 PM, Steven D'Aprano
> wrote:
>> # Wrong, don't do this!
>> x = 0.1
>> while x != 17.3:
>> print(x)
>> x += 0.1
>
> Actually, I wouldn't do that with integers either.

I propose borrowing the concept of significant digits from the
world of Physics.

The above has at least three significant digits. With that scheme
x would approximately equal 17.3 when 17.25 <= x < 17.35.

But I don't see immediately how to calculate 17.25 and 17.35 from
17.3, 00.1 and 3 significant digits.

-- 
Neil Cerutti
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Steven D'Aprano
On Thu, 30 May 2013 16:40:52 +, Steven D'Aprano wrote:

> On Fri, 31 May 2013 01:56:09 +1000, Chris Angelico wrote:

>> You're assuming you can casually hit Ctrl-C to stop an infinite loop,
>> meaning that it's trivial. It's not. Not everything lets you do that;
>> or possibly halting the process will halt far more than you intended.
>> What if you're editing live code in something that's had uninterrupted
>> uptime for over a year?
> 
> Then more fool you for editing live code.

Ouch! That came out much harsher than it sounded in my head :(

Sorry Chris, that wasn't intended as a personal attack against you, just 
as a comment on the general inadvisability of modifying code on the fly 
while it is being run.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Fri, May 31, 2013 at 2:58 AM, rusi  wrote:
> On May 30, 5:58 pm, Chris Angelico  wrote:
>> The alternative would be an infinite number of iterations, which is far far 
>> worse.
>
> There was one heavyweight among programming teachers -- E.W. Dijkstra
> -- who had some rather extreme views on this.
>
> He taught that when writing a loop of the form
>
> i = 0
> while i < n:
>   some code
>   i += 1
>
> one should write the loop test as i != n rather than i < n, precisely
> because if i got erroneously initialized to some value greater than n,
> (and thereby broke the loop invariant), it would loop infinitely
> rather than stop with a wrong result.

And do you agree or disagree with him? :)

I disagree with Dijkstra on a number of points, and this might be one of them.

When you consider that the obvious Pythonic version of that code:

for i in range(n,m):
some code

loops over nothing and does not go into an infinite loop (or throw an
exception) when n >= m, you have to at least acknowledge that I'm in
agreement with Python core code here :) That doesn't mean it's right,
of course, but it's at least a viewpoint that someone has seen fit to
enshrine in important core functionality.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Ethan Furman

On 05/30/2013 08:56 AM, Chris Angelico wrote:

On Fri, May 31, 2013 at 1:02 AM, Ethan Furman  wrote:

On 05/30/2013 05:58 AM, Chris Angelico wrote:

If you iterate from 1000 to 173, you get nowhere. This is the expected
behaviour; this is what a C-style for loop would be written as, it's
what range() does, it's the normal thing. Going from a particular
starting point to a particular ending point that's earlier than the
start results in no iterations. The alternative would be an infinite
number of iterations, which is far far worse.


If the bug is the extra three zeros (maybe it should have been two), then
silently skipping the loop is the "far, far worse" scenario.  With the
infinite loop you at least know something went wrong, and you know it pretty
darn quick (since you are testing, right? ;).


You're assuming you can casually hit Ctrl-C to stop an infinite loop,
meaning that it's trivial. It's not. Not everything lets you do that;
or possibly halting the process will halt far more than you intended.
What if you're editing live code in something that's had uninterrupted
uptime for over a year? Doing nothing is much safer than getting stuck
in an infinite loop. And yes, I have done exactly that, though not in
Python. Don't forget, your start/stop figures mightn't be constants,
so you might not see it in testing. I can't imagine ANY scenario where
you'd actually *want* the infinite loop behaviour, while there are
plenty where you want it to skip the loop, and would otherwise have to
guard it with an if.


We're not talking about skipping the loop on purpose, but on accident. 
Sure, taking a system down is no fun -- on the other hand, how much data 
corruption can occur before somebody realises there's a problem, and 
then how long to track it down to a silently, accidently, skipped loop?


--
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread rusi
On May 30, 5:58 pm, Chris Angelico  wrote:
> The alternative would be an infinite number of iterations, which is far far 
> worse.

There was one heavyweight among programming teachers -- E.W. Dijkstra
-- who had some rather extreme views on this.

He taught that when writing a loop of the form

i = 0
while i < n:
  some code
  i += 1

one should write the loop test as i != n rather than i < n, precisely
because if i got erroneously initialized to some value greater than n,
(and thereby broke the loop invariant), it would loop infinitely
rather than stop with a wrong result.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Steven D'Aprano
On Fri, 31 May 2013 01:56:09 +1000, Chris Angelico wrote:

> On Fri, May 31, 2013 at 1:02 AM, Ethan Furman 
> wrote:
>> On 05/30/2013 05:58 AM, Chris Angelico wrote:
>>> If you iterate from 1000 to 173, you get nowhere. This is the expected
>>> behaviour; this is what a C-style for loop would be written as, it's
>>> what range() does, it's the normal thing. Going from a particular
>>> starting point to a particular ending point that's earlier than the
>>> start results in no iterations. The alternative would be an infinite
>>> number of iterations, which is far far worse.
>>
>> If the bug is the extra three zeros (maybe it should have been two),
>> then silently skipping the loop is the "far, far worse" scenario.  With
>> the infinite loop you at least know something went wrong, and you know
>> it pretty darn quick (since you are testing, right? ;).
> 
> You're assuming you can casually hit Ctrl-C to stop an infinite loop,
> meaning that it's trivial. It's not. Not everything lets you do that; or
> possibly halting the process will halt far more than you intended. What
> if you're editing live code in something that's had uninterrupted uptime
> for over a year? 

Then more fool you for editing live code.

By the way, this is Python. Editing live code is not easy, if it's 
possible at all.

But even when possible, it's certainly not sensible. You don't insist on 
your car mechanic giving your car a grease and oil change while you're 
driving at 100kmh down the freeway, and you shouldn't insist that your 
developers modify your code while it runs.

In any case, your arguing about such abstract, hypothetical ideas that, 
frankly, *anything at all* might be said about it. "What if Ctrl-C causes 
some great disaster?" can be answered with an equally hypothetical "What 
if Ctrl-C prevents some great disaster?"


> Doing nothing is much safer than getting stuck in an
> infinite loop. 

I disagree. And I agree. It all depends on the circumstances. But, given 
that we are talking about Python where infinite loops can be trivially 
broken out of, *in my experience* they are less-worse than silently doing 
nothing.

I've occasionally written faulty code that enters an infinite loop. When 
that happens, it's normally pretty obvious: something which should 
complete in a millisecond is still running after ten minutes. That's a 
clear, obvious, *immediate* sign that I've screwed up, which leads to me 
fixing the problem.

On the other hand, I've occasionally written faulty code that does 
nothing at all. The specific incident I am thinking of, I wrote a bunch 
of doctests which *weren't being run at all*. For nearly two weeks (not 
full time, but elapsed time) I was developing this code, before I started 
to get suspicious that *none* of the tests had failed, not even once. I 
mean, I'm not that good a programmer. Eventually I put in some deliberate 
errors, and they still didn't fail. 

In actuality, nearly every test was failing, my entire code base was 
rubbish, and I just didn't know it.

So, in this specific case, I would have *much* preferred an obvious 
failure (such as an infinite loop) than code that silently does the wrong 
thing.

We've drifted far from the original topic. There is a distinct difference 
between guarding against inaccuracies in floating point calculations:

# Don't do this!
total = 0.0
while total != 1.0:
total += 0.1

and guarding against typos in source code:

total = 90  # Oops, I meant 0
while total != 10:
total += 1

The second case is avoidable by paying attention when you code. The first 
case is not easily avoidable, because it reflects a fundamental 
difficulty with floating point types.

As a general rule, "defensive coding" does not extend to the idea of 
defending against mistakes in your code. The compiler, linter or unit 
tests are supposed to do that. Occasionally, I will code defensively when 
initialising tedious data sets:

prefixes = ['y', 'z', 'a', 'f', 'p', 'n', 'µ', 'm', 
'k', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y']
assert len(prefixes) == 16


but that's about as far as I go.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Fri, May 31, 2013 at 1:02 AM, Ethan Furman  wrote:
> On 05/30/2013 05:58 AM, Chris Angelico wrote:
>> If you iterate from 1000 to 173, you get nowhere. This is the expected
>> behaviour; this is what a C-style for loop would be written as, it's
>> what range() does, it's the normal thing. Going from a particular
>> starting point to a particular ending point that's earlier than the
>> start results in no iterations. The alternative would be an infinite
>> number of iterations, which is far far worse.
>
> If the bug is the extra three zeros (maybe it should have been two), then
> silently skipping the loop is the "far, far worse" scenario.  With the
> infinite loop you at least know something went wrong, and you know it pretty
> darn quick (since you are testing, right? ;).

You're assuming you can casually hit Ctrl-C to stop an infinite loop,
meaning that it's trivial. It's not. Not everything lets you do that;
or possibly halting the process will halt far more than you intended.
What if you're editing live code in something that's had uninterrupted
uptime for over a year? Doing nothing is much safer than getting stuck
in an infinite loop. And yes, I have done exactly that, though not in
Python. Don't forget, your start/stop figures mightn't be constants,
so you might not see it in testing. I can't imagine ANY scenario where
you'd actually *want* the infinite loop behaviour, while there are
plenty where you want it to skip the loop, and would otherwise have to
guard it with an if.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Ethan Furman

On 05/30/2013 05:58 AM, Chris Angelico wrote:

On Thu, May 30, 2013 at 10:40 PM, Roy Smith  wrote:

if somebody were to accidentally drop three zeros into the source code:


x = 1000
while x < 173:
 print(x)
 x += 1


should the loop just quietly not execute (which is what it will do
here)?  Will that make your program correct again, or will it simply
turn this into a difficult to find bug?  If you're really worried about
that, why not:


If you iterate from 1000 to 173, you get nowhere. This is the expected
behaviour; this is what a C-style for loop would be written as, it's
what range() does, it's the normal thing. Going from a particular
starting point to a particular ending point that's earlier than the
start results in no iterations. The alternative would be an infinite
number of iterations, which is far far worse.


If the bug is the extra three zeros (maybe it should have been two), 
then silently skipping the loop is the "far, far worse" scenario.  With 
the infinite loop you at least know something went wrong, and you know 
it pretty darn quick (since you are testing, right? ;).


--
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Thu, May 30, 2013 at 10:40 PM, Roy Smith  wrote:
> if somebody were to accidentally drop three zeros into the source code:
>
>> x = 1000
>> while x < 173:
>> print(x)
>> x += 1
>
> should the loop just quietly not execute (which is what it will do
> here)?  Will that make your program correct again, or will it simply
> turn this into a difficult to find bug?  If you're really worried about
> that, why not:

If you iterate from 1000 to 173, you get nowhere. This is the expected
behaviour; this is what a C-style for loop would be written as, it's
what range() does, it's the normal thing. Going from a particular
starting point to a particular ending point that's earlier than the
start results in no iterations. The alternative would be an infinite
number of iterations, which is far far worse.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Roy Smith
In article ,
 Jussi Piitulainen  wrote:

> I wonder why floating-point errors are not routinely discussed in
> terms of ulps (units in last position).

Analysis of error is a complicated topic (and is much older than digital 
computers).  These sorts of things come up in the real world, too.  For 
example, let's say I have two stakes driven into the ground 1000 feet 
apart.  One of them is near me and is my measurement datum.

I want to drive a third stake which is 1001 feet away from the datum.  
Do I measure 1 foot from the second stake, or do I take out my 
super-long tape measure and measure 1001 feet from the datum?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Roy Smith
In article ,
 Chris Angelico  wrote:

> On Thu, May 30, 2013 at 3:10 PM, Steven D'Aprano
>  wrote:
> > # Wrong, don't do this!
> > x = 0.1
> > while x != 17.3:
> > print(x)
> > x += 0.1
> >
> 
> Actually, I wouldn't do that with integers either. There are too many
> ways that a subsequent edit could get it wrong and go infinite, so I'd
> *always* use an inequality for that:
> 
> x = 1
> while x < 173:
> print(x)
> x += 1

There's a big difference between these two.  In the first case, using 
less-than instead of testing for equality, you are protecting against 
known and expected floating point behavior.

In the second case, you're protecting against some vague, unknown, 
speculative future programming botch.  So, what *is* the right behavior 
if somebody were to accidentally drop three zeros into the source code:

> x = 1000
> while x < 173:
> print(x)
> x += 1

should the loop just quietly not execute (which is what it will do 
here)?  Will that make your program correct again, or will it simply 
turn this into a difficult to find bug?  If you're really worried about 
that, why not:

> x = 1
> while x != 173:
> assert < 172
> print(x)
> x += 1
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Jussi Piitulainen
Steven D'Aprano writes:

> On Thu, 30 May 2013 10:22:02 +0300, Jussi Piitulainen wrote:
> 
> > I wonder why floating-point errors are not routinely discussed in
> > terms of ulps (units in last position). There is a recipe for
> > calculating the difference of two floating point numbers in ulps,
> > and it's possible to find the previous or next floating point
> > number, but I don't know of any programming language having
> > built-in support for these.

...

> But we now have IEEE 754, and C has conquered the universe, so it's
> reasonable for programming languages to offer an interface for
> accessing floating point objects in terms of ULPs. Especially for a
> language like Python, which only has a single float type.

Yes, that's what I'm thinking, that there is now a ubiquitous floating
point format or two, so the properties of the format could be used.

> I have a module that works with ULPs. I may clean it up and publish it. 
> Would there be interest in seeing it in the standard library?

Yes, please.

> There are some subtleties here also. Firstly, how many ULP should
> you care about? Three, as you suggest below, is awfully small, and
> chances are most practical, real-world calculations could not
> justify 3 ULP.  Numbers that we normally care about, like "0.01mm",
> probably can justify thousands of ULP when it comes to C-doubles,
> which Python floats are.

I suppose this depends on the complexity of the process and the amount
of data that produced the numbers of interest. Many individual
floating point operations are required to be within an ulp or two of
the mathematically correct result, I think, and the rounding error
when parsing a written representation of a number should be similar.
Either these add up to produce large errors, or the computation is
approximate in other ways in addition to using floating point.

One could develop a kind of sense for such differences. Ulps could be
a tangible measure when comparing different algorithms. (That's what I
tried to do with them in the first place. And that's how I began to
notice their absence when floating point errors are discussed.)

> Another subtlety: small-but-positive numbers are millions of ULP
> away from small-but-negative numbers. Also, there are issues to do
> with +0.0 and -0.0, NANs and the INFs.

The usual suspects ^_^ and no reason to dismiss the ulp when the
competing kinds of error have their corresponding subtleties. A matter
of education, I'd say.

Thank you much for an illuminating discussion.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Steven D'Aprano
On Thu, 30 May 2013 10:22:02 +0300, Jussi Piitulainen wrote:

> I wonder why floating-point errors are not routinely discussed in terms
> of ulps (units in last position). There is a recipe for calculating the
> difference of two floating point numbers in ulps, and it's possible to
> find the previous or next floating point number, but I don't know of any
> programming language having built-in support for these.

That is an excellent question!

I think it is because the traditional recipes for "close enough" equality 
either pre-date any standardization of floating point types, or because 
they're written by people who are thinking about abstract floating point 
numbers and not considering the implementation.

Prior to most compiler and hardware manufacturers standardizing on IEEE 
754, there was no real way to treat float's implementation  in a machine 
independent way. Every machine laid their floats out differently, or used 
different number of bits. Some even used decimal, and in the case of a 
couple of Russian machines, trinary. (Although that's going a fair way 
back.)

But we now have IEEE 754, and C has conquered the universe, so it's 
reasonable for programming languages to offer an interface for accessing 
floating point objects in terms of ULPs. Especially for a language like 
Python, which only has a single float type.

I have a module that works with ULPs. I may clean it up and publish it. 
Would there be interest in seeing it in the standard library?


> Why isn't this considered the most natural measure of a floating point
> result being close to a given value? The meaning is roughly this: how
> many floating point numbers there are between these two.

There are some subtleties here also. Firstly, how many ULP should you 
care about? Three, as you suggest below, is awfully small, and chances 
are most practical, real-world calculations could not justify 3 ULP. 
Numbers that we normally care about, like "0.01mm", probably can justify 
thousands of ULP when it comes to C-doubles, which Python floats are.

Another subtlety: small-but-positive numbers are millions of ULP away 
from small-but-negative numbers. Also, there are issues to do with +0.0 
and -0.0, NANs and the INFs.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Chris Angelico
On Thu, May 30, 2013 at 3:42 PM, Steven D'Aprano
 wrote:
> On Thu, 30 May 2013 13:45:13 +1000, Chris Angelico wrote:
>
>> Let's suppose someone is told to compare floating point numbers by
>> seeing if the absolute value of the difference is less than some
>> epsilon.
>
> Which is usually the wrong way to do it! Normally one would prefer
> *relative* error, not absolute:
>
> # absolute error:
> abs(a - b) < epsilon
>
>
> # relative error:
> abs(a - b)/a < epsilon

I was picking an epsilon based on a, though, which comes to pretty
much the same thing as the relative error calculation you're using.

> But using relative error also raises questions:
>
> - what if a is negative?
>
> - why relative to a instead of relative to b?
>
> - what if a is zero?
>
> The first, at least, is easy to solve: take the absolute value of a.

One technique I saw somewhere is to use the average of a and b. But
probably better is to take the lower absolute value (ie the larger
epsilon). However, there's still the question of what epsilon should
be - what percentage of a or b you take to mean equal - and that one
is best answered by looking at the original inputs.

Take these guys, for instance. Doing the same thing I was, only with
more accuracy.

http://www.youtube.com/watch?v=ZNiRzZ66YN0

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-30 Thread Jussi Piitulainen
Steven D'Aprano writes:

> On Thu, 30 May 2013 13:45:13 +1000, Chris Angelico wrote:
> 
> > Let's suppose someone is told to compare floating point numbers by
> > seeing if the absolute value of the difference is less than some
> > epsilon.
> 
> Which is usually the wrong way to do it! Normally one would prefer
> *relative* error, not absolute:
> 
> # absolute error:
> abs(a - b) < epsilon
> 
> 
> # relative error:
> abs(a - b)/a < epsilon
> 

...

I wonder why floating-point errors are not routinely discussed in
terms of ulps (units in last position). There is a recipe for
calculating the difference of two floating point numbers in ulps, and
it's possible to find the previous or next floating point number, but
I don't know of any programming language having built-in support for
these.

Why isn't this considered the most natural measure of a floating point
result being close to a given value? The meaning is roughly this: how
many floating point numbers there are between these two.

"close enough" if abs(ulps(a, b)) < 3 else "not close enough"

"equal" if ulps(a, b) == 0 else "not equal"

There must be some subtle technical issues here, too, but it puzzles
me that this measure of closeness is not often even discussed when
absolute and relative error are discussed - and computed using the
same approximate arithmetic whose accuracy is being measured. Scary.

Got light?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Steven D'Aprano
On Thu, 30 May 2013 13:45:13 +1000, Chris Angelico wrote:

> Let's suppose someone is told to compare floating point numbers by
> seeing if the absolute value of the difference is less than some
> epsilon. 

Which is usually the wrong way to do it! Normally one would prefer 
*relative* error, not absolute:

# absolute error:
abs(a - b) < epsilon


# relative error:
abs(a - b)/a < epsilon


One problem with absolute error is that it can give an entirely spurious 
image of "fuzziness", when in reality it is actually performing the same 
exact equality as == only slower and more verbosely. If a and b are 
sufficiently large, the smallest possible difference between a and b may 
be greater than epsilon (for whichever epsilon you pick). When that 
happens, you might as well just use == and be done with it.

But using relative error also raises questions:

- what if a is negative?

- why relative to a instead of relative to b?

- what if a is zero?

The first, at least, is easy to solve: take the absolute value of a. But 
strangely, you rarely see programming books mention that, so I expect 
that there is a lot of code in the real world that assumes a is positive 
and does the wrong thing when it isn't.

Here's another way, mathematically equivalent (although not necessarily 
equivalent using floating point computations!) which avoids the divide-by-
zero problem:

abs(a - b) < epsilon*a


Whichever method you choose, there are gotchas to watch out for.

> http://xkcd.com/1047/

Nice!


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Steven D'Aprano
On Wed, 29 May 2013 20:23:00 -0400, Dave Angel wrote:

> Even in a pure decimal system of (say)
> 40 digits, I could type in a 42 digit number and it would get quantized.
>   So just because two 42 digit numbers are different doesn't imply that
> the 40 digit internal format would be.

Correct, and we can demonstrate it using Python:

py> from decimal import *
py> getcontext().prec = 3
py> a = +Decimal('1.')
py> b = +Decimal('1.0009')
py> a == b
True


(By default, the Decimal constructor does not honour the current 
precision. To force it to do so, use the unary + operator.)




-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Chris Angelico
On Thu, May 30, 2013 at 3:10 PM, Steven D'Aprano
 wrote:
> # Wrong, don't do this!
> x = 0.1
> while x != 17.3:
> print(x)
> x += 0.1
>

Actually, I wouldn't do that with integers either. There are too many
ways that a subsequent edit could get it wrong and go infinite, so I'd
*always* use an inequality for that:

x = 1
while x < 173:
print(x)
x += 1

Well, in Python I'd use for/range, but the equivalent still applies. A
range() is still based on an inequality:

>>> list(range(1,6))
[1, 2, 3, 4, 5]
>>> list(range(1,6,3))
[1, 4]

Stops once it's no longer less than the end. That's safe, since Python
can't do integer wraparound.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Steven D'Aprano
On Wed, 29 May 2013 07:27:40 -0700, Ahmed Abdulshafy wrote:

> On Tuesday, May 28, 2013 3:48:17 PM UTC+2, Steven D'Aprano wrote:
>> On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:
>> 
>> 
>> 
>> > That may be true for integers, but for floats, testing for equality
>> > is
>> 
>> > not always precise
>> 
>> 
>> 
>> Incorrect. Testing for equality is always precise, and exact. The
>> problem
>> 
>> is not the *equality test*, but that you don't always have the number
>> 
>> that you think you have. The problem lies elsewhere, not equality!
>> 
>> 
>> Steven
> 
> Well, this is taken from my python shell>
> 
 0.33455857352426283 == 0.33455857352426282
> True

This is an excellent example of misunderstanding what you are seeing. 
Both 0.33455857352426283 and 0.33455857352426282 represent the same 
float, so it is hardly a surprise that they compare equal -- they compare 
equal because they are equal.

py> a, b = 0.33455857352426283, 0.33455857352426282
py> a.as_integer_ratio()
(6026871468229899, 18014398509481984)
py> b.as_integer_ratio()
(6026871468229899, 18014398509481984)

You've made a common error: neglecting to take into account the finite 
precision of floats. Floats are not mathematical "real numbers", with 
infinite precision. The error is more obvious if we exaggerate it:

py> 0.3 == 0.31
True

Most people who have seen an ordinary four-function calculator will 
realise that the issue here is *not* that the equality operator == is 
wrongly stating that two unequal numbers are equal, but that just because 
you enter 0.300...1 doesn't mean that all those decimal places are 
actually used.


> Anyway, man, those were not my words anyway, most programming books I've
> read state so. Here's an excerpt from the Python book, I'm currently
> reading>
> 
> ">>> 0.0, 5.4, -2.5, 8.9e-4
> (0.0, 5.4004, -2.5, 0.00088995)
> 
> 
> The inexactness is not a problem specific to Python—all programming
> languages have this problem with floating-point numbers."

I'm not denying that floats are tricky to use correctly, or that testing 
for exact equality is *sometimes* the wrong thing to do:

# Wrong, don't do this!
x = 0.1
while x != 17.3:
print(x)
x += 0.1


I'm just saying that a simple minded comparison with 
sys.float_info.epsilon is *also* often wrong.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Chris Angelico
On Thu, May 30, 2013 at 12:28 PM, Steven D'Aprano
 wrote:
> * de facto exact equality testing, only slower and with the *illusion* of
> avoiding equality, e.g. "abs(x-y) < sys.float_info.epsilon" is just a
> long and slow way of saying "x == y" when both numbers are sufficiently
> large;
>

The problem here, I think, is that "epsilon" has two meanings:

* sys.float_info.epsilon, which is an extremely specific value (the
smallest x such that 1.0+x != x)

* the mathematical concept, which is where the other got its name from.

Let's suppose someone is told to compare floating point numbers by
seeing if the absolute value of the difference is less than some
epsilon. They look up "absolute value" and find abs(); they look up
"epsilon" and think they've found it. Trouble is, they've found the
wrong epsilon... and really, there's an engineering issue here too.
Here's one of my favourite examples of equality comparisons:

http://xkcd.com/1047/

# Let's say we measured this accurately to one part in 40
x = one_light_year_in_meters

y = pow(99,8)
x == y  # False
abs(x-y) < x/40  # True

Measurement accuracy is usually far FAR worse than floating-point
accuracy. It's pretty pointless to compare for some kind of "equality"
that ignores this. Say you measure the diameter and circumference of a
circle, accurate to one meter, and got values of 79 and 248; does this
mean that pi is less than 3.14? No - in fact:

pi = 248/79
# math.pi = 3.141592653589793
abs(pi-math.pi) < pi/79  # True

Worst error is 1 in 79, so all comparisons are done with epsilon
derived from that.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Steven D'Aprano
On Wed, 29 May 2013 10:50:47 -0600, Ian Kelly wrote:

> On Wed, May 29, 2013 at 8:33 AM, rusi  wrote:
>> 0.0 == 0.0 implies 5.4 == 5.4
>> is not a true statement is what (I think) Steven is saying. 0 (or if
>> you prefer 0.0) is special and is treated specially.
> 
> It has nothing to do with 0 being special.  A floating point number will
> always equal itself (except for nan, which is even more special), and in
> particular 5.4 == 5.4.  But if you have two different calculations that
> produce 0, or two different calculations that produce 5.4, you might
> actually get two different numbers that approximate 0 or 5.4 thanks to
> rounding error.  If you then compare those two ever-so-slightly
> different numbers, you will find them unequal.

EXACTLY!

The problem does not lie with the *equality operator*, it lies with the 
calculations. And that is an intractable problem -- in general, floating 
point is *hard*. So the problem occurs when we start with a perfectly 
good statement of the facts:

"If you naively test the results of a calculation for equality without 
understanding what you are doing, you will often get surprising results"

which then turns into a general heuristic that is often, but not always, 
reasonable:

"In general, you should test for floating point *approximate* equality, 
in some appropriate sense, rather than exact equality"

which then gets mangled to:

"Never test floating point numbers for equality"

and then implemented badly by people who have no clue what they are doing 
and have misunderstood the nature of the problem, leading to either:

* de facto exact equality testing, only slower and with the *illusion* of 
avoiding equality, e.g. "abs(x-y) < sys.float_info.epsilon" is just a 
long and slow way of saying "x == y" when both numbers are sufficiently 
large;

* incorrectly accepting non-equal numbers as "equal" just because they 
happen to be "close".


The problem is that there is *no one right answer*, except "have everyone 
become an expert in floating point, then judge every case on its merits", 
which will never happen.

But if nothing else, I wish that we can get past the rank superstition 
that you should "never" test floats for equality. That would be a step 
forward.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Dave Angel

On 05/29/2013 12:50 PM, Ian Kelly wrote:

On Wed, May 29, 2013 at 8:33 AM, rusi  wrote:

0.0 == 0.0 implies 5.4 == 5.4
is not a true statement is what (I think) Steven is saying.
0 (or if you prefer 0.0) is special and is treated specially.


It has nothing to do with 0 being special.  A floating point number
will always equal itself (except for nan, which is even more special),
and in particular 5.4 == 5.4.  But if you have two different
calculations that produce 0, or two different calculations that
produce 5.4, you might actually get two different numbers that
approximate 0 or 5.4 thanks to rounding error.  If you then compare
those two ever-so-slightly different numbers, you will find them
unequal.



Rounding error is just one of the problems.  Usually less obvious is 
quantization error.  If you represent a floating number in decimal, but 
you're using a binary floating point representation, it just might change.


Another error is roundoff error.  Even in a pure decimal system of (say) 
40 digits, I could type in a 42 digit number and it would get quantized. 
 So just because two 42 digit numbers are different doesn't imply that 
the 40 digit internal format would be.



--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Ian Kelly
On Wed, May 29, 2013 at 8:33 AM, rusi  wrote:
> 0.0 == 0.0 implies 5.4 == 5.4
> is not a true statement is what (I think) Steven is saying.
> 0 (or if you prefer 0.0) is special and is treated specially.

It has nothing to do with 0 being special.  A floating point number
will always equal itself (except for nan, which is even more special),
and in particular 5.4 == 5.4.  But if you have two different
calculations that produce 0, or two different calculations that
produce 5.4, you might actually get two different numbers that
approximate 0 or 5.4 thanks to rounding error.  If you then compare
those two ever-so-slightly different numbers, you will find them
unequal.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread rusi
On May 29, 7:27 pm, Ahmed Abdulshafy  wrote:
> On Tuesday, May 28, 2013 3:48:17 PM UTC+2, Steven D'Aprano wrote:
> > On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:
>
> > > That may be true for integers, but for floats, testing for equality is
>
> > > not always precise
>
> > Incorrect. Testing for equality is always precise, and exact. The problem
>
> > is not the *equality test*, but that you don't always have the number
>
> > that you think you have. The problem lies elsewhere, not equality!
>
> > Steven
>
> Well, this is taken from my python shell>
>
> >>> 0.33455857352426283 == 0.33455857352426282
>
> True
>
> Anyway, man, those were not my words anyway, most programming books I've read 
> state so. Here's an excerpt from the Python book, I'm currently reading>
>
> ">>> 0.0, 5.4, -2.5, 8.9e-4
> (0.0, 5.4004, -2.5, 0.00088995)
>
> The inexactness is not a problem specific to Python—all programming languages 
> have this problem with floating-point numbers."

0.0 == 0.0 implies 5.4 == 5.4
is not a true statement is what (I think) Steven is saying.
0 (or if you prefer 0.0) is special and is treated specially.

Naturally if you reach (nearabout) 0.0 by some numerical process thats
another matter...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Chris Angelico
On Thu, May 30, 2013 at 12:27 AM, Ahmed Abdulshafy  wrote:
> Well, this is taken from my python shell>
>
 0.33455857352426283 == 0.33455857352426282
> True


>>> 0.33455857352426283,0.33455857352426282
(0.3345585735242628, 0.3345585735242628)

They're not representably different.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-29 Thread Ahmed Abdulshafy
On Tuesday, May 28, 2013 3:48:17 PM UTC+2, Steven D'Aprano wrote:
> On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:
> 
> 
> 
> > That may be true for integers, but for floats, testing for equality is
> 
> > not always precise
> 
> 
> 
> Incorrect. Testing for equality is always precise, and exact. The problem 
> 
> is not the *equality test*, but that you don't always have the number 
> 
> that you think you have. The problem lies elsewhere, not equality!
> 
> 
> Steven

Well, this is taken from my python shell>

>>> 0.33455857352426283 == 0.33455857352426282
True

Anyway, man, those were not my words anyway, most programming books I've read 
state so. Here's an excerpt from the Python book, I'm currently reading>

">>> 0.0, 5.4, -2.5, 8.9e-4
(0.0, 5.4004, -2.5, 0.00088995)


The inexactness is not a problem specific to Python—all programming languages 
have this problem with floating-point numbers."
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Steven D'Aprano
On Tue, 28 May 2013 15:14:03 +, Grant Edwards wrote:

> On 2013-05-28, Steven D'Aprano 
> wrote:
>> On Tue, 28 May 2013 01:39:09 -0700, Ahmed Abdulshafy wrote:
>>
>>> He just said that the way to test for zero equality is x == 0, and I
>>> meant that this is true for integers but not necessarily for floats.
>>> And that's not specific to Python.
>>
>> Can you show me a value of x where x == 0.0 returns False, but x
>> actually isn't zero?
> 
> I'm confused.  Don't all non-zero values satisfy your conditions?

Of course they do :-(

I meant "but x actually *is* zero". Sorry for the confusion. I blame the 
terrists.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Grant Edwards
On 2013-05-28, Steven D'Aprano  wrote:
> On Tue, 28 May 2013 01:39:09 -0700, Ahmed Abdulshafy wrote:
>
>> He just said that the way to test for zero equality is x == 0, and I
>> meant that this is true for integers but not necessarily for floats. And
>> that's not specific to Python.
>
> Can you show me a value of x where x == 0.0 returns False, but x actually 
> isn't zero?

I'm confused.  Don't all non-zero values satisfy your conditions?

>>> x = 1.0
>>> x == 0.0
False
>>> x is 0.0
False



-- 
Grant Edwards   grant.b.edwardsYow! I'm dressing up in
  at   an ill-fitting IVY-LEAGUE
  gmail.comSUIT!!  Too late...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Chris Angelico
On Tue, May 28, 2013 at 11:48 PM, Steven D'Aprano
 wrote:
> py> y = 1e17 + x  # x is not zero, so y should be > 1e17
> py> 1/(1e17 - y)
> Traceback (most recent call last):
>   File "", line 1, in 
> ZeroDivisionError: float division by zero

You don't even need to go for 1e17. By definition:

>>> sys.float_info.epsilon+1.0==1.0
False
>>> sys.float_info.epsilon+2.0==2.0
True

Therefore the same can be done with 2 as you did with 1e17.

>>> y = 2 + sys.float_info.epsilon
>>> 1/(2-y)
Traceback (most recent call last):
  File "", line 1, in 
1/(2-y)
ZeroDivisionError: float division by zero

Of course, since we're working with a number greater than epsilon, we
need to go a little further, but we can still work with small numbers:

>>> x = sys.float_info.epsilon * 2   # Definitely greater than epsilon
>>> y = 4 + x
>>> 1/(4-y)
Traceback (most recent call last):
  File "", line 1, in 
1/(4-y)
ZeroDivisionError: float division by zero


ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Steven D'Aprano
On Tue, 28 May 2013 01:39:09 -0700, Ahmed Abdulshafy wrote:

> He just said that the way to test for zero equality is x == 0, and I
> meant that this is true for integers but not necessarily for floats. And
> that's not specific to Python.

Can you show me a value of x where x == 0.0 returns False, but x actually 
isn't zero?

Built-in floats only, if you subclass you can do anything you like:

class Cheating(float):
def __eq__(self, other):
return False


-- 
Steven

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Steven D'Aprano
On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:

> That may be true for integers, but for floats, testing for equality is
> not always precise

Incorrect. Testing for equality is always precise, and exact. The problem 
is not the *equality test*, but that you don't always have the number 
that you think you have. The problem lies elsewhere, not equality!
Unfortunately, people who say "never test floats for equality" have 
misdiagnosed the problem, or they are giving a simple work-around which 
can be misleading to those who don't understand what is actually going on.

Any floating point libraries that support IEEE-754 semantics can 
guarantee a few things, including:

x == 0.0 if, and only if, x actually equals zero.

This was not always the case for all floating point systems prior to 
IEEE-754. In his forward to the Apple Numerics Manual, William Kahan 
describes a Capriciously Designed Computer where 1/x can give a Division 
By Zero error even though x != 0. Fortunately, if you are programming in 
Python on Intel-compatible hardware, you do not have to worry about 
nightmares like that.

Let me repeat that: in Python, you can trust that if x == 0.0 returns 
False, then x is definitely not zero.

In any case, the test that you show is not a good test. I have already 
shown that it wrongly treats many non-zero numbers which can be 
distinguished from zero as if they were zero. But worse, it also fails as 
a guard against numbers which cannot be distinguished from zero!

py> import sys
py> epsilon = sys.float_info.epsilon
py> x < epsilon  # Is x so tiny it looks like zero?
False
py> y = 1e17 + x  # x is not zero, so y should be > 1e17
py> 1/(1e17 - y)
Traceback (most recent call last):
  File "", line 1, in 
ZeroDivisionError: float division by zero


So as you can see, testing for "zero" by comparing to machine epsilon 
does not save you from Zero Division errors.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Mark Lawrence

On 28/05/2013 09:39, Ahmed Abdulshafy wrote:


And that's not specific to Python.



Using google products is also not specific to Python.  However whereever 
it's used it's a PITA as people are forced into reading double spaced 
crap.  Please check out the link in my signature.


--
If you're using GoogleCrap™ please read this 
http://wiki.python.org/moin/GoogleGroupsPython.


Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list


RE: Short-circuit Logic

2013-05-28 Thread Carlos Nepomuceno

> Date: Tue, 28 May 2013 01:39:09 -0700
> Subject: Re: Short-circuit Logic
> From: abdulsh...@gmail.com
[...]
>> What Steven wrote is entirely correct: sys.float_info.epsilon is the
>>
>> smallest value x such that 1.0 and 1.0+x have distinct floating-point
>>
>> representations. It has no relevance for comparing to zero.
>
> He just said that the way to test for zero equality is x == 0, and I meant 
> that this is true for integers but not necessarily for floats. And that's not 
> specific to Python.


Have you read [1]? There's a section "Infernal Zero" that discuss this problem. 
I think it's very interesting to know! ;)

Just my 49.98¢! lol


[1] 
http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-28 Thread Ahmed Abdulshafy
On Tuesday, May 28, 2013 2:10:05 AM UTC+2, Nobody wrote:
> On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:
> 
> 
> 
> > On Sunday, May 26, 2013 2:13:47 PM UTC+2, Steven D'Aprano wrote:
> 
> >
> 
> >> What the above actually tests for is whether x is so small that (1.0+x)
> 
> >> cannot be distinguished from 1.0, which is not the same thing. It is
> 
> >> also quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.0001+x)?
> 
> >> Or (1.0+x)?
> 
> > 
> 
> > That may be true for integers,
> 
> 
> 
> What may be true for integers?
> 
> 
> 
> > but for floats, testing for equality is not always precise
> 
> 
> 
> And your point is?
> 
> 
> 
> What Steven wrote is entirely correct: sys.float_info.epsilon is the
> 
> smallest value x such that 1.0 and 1.0+x have distinct floating-point
> 
> representations. It has no relevance for comparing to zero.

He just said that the way to test for zero equality is x == 0, and I meant that 
this is true for integers but not necessarily for floats. And that's not 
specific to Python.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-27 Thread Nobody
On Mon, 27 May 2013 13:11:28 -0700, Ahmed Abdulshafy wrote:

> On Sunday, May 26, 2013 2:13:47 PM UTC+2, Steven D'Aprano wrote:
>
>> What the above actually tests for is whether x is so small that (1.0+x)
>> cannot be distinguished from 1.0, which is not the same thing. It is
>> also quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.0001+x)?
>> Or (1.0+x)?
> 
> That may be true for integers,

What may be true for integers?

> but for floats, testing for equality is not always precise

And your point is?

What Steven wrote is entirely correct: sys.float_info.epsilon is the
smallest value x such that 1.0 and 1.0+x have distinct floating-point
representations. It has no relevance for comparing to zero.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-27 Thread Nobody
On Sun, 26 May 2013 04:11:56 -0700, Ahmed Abdulshafy wrote:

> I'm having a hard time wrapping my head around short-circuit logic that's
> used by Python, coming from a C/C++ background; so I don't understand why
> the following condition is written this way!>
> 
>  if not allow_zero and abs(x) < sys.float_info.epsilon:
> print("zero is not allowed")
> 
> The purpose of this snippet is to print the given line when allow_zero is
> False and x is 0.

I don't understand your confusion. The above is directly equivalent to the
following C code:

if (!allow_zero && fabs(x) < DBL_EPSILON)
printf("zero is not allowed\n");

In either case, the use of short-circuit evaluation isn't necessary here;
it would work just as well with a strict[1] "and" operator.

Short-circuit evaluation is useful if the second argument is expensive to
compute, or (more significantly) if the second argument should not be
evaluated if the first argument is false; e.g. if x is a pointer then:

if (x && *x) ...

relies upon short-circuit evaluation to avoid dereferencing a null pointer.

On an unrelated note: the use of the "epsilon" value here is
almost certainly wrong. If the intention is to determine if the result of
a calculation is zero to within the limits of floating-point accuracy,
then it should use a value which is proportional to the values used in
the calculation.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-27 Thread Ahmed Abdulshafy
On Sunday, May 26, 2013 2:13:47 PM UTC+2, Steven D'Aprano wrote:
> On Sun, 26 May 2013 04:11:56 -0700, Ahmed Abdulshafy wrote:
> 
> 
> 
> > Hi,
> 
> > I'm having a hard time wrapping my head around short-circuit logic
> 
> > that's used by Python, coming from a C/C++ background; so I don't
> 
> > understand why the following condition is written this way!
> 
> > 
> 
> >  if not allow_zero and abs(x) < sys.float_info.epsilon:
> 
> > print("zero is not allowed")
> 
> 
> 
> Follow the logic.
> 
> 
> 
> If allow_zero is a true value, then "not allow_zero" is False, and the 
> 
> "and" clause cannot evaluate to true. (False and X is always False.) So 
> 
> print is not called.
> 
> 
> 
> If allow_zero is a false value, then "not allow_zero" is True, and the 
> 
> "and" clause depends on the second argument. (True and X is always X.) So
> 
> abs(x) < sys.float_info.epsilon is tested, and if that is True, print is 
> 
> called.
> 
> 
> 
> By the way, I don't think much of this logic. Values smaller than epsilon 
> 
> are not necessarily zero:
> 
> 
> 
> py> import sys
> 
> py> epsilon = sys.float_info.epsilon
> 
> py> x = epsilon/1
> 
> py> x == 0
> 
> False
> 
> py> x * 3 == 0
> 
> False
> 
> py> x + epsilon == 0
> 
> False
> 
> py> x + epsilon == epsilon
> 
> False
> 
> 
> 
> The above logic throws away many perfectly good numbers and treats them 
> 
> as zero even though they aren't.
> 
> 
> 
> 
> 
> > The purpose of this snippet is to print the given line when allow_zero
> 
> > is False and x is 0.
> 
> 
> 
> Then the snippet utterly fails at that, since it prints the line for many 
> 
> values of x which can be distinguished from zero. The way to test whether 
> 
> x equals zero is:
> 
> 
> 
> x == 0
> 
> 
> 
> What the above actually tests for is whether x is so small that (1.0+x) 
> 
> cannot be distinguished from 1.0, which is not the same thing. It is also 
> 
> quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.0001+x)? Or 
> 
> (1.0+x)?
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven

That may be true for integers, but for floats, testing for equality is not 
always precise
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-27 Thread Ahmed Abdulshafy
On Sunday, May 26, 2013 1:11:56 PM UTC+2, Ahmed Abdulshafy wrote:
> Hi,
> 
> I'm having a hard time wrapping my head around short-circuit logic that's 
> used by Python, coming from a C/C++ background; so I don't understand why the 
> following condition is written this way!>
> 
> 
> 
>  if not allow_zero and abs(x) < sys.float_info.epsilon:
> 
> print("zero is not allowed")
> 
> 
> 
> The purpose of this snippet is to print the given line when allow_zero is 
> False and x is 0.

Thank you guys! you gave me valuable insights! But regarding my original post, 
I don't know why for the past two days I was looking at the code *only* this 
way>
 if ( not allow_zero and abs(x) ) < sys.float_info.epsilon:

I feel so stupid now :-/, may be it's the new syntax confusing me :)! Thanks 
again guys.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Cameron Simpson
On 27May2013 06:59, Vito De Tullio  wrote:
| Cameron Simpson wrote:
| >   if s is not None and len(s) > 0:
| > ... do something with the non-empty string `s` ...
| > 
| > In this example, None is a sentinel value for "no valid string" and
| > calling "len(s)" would raise an exception because None doesn't have
| > a length.
| 
| obviously in this case an `if s: ...` is more than sufficient :P

:P

My fault for picking too similar a test.

Cheers,
-- 
Cameron Simpson 

Death is life's way of telling you you've been fired.   - R. Geis
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Vito De Tullio
Cameron Simpson wrote:

>   if s is not None and len(s) > 0:
> ... do something with the non-empty string `s` ...
> 
> In this example, None is a sentinel value for "no valid string" and
> calling "len(s)" would raise an exception because None doesn't have
> a length.

obviously in this case an `if s: ...` is more than sufficient :P

-- 
ZeD

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread rusi
On May 27, 5:40 am, Steven D'Aprano  wrote:
> On Sun, 26 May 2013 16:22:26 -0400, Roy Smith wrote:
> > In article ,
> >  Terry Jan Reedy  wrote:
>
> >> On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:
>
> >> >       if not allow_zero and abs(x) < sys.float_info.epsilon:
> >> >                  print("zero is not allowed")
>
> >> The reason for the order is to do the easy calculation first and the
> >> harder one only if the first passes.
>
> > This is a particularly egregious case of premature optimization.  You're
> > worried about how long it takes to execute abs(x)?  That's silly.
>
> I don't think it's a matter of premature optimization so much as the
> general principle "run code only if it needs to run". Hence, first you
> check the flag to decide whether or not you care whether x is near zero,
> and *only if you care* do you then check whether x is near zero.
>
> # This is silly:
> if x is near zero:
>     if we care:
>         handle near zero condition()
>
> # This is better:
> if we care:
>     if x is near zero
>         handle near zero condition()
>
> Not only is this easier to understand because it matches how we do things
> in the real life, but it has the benefit that if the "near zero"
> condition ever changes to become much more expensive, you don't have to
> worry about reordering the tests because they're already in the right
> order.
>
> --
> Steven

Three points:

3. These arguments are based on a certain assumption: that the inputs
are evenly distributed statistically.
If however that is not so, ie say:
"We-care" is mostly true
and
"x-is-near-zero" is more often false
then doing the near-zero test first would be advantageous

Well thats the 3rd point...

2. Nikalus Wirth deliberately did not use short-circuit boolean
operators in his languages because he found these kind of distinctions
to deteriorate into irrelevance and miss out the more crucial
questions of correctness

1. As Roy pointed out in his initial response to the OP:
"I dont understand your confusion... None of  applies to
your example"
its not at all clear to me that anything being said has anything to do
with what the OP asked!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Cameron Simpson
On 27May2013 00:40, Steven D'Aprano  
wrote:
| On Sun, 26 May 2013 16:22:26 -0400, Roy Smith wrote:
| 
| > In article ,
| >  Terry Jan Reedy  wrote:
| > 
| >> On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:
| >> 
| >> >   if not allow_zero and abs(x) < sys.float_info.epsilon:
| >> >  print("zero is not allowed")
| >> 
| >> The reason for the order is to do the easy calculation first and the
| >> harder one only if the first passes.
| > 
| > This is a particularly egregious case of premature optimization.  You're
| > worried about how long it takes to execute abs(x)?  That's silly.
| 
| I don't think it's a matter of premature optimization so much as the 
| general principle "run code only if it needs to run". Hence, first you 
| check the flag to decide whether or not you care whether x is near zero, 
| and *only if you care* do you then check whether x is near zero.
| 
| # This is silly:
| if x is near zero:
| if we care:
| handle near zero condition()
| 
| # This is better:
| if we care:
| if x is near zero
| handle near zero condition()
| 
| 
| Not only is this easier to understand because it matches how we do things 
| in the real life, but it has the benefit that if the "near zero" 
| condition ever changes to become much more expensive, you don't have to 
| worry about reordering the tests because they're already in the right 
| order.

I wouldn't even go that far, though nothing you say above is wrong.

Terry's assertion "The reason for the order is to do the easy
calculation first and the harder one only if the first passes" is
only sometimes that case, though well worth considering if the
second test _is_ expensive.

There are other reasons also. The first is of course your response,
that if the first test fails there's no need to even bother with
the second one. Faster, for free!

The second is that sometimes the first test is a guard against even
being able to perform the second test. Example:

  if s is not None and len(s) > 0:
... do something with the non-empty string `s` ...

In this example, None is a sentinel value for "no valid string" and
calling "len(s)" would raise an exception because None doesn't have
a length.

With short circuiting logic you can write this clearly and intuitively in one 
line
without extra control structure like the nested ifs above.

Cheers,
-- 
Cameron Simpson 

Who are all you people and why are you in my computer?  - Kibo
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Steven D'Aprano
On Sun, 26 May 2013 16:22:26 -0400, Roy Smith wrote:

> In article ,
>  Terry Jan Reedy  wrote:
> 
>> On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:
>> 
>> >   if not allow_zero and abs(x) < sys.float_info.epsilon:
>> >  print("zero is not allowed")
>> 
>> The reason for the order is to do the easy calculation first and the
>> harder one only if the first passes.
> 
> This is a particularly egregious case of premature optimization.  You're
> worried about how long it takes to execute abs(x)?  That's silly.

I don't think it's a matter of premature optimization so much as the 
general principle "run code only if it needs to run". Hence, first you 
check the flag to decide whether or not you care whether x is near zero, 
and *only if you care* do you then check whether x is near zero.

# This is silly:
if x is near zero:
if we care:
handle near zero condition()

# This is better:
if we care:
if x is near zero
handle near zero condition()


Not only is this easier to understand because it matches how we do things 
in the real life, but it has the benefit that if the "near zero" 
condition ever changes to become much more expensive, you don't have to 
worry about reordering the tests because they're already in the right 
order.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Terry Jan Reedy

On 5/26/2013 4:22 PM, Roy Smith wrote:

In article ,
  Terry Jan Reedy  wrote:


On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:


   if not allow_zero and abs(x) < sys.float_info.epsilon:
  print("zero is not allowed")


The reason for the order is to do the easy calculation first and the
harder one only if the first passes.


This is a particularly egregious case of premature optimization.  You're
worried about how long it takes to execute abs(x)?  That's silly.


This is a particularly egregious case of premature response. You're 
ignoring an extra name lookup and two extra attribute lookups. That's silly.


That's beside the fact that one *must* choose, so any difference is a 
reason to act rather than being frozen like Buridan's ass.

http://en.wikipedia.org/wiki/Buridan%27s_ass

If you wish, replace 'The reason' with 'A reason'. I also the logical 
flow as better with the order given.






--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Roy Smith
In article ,
 Terry Jan Reedy  wrote:

> On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:
> 
> >   if not allow_zero and abs(x) < sys.float_info.epsilon:
> >  print("zero is not allowed")
> 
> The reason for the order is to do the easy calculation first and the 
> harder one only if the first passes.

This is a particularly egregious case of premature optimization.  You're 
worried about how long it takes to execute abs(x)?  That's silly.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Terry Jan Reedy

On 5/26/2013 7:11 AM, Ahmed Abdulshafy wrote:


  if not allow_zero and abs(x) < sys.float_info.epsilon:
 print("zero is not allowed")


The reason for the order is to do the easy calculation first and the 
harder one only if the first passes.




--
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Steven D'Aprano
On Sun, 26 May 2013 04:11:56 -0700, Ahmed Abdulshafy wrote:

> Hi,
> I'm having a hard time wrapping my head around short-circuit logic
> that's used by Python, coming from a C/C++ background; so I don't
> understand why the following condition is written this way!
> 
>  if not allow_zero and abs(x) < sys.float_info.epsilon:
> print("zero is not allowed")

Follow the logic.

If allow_zero is a true value, then "not allow_zero" is False, and the 
"and" clause cannot evaluate to true. (False and X is always False.) So 
print is not called.

If allow_zero is a false value, then "not allow_zero" is True, and the 
"and" clause depends on the second argument. (True and X is always X.) So
abs(x) < sys.float_info.epsilon is tested, and if that is True, print is 
called.

By the way, I don't think much of this logic. Values smaller than epsilon 
are not necessarily zero:

py> import sys
py> epsilon = sys.float_info.epsilon
py> x = epsilon/1
py> x == 0
False
py> x * 3 == 0
False
py> x + epsilon == 0
False
py> x + epsilon == epsilon
False

The above logic throws away many perfectly good numbers and treats them 
as zero even though they aren't.


> The purpose of this snippet is to print the given line when allow_zero
> is False and x is 0.

Then the snippet utterly fails at that, since it prints the line for many 
values of x which can be distinguished from zero. The way to test whether 
x equals zero is:

x == 0

What the above actually tests for is whether x is so small that (1.0+x) 
cannot be distinguished from 1.0, which is not the same thing. It is also 
quite arbitrary. Why 1.0? Why not (0.0001+x)? Or (0.0001+x)? Or 
(1.0+x)?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Short-circuit Logic

2013-05-26 Thread Roy Smith
In article <5f101d70-e51f-4531-9153-c92ee2486...@googlegroups.com>,
 Ahmed Abdulshafy  wrote:

> Hi,
> I'm having a hard time wrapping my head around short-circuit logic that's 
> used by Python, coming from a C/C++ background; so I don't understand why the 
> following condition is written this way!>
> 
>  if not allow_zero and abs(x) < sys.float_info.epsilon:
> print("zero is not allowed")
> 
> The purpose of this snippet is to print the given line when allow_zero is 
> False and x is 0.

I don't understand your confusion.  Short-circuit evaluation works in 
Python exactly the same way it works in C.  When you have a boolean 
operation, the operands are evaluated left-to-right, and evaluation 
stops as soon as the truth value of the expression is known.

In C, you would write:

   if (p && p->foo) {
blah();
}

to make sure that you don't dereference a null pointer.  A similar 
example in Python might be:

if d and d["foo"]:
blah()

which protects against trying to access an element of a dictionary if 
the dictionary is None (which might happen if d was an optional argument 
to a method and wasn't passed on this invocation).

But, none of that applies to your example.  The condition is

not allow_zero and abs(x) < sys.float_info.epsilon:

it's safe to evaluate "abs(x) < sys.float_info.epsilon" no matter what 
the value of "not allow_zero".  For the purposes of understanding your 
code, you can pretend that short-circuit evaluation doesn't exist!

So, what is your code doing that you don't understand?
-- 
http://mail.python.org/mailman/listinfo/python-list


Short-circuit Logic

2013-05-26 Thread Ahmed Abdulshafy
Hi,
I'm having a hard time wrapping my head around short-circuit logic that's used 
by Python, coming from a C/C++ background; so I don't understand why the 
following condition is written this way!>

 if not allow_zero and abs(x) < sys.float_info.epsilon:
print("zero is not allowed")

The purpose of this snippet is to print the given line when allow_zero is False 
and x is 0.
-- 
http://mail.python.org/mailman/listinfo/python-list