Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Steven D'Aprano writes:

 > I don't think so. Floating point == represents *numeric* equality,

There is no such thing as floating point == in Python.  You can apply
== to two floating point numbers, but == (at the language level)
handles any two numbers, as well as pairs of things that aren't
numbers in the Python language.  So it's a design decision to include
NaNs at all, and another design decision to follow IEEE in giving them
behavior that violates the definition of equivalence relation for ==.

 > In an early post, you suggested that NANs don't have a value, or that 
 > they have a value which is not a value. I don't think that's a good way 
 > to look at it. I think the obvious way to think of it is that NAN's 
 > value is Not A Number, exactly like it says on the box. Now, if 
 > something is not a number, obviously you cannot compare it numerically:

And if Python can't do something you ask it to do, it raises an
exception.  Why should this be different?  Obviously, it's question of
expedience.

 > I'm not sure what you're referring to here. Is it that containers such 
 > as lists and dicts are permitted to optimize equality tests with 
 > identity tests for speed?

No, when I say I'm fuzzy I'm referring to the fact that although I
understand the logical rationale for IEEE 754 NaN behavior, I don't
really understand the ins and outs well enough to judge for myself
whether it's a good idea for Python to follow that model and turn ==
into something that is not an equivalence relation.

I'm not going to argue for a change, I just want to know where I stand.

 > Basically, and I realise that many people disagree with their decision 
 > (notably Bertrand Meyer of Eiffel fame, and our own Mark
 > Dickenson),

Indeed.  So "it's the standard" does not mean there is a consensus of
experts.  I'm willing to delegate to a consensus of expert opinion,
but not when some prominent local expert(s) disagree -- then I'd like
to understand well enough to come to my own conclusions.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Raymond Hettinger

On Jul 7, 2014, at 4:37 PM, Andreas Maier  wrote:

> I do not really buy into the arguments that try to show how identity and 
> value are somehow the same. They are not, not even in Python.
> 
> The argument I can absolutely buy into is that the implementation cannot be 
> changed within a major release. So the real question is how we document it.

Once every few years, someone discovers IEEE-754, learns that NaNs
aren't supposed to be equal to themselves and becomes inspired
to open an old debate about whether the wreck Python in a effort
to make the world safe for NaNs.  And somewhere along the way,
people forget that practicality beats purity.

Here are a few thoughts on the subject that may or may not add
a little clarity ;-)

* Python already has IEEE-754 compliant NaNs:

   assert float('NaN') != float('NaN')

* Python already has the ability to filter-out NaNs:

   [x for x in container if not math.nan(x)]

* In the numeric world, the most common use of NaNs is for
  missing data (much like we usually use None).  The property
  of not being equality to itself is primarily useful in
  low level code optimized to run a calculation to completion
  without running frequent checks for invalid results
  (much like @n/a is used in MS Excel).

* Python also lets containers establish their own invariants
  to establish correctness, improve performance, and make it
  possible to reason about our programs:

   for x in c:
   assert x in c

* Containers like dicts and sets have always used the rule
  that identity-implies equality.  That is central to their
  implementation.  In particular, the check of interned
  string keys relies on identity to bypass a slow
  character-by-character comparison to verify equality.

* Traditionally, a relation R is considered an equality
  relation if it is reflexive, symmetric, and transitive:

  R(x, x) -> True
  R(x, y) -> R(y, x)
  R(x, y) ^ R(y, z) -> R(x, z)

* Knowingly or not, programs tend to assume that all of those
  hold.  Test suites in particular assume that if you put
  something in a container that assertIn() will pass.

* Here are some examples of cases where non-reflexive objects
  would jeopardize the pragmatism of being able to reason
  about the correctness of programs:

  s = SomeSet()
  s.add(x)
  assert x in s

  s.remove(x)# See collections.abc.Set.remove
  assert not s

  s.clear()  # See collections.abc.Set.clear
  asset not s

* What the above code does is up to the implementer of the
  container.  If you use the Set ABC, you can choose to
  implement __contains__() and discard() to use straight
  equality or identity-implies equality.  Nothing prevents
  you from making containers that are hard to reason about.

* The builtin containers make the choice for identity-implies
  equality so that it is easier to build fast, correct code.
  For the most part, this has worked out great (dictionaries
  in particular have had identify checks built-in from almost
  twenty years).

* Years ago, there was a debate about whether to add an __is__()
  method to allow overriding the is-operator.  The push for the
  change was the "pure" notion that "all operators should be
  customizable".  However, the idea was rejected based on the
  "practical" notions that it would wreck our ability to reason
  about code, it slow down all code that used identity checks,
  that library modules (ours and third-party) already made
  deep assumptions about what "is" means, and that people would
  shoot themselves in the foot with hard to find bugs.

Personally, I see no need to make the same mistake by removing
the identity-implies-equality rule from the built-in containers.
There's no need to upset the apple cart for nearly zero benefit.

IMO, the proposed quest for purity is misguided.
There are many practical reasons to let the builtin
containers continue work as the do now.


Raymond ___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 06:08 PM, Ben Hoyt wrote:


Just like an attribute does not imply a system call, having a
method named 'is_dir' /does/ imply a system call, and not
having one can be just as misleading.


Why does a method imply a system call? os.path.join() and str.lower()
don't make system calls. Isn't it just a matter of clear
documentation? Anyway -- less philosophical discussion below.


In this case because the names are exactly the same as the os versions which 
/do/ make a system call.



I presume you're suggesting that is_dir/is_file/is_symlink should be
regular attributes, and accessing them should never do a system call.
But what if the system doesn't support d_type (eg: Solaris) or the
d_type value is DT_UNKNOWN (can happen on Linux, OS X, BSD)? The
options are:


So if I'm finally understanding the root problem here:

  - listdir returns a list of strings, one for each filename and one for
each directory, and keeps no other O/S supplied info.

  - os.walk, which uses listdir, then needs to go back to the O/S and
refetch the thrown-away information

  - so it's slow.

The solution:

  - have scandir /not/ throw away the O/S supplied info

and the new problem:

  - not all O/Ses provide the same (or any) extra info about the
directory entries

Have I got that right?

If so, I still like the attribute idea better (surprise!), we just need to revisit the 'ensure_lstat' (or whatever it's 
called) parameter:  instead of a true/false value, it could have a scale:


  - 0 = whatever the O/S gives us

  - 1 = at least the is_dir/is_file (whatever the other normal one is),
and if the O/S doesn't give it to us for free than call lstat

  - 2 = we want it all -- call lstat if necessary on this platform

After all, the programmer should know up front how much of the extra info will be needed for the work that is trying to 
be done.




We have a choice before us, a fork in the road. :-) We can choose one
of these options for the scandir API:

1) The current PEP 471 approach. This solves the issue with d_type
being missing or DT_UNKNOWN, it doesn't require onerror, and it's a
really tidy API that doesn't explode with AttributeErrors if you write
code on Windows (without thinking too hard) and then move to Linux. I
think all of these points are important -- the cross-platform one not
the least, because we want to make it easy, even *trivial*, for people
to write cross-platform code.


Yes, but we don't want a function that sucks equally on all platforms.  ;)



2) Nick Coghlan's model of only fetching the lstat value if
ensure_lstat=True, and including an onerror callback for error
handling when scandir calls lstat internally. However, as described,
we'd also need an ensure_type=True option, so that scandir() isn't way
slower than listdir() if you actually don't want the is_X values and
d_type is missing/unknown.


With the multi-level version of 'ensure_lstat' we do not need an extra 
'ensure_type'.

For reference, here's what get_tree_size() looks like with this approach, not 
including error handling with onerror:

  def get_tree_size(path):
   total = 0
   for entry in os.scandir(path, ensure_lstat=1):
   if entry.is_dir:
   total += get_tree_size(entry.full_name)
   else:
   total += entry.lstat_result.st_size
   return total

And if we added the onerror here it would be a line fragment, as opposed to the extra four lines (at least) for the 
try/except in the first example (which I cut).



Finally:

Thank you for writing scandir, and this PEP.  Excellent work.

Oh, and +1 for option 2, slightly modified.  :)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 06:33:31PM +0100, MRAB wrote:

> The log of a negative number is a complex number.

Only in complex arithmetic. In real arithmetic, the log of a negative 
number isn't a number at all.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
> I did better than that -- I read the whole thing!  ;)

Thanks. :-)

> -1 on the PEP's implementation.
>
> Just like an attribute does not imply a system call, having a
> method named 'is_dir' /does/ imply a system call, and not
> having one can be just as misleading.

Why does a method imply a system call? os.path.join() and str.lower()
don't make system calls. Isn't it just a matter of clear
documentation? Anyway -- less philosophical discussion below.

> If we have this:
>
> size = 0
> for entry in scandir('/some/path'):
> size += entry.st_size
>
>   - on Windows, this should Just Work (if I have the names correct ;)
>   - on Posix, etc., this should fail noisily with either an AttributeError
> ('entry' has no 'st_size') or a TypeError (cannot add None)
>
> and the solution is equally simple:
>
> for entry in scandir('/some/path', stat=True):
>
>   - if not Windows, perform a stat call at the same time

I'm not totally opposed to this, which is basically a combination of
Nick Coghlan's and Paul Moore's recent proposals mentioned in the PEP.
However, as discussed on python-dev, there are some edge cases it
doesn't handle very well, and it's messier to handle errors (requires
onerror as you mention below).

I presume you're suggesting that is_dir/is_file/is_symlink should be
regular attributes, and accessing them should never do a system call.
But what if the system doesn't support d_type (eg: Solaris) or the
d_type value is DT_UNKNOWN (can happen on Linux, OS X, BSD)? The
options are:

1) scandir() would always call lstat() in the case of missing/unknown
d_type. If so, scandir() is actually more expensive than listdir(),
and as a result it's no longer safe to implement listdir in terms of
scandir:

def listdir(path='.'):
return [e.name for e in scandir(path)]

2) Or would it be better to have another flag like scandir(path,
type=True) to ensure the is_X type info is fetched? This is explicit,
but also getting kind of unwieldly.

3) A third option is for the is_X attributes to be absent in this case
(hasattr tests required, and the user would do the lstat manually).
But as I noted on python-dev recently, you basically always want is_X,
so this leads to unwieldly and code that's twice as long as it needs
to be. See here:
https://mail.python.org/pipermail/python-dev/2014-July/135312.html

4) I gather in your proposal above, scandir will call lstat() if
stat=True? Except where does it put the values? Surely it should
return an existing stat_result object, rather than stuffing everything
onto the DirEntry, or throwing away some values on Linux? In this
case, I'd prefer Nick Coghlan's approach of ensure_lstat and a
.stat_result attribute. However, this still has the "what if d_type is
missing or DT_UNKNOWN" issue.

It seems to me that making is_X() methods handles this exact scenario
-- methods are so you don't have to do the dirty work.

So yes, the real world is messy due to missing is_X values, but I
think it's worth getting this right, and is_X() methods can do this
while keeping the API simple and cross-platform.

> Now, of course, we might get errors.  I am not a big fan of wrapping 
> everything in try/except, particularly when we already have a model to follow 
> -- os.walk:

I don't mind the onerror too much if we went with this kind of
approach. It's not quite as nice as a standard try/except around the
method call, but it's definitely workable and has a precedent with
os.walk().

It seems a bit like we're going around in circles here, and I think we
have all the information and options available to us, so I'm going to
SUMMARIZE.

We have a choice before us, a fork in the road. :-) We can choose one
of these options for the scandir API:

1) The current PEP 471 approach. This solves the issue with d_type
being missing or DT_UNKNOWN, it doesn't require onerror, and it's a
really tidy API that doesn't explode with AttributeErrors if you write
code on Windows (without thinking too hard) and then move to Linux. I
think all of these points are important -- the cross-platform one not
the least, because we want to make it easy, even *trivial*, for people
to write cross-platform code.

For reference, here's what get_tree_size() looks like with this
approach, not including error handling with try/except:

def get_tree_size(path):
total = 0
for entry in os.scandir(path):
if entry.is_dir():
total += get_tree_size(entry.full_name)
else:
total += entry.lstat().st_size
return total

2) Nick Coghlan's model of only fetching the lstat value if
ensure_lstat=True, and including an onerror callback for error
handling when scandir calls lstat internally. However, as described,
we'd also need an ensure_type=True option, so that scandir() isn't way
slower than listdir() if you actually don't want the is_X values and
d_type is missing/unknown.

For reference, here's what get_tree_size() looks like with this
approach, not including error handlin

Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 01:22 PM, Ethan Furman wrote:


I think caching the attributes for DirEntry is fine, but let's do it as a 
snapshot of that moment in time, not name now,
and attributes in 30 minutes when we finally get to you because we had a lot of 
processing/files ahead of you (you being
a DirEntry ;) .


This bit is wrong, I think, since scandir is a generator -- there wouldn't be much time passing between the direntry 
call and the stat call in any case.  Hopefully my other points still hold.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 12:34 PM, Ben Hoyt wrote:


Better to just have the attributes be None if they were not fetched.  None
is better than hasattr anyway, at least in the respect of not having to
catch exceptions to function properly.


The thing is, is_dir() and lstat() are not attributes (for a good
reason). Please read the relevant "Rejected ideas" sections and let us
know what you think. :-)


I did better than that -- I read the whole thing!  ;)

-1 on the PEP's implementation.

Just like an attribute does not imply a system call, having a method named 'is_dir' /does/ imply a system call, and not 
having one can be just as misleading.


If we have this:

size = 0
for entry in scandir('/some/path'):
size += entry.st_size

  - on Windows, this should Just Work (if I have the names correct ;)
  - on Posix, etc., this should fail noisily with either an AttributeError
('entry' has no 'st_size') or a TypeError (cannot add None)

and the solution is equally simple:

for entry in scandir('/some/path', stat=True):

  - if not Windows, perform a stat call at the same time

Now, of course, we might get errors.  I am not a big fan of wrapping everything in try/except, particularly when we 
already have a model to follow -- os.walk:


for entry in scandir('/some/path', stat=True, onerror=record_and_skip):

If we don't care if an error crashes the script, leave off onerror.

If we don't need st_size and friends, leave off stat=True.

If we get better performance on Windows instead of Linux, that's okay.

scandir is going into os because it may not behave the same on every platform.  Heck, even some non-os modules 
(multiprocessing comes to mind) do not behave the same on every platform.


I think caching the attributes for DirEntry is fine, but let's do it as a snapshot of that moment in time, not name now, 
and attributes in 30 minutes when we finally get to you because we had a lot of processing/files ahead of you (you being 
a DirEntry ;) .


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
>> I think you're misunderstanding is_dir() and is_file(), as these don't
>> actually call os.stat(). All DirEntry methods either call nothing or
>> os.lstat() to get the stat info on the entry itself (not the
>> destination of the symlink).
>
>
> Oh. Extract of your PEP: "is_dir(): like os.path.isdir(), but much cheaper".
>
> genericpath.isdir() and genericpath.isfile() use os.stat(), whereas
> posixpath.islink() uses os.lstat().
>
> Is it a mistake in the PEP?

Ah, you're dead right -- this is basically a bug in the PEP, as
DirEntry.is_dir() is not like os.path.isdir() in that it is based on
the entry itself (like lstat), not following the link.

I'll improve the wording here and update the PEP.

-Ben
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Victor Stinner
Le mardi 8 juillet 2014, Ben Hoyt  a écrit :

>
> > It is not clear to me which methods share the cache.
> >
> > On UNIX, is_dir() and is_file() call os.stat(); whereas lstat() and
> > is_symlink() call os.lstat().
> >
> > If os.stat() says that the file is not a symlink, I guess that you can
> > use os.stat() result for lstat() and is_symlink() methods?
> >
> > In the worst case, if the path is a symlink, would it be possible that
> > os.stat() and os.lstat() become "inconsistent" if the symlink is
> > modified between the two calls? If yes, I don't think that it's an
> > issue, it's just good to know it.
> >
> > For symlinks, readdir() returns the status of the linked file or of the
> symlink?
>
> I think you're misunderstanding is_dir() and is_file(), as these don't
> actually call os.stat(). All DirEntry methods either call nothing or
> os.lstat() to get the stat info on the entry itself (not the
> destination of the symlink).


Oh. Extract of your PEP: "is_dir(): like os.path.isdir(), but much cheaper".

genericpath.isdir() and genericpath.isfile() use os.stat(), whereas
posixpath.islink() uses os.lstat().

Is it a mistake in the PEP?

> In light of this, I don't think what you're describing above is an issue.

I'm not saying that there is an issue, I'm just trying to understand.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
> Better to just have the attributes be None if they were not fetched.  None
> is better than hasattr anyway, at least in the respect of not having to
> catch exceptions to function properly.

The thing is, is_dir() and lstat() are not attributes (for a good
reason). Please read the relevant "Rejected ideas" sections and let us
know what you think. :-)

-Ben
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ethan Furman

On 07/08/2014 11:05 AM, Ben Hoyt wrote:

Only exposing what the OS provides for free will make the API too difficult
to use in the common case. But is there a nice way to expand the API that
will allow the user who is trying to avoid extra expense know what
information is already available?

Even if the initial version doesn't have a way to check what information is
there for free, ensuring there is a clean way to add this in the future
would be really nice.


We could easily add ".had_type" and ".had_lstat" properties (not sure
on the names), that would be true if the is_X information and lstat
information was fetched, respectively. Basically both would always be
True on Windows, but on POSIX only had_type would be True d_type is
present and != DT_UNKNOWN.

I don't feel this is actually necessary, but it's not hard to add.

Thoughts?


Better to just have the attributes be None if they were not fetched.  None is better than hasattr anyway, at least in 
the respect of not having to catch exceptions to function properly.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
> Only exposing what the OS provides for free will make the API too difficult
> to use in the common case. But is there a nice way to expand the API that
> will allow the user who is trying to avoid extra expense know what
> information is already available?
>
> Even if the initial version doesn't have a way to check what information is
> there for free, ensuring there is a clean way to add this in the future
> would be really nice.

We could easily add ".had_type" and ".had_lstat" properties (not sure
on the names), that would be true if the is_X information and lstat
information was fetched, respectively. Basically both would always be
True on Windows, but on POSIX only had_type would be True d_type is
present and != DT_UNKNOWN.

I don't feel this is actually necessary, but it's not hard to add.

Thoughts?

-Ben
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
> I remember a pending question on python-dev:
>
> - Martin von Loewis asked if the scandir generator would have send()
> and close() methods as any Python generator. I didn't see a reply on
> the mailing (nor in the PEP).

Good call. Looks like you're referring to this message:
https://mail.python.org/pipermail/python-dev/2014-July/135324.html

I'm not actually familiar with the purpose of .close() and
.send()/.throw() on generators. Do you typically call these functions
manually, or are they called automatically by the generator protocol?

> It is not clear to me which methods share the cache.
>
> On UNIX, is_dir() and is_file() call os.stat(); whereas lstat() and
> is_symlink() call os.lstat().
>
> If os.stat() says that the file is not a symlink, I guess that you can
> use os.stat() result for lstat() and is_symlink() methods?
>
> In the worst case, if the path is a symlink, would it be possible that
> os.stat() and os.lstat() become "inconsistent" if the symlink is
> modified between the two calls? If yes, I don't think that it's an
> issue, it's just good to know it.
>
> For symlinks, readdir() returns the status of the linked file or of the 
> symlink?

I think you're misunderstanding is_dir() and is_file(), as these don't
actually call os.stat(). All DirEntry methods either call nothing or
os.lstat() to get the stat info on the entry itself (not the
destination of the symlink). In light of this, I don't think what
you're describing above is an issue.

-Ben
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread MRAB

On 2014-07-08 17:57, Steven D'Aprano wrote:
[snip]


In particular, reflexivity for NANs was dropped for a number of reasons,
some stronger than others:

- One of the weaker reasons for NAN non-reflexivity is that it preserved
   the identity x == y <=> x - y == 0. Although that is the cornerstone
   of real arithmetic, it's violated by IEEE-754 INFs, so violating it
   for NANs is not a big deal either.

- Dropping reflexivity preserves the useful property that NANs compare
   unequal to everything.

- Practicality beats purity: dropping reflexivity allowed programmers
   to identify NANs without waiting years or decades for programming
   languages to implement isnan() functions. E.g. before Python had
   math.isnan(), I made my own:

   def isnan(x):
   return isinstance(x, float) and x != x

- Keeping reflexivity for NANs would have implied some pretty nasty
   things, e.g. if log(-3) == log(-5), then -3 == -5.


The log of a negative number is a complex number.


Basically, and I realise that many people disagree with their decision
(notably Bertrand Meyer of Eiffel fame, and our own Mark Dickenson), the
IEEE-754 committee led by William Kahan decided that the problems caused
by having NANs compare unequal to themselves were much less than the
problems that would have been caused without it.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Chris Angelico
On Wed, Jul 9, 2014 at 3:00 AM, Steven D'Aprano  wrote:
> On Tue, Jul 08, 2014 at 04:58:33PM +0200, Anders J. Munch wrote:
>
>> For two NaNs computed differently to compare equal is no worse than 2+2
>> comparing equal to 1+3.  You're comparing values, not their history.
>
> a = -23
> b = -42
> if log(a) == log(b):
> print "a == b"

That could also happen from rounding error, though.

>>> a = 2.0**52
>>> b = a+1.0
>>> a == b
False
>>> log(a) == log(b)
True

Any time you do any operation on numbers that are close together but
not equal, you run the risk of getting results that, in
finite-precision floating point, are deemed equal, even though
mathematically they shouldn't be (two unequal numbers MUST have
unequal logarithms).

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 04:58:33PM +0200, Anders J. Munch wrote:

> For two NaNs computed differently to compare equal is no worse than 2+2 
> comparing equal to 1+3.  You're comparing values, not their history.

a = -23
b = -42
if log(a) == log(b):
print "a == b"


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 04:53:50PM +0900, Stephen J. Turnbull wrote:
> Chris Angelico writes:
> 
>  > The reason NaN isn't equal to itself is because there are X bit
>  > patterns representing NaN, but an infinite number of possible
>  > non-numbers that could result from a calculation.
> 
> I understand that.  But you're missing at least two alternatives that
> involve raising on some calculations involving NaN, as well as the
> fact that forcing inequality of two NaNs produced by equivalent
> calculations is arguably just as wrong as allowing equality of two
> NaNs produced by the different calculations.  

I don't think so. Floating point == represents *numeric* equality, not
(for example) equality in the sense of "All Men Are Created Equal". Not
even numeric equality in the most general sense, but specifically in the
sense of (approximately) real-valued numbers, so it's an extremely 
precise definition of "equal", not fuzzy in any way.

In an early post, you suggested that NANs don't have a value, or that 
they have a value which is not a value. I don't think that's a good way 
to look at it. I think the obvious way to think of it is that NAN's 
value is Not A Number, exactly like it says on the box. Now, if 
something is not a number, obviously you cannot compare it numerically:

"Considered as numbers, is the sound of rain on a tin roof
 numerically equal to the sight of a baby smiling?"

Some might argue that the only valid answer to this question is "Mu",

https://en.wikipedia.org/wiki/Mu_%28negative%29#.22Unasking.22_the_question

but if we're forced to give a Yes/No True/False answer, then clearly
False is the only sensible answer. No, Virginia, Santa Claus is not the 
same number as Santa Claus.

To put it another way, if x is not a number, then x != y for all 
possible values of y -- including x.

[Disclaimer: despite the name, IEEE-754 arguably does not intend NANs to 
be Not A Number in the sense that Santa Claus is not a number, but more 
like "it's some number, but it's impossible to tell which". However, 
despite that, the standard specifies behaviour which is best thought of 
in terms of as the Santa Claus model.]



> That's where things get
> fuzzy for me -- in Python I would expect that preserving invariants
> would be more important than computational efficiency, but evidently
> it's not.  

I'm not sure what you're referring to here. Is it that containers such 
as lists and dicts are permitted to optimize equality tests with 
identity tests for speed?

py> NAN = float('NAN')
py> a = [1, 2, NAN, 4]
py> NAN in a  # identity is checked before equality
True
py> any(x == NAN for x in a)
False


When this came up for discussion last time, the clear consensus was that 
this is reasonable behaviour. NANs and other such "weird" objects are 
too rare and too specialised for built-in classes to carry the burden of 
having to allow for them. If you want a "NAN-aware list", you can make 
one yourself.


> I assume that I would have a better grasp on why Python
> chose to go this way rather than that if I understood IEEE 754 better.

See the answer by Stephen Canon here:

http://stackoverflow.com/questions/1565164/

[quote]

It is not possible to specify a fixed-size arithmetic type that 
satisfies all of the properties of real arithmetic that we know and 
love. The 754 committee has to decide to bend or break some of them. 
This is guided by some pretty simple principles:

When we can, we match the behavior of real arithmetic.
When we can't, we try to make the violations as predictable and as 
easy to diagnose as possible.

[end quote]


In particular, reflexivity for NANs was dropped for a number of reasons, 
some stronger than others:

- One of the weaker reasons for NAN non-reflexivity is that it preserved
  the identity x == y <=> x - y == 0. Although that is the cornerstone 
  of real arithmetic, it's violated by IEEE-754 INFs, so violating it
  for NANs is not a big deal either.

- Dropping reflexivity preserves the useful property that NANs compare 
  unequal to everything.

- Practicality beats purity: dropping reflexivity allowed programmers
  to identify NANs without waiting years or decades for programming 
  languages to implement isnan() functions. E.g. before Python had 
  math.isnan(), I made my own:

  def isnan(x):
  return isinstance(x, float) and x != x

- Keeping reflexivity for NANs would have implied some pretty nasty
  things, e.g. if log(-3) == log(-5), then -3 == -5.


Basically, and I realise that many people disagree with their decision 
(notably Bertrand Meyer of Eiffel fame, and our own Mark Dickenson), the 
IEEE-754 committee led by William Kahan decided that the problems caused 
by having NANs compare unequal to themselves were much less than the 
problems that would have been caused without it.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/

Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Janzert

On 7/8/2014 9:52 AM, Ben Hoyt wrote:

DirEntry fields being "static" attribute-only objects
-

In `this July 2014 python-dev message
`_,
Paul Moore suggested a solution that was a "thin wrapper round the OS
feature", where the ``DirEntry`` object had only static attributes:
``name``, ``full_name``, and ``is_X``, with the ``st_X`` attributes
only present on Windows. The idea was to use this simpler, lower-level
function as a building block for higher-level functions.

At first there was general agreement that simplifying in this way was
a good thing. However, there were two problems with this approach.
First, the assumption is the ``is_dir`` and similar attributes are
always present on POSIX, which isn't the case (if ``d_type`` is not
present or is ``DT_UNKNOWN``). Second, it's a much harder-to-use API
in practice, as even the ``is_dir`` attributes aren't always present
on POSIX, and would need to be tested with ``hasattr()`` and then
``os.stat()`` called if they weren't present.



Only exposing what the OS provides for free will make the API too 
difficult to use in the common case. But is there a nice way to expand 
the API that will allow the user who is trying to avoid extra expense 
know what information is already available?


Even if the initial version doesn't have a way to check what information 
is there for free, ensuring there is a clean way to add this in the 
future would be really nice.


Janzert

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Anders J. Munch

Chris Angelico wrote:


This is off-topic for this thread, but still...

The trouble is that your "arguably just as wrong" is an
indistinguishable case. If you don't want two different calculations'
NaNs to *ever* compare equal, the only solution is to have all NaNs
compare unequal
For two NaNs computed differently to compare equal is no worse than 2+2 
comparing equal to 1+3.  You're comparing values, not their history.


You've prompted me to get a rant on the subject off my chest, I just posted an 
article on NaN comparisons to python-list.


regards, Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Victor Stinner
Hi,

2014-07-08 15:52 GMT+02:00 Ben Hoyt :
> After some very good python-dev feedback on my first version of PEP
> 471, I've updated the PEP to clarify a few things and added various
> "Rejected ideas" subsections. Here's a link to the new version (I've
> also copied the full text below):

Thanks, the new PEP looks better.

> * Removed the "open issues" section, as the three open issues have
> either been included (full_name) or rejected (windows_wildcard)

I remember a pending question on python-dev:

- Martin von Loewis asked if the scandir generator would have send()
and close() methods as any Python generator. I didn't see a reply on
the mailing (nor in the PEP).

> One known error in the PEP is that the "Notes" sections should be
> top-level sections, not be subheadings of "Examples". If someone would
> like to give me ("benhoyt") commit access to the peps repo, I can fix
> this and any other issues that come up.

Or just send me your new PEP ;-)

> Notes on caching
> 
>
> The ``DirEntry`` objects are relatively dumb -- the ``name`` and
> ``full_name`` attributes are obviously always cached, and the ``is_X``
> and ``lstat`` methods cache their values (immediately on Windows via
> ``FindNextFile``, and on first use on POSIX systems via a ``stat``
> call) and never refetch from the system.

It is not clear to me which methods share the cache.

On UNIX, is_dir() and is_file() call os.stat(); whereas lstat() and
is_symlink() call os.lstat().

If os.stat() says that the file is not a symlink, I guess that you can
use os.stat() result for lstat() and is_symlink() methods?

In the worst case, if the path is a symlink, would it be possible that
os.stat() and os.lstat() become "inconsistent" if the symlink is
modified between the two calls? If yes, I don't think that it's an
issue, it's just good to know it.

For symlinks, readdir() returns the status of the linked file or of the symlink?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot.python.org down again?

2014-07-08 Thread Guido van Rossum
May the true owner of buildbot.python.org stand up!

(But I do think there may well not be anyone who feels they own it. And
that's a problem for its long term viability.)

Generally speaking, as an organization we should set up a process for
managing ownership of *all* infrastructure in a uniform way. I don't mean
to say that we need to manage all infrastructure uniformly, just that we
need to have a process for identifying and contacting the owner(s) for each
piece of infrastructure, as well as collecting other information that
people besides the owners might need to know. You can use a wiki page for
that list for all I care, but have a process for what belongs there,
how/when to update it, and even an owner for the wiki page! Stuff like this
shouldn't be just in a few people's heads (even if they are board members)
nor should it be in a file in a repo that nobody has ever heard of.


On Tue, Jul 8, 2014 at 12:33 AM, Donald Stufft  wrote:

>
> On Jul 8, 2014, at 12:58 AM, Nick Coghlan  wrote:
>
>
> On 7 Jul 2014 10:47, "Guido van Rossum"  wrote:
> >
> > It would still be nice to know who "the appropriate persons" are. Too
> much of our infrastructure seems to be maintained by house elves or the ITA.
>
> I volunteered to be the board's liaison to the infrastructure team, and
> getting more visibility around what the infrastructure *is* and how it's
> monitored and supported is going to be part of that. That will serve a
> couple of key purposes:
>
> - making the points of escalation clearer if anything breaks or needs
> improvement (although "infrastruct...@python.org" is a good default
> choice)
> - making the current "todo" list of the infrastructure team more visible
> (both to calibrate resolution time expectations and to provide potential
> contributors an idea of what's involved)
>
> Noah has already set up http://status.python.org/ to track service
> status, I can see about getting buildbot.python.org added to the list.
>
> Cheers,
> Nick.
>
>
> We (the infrastructure team) were actually looking earlier about
> buildbot.python.org and we’re not entirely sure who "owns"
> buildbot.python.org.
> Unfortunately a lot of the *.python.org services are in a similar state
> where
> there is no clear owner. Generally we've not wanted to just step in and
> take
> over for fear of stepping on someones toes but it appears that perhaps
> buildbot.p.o has no owner?
>
> -
> Donald Stufft
> PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372
> DCFA
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Updates to PEP 471, the os.scandir() proposal

2014-07-08 Thread Ben Hoyt
Hi folks,

After some very good python-dev feedback on my first version of PEP
471, I've updated the PEP to clarify a few things and added various
"Rejected ideas" subsections. Here's a link to the new version (I've
also copied the full text below):

http://legacy.python.org/dev/peps/pep-0471/ -- new PEP as HTML
http://hg.python.org/peps/rev/0da4736c27e8 -- changes

Specifically, I've made these changes (not an exhaustive list):

* Clarified wording in several places, for example "Linux and OS X" ->
"POSIX-based systems"
* Added a new "Notes on exception handling" section
* Added a thorough "Rejected ideas" section with the various ideas
that have been discussed previously and rejected for various reasons
* Added a description of the .full_name attribute, which folks seemed
to generally agree is a good idea
* Removed the "open issues" section, as the three open issues have
either been included (full_name) or rejected (windows_wildcard)

One known error in the PEP is that the "Notes" sections should be
top-level sections, not be subheadings of "Examples". If someone would
like to give me ("benhoyt") commit access to the peps repo, I can fix
this and any other issues that come up.

I'd love to see this finalized! If you're going to comment with
suggestions to change the API, please ensure you've first read the
"rejected ideas" sections in the PEP as well as the relevant
python-dev discussion (linked to in the PEP).

Thanks,
Ben


PEP: 471
Title: os.scandir() function -- a better and faster directory iterator
Version: $Revision$
Last-Modified: $Date$
Author: Ben Hoyt 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 30-May-2014
Python-Version: 3.5
Post-History: 27-Jun-2014, 8-Jul-2014


Abstract


This PEP proposes including a new directory iteration function,
``os.scandir()``, in the standard library. This new function adds
useful functionality and increases the speed of ``os.walk()`` by 2-10
times (depending on the platform and file system) by significantly
reducing the number of times ``stat()`` needs to be called.


Rationale
=

Python's built-in ``os.walk()`` is significantly slower than it needs
to be, because -- in addition to calling ``os.listdir()`` on each
directory -- it executes the ``stat()`` system call or
``GetFileAttributes()`` on each file to determine whether the entry is
a directory or not.

But the underlying system calls -- ``FindFirstFile`` /
``FindNextFile`` on Windows and ``readdir`` on POSIX systems --
already tell you whether the files returned are directories or not, so
no further system calls are needed. Further, the Windows system calls
return all the information for a ``stat_result`` object, such as file
size and last modification time.

In short, you can reduce the number of system calls required for a
tree function like ``os.walk()`` from approximately 2N to N, where N
is the total number of files and directories in the tree. (And because
directory trees are usually wider than they are deep, it's often much
better than this.)

In practice, removing all those extra system calls makes ``os.walk()``
about **8-9 times as fast on Windows**, and about **2-3 times as fast
on POSIX systems**. So we're not talking about micro-
optimizations. See more `benchmarks here`_.

.. _`benchmarks here`: https://github.com/benhoyt/scandir#benchmarks

Somewhat relatedly, many people (see Python `Issue 11406`_) are also
keen on a version of ``os.listdir()`` that yields filenames as it
iterates instead of returning them as one big list. This improves
memory efficiency for iterating very large directories.

So, as well as providing a ``scandir()`` iterator function for calling
directly, Python's existing ``os.walk()`` function could be sped up a
huge amount.

.. _`Issue 11406`: http://bugs.python.org/issue11406


Implementation
==

The implementation of this proposal was written by Ben Hoyt (initial
version) and Tim Golden (who helped a lot with the C extension
module). It lives on GitHub at `benhoyt/scandir`_.

.. _`benhoyt/scandir`: https://github.com/benhoyt/scandir

Note that this module has been used and tested (see "Use in the wild"
section in this PEP), so it's more than a proof-of-concept. However,
it is marked as beta software and is not extensively battle-tested.
It will need some cleanup and more thorough testing before going into
the standard library, as well as integration into ``posixmodule.c``.



Specifics of proposal
=

Specifically, this PEP proposes adding a single function to the ``os``
module in the standard library, ``scandir``, that takes a single,
optional string as its argument::

scandir(path='.') -> generator of DirEntry objects

Like ``listdir``, ``scandir`` calls the operating system's directory
iteration system calls to get the names of the files in the ``path``
directory, but it's different from ``listdir`` in two ways:

* Instead of returning bare filename strings, it returns lightweight
  ``DirEnt

Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Chris Angelico
On Tue, Jul 8, 2014 at 5:53 PM, Stephen J. Turnbull  wrote:
> But you're missing at least two alternatives that
> involve raising on some calculations involving NaN, as well as the
> fact that forcing inequality of two NaNs produced by equivalent
> calculations is arguably just as wrong as allowing equality of two
> NaNs produced by the different calculations.

This is off-topic for this thread, but still...

The trouble is that your "arguably just as wrong" is an
indistinguishable case. If you don't want two different calculations'
NaNs to *ever* compare equal, the only solution is to have all NaNs
compare unequal - otherwise, two calculations might happen to produce
the same bitpattern, as there are only a finite number of them
available.

> That's where things get
> fuzzy for me -- in Python I would expect that preserving invariants
> would be more important than computational efficiency, but evidently
> it's not.

What invariant is being violated for efficiency? As I see it, it's one
possible invariant (things should be equal to themselves) coming up
against another possible invariant (one way of generating NaN is
unequal to any other way of generating NaN).

Raising an exception is, of course, the purpose of signalling NaNs
rather than quiet NaNs, which is a separate consideration from how
they compare.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Chris Angelico writes:

 > The reason NaN isn't equal to itself is because there are X bit
 > patterns representing NaN, but an infinite number of possible
 > non-numbers that could result from a calculation.

I understand that.  But you're missing at least two alternatives that
involve raising on some calculations involving NaN, as well as the
fact that forcing inequality of two NaNs produced by equivalent
calculations is arguably just as wrong as allowing equality of two
NaNs produced by the different calculations.  That's where things get
fuzzy for me -- in Python I would expect that preserving invariants
would be more important than computational efficiency, but evidently
it's not.  I assume that I would have a better grasp on why Python
chose to go this way rather than that if I understood IEEE 754 better.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot.python.org down again?

2014-07-08 Thread Donald Stufft

On Jul 8, 2014, at 12:58 AM, Nick Coghlan  wrote:

> 
> On 7 Jul 2014 10:47, "Guido van Rossum"  wrote:
> >
> > It would still be nice to know who "the appropriate persons" are. Too much 
> > of our infrastructure seems to be maintained by house elves or the ITA.
> 
> I volunteered to be the board's liaison to the infrastructure team, and 
> getting more visibility around what the infrastructure *is* and how it's 
> monitored and supported is going to be part of that. That will serve a couple 
> of key purposes:
> 
> - making the points of escalation clearer if anything breaks or needs 
> improvement (although "infrastruct...@python.org" is a good default choice)
> - making the current "todo" list of the infrastructure team more visible 
> (both to calibrate resolution time expectations and to provide potential 
> contributors an idea of what's involved)
> 
> Noah has already set up http://status.python.org/ to track service status, I 
> can see about getting buildbot.python.org added to the list.
> 
> Cheers,
> Nick.
> 
> 

We (the infrastructure team) were actually looking earlier about
buildbot.python.org and we’re not entirely sure who "owns" buildbot.python.org.
Unfortunately a lot of the *.python.org services are in a similar state where
there is no clear owner. Generally we've not wanted to just step in and take
over for fear of stepping on someones toes but it appears that perhaps
buildbot.p.o has no owner?

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Chris Angelico
On Tue, Jul 8, 2014 at 5:01 PM, Stephen J. Turnbull  wrote:
> I agree with Steven d'A that this rule is not part of the language
> definition and shouldn't be, but it's the rule of thumb I find hardest
> to imagine *ever* wanting to break in my own code (although I sort of
> understand why the IEEE 754 committee found they had to).

The reason NaN isn't equal to itself is because there are X bit
patterns representing NaN, but an infinite number of possible
non-numbers that could result from a calculation. Is
float("inf")-float("inf") equal to float("inf")/float("inf")? There
are three ways NaN equality could have been defined:

1) All NaNs are equal, as if NaN is some kind of "special number".
2) NaNs are equal if they have the exact same bit pattern, and unequal else.
3) All NaNs are unequal, even if they have the same bit pattern.

The first option is very dangerous, because it'll mean that "NaN
pollution" can actually result in unexpected equality. The second
looks fine - a NaN is equal to itself, for instance - but it suffers
from the pigeonhole problem, in that eventually you'll have two
numbers which resulted from different calculations and happen to have
the same bit pattern. The third is what IEEE went with. It's the
sanest option.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Stephen J. Turnbull
Rob Cliffe writes:

 > > Why? What value (pun intended) is there in adding an explicit statement
 > > of value to every single class?

 > It troubles me a bit that "value" seems to be a fuzzy concept - it has 
 > an obvious meaning for some types (int, float, list etc.) but for 
 > callable objects you tell me that their value is the object itself,

Value is *abstract* and implicit, but not fuzzy: it's what you compare
when you test for equality.  It's abstract in the sense that "inside
of Python" an object's value has to be an object (everything is an
object).  Now, the question is "do we need a canonical representation
of objects' values?"  Ie, do we need a mapping from from every object
conceivable within Python to a specific object that is its value?
Since Python generally allows, even prefers, duck-typing, the answer
presumably is "no".  (Maybe you can think of Python programs you'd
like to write where the answer is "yes", but I don't have any
examples.)  And in fact there is no such mapping in Python.

So the answer I propose is that an object's value needs a
representation in Python, but that representation doesn't need to be
unique.  Any object is a representation of its own value, and if you
need two different objects to be equal to each other, you must define
their __eq__ methods to produce that result.

This (the fact that any object represents its value, and so can be
used as "the" standard of comparison for that value) is why it's so
important that equality be reflexive, symmetric, and transitive, and
why we really want to be careful about creating objects like NaN whose
definition is "my value isn't a value", and therefore "a = float('NaN');
a == a" evaluates to False.

I agree with Steven d'A that this rule is not part of the language
definition and shouldn't be, but it's the rule of thumb I find hardest
to imagine *ever* wanting to break in my own code (although I sort of
understand why the IEEE 754 committee found they had to).

 > How can we say if an object is mutable if we don't know what its
 > value is?

Mutability is a different question.  You can define a class whose
instances have mutable attributes but are nonetheless all compare
equal regardless of the contents of those attributes.

OTOH, the test for mutability to try to mutate it.  If that doesn't
raise, it's mutable.

Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com