On 11 July 2011 23:21, Bengt Richter <b...@oz.net> wrote: > On 07/11/2011 01:36 PM William ML Leslie wrote: >> >> On 11 July 2011 20:29, Bengt Richter<b...@oz.net> wrote: >>> >>> On 07/10/2011 09:13 PM Laura Creighton wrote: >>>> >>>> What do we want to happen when somebody -- say in a C extension -- takes >>>> the id of an object >>>> that is scheduled to be removed when the gc next runs? >>> >>> IMO taking the id should increment the object ref counter >>> and prevent the garbage collection, until the id value itself is garbage >>> collected. >> >> This significantly changes the meaning of id() in a way that will >> break existing code. >> > Do you have an example of existing code that depends on the integer-cast > value of a dangling pointer??
I mean that id creating a reference will break existing code. id() has always returned an integer, and the existence of some integer in some python code has never prevented some otherwise unrelated object from being collected. Existing code will not make sure that it cleans up the return value of id(), as nowhere has id() ever kept a reference to the object passed in. I know that you are suggesting that id returns something that is /not/ an integer, but that is also a language change. People have always been able to assume that they can % format ids as decimals or hexadecimals. > Or do you mean that id's must be allowed to be compared == to integers, > which my example prohibits? (I didn't define __cmp__, BTW, just lazy ;-) Good, __cmp__ has been deprecated for over 10 years now. >> If you want an object reference, just use one. If you want them to be >> persistent, build a dictionary from id to object. > > Yes, dictionary is one way to bind an object and thus make sure its id is > valid. > > But it would be overkill to use a dictionary to guarantee object id > persistence > just for the duration of an expression such as id(x.a) == id(y.a) But id is not about persistence. The lack of persistence is one of its key features. That said, I do think id()'s current behaviour is overkill. I just don't think we can change it in a way that will fit existing usage. And cleaning it up properly is far too much work. >> You can already do >> this yourself in pure python, and it doesn't have the side-effect of >> bloating id(). > > My examples *are* in pure python ;-) As is copy.py. We've seen several examples on this thread where you can build additional features on top of what id() gives you without changing id(). So we've no need to break id() in any of the ways that have been suggested here. >> Otherwise, such a suggestion should go through the usual process for >> such a significant change to a language primitive. >> > Sure, but I only really want to understand the real (well, *intended* ;-) > meaning of the id function, so I am putting forth illustrative examples > to identify aspects of its current and possible behavior. The notion of identity is important in any stateful language. Referential equivalence, which is a slightly more complicated (yet much better defined) idea says that x and y are equivalent when no operation can tell the difference between the two objects. 'is' is an approximation that is at least accurate for mutability of python objects. In order for x to "be" y, assignments like x.__class__ = Foo must have exactly the same effect as y.__class__ = Foo. You could presumably write a type in the implementation language that was in no way discernable from the real x, but if x is y, you *know* there is no difference. What id() does is it attempts to distil 'the thing compared' when 'is' is used. On cpython, it just returned the integer value of the pointer to the object, because on cpython that is cheap and does the job (and hey, it *is* the thing compared when you do 'is' on cpython). On pypy, things are slightly more complicated. Pypy is written in python, which has no concept of pointers. It translates to the JVM and the (safe) CLI, neither of which have a direct analogue of the pointer. And even when C or LLVM is used as the backend, objects may move around in memory. Having id() return different values after a collection cycle would be very confusing. So, pypy implements its own, quite clever mechanism for creating ids. It is described in a blog post, if you'd like to read it. The definition of id(), according to docs.python.org, is: Return the “identity” of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime. Two objects with non-overlapping lifetimes may have the same id() value. > Also, a new id could live alongside the old ;-) It's just that the problems you are attempting to fix are already solved, and they are only vaguely related to what a python programmer understands id() to mean. If, according to cpython, "1003 is not 1000 + 3", then programmers can't rely on any excellent new behaviour for id() *anyway*. OTOH, the "identity may not even be preserved for primitive types" issue is an observable difference to cpython and is fixable, even if it is a silly thing to rely on. -- William Leslie _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev