Marko Rauhamaa wrote: > Skip Montanaro <s...@pobox.com>: > >> The use of "is" or "is not" is the right thing to do when the object >> of the comparison is known to be a singleton. > > Object identity stirs a lot of passions on this forum. I'm guessing the > reason is that it is not defined very clearly (<URL: > https://docs.python.org/3/library/functions.html#id>):
Identity is defined very clearly. As you just quoted: > id(object) > > Return the “identity” of an object. This is an integer which is > guaranteed to be unique and constant for this object during its > lifetime. Two objects with non-overlapping lifetimes may have the > same id() value. Python identity is represented by an integer, and it is guaranteed to be unique and constant for the lifetime of the object. It may or may not be reused once the object no longer exists. That's all you need to know about identity; that's *all there is to know* about identity in Python. [Actually, to be pedantic, one also needs to state that the objects have to be part of a single Python process. Object X in one process, and object Y in another process, may have the same id() but still be considered distinct.] > CPython implementation detail: This is the address of the object > in memory. I really wish CPython didn't do that, or at least not admit to it. It does nothing but confuse people. > The "is" relation can be defined trivially through the id() function: > > X is Y iff id(X) == id(Y) Except that id() is a built-in function and can be shadowed or monkey-patched, while the `is` operator is a keyword and cannot be. But apart from that minor point, I agree. > What remains is the characterization of the (total) id() function. For > example, we can stipulate that: > > X = Y > assert(id(X) == id(Y)) > # assignment preserves identity That's not a property of identity. That's a property of *assignment*. So you cannot use that fact to define identity in Python, since there could be another language with the *exact* same definition of identity but that does copy-on-assignment instead. > (assuming X and Y are not modified in other threads or signal handlers). > > We know further that: > > i = id(X) > time.sleep(T) > assert(i == id(X)) > # the identity does not change over time That would be the part of the definition that says the identity is constant. > def f(x, y): > return id(x) == id(y) > assert(f(X, X)) > # parameter passing preserves the identity Again, that's not a property of identity. There could be a language just like Python in all respects, including identity, except that parameters are passed by value. [snip more examples of things which tell us nothing about identity] > The nice thing about these kinds of formal definitions is > that they make no metaphysical reference to "objects" or "lifetime" (or > the CPython implementation). They are not metaphysical. They are concrete. You cannot understand the semantics of identity in Python without understanding Python's execution model. Python's execution model contains objects (which are not metaphysical woo, but a concrete computer science data structure), and identity in Python is defined in terms of objects. Not values, or names, or namespaces, but objects. > They can also be converted into > implementation conformance statements and test cases. True, but in most of the examples you show, they will be tests of some other aspect of Python, e.g. that assignment (name binding) preserves identity. They don't help us understand identity, because there are other models for assignment, and there are an infinite number of things which could also be preserved by assignment but aren't identity. E.g. "the number of zero bits in the object struct". > A much tougher task is to define when id(X) != id(Y). After all, all of > the above would be satisfied by this implementation of id(): > > def id(x): > return 0 That fails the definition given, that identities are *unique*. Defining non-identity for objects that exist simultaneously is simple: id(X) != id(Y) iff not (X is Y) We don't have good notation for discussing objects which exist at different times, but we can fake it with the rule: "if either X or Y or both raise NameError, then we deem `X is Y` to be false" > The nonidentity will probably have to be defined separately for each > builtin datatype. That is incorrect. See below. > For example, for integers and strings we know only > that: > > assert(X == Y or id(X) != id(Y)) > # inequality implies nonidentity That tells us something about string equality. It tells us nothing about identity. Python's concept of identity applies equally to all types. Read the definition again: it refers to objects, but without caring about the type of objects. Let's just consider two types: str: assert (X == Y or id(X) != id(Y)) always passes float: assert (X == Y or id(X) != id(Y)) sometimes fails Proof of the second case: py> X = Y = float('nan') py> assert (X == Y or id(X) != id(Y)) Traceback (most recent call last): File "<stdin>", line 1, in <module> AssertionError This demonstrates that the condition (X == Y or id(X) != id(Y)) fails to tell us anything useful about identity, since it is sometimes true and sometimes false. You cannot understand identity from first principles, precisely because it is not a metaphysical concept in Python. In Python it is defined by and in terms of the concrete programming model of the language. -- Steven -- https://mail.python.org/mailman/listinfo/python-list