Perhaps a little background to my original post will defuse some of the controversy. While working during an airline flight, I ran into an unexpected outcome from using the * replication operator to initialize an array of lists. When I modified a single element of the array an entire column changed. Having no reference books or internet access available, I tried to understand what was going on by creating some small arrays on the command line to see if there was a difference between explicit initialization and initialization with range() and the * operator.
The arrays looked identical when printed and a == b returned True. Yet the arrays were clearly not equivalent because mutating the corresponding elements produced different outcomes. I put the problem aside until the next day when I looked at it some more and and created the example script I posted. Just as I was about to hit the Send button, I realized that the * operator must have been creating references instead of copies. And then I appended the now much debated opinion that == should have detected the difference. (As an aside, may I point out that Python In A Nutshell states on page 46 "The result of S*n or n*S is the concatenation of n copies of S". It might be well to put a warning in a future edition that this is not strictly the case.) My viewpoint is that of a working professional software consultant. I'm essentially a pragmatist with no strong 'religious' views about languages and methodologies. As I noted in an earlier reply, I don't realistically expect Python to change the behavior of the == operator. I do think that a problem arose when it was adopted from C and extended to allow comparison of containers. In C, you can use it to compare integers, floats, and pointers and everyone understands that p==q does not imply *p == *q. Moreover, compilers issue warnings about comparisons between different types. Basically, I'm looking for simple diagnostic tools that make it easy to understand what's really going on when code produces an unexpected result. A 'strengthened equivalence' operator, to use your terminology would have been useful to me. As to constructing pseudocode for such an operator, I've appended a working script below. The counterexamples and questions from Slawomir, Maric, and Jim were really useful in sharpening my thinking about the matter. I'm sure there are many ways to break it. For example, tuples have no index method, so one would have to be written. Still, I hope it will serve to move the discussion beyond terms like 'crazy' and 'handwaving' and 'ill-founded'. I haven't used such perjoratives in any of my posts and would appreciate the same courtesy. Cheers, Mike ''' StrongEquality -- a first cut at the definition proposed by M. Ellis. Author: Michael F. Ellis, Ellis & Grant, Inc. ''' def indices(item,seq): '''Utility function that returns a list of indices where item occurs in seq''' result=[] for i in xrange(len(seq)): try: result.append(i+seq[i:].index(item)) except ValueError: return result def StrongEquality(a,b): '''True if a and b are numerically and "structurally" equal''' if a is b: return True if a != b: return False ## At this point we know a and b have the same length and ## evaluate numerically equivalent. We now need to figure out ## whether there are any references to identical objects in non-corresponding ## positions of a & b (per Slawomir's example). We also need to inspect ## a and b for non-matching patterns of identical references (per my example) ida=[] ; idb=[] for i in xrange(len(a)): if a[i] is b[i]: continue if isinstance(a[i], (int, float, str)) and isinstance(b[i], (int, float, str)): continue ## we already know they're numerically equal ida.append(id(a[i])) idb.append(id(b[i])) ## We know that ida[n] is not idb[n] for all n because we omitted all ## cases where a is b. Therefore Slawomir's example is detected if ## any id appears in both lists. for n in ida: if n in idb: return False ## Next we test for my example. I'm sure this can be coded more ## more elegantly ... for j in xrange(len(ida)): if indices(ida[j],ida) != indices(idb[j],idb): return False ## Lastly, recurse ... if not StrongEquality(a[i],b[i]): return False return True if __name__=='__main__': ## Rudimentary test cases assert StrongEquality(1,1) assert not StrongEquality(0,1) ## Slawomir's example x, y, z = [1],[1],[1] a, b = [x,y], [y,z] c, d = [[1],[1]], [[1],[1]] assert StrongEquality(c,d) assert a == b assert not StrongEquality(a,b) ## My example a =[[[1,2],[1,2]],[[1,2],[1,2]]] b = [[range(1,3)]*2]*2 assert a==b assert not StrongEquality(a,b) print "All tests ok." Alex Martelli wrote: > <[EMAIL PROTECTED]> wrote: > > > Hi Alex, > > With all due respect to your well-deserved standing in the Python > > community, I'm not convinced that equality shouldn't imply invariance > > under identical operations. > > So, why aren't you satisfying my request? Provide a simple concrete > definition of what your idea of equality WOULD behave like. I notice > that your lack of response stands out like a sore thumb -- all you're > providing is a set of constraints you desire and a collection of > illfounded analogies and handwaving. Traditional mathematics does not > support the concept of "change", nor the distinction between equality > and identity; the "real world" has no way to define what modifications > are "identical" except by their effects (if the results differ, either > the original equality was ill-posited or the modifications were not > "identical"). But the real world DOES have the concept of "performing > exactly the same sequence of operational steps", and, by THAT definition > of "equal modifications", then your assertion: > > > make identical modifications to the engines of two identical > > automobiles, I expect the difference in performance to be identical. > > is ill-founded -- or, rather, your *expectation* may be ill-founded. > > Take two systems of any significant complexity that are similar enough > to be called "identical" by ALL observers (because trying to ascertain > the differences, if any, would inevitably perturb the systems > irretrievably by Heisenberg's effect -- i.e., there are no OBSERVABLE > differences, which by Occam's Razor requires you to posit the systems > are equal, because you cannot prove otherwise -- and entities must not > be multiplied beyond necessity, so supposing that "observably equal" > systems are indeed equal is Occam-compliant). > > Now, perform "identical" (ditto) modifications: in the real world, due > to quantum effects, there WILL be sub-observable differences in what > you're doing to the first one and to the second one. If the systems are > unstable to start with, they may well amplify those differences to > observable proportions -- and there you are: the effect of the "equal" > change on "equal" system may easily become observably unequal. > Philosophically, you may classify this as an "observation" of both > systems, which reasoning backwards lead you to posit that either the > systems were NOT equal to start with or the modifications weren't... > that is, IF you also posit determinism, which, as well we know, is an > unwarrantedly strong hypothesis for systems in which the differences at > quantum level matter. Feel free to follow Einstein (and diverse > light-years away from the last few decades of physics) in positing that > there MUST exist "hidden variables" (unobservable except maybe in > destructive, irreversible ways) explaining the difference -- I'll stick > with the mainstream of physics and claim your expectation was badly > founded to start with. > > I can debate epistemology with the best, but this is not really the > proper forum for this -- starting with the crucial distinction, what it > means, in mathematics OR in the real world, to state that two systems > are "equal but NOT identical"? In the end, such debates tend to prove > rather futile and unproductive, however. > > In the world of programming languages, we cut through the chase by > requesting *operational* (Brouwer-ian, mathematically speaking) > definitions. Provide the *operational* definition of how you WANT > equality checking to work, contrast it with my simple two-lines one, and > THEN we can have a meaningful debate of which one is the correct one to > use in the core of a programming language that has the (blessing and > curse of) mutable data objects... > > > Alex -- http://mail.python.org/mailman/listinfo/python-list