I think in this case, though, if we say for the sake of argument that the guaranteed semantics of a dictionary lookup are zero or more calls to __hash__ plus zero or more calls to __eq__, then two back-to-back dictionary lookups wouldn't have any observable differences from doing only one, unless you start to make assumptions about the behavior of the implementation. To me there seems to be a bit of a gap between seeing a dictionary lookup and knowing the exact sequence of user-functions that get called, far more than for example something like "a < b". I would feel differently if the question was if it's ok to fold something like
x = a < b y = a < b into a single comparison, since I'd agree with the way you described it that you look at this code and would expect __lt__ to be called twice. I guess maybe I just disagree about whether dictionaries are contractually bound to call __hash__ and __eq__? For instance, maybe the dict could have a small cache of recently-looked-up elements to skip hashing / equality tests if they get accessed again; I have no idea if that would be profitable or not, but it seems like that would be a roughly-equivalent change that's still "doing two dictionary lookups" except the second one simply wouldn't call back into the user's python code. Or maybe the dictionary could be implemented as a simple list for small sizes and skip calling __hash__ until it decides to switch to a hash-table strategy; again I'd still say it's "doing the lookups" but it just calls a different set of python functions. > Although I have tentatively said I think this is okay, it is a change in > actual semantics of Python code: what you write is no longer what gets > run. That makes this *very* different from changing the implementation > of sort -- by analogy, its more like changing the semantics of > > a = f(x) + f(x) > > to only call f(x) once. I don't think you would call that an > implementation detail, would you? Even if justified -- f(x) is a pure, > deterministic function with no side-effects -- it would still be a > change to the high-level behaviour of the code. > To me, the optimization stage of a compiler's job is to transform a program to an equivalent one that runs faster, where equivalence is defined w.r.t. a certain set of rules defining the behavior of the language. If f() can really be proven to be a function that is deterministic, has no side-effects, does no runtime introspection, and returns a type that supports the identity "x + x == 2 * x" (quite a bit of work for a dynamic language jit, but definitely possible!), then I'd say that I have a fairly different understanding of the "high-level behavior" the runtime is contracted to follow. As a simpler example, I think the runtime should be very free to condense "a = 1 + 1" into "a = 2" without doing the addition. Anyway, as I alluded to about the __lt__ / __gt__ usage in sort(), just because I might want alternative implementations to have flexibility to do something doesn't mean it's reasonable to say it's so. I'm biased since the implementation I'm working on uses std::unordered_map to implement Python dictionaries, and I have no idea how that's actually implemented and I'd rather not have to :)
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com