On 6/26/06, Ka-Ping Yee <[EMAIL PROTECTED]> wrote: > On Mon, 26 Jun 2006, Guido van Rossum wrote: > > Can you please edit the PEP yourself to add this? That will be most > > efficient. > > I've done so, and tried to clarify the next line to match (see below). > > > With school I, if you want to > > optimize using a hash table (as in PEP 275 Solution 1) you have to > > catch and discard exceptions in hash(), and a bug in hash() can still > > lead this optimization astray > > Right. As written, the problem "a buggy hash might generate an > incorrect match" is not specific to School I; it's a problem with > any approach that is implemented by a hash lookup. School II is > necessarily implemented this way; School I might or might not be. > So i think the part that says: > > the hash function might have a bug or a side effect; if we > generate code that believes the hash, a buggy hash might > generate an incorrect match > > doesn't belong there, and i'd like your consent to remove it.
I'd rather keep it, but clarify that school II considers the outcome of the hash() the official semantics, while school I + dict-based optimization would create optimized code that doesn't match the language specs. My point being that not following your own specs is a much more severe sin than trusting a buggy hash(). If hash() is buggy, school II's official spec means that the bug affects the outcome, and that's that; but because school I specifies the semantics as based on an if/elif chain, using a buggy hash() means not following the spec. If we choose school I, a user may assume that a buggy hash() doesn't affect the outcome because it's defined in terms of == tests only. > On the other hand, this criticism: > > if we generate code that catches errors in the hash to > fall back on an if/elif chain, we might hide genuine bugs > > is indeed specific to School I + hashing. Right. > > Right. School I appears just as keen as school II to use hashing to > > optimize things, but isn't prepared to pay the price in semantics; > > Ok. Then there's an inconsistency with the definition of School I: > > School I wants to define the switch statement in term of > an equivalent if/elif chain > > To clear this up, i've edited the first line of the School II > paragraph, which previously said: > > School II sees nothing but trouble in that approach > > It seems clear that by "that approach" you meant "trying to achieve > if/elif semantics while using hash optimization" rather than the > more general definition of School I that was given. Right. Thanks for the clarification; indeed, the only problem I have with the "clean" school I approach (no hash-based optimization) is that there's no optimization, and we end up once more having to tweak the ordering of the cases based on our expectation of their frequency (which may not match reality). Most school I proponents (perhaps you're the only exception) have claimed that optimization is desirable, but added that it would be easy to add hash-based optimization. IMO it's not so easy in the light of various failure modes of hash(). (A possible solution would be to only use hashing if the expression's type is one of a small set of trusted builtins, and not a subclass; we can trust int.__hash__, str.__hash__ and a few others.) > I believe > there are a few voices here (and i count myself among them) that > consider the semantics more important than the speed and are in > School I but aren't treating hash optimization as the quintessence > of 'switch', and we shouldn't leave them out. This is an important distinction; thanks for pointing it out. Perhaps we can introduce school Ia and Ib, where Ia is "clean but unoptimized" and Ib is "if/elif with hash-based optimization desirable when possible". Another distinction between the two schools is that school Ib will have a hard time optimizing switches based on named constants. I don't believe that the "callback if variable affecting expression value is ever assigned to" approach will work. School II is definitely more pragmatist; I really don't see much wrong with defining that it works a certain way, which is not exactly what you would expect but has the same effect in *most* common cases, and then explaining odd behavior in various corner cases out of that way, as long as I don't care about those corner cases. This is pretty much how I defend the semantics of default parameter values -- it doesn't matter that they are computed at function definition time instead of call time as long as you only use immutable constants (which could be named constants as long as they're immutable), which is the only use case I care about. There are many other examples of odd Python semantics that favor a relatively simple implementation and glorify that implementation as the official semantics. I find this much preferable over weasel words a la traditional language standards which give optimizers a lot of leeway at the cost of difference in behavior between different compilers or optimization levels. For a real example from C++, read "Double Checked Locking is Broken": http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html (I first heard about this from Scott Meyers at the '06 ACCU conference in Oxford; an earlier version of his talk is also online: http://www.aristeia.com/Papers/DDJ_Jul_Aug_2004_revised.pdf). -- --Guido van Rossum (home page: http://www.python.org/~guido/) _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com