On Wed, Oct 12, 2016 at 5:36 AM Paul Moore <p.f.mo...@gmail.com> wrote:
> On 12 October 2016 at 11:16, Steven D'Aprano <st...@pearwood.info> wrote: > > On Wed, Oct 12, 2016 at 12:25:16AM +0000, Elliot Gorokhovsky wrote: > > > >> Regarding generalization: the general technique for special-casing is > you > >> just substitute all type checks with 1 or 0 by applying the type > assumption > >> you're making. That's the only way to guarantee it's safe and compliant. > > > > I'm confused -- I don't understand how *removing* type checks can > > possible guarantee the code is safe and compliant. > > > > It's all very well and good when you are running tests that meet your > > type assumption, but what happens if they don't? If I sort a list made > > up of (say) mixed int and float (possibly including subclasses), does > > your "all type checks are 1 or 0" sort segfault? If not, why not? > > Where's the safety coming from? > > My understanding is that the code does a pre-check that all the > elements of the list are the same type (float, for example). This is a > relatively quick test (O(n) pointer comparisons). Yes, that's correct. I'd like to emphasize that I'm not "*removing* type checks" -- I'm checking them in advance, and then substituting in the values I already know are correct. To put it rigorously: there are expressions of the form PyWhatever_Check. I can be eager or lazy about how I calculate these. The current implementation is lazy: it waits until the values are actually called for before calculating them. This is expensive, because they are called for many, many times. My implementation is eager: I calculate all the values in advance, and then if they all happen to be the same, I plug in that value (1 or 0 as the case may be) wherever it appears in the code. If they don't happen to all be the same, like for "mixed int and float", then I just don't change anything and use the default implementation. The code for this is really very simple: int keys_are_all_same_type = 1; PyTypeObject* key_type = lo.keys[0]->ob_type; for (i=0; i< saved_ob_size; i++){ if (lo.keys[i]->ob_type != key_type){ keys_are_all_same_type = 0; break; } } if (keys_are_all_same_type){ if (key_type == &PyUnicode_Type) compare_function = unsafe_unicode_compare; if (key_type == &PyLong_Type) compare_function = unsafe_long_compare; if (key_type == &PyFloat_Type) compare_function = unsafe_float_compare; else compare_function = key_type->tp_richcompare; } else { compare_function = PyObject_RichCompare; } Those unsafe_whatever* functions are derived by substituting in, like I said, the known values for the typechecks (known since keys_are_all_same_type=1 and key_type = whatever) in the existing implementations of the compare functions. Hope everything is clear now! Elliot
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/