Greg Price <[email protected]> added the comment:
Hmm, I'm a bit confused because:
* Your patch at GH-15251 replaces a number of calls to PyLong_FromLong with
calls to the new _PyLong_FromUnsignedChar.
* That function, in turn, just calls PyLong_FromSize_t.
* And that function begins:
PyObject *
PyLong_FromSize_t(size_t ival)
{
PyLongObject *v;
size_t t;
int ndigits = 0;
if (ival < PyLong_BASE)
return PyLong_FromLong((long)ival);
// ...
* So, it seems like after your patch we still end up calling PyLong_FromLong at
each of these callsites, just after a couple more indirections than before.
Given the magic of compilers and of hardware branch prediction, it wouldn't at
all surprise me for those indirections to not make anything slower... but if
the measurements are coming out *faster*, then I feel like something else must
be going on. ;-)
Ohhh, I see -- I bet it's that at _PyLong_FromUnsignedChar, the compiler can
see that `is_small_int(ival)` is always true, so the whole function just turns
into get_small_int. Whereas when compiling a call to PyLong_FromLong from some
other file (other translation unit), it can't see that and can't make the
optimization.
Two questions, then:
* How do the measurements look under LTO? I wonder if with LTO the linker is
able to make the same optimization that this change helps the compiler make.
* Is there a particular reason to specifically call PyLong_FromSize_t? Seems
like PyLong_FromLong is the natural default (and what we default to in the rest
of the code), and it's what this ends up calling anyway.
----------
nosy: +Greg Price
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37837>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com