Greg Price <gnpr...@gmail.com> added the comment:

Hmm, I'm a bit confused because:

* Your patch at GH-15251 replaces a number of calls to PyLong_FromLong with 
calls to the new _PyLong_FromUnsignedChar.

* That function, in turn, just calls PyLong_FromSize_t.

* And that function begins:

PyObject *
PyLong_FromSize_t(size_t ival)
{
    PyLongObject *v;
    size_t t;
    int ndigits = 0;

    if (ival < PyLong_BASE)
        return PyLong_FromLong((long)ival);
// ...


* So, it seems like after your patch we still end up calling PyLong_FromLong at 
each of these callsites, just after a couple more indirections than before.

Given the magic of compilers and of hardware branch prediction, it wouldn't at 
all surprise me for those indirections to not make anything slower... but if 
the measurements are coming out *faster*, then I feel like something else must 
be going on. ;-)

Ohhh, I see -- I bet it's that at _PyLong_FromUnsignedChar, the compiler can 
see that `is_small_int(ival)` is always true, so the whole function just turns 
into get_small_int.  Whereas when compiling a call to PyLong_FromLong from some 
other file (other translation unit), it can't see that and can't make the 
optimization.

Two questions, then:

* How do the measurements look under LTO? I wonder if with LTO the linker is 
able to make the same optimization that this change helps the compiler make.

* Is there a particular reason to specifically call PyLong_FromSize_t? Seems 
like PyLong_FromLong is the natural default (and what we default to in the rest 
of the code), and it's what this ends up calling anyway.

----------
nosy: +Greg Price

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37837>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to