Greg Price <gnpr...@gmail.com> added the comment:

Very interesting, thanks!

It looks like with LTO enabled, this optimization has no effect at all.

This change adds significant complexity, and it seems like the hoped-for payoff 
is entirely in terms of performance on rather narrowly-focused microbenchmarks. 
 In general I think that sets the bar rather high for making sure the 
performance gains are meaningful enough to justify the increase in complexity 
in the code.

In particular, I expect most anyone running Python and concerned with 
performance to be using LTO.  (It's standard in distro builds of Python, so 
that covers probably most users already.)  That means if the optimization 
doesn't do anything in the presence of LTO, it doesn't really count. ;-)

---

Now, I am surprised at the specifics of the result! I expected that LTO would 
probably pick up the equivalent optimization on its own, so that this change 
didn't have an effect. Instead, it looks like with LTO, this microbenchmark 
performs the same as it does without LTO *before* this change. That suggests 
that LTO may instead be blocking this optimization.

In that case, there may still be an opportunity: if you can work out why the 
change doesn't help under LTO, maybe you can find a way to make this 
optimization happen under LTO after all.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37837>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to