On Sun, Mar 26, 2017 at 12:22 AM, Andres Freund <and...@anarazel.de> wrote: >> At least with current gcc (6.3.1 on Fedora 25) at -O2, >> what I see is multiple places jumping to the same indirect jump >> instruction :-(. It's not a total disaster: as best I can tell, all the >> uses of EEO_JUMP remain distinct. But gcc has chosen to implement about >> 40 of the 71 uses of EEO_NEXT by jumping to the same couple of >> instructions that increment the "op" register and then do an indirect >> jump :-(. > > Yea, I see some of that too - "usually" when there's more than just the > jump in common. I think there's some gcc variables that influence this > (min-crossjump-insns (5), max-goto-duplication-insns (8)). Might be > worthwhile experimenting with setting them locally via a pragma or such. > I think Aants wanted to experiment with that, too.
I haven't had the time to research this properly, but initial tests show that with GCC 6.2 adding #pragma GCC optimize ("no-crossjumping") fixes merging of the op tail jumps. Some quick and dirty benchmarking suggests that the benefit for the interpreter is about 15% (5% speedup on a workload that spends 1/3 in ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local vars before the indirect jump is somewhere between a tiny benefit and no effect, certainly not worth introducing extra complexity. Clang 3.8 does the correct thing out of the box and is a couple of percent faster than GCC with the pragma. Regards, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers