On 8/30/2023 9:44 AM, Max Chernoff wrote:
None of these patches should have any performance implications, yet
"sys" is ~40% slower than either "2023" or "2023-initial", which makes
me think that the LuaTeX currently in TL23 ("sys") was erroneously
compiled with "-O0" or something.
i'm bit puzzled now:
so the tl 2023 bin is slower but when you compile fresh the performenac
is the same? if so ... then why waste time on it if generating a new bin
solves the problem?
i can indeed imagine different compile flags being used, although i
wonder if that will give a 50% performance gain .. i did observe small
differences between gcc versions and a few options but we're talking low
percentages here
Is the same
fontloader used for every test (which assume stability over 2017 - 2023)?
Yes.
Is the luatex build.sh used or some tex live one?
I used the LuaTeX "build.sh" for all the "20xx" binaries, albeit with a
few tiny patches. Very detailed build instructions here:
https://tug.org/~mseven/luatex.html#binary-details
"2023-initial" would have been built with the standard TL build script.
"sys" was probably built with the TL script, but it's a special ad-hoc
release so something different might have happened.
i'm not sure if there was a reason for that (afaik texlive goes for a
more conservative older compilation just to make sure it works on older
os versions)
Benchmark 3: PATH=/tmp/texlive-testing/2023-initial/bin/x86_64-linux:/bin/
luatex -ini factorial.tex
Time (mean ± σ): 3.958 s ± 0.039 s [User: 3.933 s, System: 0.024
s]
Range (min … max): 3.908 s … 4.031 s 10 runs
Benchmark 4: PATH=/tmp/texlive-testing/sys/bin/x86_64-linux:/bin/ luatex -ini factorial.tex
Time (mean ± σ): 5.519 s ± 0.054 s [User: 5.489 s, System: 0.029
s]
Range (min … max): 5.464 s … 5.589 s 10 runs
All 4 tests use the exact same TL23 texmf trees, so the TeX code, Lua
code, and all the other binaries are identical; only the "luahbtex" and
"luatex" binaries are different. The only difference between "2023-
initial" and "sys" binaries should be the socket and debug/popen
patches; they (probably) used the same compilers and the source should
be otherwise identical.
These are as you mentioned irrelevant there.
Yet the current TL23 LuaTeX binary ("sys") is 50% slower than the
initially-released TL23 LuaTeX binary ("2023-initial"), so something
weird is definitely going on. My guess would be that the current LuaTeX
binaries were compiled with -O0 while all the other binaries use -O3, or
something similar. Just a guess though.
Factorial does little (not spread all over token space). Some mem access
for registers, a little amount of macro tokens that likely sit in the
cpu cache. Plus making a macro that gets larger body every iteration (so
that is actually the bottleneck as it involved copying tokens). As you
start ini tokens are not scattered that much.
So, do you see the same 50 % drop with the current luatex when you
compile without O3 ?
One thing I can imagine that there is less inlining applied for lua end
but older compilers (doesn't tl still use gcc 7) were not that
aggressive in that anyway. (fwiw, link time optimization at most gains a
few % on tex but that's even more recent.)
In factorial the bottleneck might be likely in \the which (because we're
an utf engine) has a couple of calls that might benefit from inlining,
but imo unlikely to make the 50% drop.
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------