On 8/30/2023 9:44 AM, Max Chernoff wrote:

None of these patches should have any performance implications, yet
"sys" is ~40% slower than either "2023" or "2023-initial", which makes
me think that the LuaTeX currently in TL23 ("sys") was erroneously
compiled with "-O0" or something.

i'm  bit puzzled now:

so the tl 2023 bin is slower but when you compile fresh the performenac is the same? if so ... then why waste time on it if generating a new bin solves the problem?

i can indeed imagine different compile flags being used, although i wonder if that will give a 50% performance gain .. i did observe small differences between gcc versions and a few options but we're talking low percentages here

Is the same
fontloader used for every test (which assume stability over 2017 - 2023)?

Yes.

Is the luatex build.sh used or some tex live one?

I used the LuaTeX "build.sh" for all the "20xx" binaries, albeit with a
few tiny patches. Very detailed build instructions here:

    https://tug.org/~mseven/luatex.html#binary-details

"2023-initial" would have been built with the standard TL build script.
"sys" was probably built with the TL script, but it's a special ad-hoc
release so something different might have happened.

i'm not sure if there was a reason for that (afaik texlive goes for a more conservative older compilation just to make sure it works on older os versions)

    Benchmark 3: PATH=/tmp/texlive-testing/2023-initial/bin/x86_64-linux:/bin/ 
luatex -ini factorial.tex
      Time (mean ± σ):      3.958 s ±  0.039 s    [User: 3.933 s, System: 0.024 
s]
      Range (min … max):    3.908 s …  4.031 s    10 runs
Benchmark 4: PATH=/tmp/texlive-testing/sys/bin/x86_64-linux:/bin/ luatex -ini factorial.tex
      Time (mean ± σ):      5.519 s ±  0.054 s    [User: 5.489 s, System: 0.029 
s]
      Range (min … max):    5.464 s …  5.589 s    10 runs

All 4 tests use the exact same TL23 texmf trees, so the TeX code, Lua
code, and all the other binaries are identical; only the "luahbtex" and
"luatex" binaries are different. The only difference between "2023-
initial" and "sys" binaries should be the socket and debug/popen
patches; they (probably) used the same compilers and the source should
be otherwise identical.

These are as you mentioned irrelevant there.

Yet the current TL23 LuaTeX binary ("sys") is 50% slower than the
initially-released TL23 LuaTeX binary ("2023-initial"), so something
weird is definitely going on. My guess would be that the current LuaTeX
binaries were compiled with -O0 while all the other binaries use -O3, or
something similar. Just a guess though.

Factorial does little (not spread all over token space). Some mem access for registers, a little amount of macro tokens that likely sit in the cpu cache. Plus making a macro that gets larger body every iteration (so that is actually the bottleneck as it involved copying tokens). As you start ini tokens are not scattered that much.

So, do you see the same 50 % drop with the current luatex when you compile without O3 ?

One thing I can imagine that there is less inlining applied for lua end but older compilers (doesn't tl still use gcc 7) were not that aggressive in that anyway. (fwiw, link time optimization at most gains a few % on tex but that's even more recent.)

In factorial the bottleneck might be likely in \the which (because we're an utf engine) has a couple of calls that might benefit from inlining, but imo unlikely to make the 50% drop.

Hans


-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

Reply via email to