On Tue, 20 Jun 2023 at 15:14, Axel Kittenberger <axk...@gmail.com> wrote:
> Hello, > > First let me say, I really don't want this to be a complaint. I'm just > wondering. > > I considered switching my department's default compiler from pdflatex to > lualatex. > Some subtle differences were to be expected and with test documents so far > easily catered for. > The output is okay. > > However what surprised me is a complicated test document which took ~150 > seconds with pdflatex now takes 210 seconds with lualatex. > > Trying to figure out if this is some of the many packages it does, I kept > simplifying. > > --- laliptest.tex --- > \documentclass{article} > \input{plipsum} > \begin{document} > \newcount\ii > \ii=100 > \loop > \lipsum{1-100} > \advance\ii-1 > \ifnum \ii>0 > \repeat > \end{document} > --------- > > This most simple document doesn't use any package, but plipsum which can > be replaced with plain text too. Compile time results: > > pdflatex: user 0m1.920s (3.1 MB result) > lualatex: user 0m17.565s (3.8 MB result) > > 8 times slower. > > Probably \loggingall shows if they load the same set of macro > Versions tested with: > pdfTeX 3.141592653-2.6-1.40.24 (TeX Live 2022/Debian) > This is LuaHBTeX, Version 1.15.0 (TeX Live 2022/Debian) > > Since LaTeX also includes a lot of stuff already, same tests with plain > TeX. > > --- liptest.tex --- > \input{plipsum} > \newcount\i > \i=100 > \loop > \lipsum{1-100} > \advance\i-1 > \ifnum \i>0 > \repeat > \end > --------- > pdftex: user 0m1.053s (2.9 MB result) > luatex: user 0m1.943s (3.1 MB result) > > This isn't as bad as the LaTeX variants, but still almost a factor two. > Searching about this online turns up results about microtype or front > loading etc. > Both cannot be an issue, since microtype is off and frontloading must be a > fixed offset, but the compile time increases linearly with document length. > > This now took me a while, but I managed to compile luatex with "-gp" to > create a gprof profile and this is > the result: > ---------- > Flat profile: > > Each sample counts as 0.01 seconds. > % cumulative self self total > time seconds seconds calls s/call s/call name > 14.63 0.42 0.42 2409555 0.00 0.00 longest_match > 8.71 0.67 0.25 295700 0.00 0.00 hnj_hyphen_hyphenate > 8.19 0.91 0.24 52832741 0.00 0.00 get_sa_item > 6.62 1.10 0.19 773 0.00 0.00 deflate_slow > 3.48 1.20 0.10 30117352 0.00 0.00 char_info > 2.79 1.28 0.08 10000 0.00 0.00 ext_do_line_break > 2.79 1.36 0.08 773 0.00 0.00 compress_block > 2.09 1.42 0.06 2978422 0.00 0.00 calc_pdfpos > 2.09 1.48 0.06 515855 0.00 0.00 handle_lig_word > 1.74 1.53 0.05 14032575 0.00 0.00 char_exists > 1.74 1.58 0.05 4689611 0.00 0.00 flush_node > 1.74 1.63 0.05 2896557 0.00 0.00 output_one_char > 1.74 1.68 0.05 227877 0.00 0.00 hash_normalized > 1.74 1.73 0.05 41510 0.00 0.00 hlist_out > 1.74 1.78 0.05 23020 0.00 0.00 fix_node_list > 1.74 1.83 0.05 2319 0.00 0.00 adler32_z > 1.39 1.87 0.04 227877 0.00 0.00 > hash_insert_normalized > 1.39 1.91 0.04 39615 0.00 0.00 fm_scan_line > 1.39 1.95 0.04 11510 0.00 0.00 hnj_hyphenation > 1.05 1.98 0.03 3831639 0.00 0.00 get_x_token > 1.05 2.01 0.03 2896557 0.00 0.00 get_charinfo_whd > 1.05 2.04 0.03 2382502 0.00 0.00 add_kern_before > 1.05 2.07 0.03 303962 0.00 0.00 luaS_hash > 1.05 2.10 0.03 10000 0.00 0.00 ext_post_line_break > ------- > So it's not like there is one function that takes the bulk of the slowdown > as I expected (and often > happens in reality an innocent looking small thing takes so much) > > longest_match() is something from zlib. > > I'm just really surprised, I keep following this project for a while now, > since I consider it highly interesting and thought since I read one of the > major steps was rewriting the TeX core from somewhat idiosyncratic WEB to > C, I expected it to be even a bit faster... > > And this is the profile of pdftex in comparison. > ---------- > Flat profile: > > Each sample counts as 0.01 seconds. > % cumulative self self total > time seconds seconds calls s/call s/call nam > 29.48 0.51 0.51 2362906 0.00 0.00 longest_match > 13.29 0.74 0.23 5876210 0.00 0.00 zdividescaled > 11.56 0.94 0.20 775 0.00 0.00 deflate_slow > 4.62 1.02 0.08 41510 0.00 0.00 pdfhlistout > 4.62 1.10 0.08 774 0.00 0.00 compress_block > 3.47 1.16 0.06 1 0.06 1.59 maincontrol > 2.89 1.21 0.05 423 0.00 0.00 inflate_fast > 2.31 1.25 0.04 227877 0.00 0.00 > hash_insert_normalized > 2.31 1.29 0.04 41510 0.00 0.00 zhpack > 1.73 1.32 0.03 17821585 0.00 0.00 zeffectivechar > 1.73 1.35 0.03 825830 0.00 0.00 zpdfprintint > 1.73 1.38 0.03 260088 0.00 0.00 read_line > 1.73 1.41 0.03 223361 0.00 0.00 pqdownheap > 1.73 1.44 0.03 39615 0.00 0.00 fm_scan_line > 1.45 1.47 0.03 1274937 0.00 0.00 zgetnode > 1.16 1.49 0.02 2896157 0.00 0.00 zadvcharwidth > 1.16 1.51 0.02 579800 0.00 0.00 ztrybreak > 1.16 1.53 0.02 227877 0.00 0.00 hash_normalized > 1.16 1.55 0.02 26742 0.00 0.00 zflushnodelist > 0.87 1.56 0.02 1274936 0.00 0.00 zfreenode > 0.58 1.57 0.01 4160738 0.00 0.00 getnext > 0.58 1.58 0.01 3419912 0.00 0.00 zgetautokern > 0.58 1.59 0.01 2896161 0.00 0.00 hasfmentry > 0.58 1.60 0.01 2896159 0.00 0.00 isscalable > 0.58 1.61 0.01 2896157 0.00 0.00 zpdfprintchar > -------- > Both weren't exactly the same version as tested previously, I self > compiled the newest texlive tagged as release. > (This is LuaTeX, Version 1.16.0 (TeX Live 2023)) > (pdfTeX 3.141592653-2.6-1.40.25 (TeX Live 2023)) > > Runtimes when compiled with -O3 are almost the same as the native debian > above, and I profiled the plain TeX variants only. > > So zlib also takes a bulk, in relation even larger. So not the culprit. > Different implementation of hyphenation seems to be one factor I'd "blame" > > Turning it off with \language=255 improves it: > > pdftex: user 0m1.029s > luatex: user 0m1.596s > > but there is still more. > which is get_sa_item()/char_info(). > > And reading the comments managed-sa.c, it seems main the issue is being > sparse? So I guess the way to improve that would be to be not sparse? > > Anyway, that was my report to this, unfortunately I'm holding off pushing > it as the new default compiler for us, since the slowdown is a bad sell for > something which only sometimes is userful. > > PS: personally I use Lua to calculate some complexer drawing for tikz, as > using a "normal" programming > language is much easier to me than doing more complicated pgf macros. But > also in the end it just generates .tex code, which I simply feed into > pdflatex, it's only this gets complicated which files people ought to > change and which are autogenerated .tex files. > > Kind regards, Axel > It could be but luatex by design is different from pdftex in supporting the opentype fonts which by themselves are a complex subject. Also luatex opens more tex internals, and this comes with a price (in these kinds of tests I usually put into the loop also xetex, just to complete the picture). Anyway thank you very much , I will look at the test more carefully. -- luigi