Hi,
Do you think the following is a reasonable possible performance improvement for
LuaTeX? Thank you.
------
In the hnj_hyphenation function, currently the large array `utf8word` is zeroed
out every time the function is called.
This is unnecessary (the string is always null-terminated before being used.
Besides, observe that the function processes multiple words per call, without
zeroing out the array between each word, so even with the current function
there could be nonzero bytes after the null terminator at each use of the
array).
For a plain TeX file where macro expansion takes negligible time (such as
tex.tex, the source code of TeX), this single improvement can speeds up the
compilation time by 25%. On my machine, the runtime decreases from 3.064s to
2.327s.
To generate `tex.tex`, you can download `tex.web` from CTAN package `tex`, and
run `weave tex.web`
The patch is attached below.
diff --git a/source/texk/web2c/luatexdir/lang/texlang.c b/source/texk/web2c/luatexdir/lang/texlang.c
index bc912de..a7a741b 100644
--- a/source/texk/web2c/luatexdir/lang/texlang.c
+++ b/source/texk/web2c/luatexdir/lang/texlang.c
@@ -943,7 +943,7 @@ void hnj_hyphenation(halfword head, halfword tail)
int lchar, i;
struct tex_language *lang;
lang_variables langdata;
- char utf8word[(4 * MAX_WORD_LEN) + 1] = { 0 };
+ char utf8word[(4 * MAX_WORD_LEN) + 1];
int wordlen = 0;
char *hy = utf8word;
char *replacement = NULL;