Paul Eggert <[email protected]> writes:
> On 2026-03-29 13:27, Bruno Haible wrote:
>> what is the problem? The .c files are only compiled when needed.
>
> OK, but the tarball is significantly larger than it needs to be, and a
> lot of those .c files are not really needed even when they are
> compiled.
>
> Some of the problem is longstanding: the printf-posix and
> fprintf-posix bring in a lot of floating-point and multibyte stuff
> that gzip doesn't need. It's the usual problem that getting printf
> exactly right means a lot of code - but gzip doesn't need printf to be
> exactly right.
>
> But some of the problem is new: there's more multibyte and multithread
> stuff now, none of which gzip needs.
>
> From the Gnulib point of view, perhaps tuning gzip's use of Gnulib is
> a waste of time or even counterproductive, because it's yet another
> combination of modules that needs testing (by gzip). But from gzip's
> point of view, Gnulib's bloat is becoming a bigger and bigger problem:
> there are many more files, they all do something maybe, but probably
> not, and it's increasing the confusion level.
>
>> In any case, you can use gnulib-tool to determine the reason why a
>> file is included:
>
> I find gnulib-tool hard to use for this sort of thing. If I see a file
> m4/xyz.m4 that I don't know why it's there, it's not easy for me to
> see the path from m4/xyz.m4 to the Gnulib modules that bootstrap.conf
> specifies. It'd be nice if gnulib-tool had an option to do that.
Well we have --find:
$ gnulib-tool --find m4/pid_t.m4
dirent-h
fcntl-h
sched-h
signal-h
spawn-h
sys_stat-h
sys_types-h
sys_wait-h
termios-h
unistd-h
You can use that with --extract-recursive-dependents to get a list of
modules that may have imported that file:
$ gnulib-tool --extract-recursive-dependents \
$(gnulib-tool --find m4/pid_t.m4)
acl
acl-tests
alphasort
areadlinkat
[...]
>> Looking at the starting points in gzip/bootstrap.conf, I can see at least
>> this dependency chain:
>> yesno → rpmatch → regex → wctype → iswxdigit
>
> Unfortunately that example underscores the need for the gnulib-tool
> option, as that dependency chain does not exist for gzip. gzip is
> bootstrapped by passing '--avoid rpmatch' to gnulib-tool; see gzip's
> bootstrap.conf.
This would probably be more difficult than having extra code for each
module to handle single-threaded and/or non-localized programs.
Looking at the dependencies for rpmatch:
$ gnulib-tool --extract-dependencies rpmatch
stdlib-h
extensions
bool [test $HAVE_RPMATCH = 0]
gettext-h [test $HAVE_RPMATCH = 0]
gnulib-i18n [test $HAVE_RPMATCH = 0]
regex [test $HAVE_RPMATCH = 0]
strdup [test $HAVE_RPMATCH = 0]
streq [test $HAVE_RPMATCH = 0]
If one were to simply avoid all modules depended on by this module,
their build would certainly break. For example, I estimate bool is used
by a majority of modules.
Collin