Hi Berny,
On Wed, 21 Apr 2021, Bernhard Voelker wrote:
shouldn't it use the 'argv-iter' gnulib module (like du.c and wc.c)
instead of directly using the underlying ...
+#include "readtokens0.h"
I considered this, too! :)
I think the short answer is that du and wc don't actually need to keep any
information around from all the files they process except for the running
totals, so "at scale" the argv-iter version for files0 processing allows
for a constant (?) memory usage in du and wc regardless of the number of
input items.
But for ls, even if using argv-iter would result in one less copy of the
filenames, the scaled memory requirements are still the same for sorting
(fileinfo for each entry has to be kept till the end).
[If anything, i would look more carefully into whether gobble_file in ls.c
really needs to make a copy of each filename, rather than just using the
command line arguments or names from files0 input directly.]
So i ended up following the pattern closer to what's in sort.c. Note that
argv-iter is not used in there either, perhaps for a similar reason (it's
not going to bring the memory requirements down to O(1), the way it could
for wc and du).
The one case ls might possibly find an improvement with argv-iter is for
unsorted, non-column output (so, specifically -U1), where the names are
only needed once. But in that case there's no need to use the
'--files0-from' option for global sort or summary info -- you could use
'xargs -0 ls -U1 ...' instead for identical output.
It just didn't seem like there was a strong argument for it (same memory
scaling regardless). And the other frank truth is (if you look at wc.c
and du.c for examples), it seemed like adding argv_iterator processing
would significantly complicate the code.
Those were my thoughts anyway :)
Have a good one,
Carl