unarchive 20874 forcemerge 20874 25832 stop On 21/02/17 18:40, Assaf Gordon wrote: > Hello, > >> On Feb 21, 2017, at 19:55, Holger Wolff <holger-bug-coreut...@wolffh.de> >> wrote: >> >> Incorrect numeric suffixes are sometimes produced when going beyond number >> 89: >> Assume a file "test.txt" with 1000 lines, and the command >> >> $ split -d -l 10 test.txt test_ >> >> I expect files test_00 through test_99, but what I get are test_00 through >> test_89 and test_9000 through test_9009. > > Thank you for the bug report. > > I can confirm this is reproducible in the latest revision. > > The immediate reason is that without a starting value, > coreutil's split has a feature to 'widen' the filename, > but the logic to widen it follows the alphabet widening > and doesn't work well for numeric widening. > > That is, when not using numeric-suffixes, > 'yz' (the last two letters) are widened to 'zaaa': > > $ seq 1000 | split -l 1 - foo_ > > will result in: > > ... > foo_yy > foo_yz > foo_zaaa > foo_zaab > ... > > And you are seeing the last two digits ('89') > widened in the same logic (to '9000'). > > > Technically, if 'numeric_suffix_start' > is left as 'null' in the parsing of --numeric-suffix: > http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/split.c#n1455 > > then the widening logic behaves as if those were letters, not digits > in 'split.c:next_file_name()': > http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/split.c#n403 > > > > An immediate band-aid of defaulting to numeric_suffix_start=0 > will result in an unintended consequences (a regression, perhaps): > If more files needs to be created, an explicit numeric start value prevents > filename widening (this wasn't the case in your example because 1000 lines > fit in 100 files of 10 lines): > > # Works, filenames will be widened to 9010. > $ seq 1001 | split -l 10 --numeric-suffix - foo_ > > # Widening is not allowed (from default of 2 digits), split fails: > $ seq 1001 | split -l 10 --numeric-suffix=0 - foo_ > split: output file suffixes exhausted > > > What do others think: default to no-widening for numeric suffixes, > or add code to 'next_file_name()' for numeric widening ?
This was discussed at http://bugs.gnu.org/20874 I'm not sure anything needs to be done here, since for backward compat for concat operations expecting lexical sort we use the current auto widening scheme. cheers, Pádraig