On Mon, Apr 26, 2021 at 12:04:45PM +0200, Thomas Schmitt wrote: > > what accounts for the three missing characters (namely SPACE, TAB, > > and NEWLINE)? > > They get eaten by the shell parser if you do not use quotation marks: > > $ echo $COMP_WORDBREAKS | wc -c > 11 > $ echo "$COMP_WORDBREAKS" | wc -c > 14
Not the parser, technically. The correct term is word splitting. See below if you want more details. > So to see all characters (including the newline added by "echo") i do: > > $ echo "$COMP_WORDBREAKS" | hd > 00000000 20 09 0a 22 27 3e 3c 3d 3b 7c 26 28 3a 0a | .."'><=;|&(:.| > 0000000e Even better, when you're dealing with arbitrary data that may include characters which echo might interpret: printf %s "$COMP_WORDBREAKS" | hd It looks like there aren't any in this particular case, but it's a good habit to develop for future cases. Word splitting: an unquoted substitution such as $COMP_WORDBREAKS undergoes two more rounds of alterations: word splitting, and pathname expansion. The word splitting round uses the contents of the IFS variable, or the default value of "space tab newline" if IFS is unset. Each character of IFS is treated as a word delimiter, and may cause a split. The characters of IFS are divided into two types: whitespace, and non-whitespace. All consecutive IFS whitespace characters are grouped together and treated as a single delimiter. Also, any single IFS non-whitespace character may be surrounded by any number of adjacent IFS whitespace characters, and that whole group is treated as a single delimiter. Finally, any leading or trailing IFS whitespace characters are trimmed from the value and discarded. In your "$COMP_WORDBREAKS", you can see that the value begins with space, tab and newline. Those are all IFS whitespace characters, so they're discarded. The rest of the value is free of IFS whitespace characters, so there are no further alterations. The result is the single word "'><=;|&(:. which is then passed to the pathname expansion round. (There are no globbing characters in this value, so pathname expansion will not occur. But in general, it's a thing you need to be aware of.) Some simple demonstrations: $ string=' hi there ' $ printf '<%s> ' "$string" ; echo < hi there > $ printf '<%s> ' $string ; echo <hi> <there> $ IFS=h $ printf '<%s> ' $string ; echo < > <i t> <ere > $ IFS='h ' $ printf '<%s> ' $string ; echo <> <i> <t> <ere> The last one shows an IFS value with both whitespace and non-whitespace characters in it. The leading spaces are trimmed, leaving h as the first character. As a non-whitespace character, that one is *not* trimmed, so it delimits an initial empty field from the rest of the string. An example with pathname expansion: $ IFS=$' \t\n' $ string='/* a comment */' $ cd /tmp $ echo $string /backup /bin /boot /chroot /command /dev /etc /hd /home /initrd.img /initrd.img.old /lib /lib64 /lost+found /media /mnt /opt /package /proc /root /run /sbin /service /srv /stuff /sys /tmp /usr /var /vmlinuz /vmlinuz.old a comment dumps/ ssh-6i8aLIWw2QgZ/ ssh-T5JLPWVvU9xw/ systemd-private-d50cab7eaba04f88b49c7e97e3d1043b-ModemManager.service-iD7GSh/ systemd-private-d50cab7eaba04f88b49c7e97e3d1043b-ntp.service-GeOK4e/ systemd-private-d50cab7eaba04f88b49c7e97e3d1043b-systemd-logind.service-se4pDh/ Temp-10bb8392-9d61-4469-bf31-5b5ef6c29a88/ Temp-1ffbc72d-62a7-46d4-a403-481053966525/ Word splitting occurs first, giving the four words /* a comment */ and then pathname expansion (globbing) occurs on the first and last words. All of the resulting words are given as arguments to echo. All of this is why proper quoting is absolutely essential when working with the shell. It cannot possibly be said enough times.