Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: linux-gnu Compiler: gcc Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin' -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc' -DSYS_BASH_LOGOUT='/etc/bash.bash_logout' -DNON_INTERACTIVE_LOGIN_SHELLS uname output: Linux articuno 5.11.5-arch1-1 #1 SMP PREEMPT Tue, 09 Mar 2021 18:56:28 +0000 x86_64 GNU/Linux Machine Type: x86_64-pc-linux-gnu
Bash Version: 5.1 Patch Level: 4 Release Status: release Description: Two bytes unicode characters getting corrupted under certain circumstances around every 5th attempt of running below command Repeat-By: 1) create an UTF-8 file which contains 510 'A' characters, newline ('\n') and some two bytes unicode character (f.e. russian letter 'Я') (file size should be equal to 513 bytes after that) File can be created using following script in python 3: #!/usr/bin/env python with open('./a', 'w', encoding='UTF-8') as out: out.write('A' * 510 + '\n' + 'Я') 2) create simple bash function which echoes second argument: foo() { echo "$2" } 3) run following command foo $(cat ./a) 4) around every 5th attempt the letter will be corrupted (you will get 'd0 ?? af 0a' instead of 'd0 af 0a') -- Кириллов Дмитрий Сергеевич