Configuration Information [Automatically generated, do not change]:
   Machine: x86_64
   OS: linux-gnu
   Compiler: gcc
   Compilation CFLAGS: -march=x86-64 -mtune=generic -O2 -pipe -fno-plt
   -DDEFAULT_PATH_VALUE='/usr/local/sbin:/usr/local/bin:/usr/bin'
   -DSTANDARD_UTILS_PATH='/usr/bin' -DSYS_BASHRC='/etc/bash.bashrc'
   -DSYS_BASH_LOGOUT='/etc/bash.bash_logout'
   -DNON_INTERACTIVE_LOGIN_SHELLS
   uname output: Linux articuno 5.11.5-arch1-1 #1 SMP PREEMPT Tue, 09 Mar
   2021 18:56:28 +0000 x86_64 GNU/Linux
   Machine Type: x86_64-pc-linux-gnu

   Bash Version: 5.1
   Patch Level: 4
   Release Status: release

   Description:
   Two bytes unicode characters getting corrupted under certain
   circumstances around every 5th attempt of running below command

   Repeat-By:
   1) create an UTF-8 file which contains 510 'A' characters, newline
   ('\n') and some two bytes unicode character (f.e. russian letter 'Я')
   (file size should be equal to 513 bytes after that)
   File can be created using following script in python 3:

   #!/usr/bin/env python

   with open('./a', 'w', encoding='UTF-8') as out:
      out.write('A' * 510 + '\n' + 'Я')

   2) create simple bash function which echoes second argument:
   foo()
   {
      echo "$2"
   }
   3) run following command
   foo $(cat ./a)
   4) around every 5th attempt the letter will be corrupted (you will get
   'd0 ?? af 0a' instead of 'd0 af 0a')

   --

   Кириллов Дмитрий Сергеевич

Reply via email to