** Description changed:

- ## DRAFT ###
  [Impact]
  valgrind on bionic coredump and errors out as follows:
  
  ARM64 front end: branch_etc
  disInstr(arm64): unhandled instruction 0xD5380000
  disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000
  ==11950== valgrind: Unrecognised instruction at address 0x4014c90.
  ==11950==    at 0x4014C90: init_cpu_features (cpu-features.c:72)
  ==11950==    by 0x4014C90: dl_platform_init (dl-machine.h:208)
  ==11950==    by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231)
  ==11950==    by 0x40018C3: _dl_start_final (rtld.c:414)
  ==11950==    by 0x4001B47: _dl_start (rtld.c:523)
  ==11950==    by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so)
  ==11950== Your program just tried to execute an instruction that Valgrind
  ==11950== did not recognise.  There are two possible reasons for this.
  ==11950== 1. Your program has a bug and erroneously jumped to a non-code
  ==11950==    location.  If you are running Memcheck and you just saw a
  ==11950==    warning about a bad jump, it's probably your program's fault.
  ==11950== 2. The instruction is legitimate but Valgrind doesn't handle it,
  ==11950==    i.e. it's Valgrind's fault.  If you think this is the case or
  ==11950==    you are not sure, please let us know and we'll try to fix it.
  ==11950== Either way, Valgrind will now raise a SIGILL signal which will
  ==11950== probably kill your program.
- ==11950== 
+ ==11950==
  ==11950== Process terminating with default action of signal 4 (SIGILL)
  ==11950==  Illegal opcode at address 0x4014C90
  ==11950==    at 0x4014C90: init_cpu_features (cpu-features.c:72)
  ==11950==    by 0x4014C90: dl_platform_init (dl-machine.h:208)
  ==11950==    by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231)
  ==11950==    by 0x40018C3: _dl_start_final (rtld.c:414)
  ==11950==    by 0x4001B47: _dl_start (rtld.c:523)
  ==11950==    by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so)
  
+ The crash occurs because Valgrind is trying to simulate the CPU
+ instructions when debugging a specific process. Valgrind tries to
+ disassemble the whole instructions running by the process and insert the
+ debugging instructions in run time. However, in this case, Valgrind
+ cannot identify the MIDR_EL1 flag which happens in the "mrs %0,
+ midr_el1" instruction. And this instruction means to read the CPU ID
+ state register to %0(id) variable. asm volatile ("mrs %0, midr_el1" :
+ "=r"(id)); so, Valrind cannot recognize what "midr_el1" is and then
+ crashes.
+ 
+ 
+ https://www.kernel.org/doc/Documentation/arm64/cpu-feature-registers.txt
+ ....
+ d) CPU Identification :
+     MIDR_EL1 is exposed to help identify the processor. On a
+     heterogeneous system, this could be racy (just like getcpu()). The
+     process could be migrated to another CPU by the time it uses the
+     register value, unless the CPU affinity is set. Hence, there is no
+     guarantee that the value reflects the processor that it is
+     currently executing on. The REVIDR is not exposed due to this
+     constraint, as REVIDR makes sense only in conjunction with the
+     MIDR. Alternately, MIDR_EL1 and REVIDR_EL1 are exposed via sysfs
+     at:
+ 
+       /sys/devices/system/cpu/cpu$ID/regs/identification/
+                                                     \- midr
+                                                     \- revidr
  
  [Test Case]
  
  1) Write a 'Hello World' program:
  ----
  #include <stdio.h>
  
  void main(void) {
  printf("Hello World!\n");
  };
  ----
  
  2) Build it:
  $ cc -o hello hello.c
  
  3) Then run valgrind on it:
  $ valgrind ./hello
  
  [Regression Potential]
  
+ For the regression possibility, it should be fine.
+ 
+ The symtpom happens when Valgrind is trying to disassemble code inside
+ glibc (sysdeps/unix/sysv/linux/aarch64/cpu-features.c):
+ 
+ Even if the HWCAP_CPUID is not supported, the default value is to assign
+ 0 to the midr variable. So, I think it's not an important feature to
+ support.
+ 
+ Additionally, the fix is found in Ubuntu already (disco and late).
+ 
+ For some reasons, if a regression happens, the regression will be
+ limited to ARM arch and shouldn't affect other cpu(s) architecture.
+ 
  [Other information]
  
- Upstream fix: 
+ Upstream fix:
  
https://sourceware.org/git/?p=valgrind.git;a=commit;h=fbbb696c5d1e93d4ac6cb548c68bb3f443ceef42
  
  * Only affecting Bionic:
  
  # git describe --contains fbbb696c5d1e93d4ac6cb548c68bb3f443ceef42
  VALGRIND_3_14_0~96
  
  # rmadison valgrind
- => valgrind | 1:3.13.0-2ubuntu2.1      | bionic-updates  
-    valgrind | 1:3.14.0-2ubuntu6        | disco                      
-    valgrind | 1:3.15.0-1ubuntu3.1      | eoan-updates    
-    valgrind | 1:3.15.0-1ubuntu5        | focal          
- 
+ => valgrind | 1:3.13.0-2ubuntu2.1      | bionic-updates
+    valgrind | 1:3.14.0-2ubuntu6        | disco
+    valgrind | 1:3.15.0-1ubuntu3.1      | eoan-updates
+    valgrind | 1:3.15.0-1ubuntu5        | focal
  
  [Original Description]
  
  I'm performing Valgrind testing on an ElPotato running Ubuntu Bionic
  Aarch64 image. My program is dying like in
  https://bugs.kde.org/show_bug.cgi?id=381556 :
  
  ```
  $ valgrind --track-origins=yes --suppressions=cryptopp.supp ./cryptest.exe v
  ==12969== Memcheck, a memory error detector
  ==12969== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
  ==12969== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
  ==12969== Command: ./cryptest.exe v
  ==12969==
  ARM64 front end: branch_etc
  disInstr(arm64): unhandled instruction 0xD5380000
  disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000
  ==12969== valgrind: Unrecognised instruction at address 0x4014c90.
  ==12969==    at 0x4014C90: init_cpu_features (cpu-features.c:72)
  ==12969==    by 0x4014C90: dl_platform_init (dl-machine.h:208)
  ==12969==    by 0x4014C90: _dl_sysdep_start (dl-sysdep.c:231)
  ==12969==    by 0x40018C3: _dl_start_final (rtld.c:414)
  ==12969==    by 0x4001B47: _dl_start (rtld.c:523)
  ==12969==    by 0x40011C7: ??? (in /lib/aarch64-linux-gnu/ld-2.27.so)
  ...
  ```
  
  Here's a similar Red Hat issue report:
  https://bugzilla.redhat.com/show_bug.cgi?id=1467952 .
  
  Please pickup the patch in the 381556 bug report.
  
  -----
  
  $ lsb_release -rd
  Description:    Ubuntu 18.04.2 LTS
  Release:        18.04
  
  $ apt-cache policy valgrind
  valgrind:
    Installed: 1:3.13.0-2ubuntu2.1
    Candidate: 1:3.13.0-2ubuntu2.1
    Version table:
   *** 1:3.13.0-2ubuntu2.1 500
          500 http://ports.ubuntu.com bionic-updates/main arm64 Packages
          100 /var/lib/dpkg/status
       1:3.13.0-2ubuntu2 500
          500 http://ports.ubuntu.com bionic/main arm64 Packages

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1826811

Title:
  Valgrind unhandled instruction 0xD5380000 on Aarch64

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to