Launchpad has imported 15 comments from the remote bug at https://bugzilla.redhat.com/show_bug.cgi?id=1464211.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2017-06-22T16:23:16+00:00 fweimer wrote: +++ This bug was initially created as a clone of Bug #1464085 +++ valgrind currently does not know anything about the CPUID flag added to the HWCAP auxv entry in kernel 4.11. It passes this flag through to applications, but it will then choke when the application uses it, like this: ARM64 front end: branch_etc disInstr(arm64): unhandled instruction 0xD5380000 disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000 ==924== valgrind: Unrecognised instruction at address 0x11f548. ==924== at 0x11F548: init_cpu_features (cpu-features.c:32) ==924== by 0x11F548: dl_platform_init (dl-machine.h:241) ==924== by 0x11F548: _dl_sysdep_start (dl-sysdep.c:231) ==924== by 0x10981B: _dl_start_final (rtld.c:412) ==924== by 0x109AAB: _dl_start (rtld.c:520) The crashing instruction is the mrs in the glibc startup code, which means that currently no applications run under valgrind: if (hwcap & HWCAP_CPUID) { register uint64_t id = 0; asm volatile ("mrs %0, midr_el1" : "=r"(id)); cpu_features->midr_el1 = id; } else cpu_features->midr_el1 = 0; Perhaps valgrind should mask all the HWCAP bits it knows nothing about. Workaround: Run with “LD_HWCAP_MASK=1”. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/0 ------------------------------------------------------------------------ On 2017-06-23T10:52:41+00:00 mjw wrote: See also upstream https://bugs.kde.org/show_bug.cgi?id=381556 arm64: Handle feature registers access on 4.11 Linux kernel or later For now worked around in valgrind valgrind-3.13.0-3.fc27 as suggested in the original description of this bug: --- a/coregrind/m_initimg/initimg-linux.c +++ b/coregrind/m_initimg/initimg-linux.c @@ -703,6 +703,12 @@ Addr setup_client_stack( void* init_sp, (and anything above) are not supported by Valgrind. */ auxv->u.a_val &= VKI_HWCAP_S390_TE - 1; } +# elif defined(VGP_arm64_linux) + { + /* Linux 4.11 started pupulating this for arm64, but we + currently don't support any. */ + auxv->u.a_val = 0; + } # endif break; # if defined(VGP_ppc64be_linux) || defined(VGP_ppc64le_linux) Keeping this bug open to see how upstream resolves this. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/1 ------------------------------------------------------------------------ On 2017-06-29T20:11:01+00:00 updates wrote: valgrind-3.13.0-4.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-4315a2f0cd Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/2 ------------------------------------------------------------------------ On 2017-06-30T20:25:29+00:00 updates wrote: valgrind-3.13.0-4.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-4315a2f0cd Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/3 ------------------------------------------------------------------------ On 2017-07-07T23:05:15+00:00 updates wrote: valgrind-3.13.0-4.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/4 ------------------------------------------------------------------------ On 2018-06-13T19:15:57+00:00 rclark wrote: (In reply to Mark Wielaard from comment #1) > See also upstream https://bugs.kde.org/show_bug.cgi?id=381556 > arm64: Handle feature registers access on 4.11 Linux kernel or later > > For now worked around in valgrind valgrind-3.13.0-3.fc27 as suggested in the > original description of this bug: > > --- a/coregrind/m_initimg/initimg-linux.c > +++ b/coregrind/m_initimg/initimg-linux.c > @@ -703,6 +703,12 @@ Addr setup_client_stack( void* init_sp, > (and anything above) are not supported by Valgrind. */ > auxv->u.a_val &= VKI_HWCAP_S390_TE - 1; > } > +# elif defined(VGP_arm64_linux) > + { > + /* Linux 4.11 started pupulating this for arm64, but we > + currently don't support any. */ > + auxv->u.a_val = 0; > + } > # endif > break; > # if defined(VGP_ppc64be_linux) || defined(VGP_ppc64le_linux) > > Keeping this bug open to see how upstream resolves this. hmm, I just saw the same issue on rawhide (valgrind 1:3.13.0-18.fc29).. did a patch get lost from the spec file? Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/5 ------------------------------------------------------------------------ On 2018-06-13T19:24:33+00:00 mjw wrote: (In reply to Rob Clark from comment #5) > hmm, I just saw the same issue on rawhide (valgrind 1:3.13.0-18.fc29).. did > a patch get lost from the spec file? The patch (valgrind-3.13.0-arm64-hwcap.patch) is there (and still the same, no change upstream), and applied. Is the issue exactly the same as in the description? Could you paste the command line and the valgrind error message? Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/6 ------------------------------------------------------------------------ On 2018-06-13T19:36:20+00:00 rclark wrote: cmdline: valgrind --leak-check=yes ./deqp-gles31 --deqp-case=dEQP- GLES31.functional.ssbo.layout.random.arrays_of_arrays.1 (debuging some dEQP test crashes in mesa/freedreno) output (without LD_HWCAP_MASK=1 which works around the issue) (also attached): ==32073== Memcheck, a memory error detector ==32073== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==32073== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info ==32073== Command: ./deqp-gles31 --deqp-visibility=hidden --deqp-case=dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1 --deqp-log-filename=results/dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1.qpa ==32073== ARM64 front end: branch_etc disInstr(arm64): unhandled instruction 0xD5380000 disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000 ==32073== valgrind: Unrecognised instruction at address 0x40150cc. ==32073== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==32073== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==32073== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==32073== by 0x40018C3: _dl_start_final (rtld.c:411) ==32073== by 0x4001B3F: _dl_start (rtld.c:520) ==32073== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) ==32073== Your program just tried to execute an instruction that Valgrind ==32073== did not recognise. There are two possible reasons for this. ==32073== 1. Your program has a bug and erroneously jumped to a non-code ==32073== location. If you are running Memcheck and you just saw a ==32073== warning about a bad jump, it's probably your program's fault. ==32073== 2. The instruction is legitimate but Valgrind doesn't handle it, ==32073== i.e. it's Valgrind's fault. If you think this is the case or ==32073== you are not sure, please let us know and we'll try to fix it. ==32073== Either way, Valgrind will now raise a SIGILL signal which will ==32073== probably kill your program. ==32073== ==32073== Process terminating with default action of signal 4 (SIGILL): dumping core ==32073== Illegal opcode at address 0x40150CC ==32073== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==32073== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==32073== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==32073== by 0x40018C3: _dl_start_final (rtld.c:411) ==32073== by 0x4001B3F: _dl_start (rtld.c:520) ==32073== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) valgrind: m_coredump/coredump-elf.c:506 (fill_fpu): Assertion 'Unimplemented functionality' failed. valgrind: valgrind host stacktrace: ==32073== at 0x3803E0FC: show_sched_status_wrk (m_libcassert.c:378) ==32073== by 0x3803E22B: report_and_quit (m_libcassert.c:449) ==32073== by 0x3803E387: vgPlain_assert_fail (m_libcassert.c:515) ==32073== by 0x380706FB: fill_fpu.isra.4 (coredump-elf.c:506) ==32073== by 0x380708CF: dump_one_thread (coredump-elf.c:563) ==32073== by 0x380708CF: make_elf_coredump (coredump-elf.c:667) ==32073== by 0x380708CF: vgPlain_make_coredump (coredump-elf.c:748) ==32073== by 0x3805654F: default_action (m_signals.c:1937) ==32073== by 0x3805654F: deliver_signal (m_signals.c:1997) ==32073== by 0x38056D0B: vgPlain_synth_sigill (m_signals.c:2106) ==32073== by 0x380982DB: vgPlain_scheduler (scheduler.c:1577) ==32073== by 0x380A939F: thread_wrapper (syswrap-linux.c:103) ==32073== by 0x380A939F: run_a_thread_NORETURN (syswrap-linux.c:156) ==32073== by 0xFFFFFFFFFFFFFFFF: ??? sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 32073) ==32073== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==32073== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==32073== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==32073== by 0x40018C3: _dl_start_final (rtld.c:411) ==32073== by 0x4001B3F: _dl_start (rtld.c:520) ==32073== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/7 ------------------------------------------------------------------------ On 2018-06-13T19:36:52+00:00 rclark wrote: Created attachment 1451010 valgrind output Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/8 ------------------------------------------------------------------------ On 2018-06-13T19:47:16+00:00 fweimer wrote: That's from the midr_el1 read: /* If there was no useful tunable override, query the MIDR if the kernel allows it. */ if (midr == UINT64_MAX) { if (hwcap & HWCAP_CPUID) asm volatile ("mrs %0, midr_el1" : "=r"(midr)); else midr = 0; } So it looks like we get the wrong (host) hwcap value without masking. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/9 ------------------------------------------------------------------------ On 2018-06-13T19:48:17+00:00 fweimer wrote: It might be helpful to run “LD_SHOW_AUXV=1 /bin/true” with and without valgrind. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/10 ------------------------------------------------------------------------ On 2018-06-13T20:05:40+00:00 rclark wrote: so, quick disclaimer, but I'm running a non-standard kernel atm, if any kernel config/etc could effect this, I can retry w/ a vanilla kernel (but not immediately, and possibly not on the same device) (In reply to Florian Weimer from comment #10) > It might be helpful to run “LD_SHOW_AUXV=1 /bin/true” with and without > valgrind. [robclark@db820c:~]$ LD_SHOW_AUXV=1 /bin/true AT_SYSINFO_EHDR: 0xffff81924000 AT_HWCAP: 8ff AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0xaaaac8ba2040 AT_PHENT: 56 AT_PHNUM: 9 AT_BASE: 0xffff818f6000 AT_FLAGS: 0x0 AT_ENTRY: 0xaaaac8ba38d0 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xfffff1883f68 AT_EXECFN: /bin/true AT_PLATFORM: aarch64 [robclark@db820c:~]$ [robclark@db820c:~]$ LD_SHOW_AUXV=1 valgrind --leak-check=yes /bin/true AT_SYSINFO_EHDR: 0xffff9eb51000 AT_HWCAP: 8ff AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x400040 AT_PHENT: 56 AT_PHNUM: 9 AT_BASE: 0xffff9eb23000 AT_FLAGS: 0x0 AT_ENTRY: 0x4011d0 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xffffc66278c8 AT_EXECFN: /usr/local/bin/valgrind AT_PLATFORM: aarch64 ==1668== Memcheck, a memory error detector ==1668== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==1668== Using Valgrind-3.13.0.SVN and LibVEX; rerun with -h for copyright info ==1668== Command: /bin/true ==1668== ARM64 front end: branch_etc disInstr(arm64): unhandled instruction 0xD5380000 disInstr(arm64): 1101'0101 0011'1000 0000'0000 0000'0000 ==1668== valgrind: Unrecognised instruction at address 0x40150cc. ==1668== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==1668== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==1668== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==1668== by 0x40018C3: _dl_start_final (rtld.c:411) ==1668== by 0x4001B3F: _dl_start (rtld.c:520) ==1668== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) ==1668== Your program just tried to execute an instruction that Valgrind ==1668== did not recognise. There are two possible reasons for this. ==1668== 1. Your program has a bug and erroneously jumped to a non-code ==1668== location. If you are running Memcheck and you just saw a ==1668== warning about a bad jump, it's probably your program's fault. ==1668== 2. The instruction is legitimate but Valgrind doesn't handle it, ==1668== i.e. it's Valgrind's fault. If you think this is the case or ==1668== you are not sure, please let us know and we'll try to fix it. ==1668== Either way, Valgrind will now raise a SIGILL signal which will ==1668== probably kill your program. ==1668== ==1668== Process terminating with default action of signal 4 (SIGILL): dumping core ==1668== Illegal opcode at address 0x40150CC ==1668== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==1668== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==1668== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==1668== by 0x40018C3: _dl_start_final (rtld.c:411) ==1668== by 0x4001B3F: _dl_start (rtld.c:520) ==1668== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) valgrind: m_coredump/coredump-elf.c:506 (fill_fpu): Assertion 'Unimplemented functionality' failed. valgrind: valgrind host stacktrace: ==1668== at 0x3803E0FC: show_sched_status_wrk (m_libcassert.c:378) ==1668== by 0x3803E22B: report_and_quit (m_libcassert.c:449) ==1668== by 0x3803E387: vgPlain_assert_fail (m_libcassert.c:515) ==1668== by 0x380706FB: fill_fpu.isra.4 (coredump-elf.c:506) ==1668== by 0x380708CF: dump_one_thread (coredump-elf.c:563) ==1668== by 0x380708CF: make_elf_coredump (coredump-elf.c:667) ==1668== by 0x380708CF: vgPlain_make_coredump (coredump-elf.c:748) ==1668== by 0x3805654F: default_action (m_signals.c:1937) ==1668== by 0x3805654F: deliver_signal (m_signals.c:1997) ==1668== by 0x38056D0B: vgPlain_synth_sigill (m_signals.c:2106) ==1668== by 0x380982DB: vgPlain_scheduler (scheduler.c:1577) ==1668== by 0x380A939F: thread_wrapper (syswrap-linux.c:103) ==1668== by 0x380A939F: run_a_thread_NORETURN (syswrap-linux.c:156) ==1668== by 0xFFFFFFFFFFFFFFFF: ??? sched status: running_tid=1 Thread 1: status = VgTs_Runnable (lwpid 1668) ==1668== at 0x40150CC: init_cpu_features (cpu-features.c:72) ==1668== by 0x40150CC: dl_platform_init (dl-machine.h:208) ==1668== by 0x40150CC: _dl_sysdep_start (dl-sysdep.c:231) ==1668== by 0x40018C3: _dl_start_final (rtld.c:411) ==1668== by 0x4001B3F: _dl_start (rtld.c:520) ==1668== by 0x4001047: ??? (in /usr/lib64/ld-2.27.9000.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/11 ------------------------------------------------------------------------ On 2018-06-13T20:14:44+00:00 mjw wrote: hohum, so that shows the HWCAP of valgrind itself, which then execs /bin/true and crashes before showing the auxv Maybe try: LD_HWCAP_MASK=1 LD_SHOW_AUXV=1 valgrind -q /bin/true Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/12 ------------------------------------------------------------------------ On 2018-06-14T11:50:06+00:00 rclark wrote: heh, so this makes my problem a bit more obvious.. at one point in the past I had built my own valgrind (in /usr/local/bin which was ahead of /usr/bin in $PATH).. so in fact the problem all along was not with fedora's valgrind but pebkac ;-) /me reaches for brown paper bag ------ [robclark@db820c:~]$ LD_HWCAP_MASK=1 LD_SHOW_AUXV=1 valgrind -q /bin/true AT_SYSINFO_EHDR: 0xffffb56ca000 AT_HWCAP: 8ff AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x400040 AT_PHENT: 56 AT_PHNUM: 9 AT_BASE: 0xffffb569c000 AT_FLAGS: 0x0 AT_ENTRY: 0x4011d0 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xffffd156b538 AT_EXECFN: /usr/local/bin/valgrind AT_PLATFORM: aarch64 AT_HWCAP: 8ff AT_PAGESZ: 4096 AT_CLKTCK: 100 AT_PHDR: 0x108040 AT_PHENT: 56 AT_PHNUM: 9 AT_BASE: 0x4000000 AT_FLAGS: 0x0 AT_ENTRY: 0x1098d0 AT_UID: 1000 AT_EUID: 1000 AT_GID: 1000 AT_EGID: 1000 AT_SECURE: 0 AT_RANDOM: 0xfff000fda AT_EXECFN: /bin/true AT_PLATFORM: aarch64 Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/13 ------------------------------------------------------------------------ On 2018-06-14T12:40:40+00:00 mjw wrote: (In reply to Rob Clark from comment #13) > heh, so this makes my problem a bit more obvious.. at one point in the past > I had built my own valgrind (in /usr/local/bin which was ahead of /usr/bin > in $PATH).. so in fact the problem all along was not with fedora's valgrind > but pebkac ;-) > > /me reaches for brown paper bag No worries. Thanks for walking through it with us. If there is any reason in the future to build an upstream valgrind please let me know. I am happy to backport any fixes to the fedora package. Reply at: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/comments/14 ** Changed in: valgrind (Fedora) Status: Unknown => Fix Released ** Changed in: valgrind (Fedora) Importance: Unknown => Undecided ** Bug watch added: KDE Bug Tracking System #381556 https://bugs.kde.org/show_bug.cgi?id=381556 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1826811 Title: Valgrind unhandled instruction 0xD5380000 on Aarch64 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/valgrind/+bug/1826811/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs