On 16 November 2017 at 13:25, Maxim Ostapenko <m.ostape...@samsung.com> wrote: > Hi Christophe, > > On 13/11/17 15:47, Christophe Lyon wrote: >> >> On 30 October 2017 at 16:21, Maxim Ostapenko <m.ostape...@samsung.com> >> wrote: >>> >>> On 30/10/17 17:08, Christophe Lyon wrote: >>>> >>>> On 30/10/2017 11:12, Maxim Ostapenko wrote: >>>>> >>>>> Hi, >>>>> >>>>> sorry for the late response. >>>>> >>>>> On 20/10/17 13:45, Christophe Lyon wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> On 19 October 2017 at 13:17, Jakub Jelinek <ja...@redhat.com> wrote: >>>>>>> >>>>>>> On Thu, Oct 19, 2017 at 02:07:24PM +0300, Maxim Ostapenko wrote: >>>>>>>>> >>>>>>>>> Is the patch (the merge + this incremental) ok for trunk? >>>>>>>> >>>>>>>> I think the patch is OK, just wondering about two things: >>>>>>> >>>>>>> Richi just approved the patch on IRC, so I'll commit, then we can >>>>>>> deal >>>>>>> with >>>>>>> follow-ups. >>>>>>> >>>>>> Does anyone else run these tests on arm? >>>>>> Since you applied this patch, I'm seeing lots of new errors and >>>>>> timeouts. >>>>>> I have been ignoring regression reports for *san because of >>>>>> yyrandomness >>>>>> in the results, but the timeouts are a major inconvenience in testing >>>>>> because it increases latency a lot in getting results, or worse I get >>>>>> no >>>>>> result at all because the validation job is killed before completion. >>>>>> >>>>>> Looking at some intermediate logs, I have noticed: >>>>>> ==24797==AddressSanitizer CHECK failed: >>>>>> /libsanitizer/asan/asan_poisoning.cc:34 >>>>>> "((AddrIsAlignedByGranularity(addr))) != (0)" (0x0, 0x0) >>>>>> #0 0x408d7d65 in AsanCheckFailed >>>>>> /libsanitizer/asan/asan_rtl.cc:67 >>>>>> #1 0x408ecd5d in __sanitizer::CheckFailed(char const*, int, char >>>>>> const*, unsigned long long, unsigned long long) >>>>>> /libsanitizer/sanitizer_common/sanitizer_termination.cc:77 >>>>>> #2 0x408d22d5 in __asan::PoisonShadow(unsigned long, unsigned >>>>>> long, unsigned char) /libsanitizer/asan/asan_poisoning.cc:34 >>>>>> #3 0x4085409b in __asan_register_globals >>>>>> /libsanitizer/asan/asan_globals.cc:368 >>>>>> #4 0x109eb in _GLOBAL__sub_I_00099_1_ten >>>>>> >>>>>> >>>>>> (/aci-gcc-fsf/builds/gcc-fsf-gccsrc-thumb/obj-arm-none-linux-gnueabi/gcc3/gcc/testsuite/gcc/alloca_big_alignment.exe+0x109eb) >>>>>> >>>>>> in MANY (193 in gcc) tests. >>>>>> >>>>>> and many others (152 in gcc) just time out individually (eg >>>>>> c-c++-common/asan/alloca_instruments_all_paddings.c) with no error in >>>>>> the logs besides Dejagnu's >>>>>> WARNING: program timed out. >>>>>> >>>>>> >>>>>> Since I'm using an apparently unusual setup, maybe I have to update it >>>>>> to cope with the new version, >>>>>> so I'd like to know if others are seeing the same problems on arm? >>>>>> >>>>>> I'm using qemu -R 0 to execute the test programs, encapsulated by >>>>>> proot (similar to chroot, but does not require root privileges). >>>>>> >>>>>> Am I missing something obvious? >>>>> >>>>> >>>>> I've caught the same error on my Arndale board. The issue seems to be >>>>> quite obvious: after merge, ASan requires globals array to be aligned >>>>> by >>>>> shadow granularity. >>>> >>>> >>>> Thanks for confirming. I've spent a lot of time investigating the >>>> timeout >>>> issues, that led to zombie processes and servers needing reboot. I've >>>> finally identified that going back to qemu-2.7 avoid the timeout issues >>>> (I've reported a qemu bug). >>>> >>>>> This trivial patch seems to fix the issue. Could you check it on your >>>>> setup? >>>>> >>>> I was just about to finally start looking at this sanity check problem, >>>> so >>>> thank you very much for sharing your patch. >>>> I manually tested it on the subset of my configs and it solves the >>>> assertion failure, thanks! >>>> However, I can notice many regressions compared to before the merge: >>>> c-c++-common/asan/alloca_instruments_all_paddings.c >>>> c-c++-common/asan/alloca_loop_unpoisoning.c >>>> c-c++-common/asan/alloca_safe_access.c >>>> c-c++-common/asan/asan-interface-1.c >>>> c-c++-common/asan/halt_on_error-1.c >>>> c-c++-common/asan/pr59063-1.c >>>> c-c++-common/asan/pr59063-2.c >>>> c-c++-common/asan/pr63316.c >>>> c-c++-common/asan/pr63888.c >>>> c-c++-common/asan/pr70712.c >>>> c-c++-common/asan/pr71480.c >>>> c-c++-common/asan/pr79944.c >>>> c-c++-common/asan/pr80308.c >>>> c-c++-common/asan/swapcontext-test-1.c >>>> gcc.dg/asan/nosanitize-and-inline.c >>>> gcc.dg/asan/pr79196.c >>>> gcc.dg/asan/pr80166.c >>>> gcc.dg/asan/pr81186.c >>>> gcc.dg/asan/use-after-scope-11.c >>>> gcc.dg/asan/use-after-scope-4.c >>>> gcc.dg/asan/use-after-scope-6.c >>>> gcc.dg/asan/use-after-scope-7.c >>>> gcc.dg/asan/use-after-scope-goto-1.c >>>> gcc.dg/asan/use-after-scope-goto-2.c >>>> gcc.dg/asan/use-after-scope-switch-1.c >>>> gcc.dg/asan/use-after-scope-switch-2.c >>>> gcc.dg/asan/use-after-scope-switch-3.c >>>> gcc.dg/asan/use-after-scope-switch-4.c >>>> >>>> out of which only >>>> c-c++-common/asan/swapcontext-test-1.c >>>> c-c++-common/asan/halt_on_error-1.c >>>> print something in gcc.log >>>> >>>> Do they pass for you? >>> >>> >>> Ah, I see. The problem is that after this merge LSan was enabled for ARM. >>> LSan sets atexit handler that calls internal_clone function that's not >>> supported in QEMU. >>> That's why these tests pass on board, but fail under QEMU. >>> Could you try set ASAN_OPTIONS=detect_leaks=0 in your environment? >>> >> Hi, >> >> I have a followup on this issue, after investigating a bit more. >> >> I filed a bug report against QEMU, and indeed it seems that it rejects >> clone() as called by the sanitizers on purpose, because it cannot support >> CLONE_UNTRACED. >> >> That being said, I was wondering why the same tests worked "better" >> with qemu-aarch64 (as opposed to qemu-arm). And I noticed that on aarch64, >> we have sanitizer_common/sanitizer_syscall_linux_aarch64.inc where >> internal_iserror returns true if retval >= -4095. >> >> On arm, we use sanitizer_common/sanitizer_syscall_generic.inc where >> internal_iserror returns true if retval == -1. >> >> But on the upstream sanitizer repo, >> sanitizer_common/sanitizer_syscall_linux_arm.inc was added on Nov 8th, >> and also checks for retval >= -4095, hence handling the clone() error >> gracefully. So... can we merge again our sanitizer sources to at least >> http://llvm.org/viewvc/llvm-project?view=revision&revision=317640 ? > > > Could you check whether attached patch helps you? >
Yes, it does. Thanks! Christophe > Thanks, > -Maxim > > >> >> Thanks, >> >> Christophe >> >> >>> -Maxim >>> >>> >>>> Thanks, >>>> >>>> Christophe >>>> >>>>> Thanks, >>>>> -Maxim >>>>> >>>>>> Thanks, >>>>>> >>>>>> Christophe >>>>>> >>>>>> >>>>>>>> 1) We have a bunch of GCC local patches, did you include them into a >>>>>>>> cumulative patch (I guess yes)? >>>>>>> >>>>>>> I have done some verification today, diff from upstream r285547 to >>>>>>> unpatched >>>>>>> GCC (with the LLVM Compiler infrastructure two liners removed), >>>>>>> attached P1, >>>>>>> and diff from upstream r315899 to patched GCC (again, two liners >>>>>>> removed), >>>>>>> attached P2 and went through the changes in P1 and verified that >>>>>>> except >>>>>>> for >>>>>>> the ubsan backwards compatibility we had that can't work anymore >>>>>>> everything >>>>>>> else was upstreamed, or remained in P2. So P2 is the current diff >>>>>>> from >>>>>>> upstream, with the >>>>>>> sanitizer_common/sanitizer_symbolizer_libbacktrace.cc >>>>>>> changes now filed upstream. >>>>>>> >>>>>>>> 2) Upstream has enabled LSan for x86 and ARM, is it worth to enable >>>>>>>> them in >>>>>>>> GCC too? >>>>>>> >>>>>>> Maybe, feel free to post patches. For LSan we need to split off >>>>>>> lsan_preinit >>>>>>> out of liblsan and link it into executables, will handle it next >>>>>>> (there >>>>>>> is a >>>>>>> PR about it, just wanted to wait until the merge is in). >>>>>>> >>>>>>> Jakub >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >> >> >