On Wed, Sep 13, 2023 at 06:52:08PM -0300, Antonio Terceiro wrote: > ohcount segfaults (and sometimes aborts with a Bus error) on arm64, > almost 90% of the time. I tried this on an up to date arm64 Debian
Running ohcount under gdb traps on the segfault but can't get a backtrace due to a corrupted stack. So I recompiled ohcount with the address sanitiser which traps on the segfault with the following: ================================================================= ==14540==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0xffffeab309b4 at pc 0xaaaaacf8bd38 bp 0xffffeab30960 sp 0xffffeab30978 WRITE of size 1 at 0xffffeab309b4 thread T0 #0 0xaaaaacf8bd34 in disambiguate_aspx src/detector.c:241 #1 0xaaaaacf8ba80 in ohcount_detect_language src/detector.c:221 #2 0xaaaaacf87304 in ohcount_sourcefile_get_language src/sourcefile.c:128 #3 0xaaaaad1fb5d0 in ohcount_parse src/parser.c:16 #4 0xaaaaacf879cc in ohcount_sourcefile_parse src/sourcefile.c:195 #5 0xaaaaacf87be0 in ohcount_sourcefile_get_loc_list src/sourcefile.c:239 #6 0xaaaaacf88f48 in ohcount_sourcefile_list_analyze_languages src/sourcefile.c:404 #7 0xaaaaacf8582c in summary src/ohcount.c:210 #8 0xaaaaacf86394 in main src/ohcount.c:302 #9 0xffffa95f777c in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #10 0xffffa95f7854 in __libc_start_main_impl ../csu/libc-start.c:360 #11 0xaaaaacf840ac in _start (/home/mjc/debian/ohcount/ohcount-4.0.0/bin/ohcount+0x240ac) Address 0xffffeab309b4 is located in stack of thread T0 SUMMARY: AddressSanitizer: dynamic-stack-buffer-overflow src/detector.c:241 in disambiguate_aspx Shadow bytes around the buggy address: 0x200ffd5660e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd5660f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x200ffd566130: ca ca ca ca 00 00[04]cb cb cb cb cb 00 00 00 00 0x200ffd566140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x200ffd566170: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 00 00 00 0x200ffd566180: 00 00 01 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==14540==ABORTING The code for disambiguate_aspx() where the segfaults occurs is: const char *disambiguate_aspx(SourceFile *sourcefile) { char *p = ohcount_sourcefile_get_contents(sourcefile); char *eof = p + ohcount_sourcefile_get_contents_size(sourcefile); for (; p < eof; p++) { // /<%@\s*Page[^>]+Language="VB"[^>]+%>/ p = strstr(p, "<%@"); if (!p) break; char *pe = strstr(p, "%>"); if (p && pe) { p += 3; const int length = pe - p; char buf[length]; strncpy(buf, p, length); buf[length] = '\0'; char *eol = buf + strlen(buf); for (p = buf; p < eol; p++) *p = tolower(*p); p = buf; while (*p == ' ' || *p == '\t') p++; if (strncmp(p, "page", 4) == 0) { p += 4; if (strstr(p, "language=\"vb\"")) return LANG_VB_ASPX; } } } return LANG_CS_ASPX; } Line 241 is the line with: buf[length] = '\0'; We see that buf is declared two lines above as a variable length array. Being a local variable I assume that it is allocated on the stack, which is dangerous if its length turns out to be too large for the stack. Presumably that is the problem. Cheers, Michael.