I have found another ABI compliance bug in the AArch64 backend (arm64-gen.c).

According to AAPCS64, when an argument requires 16-byte alignment and
is passed on the stack, the stack address must be rounded up to the
next 16-byte boundary.

Currently, TCC fails to perform this alignment check for non-HFA types
in gen_va_arg. It reads data directly from the current stack pointer
position, ignoring the necessary padding. This results in data
corruption when a 16-byte aligned argument follows an 8-byte argument
on the stack.

I have checked x86_64-gen.c and riscv64-gen.c, and I did not observe
similar issues in those backends.
----------------------------------
Reproduction Code:
#include <stdarg.h>
#include <stdio.h>
#include <stdint.h>

typedef struct __attribute__((aligned(16))) A16 {
    uint64_t lo;
    uint64_t hi;
} A16;

static int check(int dummy, ...)
{
    va_list ap;
    uint64_t first;
    A16 second;

    va_start(ap, dummy);
    // The first argument takes 8 bytes on the stack (if registers are
exhausted)
    first = va_arg(ap, uint64_t);

    // The second argument requires 16-byte alignment.
    // TCC currently reads from offset 8 instead of offset 16 (padding ignored).
    second = va_arg(ap, A16);
    va_end(ap);

    if (first != 0x1122334455667788ULL)
        return 1;
    if (second.lo != 0xaaaaaaaaaaaaaaaaULL || second.hi !=
0xbbbbbbbbbbbbbbbbULL)
        return 2;
    return 0;
}

int main(void)
{
    // Force stack usage by exhausting registers or relying on va_list behavior
    // (Note: Mach-O/Apple Silicon passes variadic args entirely on the stack)
    A16 v = { 0xaaaaaaaaaaaaaaaaULL, 0xbbbbbbbbbbbbbbbbULL };
    if (check(0, 0x1122334455667788ULL, v) != 0) {
        puts("FAIL");
        return 1;
    }
    puts("OK");
    return 0;
}
------------------------
The patch:
diff --git a/arm64-gen.c b/arm64-gen.c
index 2038aeba..bbe63fa6 100644
--- a/arm64-gen.c
+++ b/arm64-gen.c
@@ -1355,6 +1355,10 @@ ST_FUNC void gen_va_arg(CType *t)
         o(0x540000ad); // b.le .+20
 #endif
         o(0xf9400000 | r1 | r0 << 5); // ldr x(r1),[x(r0)] // __stack
+        if (align == 16) {
+            o(0x91003c00 | r1 | r1 << 5); // add x(r1),x(r1),#15
+            o(0x927cec00 | r1 | r1 << 5); // and x(r1),x(r1),#-16
+        }
         o(0x9100001e | r1 << 5 | n << 10); // add x30,x(r1),#(n)
         o(0xf900001e | r0 << 5); // str x30,[x(r0)] // __stack
 #if !defined(TCC_TARGET_MACHO)


This change fixes the issue and then renders the correct result.

_______________________________________________
Tinycc-devel mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to