Hi Janneke, I just found the bug in tinycc that caused failed our ARM bootstrap to fail.
I use the following reproducer: int main() { double f = 1.0; return 0; } and then invoke tcc a.c on ARM (32) using your patched tcc. (but it's also broken in the unpatched tcc) (tcc cross compiler is not enough. tcc has to actually itself be an ARM EABI executable) I get a bus error here: │ 0x24698 <init_putv+1688> vstr d0, [r0] │ Debugging some more, I find: tccgen.c: /* store a value or an expression directly in global data or in local array */ static void init_putv(CType *type, Section *sec, unsigned long c) { [...] size = type_size(type, &align); section_reserve(sec, c + size); // c == 0, size == 8 ptr = sec->data + c; // sec->data == 0x24b01e, c == 0 switch(bt) { /* XXX: when cross-compiling we assume that each type has the same representation on host and target, which is likely to be wrong in the case of long double */ case VT_BOOL: vtop->c.i = vtop->c.i != 0; case VT_BYTE: *(char *)ptr = vtop->c.i; break; case VT_SHORT: *(short *)ptr = vtop->c.i; break; case VT_FLOAT: *(float*)ptr = vtop->c.f; break; case VT_DOUBLE: *(double *)ptr = vtop->c.d; break; [... and so on] tccelf.c: /* reserve at least 'size' bytes from section start */ ST_FUNC void section_reserve(Section *sec, unsigned long size) { if (size > sec->data_allocated) // both 8 section_realloc(sec, size); if (size > sec->data_offset) // both 8 sec->data_offset = size; } Nothing here make sure that the VFP double is aligned to 8 Byte. And indeed, (0x24b01e % 8) == 6, not 0. Alignment could be disabled on the CPU https://developer.arm.com/documentation/ddi0464/f/system-control/register-descriptions/system-control-register but I don't think EABI wants that. tinycc does have: /* reserve at least 'size' bytes aligned per 'align' in section 'sec' from current offset, and return the aligned offset */ ST_FUNC size_t section_add(Section *sec, addr_t size, int align) { size_t offset, offset1; offset = (sec->data_offset + align - 1) & -align; offset1 = offset + size; if (sec->sh_type != SHT_NOBITS && offset1 > sec->data_allocated) section_realloc(sec, offset1); sec->data_offset = offset1; if (align > sec->sh_addralign) sec->sh_addralign = align; return offset; } But that's not used for init_putv. And section_reserve, which is used, doesn't care about alignment at all. (it seems there's a reserved part and a data part in each section, and it holds that the data part elements are aligned--but the reserved part elements are NOT aligned. I don't see how sec->data would be aligned by the dynamic memory allocator either) Other notes: tccgen.c even has this: > /* XXX: when cross-compiling we assume that each type has the > same representation on host and target, which is likely to > be wrong in the case of long double */ Yeah, and even when NOT cross-compiling, the alignment is wrong--which means it sometimes won't work at all on ARM, depending on luck. As a workaround, we can patch tcc to instead do the assignments on elements on the stack and then copy those over, instead of doing *(double *)ptr = vtop->c.d (the latter of which emits VFP instructions that expect double-aligned pointers).
pgp513JL8bZDS.pgp
Description: OpenPGP digital signature