https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230857
Bjoern A. Zeeb <b...@freebsd.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |d...@freebsd.org Keywords| |toolchain --- Comment #5 from Bjoern A. Zeeb <b...@freebsd.org> --- I'll start describing the problem from a reduced piece of code, which is not as big as carp, replicating carpstats, assuming VIMAGE is on: -------- #include <sys/param.h> #include <sys/kernel.h> #include <sys/systm.h> #include <sys/types.h> #include <sys/mbuf.h> #include <sys/counter.h> #include <net/vnet.h> struct xstats { uint64_t foo1; uint64_t bar1; uint64_t baz1; uint64_t mad1; uint64_t foo2; uint64_t bar2; uint64_t baz2; uint64_t mad2; uint64_t foo3; uint64_t bar3; uint64_t baz3; uint64_t mad3; uint64_t foo4; uint64_t bar4; uint64_t baz4; uint64_t mad4; }; VNET_PCPUSTAT_DEFINE(struct xstats, xstats); VNET_PCPUSTAT_SYSINIT(xstats); -------- This unrolls into: 1 struct _hack; 2 counter_u64_t vnet_entry_xstats[sizeof(struct xstats) / sizeof(uint64_t)] __attribute__((__section__("set_vnet"))) __attribute__((__used__)); 3 4 5 static void 6 vnet_xstats_init(const void *unused) 7 { 8 do { 9 for (int i = 0; i < (sizeof((*(__typeof(vnet_entry_xstats) *) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) & vnet_entry_xstats))) / sizeof(counter_u64_t)); i++) 10 ((*(__typeof(vnet_entry_xstats) *) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) & vnet_entry_xstats)))[i] = counter_u64_alloc((0x0002)); 11 } while (0); 12 } In essance what looks so complicated is (on a per vnet base): void *array[16]; for (i=0 ; i<16; i++) array[i] = alloc(); The above (with the vnet bits), on i386, is translated into: 00000340 <vnet_xstats_init>: 340: 55 push %ebp 341: 89 e5 mov %esp,%ebp 343: 56 push %esi 344: 50 push %eax 345: be c0 ff ff ff mov $0xffffffc0,%esi 34a: 90 nop 34b: 90 nop 34c: 90 nop 34d: 90 nop 34e: 90 nop 34f: 90 nop 350: c7 04 24 02 00 00 00 movl $0x2,(%esp) 357: e8 fc ff ff ff call 358 <vnet_xstats_init+0x18> 358: R_386_PC32 counter_u64_alloc 35c: 64 8b 0d 00 00 00 00 mov %fs:0x0,%ecx 363: 8b 89 1c 03 00 00 mov 0x31c(%ecx),%ecx 369: 8b 49 1c mov 0x1c(%ecx),%ecx 36c: 89 84 31 88 14 00 00 mov %eax,0x1488(%ecx,%esi,1) 36f: R_386_RELATIVE *ABS* 373: 83 c6 04 add $0x4,%esi 376: 75 d8 jne 350 <vnet_xstats_init+0x10> 378: 83 c4 04 add $0x4,%esp 37b: 5e pop %esi 37c: 5d pop %ebp 37d: c3 ret Now here's the problem: __start_set_vnet is 0x1448 __stop_set_vnet is 0x1488 The problem is that the code generated goes like this: %esi = -64 repeat: %eax = alloc() We do all the curthread->td_vnet->vnet_data_base in %ecx and then do mov %eax,0x1488(%ecx,%esi,1) Which is: move the alloc() result into curthread->td_vnet->vnet_data_base + 0x1488 + (1 * -64) Now 0x1488 - 64 gets us to the beginning of the array[] or array[0]. %esi += 4 So next iteration it'll be 0x1488 - 60 or array[1] ... and so on. while %esi != 0 goto repeat; It's an easy way to to the for(i=0; i<16; i++) loop. The problem is that 0x1488 was not relocated. When we are going over the relocations and calling into elf_relocaddr() the check for VIMAGE is: if (x >= ef->vnet_start && x < ef->vnet_stop) { In our case we have an *ABS* R_386_RELATIVE of 0x1488, which is == ef->vnet_stop but not < ef->vnet_stop. The real problem is that with non-simple-types the code generated with an absolute relocation might be just outside the range. We cannot adjust the check as there might be a simple-type following in the next section which would then be relocated. For CARP this showed up because the VNET_PCPUSTAT_DEFINE() went into the VNET section the last and hence the problem showed up. If there was any other, say int V_Foo after it, we'd never have noticed. We cannot fully control the order in which symbols go into the section, or at least not to the extend we'd like to, so we cannot make sure there's always a char at the end. The only solution arichardson and I agreed to would be to add 1 byte of padding to the end of the section. Using BYTE(1) in a linker script however would always create a set_vnet section in all kernel modules even if they do not have any virtualized objects. Using a https://sourceware.org/binutils/docs-2.31/ld/Output-Section-Discarding.html#Output-Section-Discarding . = . + <0|sym>; should not create the section. This leads to a multi-stage solution: (1) add a symbol based on SIZEOF(set_vnet) which is either 0 or 1. SIZEOF() does not create a section if it does not exist. (2) pass the *(set_vnet) through and then increment "." by the amount of the symbol from (1); which if there is a section it will increase the section by 1 byte and any non-simple-type objects resulting in absolute relocations on the boundary should also be relocated. If there is no section . = . + 0; will not create the empty section. The problem is that the symbol from (1) relies on the section size to be known which we don't seem to in the first pass. The second problem is that the symbol from (1) is not immediately visible (despite it should me; XXX sourceware reference for that) So we really have to do multiple linker invocations with different linker scripts. With ld.bfd an empty set_vnet section will not show up, only the symbol from (1) will be left behind and that's fine. With ld.lld 6 it will also leave an empty set_vnet section behind; emaste points me to https://reviews.llvm.org/rL325887 where this had been under discussion already with dim, him, and lld developers. It's not beautiful, but should not harm either (XXX to be tested). We need to exclude "firmware" from the dance and for now make it i386 specific. The entire problem described above for VNET also applies to DPCPU (set_pcpu). I'll attached a preliminary patch which seems to hack this together and once I added a bit more description etc to the files, I'll put it into Phab as well (if you have comments earlier let me know). I hope this all kind-of makes any sense ;-) -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ freebsd-toolchain@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"