https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230857

Bjoern A. Zeeb <b...@freebsd.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |d...@freebsd.org
           Keywords|                            |toolchain

--- Comment #5 from Bjoern A. Zeeb <b...@freebsd.org> ---
I'll start describing the problem from a reduced piece of code, which is not as
big as carp, replicating carpstats, assuming VIMAGE is on:

--------
#include <sys/param.h>
#include <sys/kernel.h>
#include <sys/systm.h>
#include <sys/types.h>
#include <sys/mbuf.h>
#include <sys/counter.h>
#include <net/vnet.h>

struct xstats {
        uint64_t        foo1;
        uint64_t        bar1;
        uint64_t        baz1;
        uint64_t        mad1;

        uint64_t        foo2;
        uint64_t        bar2;
        uint64_t        baz2;
        uint64_t        mad2;

        uint64_t        foo3;
        uint64_t        bar3;
        uint64_t        baz3;
        uint64_t        mad3;

        uint64_t        foo4;
        uint64_t        bar4;
        uint64_t        baz4;
        uint64_t        mad4;
};

VNET_PCPUSTAT_DEFINE(struct xstats, xstats);
VNET_PCPUSTAT_SYSINIT(xstats);
--------

This unrolls into:

 1 struct _hack;
 2 counter_u64_t   vnet_entry_xstats[sizeof(struct xstats) / sizeof(uint64_t)]
__attribute__((__section__("set_vnet"))) __attribute__((__used__));
 3
 4
 5 static void
 6 vnet_xstats_init(const void *unused)
 7 {
 8         do {
 9                 for (int i = 0; i < (sizeof((*(__typeof(vnet_entry_xstats)
*) (((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) &
vnet_entry_xstats))) / sizeof(counter_u64_t)); i++)
10                         ((*(__typeof(vnet_entry_xstats) *)
(((((__curthread())->td_vnet))->vnet_data_base) + (uintptr_t) &
vnet_entry_xstats)))[i] = counter_u64_alloc((0x0002));
11         } while (0);
12 }


In essance what looks so complicated is (on a per vnet base):
        void *array[16];
        for (i=0 ; i<16; i++)
                array[i] = alloc();


The above (with the vnet bits), on i386, is translated into:

00000340 <vnet_xstats_init>:
 340:   55                      push   %ebp
 341:   89 e5                   mov    %esp,%ebp
 343:   56                      push   %esi
 344:   50                      push   %eax
 345:   be c0 ff ff ff          mov    $0xffffffc0,%esi
 34a:   90                      nop
 34b:   90                      nop
 34c:   90                      nop
 34d:   90                      nop
 34e:   90                      nop
 34f:   90                      nop
 350:   c7 04 24 02 00 00 00    movl   $0x2,(%esp)
 357:   e8 fc ff ff ff          call   358 <vnet_xstats_init+0x18>
                        358: R_386_PC32 counter_u64_alloc
 35c:   64 8b 0d 00 00 00 00    mov    %fs:0x0,%ecx
 363:   8b 89 1c 03 00 00       mov    0x31c(%ecx),%ecx
 369:   8b 49 1c                mov    0x1c(%ecx),%ecx
 36c:   89 84 31 88 14 00 00    mov    %eax,0x1488(%ecx,%esi,1)
                        36f: R_386_RELATIVE     *ABS*
 373:   83 c6 04                add    $0x4,%esi
 376:   75 d8                   jne    350 <vnet_xstats_init+0x10>
 378:   83 c4 04                add    $0x4,%esp
 37b:   5e                      pop    %esi
 37c:   5d                      pop    %ebp
 37d:   c3                      ret


Now here's the problem:

__start_set_vnet is 0x1448
__stop_set_vnet is 0x1488


The problem is that the code generated goes like this:

%esi = -64
repeat:
%eax = alloc()
We do all the curthread->td_vnet->vnet_data_base in %ecx and then do
    mov    %eax,0x1488(%ecx,%esi,1)   
    Which is:
    move the alloc() result into curthread->td_vnet->vnet_data_base + 0x1488 +
(1 * -64)
    Now 0x1488 - 64 gets us to the beginning of the array[] or array[0].

%esi += 4
    So next iteration it'll be 0x1488 - 60 or array[1] ... and so on.
while %esi != 0 goto repeat;


It's an easy way to to the for(i=0; i<16; i++) loop.

The problem is that 0x1488 was not relocated.

When we are going over the relocations and calling into elf_relocaddr() the
check for VIMAGE is:

      if (x >= ef->vnet_start && x < ef->vnet_stop) {

In our case we have an *ABS* R_386_RELATIVE of 0x1488, which is ==
ef->vnet_stop but not < ef->vnet_stop.


The real problem is that with non-simple-types the code generated with an
absolute relocation might be just outside the range.  We cannot adjust the
check as there might be a simple-type following in the next section which would
then be relocated.



For CARP this showed up because the VNET_PCPUSTAT_DEFINE() went into the VNET
section the last and hence the problem showed up.  If there was any other, say
int V_Foo after it, we'd never have noticed.   We cannot fully control the
order in which symbols go into the section, or at least not to the extend we'd
like to, so we cannot make sure there's always a char at the end.

The only solution arichardson and I agreed to would be to add 1 byte of padding
to the end of the section.


Using BYTE(1) in a linker script however would always create a set_vnet section
in all kernel modules even if they do not have any virtualized objects.

Using a
https://sourceware.org/binutils/docs-2.31/ld/Output-Section-Discarding.html#Output-Section-Discarding
. = . + <0|sym>; should not create the section.

This leads to a multi-stage solution:

(1) add a symbol based on SIZEOF(set_vnet) which is either 0 or 1.   SIZEOF()
does not create a section if it does not exist.
(2) pass the *(set_vnet) through and then increment "." by the amount of the
symbol from (1); which if there is a section it will increase the section by 1
byte and any non-simple-type objects resulting in absolute relocations on the
boundary should also be relocated.  If there is no section . = . + 0; will not
create the empty section.

The problem is that the symbol from (1) relies on the section size to be known
which we don't seem to in the first pass.
The second problem is that the symbol from (1) is not immediately visible
(despite it should me; XXX sourceware reference for that)

So we really have to do multiple linker invocations with different linker
scripts.

With ld.bfd an empty set_vnet section will not show up, only the symbol from
(1) will be left behind and that's fine.
With ld.lld 6 it will also leave an empty set_vnet section behind;  emaste
points me to https://reviews.llvm.org/rL325887 where this had been under
discussion already with dim, him, and lld developers.   It's not beautiful, but
should not harm either (XXX to be tested).


We need to exclude "firmware" from the dance and for now make it i386 specific.

The entire problem described above for VNET also applies to DPCPU (set_pcpu).

I'll attached a preliminary patch which seems to hack this together and once I
added a bit more description etc to the files, I'll put it into Phab as well
(if you have comments earlier let me know).



I hope this all kind-of makes any sense ;-)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Reply via email to