Hi Andreas, On Mon, 13 May 2013 10:35:12 +0200, "Andreas Bießmann" <andreas.de...@googlemail.com> wrote:
> Hi Albert, > > On 05/10/2013 05:09 PM, Albert ARIBAUD wrote: > > Hi Andreas, > > > > On Fri, 10 May 2013 12:57:21 +0200, "Andreas Bießmann" > > <andreas.de...@googlemail.com> wrote: > > > >> Hi Albert, > >> > >> On 05/10/2013 11:24 AM, Albert ARIBAUD wrote: > >>> Hi Andreas, > >>> > >>> On Wed, 8 May 2013 11:25:17 +0200, Andreas Bießmann > >>> <andreas.de...@googlemail.com> wrote: > >>> > >>>> Commit 1865286466a5d0c7f2e3c37632da56556c838e9e (Introduce generic link > >>>> section.h symbol files) changed the __bss_end symbol type from char[] to > >>>> ulong. This led to wrong relocation parameters which ended up in a not > >>>> working > >>>> u-boot. Unfortunately this is not clear to see cause due to RAM aliasing > >>>> we > >>>> may get a 'half-working' u-boot then. > >>>> > >>>> Fix this by dereferencing the __bss_end symbol where needed. > >>> > >>> (cc:ing Simon and Tom) > >>> > >>> The dereferencing is correct, so this patch seems good per se (it could > >>> actually have applied when __bss_end was still a char[]). > >> > >> well, as I understood this the __bss_end being a char[] did implicitly > >> take the address when accessing __bss_end (as we do when we have a > >> definition of char foo[2] and we take just 'foo'). But you say here we > >> should reference the address of __bss_end while it was still of type > >> char[]. Sorry, I do not understand that, can you please clarify? > > > > There are several concepts here, some pertaining to the compiler, some > > to the linker. > > > > From the linker viewpoint, a symbol is *always* and *only* an address, > > the first address of the object corresponding to the symbol, and an > > object is just some area in the addressable space. > > > > From the compiler viewpoint, an object has a C type, possibly with an > > initial value, and a name, which is the symbol. The compiler considers > > the name/symbol to be value, not the address of the corresponding > > object... at least most of the time: as you indicate, when the symbol > > denotes a C array, then the C compiler understand the symbol as the > > address of the array. > > > > The __bss_end symbol does not actually correspond to an object in the > > usual sense, since the BSS contains all sorts of data: any C global, > > either uninitialized or initialized with zeroes, whatever its type, > > ends up in BSS. The most general way one can assign a type to BSS > > itself is by considering it as a shapeless array of bytes -- hence the > > char[] definition. > > > > Thus, the C compiler considered the name __bss_end to denote the > > address of the BSS "object", and the C code for AVR32 was correct as it > > was actually referring to the BSS "object"'s address. > > > > When the __bss_end symbol's C type was changed to 'ulong', this changed > > the way the compiler understood the symbol: it now thinks __bss_end is > > the BSS' "value", which has no true sense, and certainly does not mean > > 'the first 4 bytes of BSS considered as a 32-bit value'. > > > > To compensate this, the AVR32 code has to add an & to find the address > > of __bss_end, but the original error is to have changed the "type" of > > the BSS. > > > > IOW, we should *always* take the address of __bss_end, since this is > > the only thing it was defined for. We should never give it a chance to > > even *have* a value at the C level, because we don't want to read, or > > worse, write, the BSS itself; we only want to access C globals in the > > BSS. > > thank you for your detailed explanation. So now its clear why referring > the address of an object of type char[] will also work. > Another question, wouldn't it make sense to declare these C globals as > const then? Indeed, const may help prevent these symbols from being accidentally written into, at least in the most expected cases such as passing &__bss_end to a function expecting a non-const char*. There is, however, a much better way of preventing this and more: just give these symbols a C type of 'struct {}' (empty struct). Since this type has absolutely no field which could be written into or read from, it is completely protected from direct write but also from direct read; and a pointer to it has type "struct {} *" which is incompatible with any other pointer, so any inconsiderate use of it is detected at compile time. I had thought of the 'struct {}' method for linker lists refactoring, when I needed a zero-size type; I finally turned to char[0] instead (and comment on this at line 150 of include/linker_lists.h) because the struct method would cause gcc 4.4 and earlier, such as eldk42, to throw diagnostics like "warning: dereferencing type-punned pointer will break strict-aliasing rules" -- that is the incompatibility I am talking about. Note that the diagnostics did not stem from the empty struct variable declarations as such, but type-casting the address of an empty struct into a pointer to a known, non-empty, struct; I just checked now, and doing an intermediate cast to char* or void* prevents the warnings. Why I failed to find this when I was refactoring linker lists, I'll never know. Of course, there is always a possibility that the area around BSS end be accidentally read or written into if &__bss_end is force-cast into, say, a char*, but that's true whatever type you give __bss_end. Aside from this force-casting, the empty struct approach is IMO the most protective. Thus if one wants to protect symbols in sections.h, I suggest giving them a type of 'struct {}', then building U-Boot with e.g. ELDK42, and fixing all type-punning diagnostics. (and I will 'fix' include/linker_lists.h myself to use empty structs) > Best regards > > Andreas Bießmann Amicalement, -- Albert. _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot