[I am again breaking off another section of older material.]

Mixed news I'm afraid.

The specific couple of ports that I attempted did build, the same ones that 
originally got the Bus Error in ar using (indirectly) _fseeko and memset that I 
reported. So I expect that you fixed one error.

But when I tried to buildworld, clang++ 3.7 processing 
usr/src/lib/clang/libllvmtablegen/ materials quickly got a Bus Error at nearly 
the same type of instruction (it has a "!" below that the earlier one did not), 
but with r4 holding the misaligned address this time:

> --- _bootstrap-tools-lib/clang/libllvmsupport ---
> --- APFloat.o ---
> clang++: error: unable to execute command: Bus error (core dumped)
> . . .
> # gdb clang++ usr/src/lib/clang/libllvmtablegen/clang++.core
> . . .
> Core was generated by `clang++'.
> Program terminated with signal 10, Bus error.
> #0  0x00c3bb9c in 
> clang::DependentTemplateSpecializationType::DependentTemplateSpecializationType
>  ()
> [New Thread 22a18000 (LWP 100128/<unknown>)]
> (gdb) x/40i 0x00c3bb60
> . . .
> 0xc3bb9c 
> <_ZN5clang35DependentTemplateSpecializationTypeC2ENS_21ElaboratedTypeKeywordEPNS_19NestedNameSpecifierEPKNS_14IdentifierInfoEjPKNS_16TemplateArgumentENS_8QualTypeE+356>:
>     
>     vst1.64   {d16-d17}, [r4]!
> . . .
> (gdb) info all-registers
> r0             0xbfbf81a8     -1077968472
> r1             0x22f07e14     586186260
> r2             0xc416bc       12850876
> r3             0x2    2
> r4             0x22f07dfc     586186236
> . . .


Thus it appears that there is more code around that likely generates pointers 
not aligned so to allow the code generation that is in use for what is pointed 
to.

At this point I have no clue if the issue is just inside clang itself vs. if it 
is in something that clang is layered on top of. Nor if there is just one bad 
thing or many.

Note: I had not yet tried buildworld/buildkernel for the context of the "-f" 
option that I was experimenting with earlier. So I do not have a direct compare 
and contrast at this point.



Older material:

On 2015-Dec-25, at 5:21 PM, Mark Millard <mar...@dsl-only.net> wrote:

> On 2015-Dec-25, at 3:42 PM, Warner Losh <i...@bsdimp.com> wrote:
> 
> 
>> On Dec 25, 2015, at 3:14 PM, Mark Millard <mar...@dsl-only.net> wrote:
>> 
>> [I'm going to break much of the earlier "original material" text to tail of 
>> the message.]
>> 
>>> On 2015-Dec-25, at 11:53 AM, Warner Losh <i...@bsdimp.com> wrote:
>>> 
>>> So what happens if we actually fix the underlying bug?
>>> 
>>> I see two ways of doing this. In findfp.c, we allocate an array of FILE * 
>>> today like:
>>>     g = (struct glue *)malloc(sizeof(*g) + ALIGNBYTES + n * sizeof(FILE));
>>> but that assumes that FILE just has normal pointer alignment requirements. 
>>> However,
>>> due to the mbstate having int64_t alignment requirements, this is wrong. 
>>> Maybe we
>>> need to do something like
>>>     g = (struct glue *)malloc(sizeof(*g) + max(sizeof(int64_t),ALIGNBYTES) 
>>> + n * sizeof(FILE));
>>> which wouldn’t change anything on LP64 systems, but would result in proper 
>>> alignment
>>> for ILP32 systems. We’d have to fix the loop that uses ALIGN afterwards to 
>>> use
>>> roundup. Instead, we’d need to round up to the neared 8-byte aligned offset 
>>> (or technically,
>>> the max of ALIGNBYTES and 8, but that’s always 8 on today’s systems. If we 
>>> do this,
>>> we can make sure that each file is 8-byte aligned or better. We may need to 
>>> round up
>>> sizeof(FILE) to a multiple of 8 as well. I believe that since it has the 
>>> 8-byte alignment
>>> for a member, its size must be a multiple of 8, but I’ve not chased that 
>>> belief to ground.
>>> If not, we may need another decorator (__aligned(8), I think, spelled with 
>>> the ugly
>>> max expression above). That way, the contract we’re making with the 
>>> compiler will
>>> always be true. ALIGN BYTES is 4 on Arm anyway, so that bit is clearly 
>>> wrong.
>>> 
>>> This wouldn’t be an ABI change, since you can only get a valid FILE * from 
>>> fopen (and
>>> friends), plus stdin, stdout, and stderr. Those addresses aren’t hard coded 
>>> into binaries,
>>> so even if we have to tweak the last three and deal with some ‘fake’ FILE 
>>> abuse in libc
>>> (which I don’t think suffers from this issue, btw, given the alignment 
>>> requirements that would
>>> naturally follow from something on the stack), we’d still be ahead. At 
>>> least for all CONFORMING
>>> implementations[*]...
>>> 
>>> TL;DR: Why not make FILE * always 8-byte aligned? The compiler options are 
>>> a band-aide.
>>> 
>>> Warner
>>> 
>>> [*] There’s at least on popular package that has a copy of the FILE 
>>> structure in one of its
>>> .h files and uses that to do unnatural optimization things, but even that’s 
>>> cool, I think,
>>> since it never allocates a new one.
>>> 
>> 
>> The ARM documentation mentions cases of 16 byte alignment requirements. I've 
>> no clue if the clang code generation ever creates such code. There might be 
>> wider requirements possible in arm code as well. (I'm not an arm expert.) As 
>> an example of an implication: "The malloc() function returns a pointer to a 
>> block of at least size bytes suitably aligned for any use." In other words: 
>> aligned to some figure that is a multiple of *every* alignment requirement 
>> that the code generator can produce, possibly being the least common 
>> multiple.
>> 
>> "-fmax-type-align=. . ." is a means of controlling/limiting the range of 
>> potential alignments to no more than a fixed, predefined value. Above that 
>> and the code generation has to work in small size accesses and 
>> build-up/split-up bigger values. Using "-fmax-type-align=. . ." allows 
>> defining a figure as part of an ABI that is then not subject to code 
>> generator updates that could increase the maximum alignment figure and break 
>> things: It turns off such new capabilities. Other options need not work that 
>> way to preserve the ABI.
> 
> That’s true, as far as it goes… But I’m not sure it goes far enough. The 
> premise here is that the problem is wide-spread, when in fact I think it is 
> quite narrow.
> 
>> But in the most fundamental terms process wise as far as I can tell. . .
>> 
>> While the FILE case that occurred is a specific example, every 
>> memory-allocation-like operation is at a potential issue for all such 
>> "allocated" objects where the related code generation requires alignment to 
>> avoid Bus Error (given the SCTLR bit[1] in use).
> 
> The problem isn’t general. The problem isn’t malloc. Malloc will generally 
> return the right thing on arm (and if it doesn’t,
> then we need to make sure it does).
> 
> The problem is we get a boatload of FILEs from the system all at once, and 
> those are misaligned because of a bug in the code. One that’s fixed, I 
> believe, in https://reviews.freebsd.org/D4708.
> 
> 
>> How many other places in FreeBSD might sometimes return mis-aligned pointers 
>> for the existing code generation and ABI combination?
> 
> It isn’t an ABI thing, just a code bug thing. The only reason it was an issue 
> was due to the optimizing nature of clang.
> 
> We’ve had to deal with the arm alignment issues for years. I wager there are 
> very few indeed. The only reason this was was brought to light was better 
> code-gen from clang.
> 
>> How many other places are subject to breakage when "internal" 
>> structs/unions/fields involved are changed to be of a different size because 
>> the code is not fully auto-adjusting to match the code generation yet --even 
>> if right now "it works"? How fragile will things be for future work?
> 
> If there are others, I’ll bet they could be counted on one hand since very 
> few things do the ‘slab’ allocator that FILE does.
> 
>> What would it take to find out and deal with them all? (I do not have the 
>> background knowledge to span much.)
>> 
>> My experiment avoided potentially changing parts of the ABI and also avoided 
>> dealing with such a "lots of code to investigate" issue. It may not be the 
>> long term 11.0-RELEASE solution. Even if not, it may be appropriate for 
>> various temporary purposes that need to avoid Bus Errors in the process. For 
>> example if Ian has a good reason to use clang 3.7 instead of gcc 4.2.1.
> 
> The review above doesn’t change the ABI either.
> 
>> Other notes:
>> 
>>> I believe that since it has the 8-byte alignment
>>> for a member, its size must be a multiple of 8
>> 
>> There are some C/C++ language rules about the address of a structure 
>> equalling the address of the first field, uniformity of the offsets, and the 
>> like. But. . .
>> 
>> The C and C++ languages specify no specific numerical alignment figures, not 
>> even relative to specific sizeof(...) expressions. To use an old example: a 
>> 68010 only needs alignment for >= 2 byte things and even alignment is all 
>> that is then required. Some other contexts take a lot more to meet the 
>> specifications. There are some implications of the modern memory model(s) 
>> created to cover concurrency explicitly, such as avoiding interactions that 
>> can happen via, for example, separate objects (in part) sharing a cache 
>> line. (I've only looked at C++ for this, and only to a degree.)
>> 
>> The detailed alignment rules are more "implementation defined" than 
>> "predefined by the standard". But the definition is trying to meet language 
>> criteria. It is not a fully independent choice.
> 
> Many of them are actually defined by a combination of the standard language 
> definition, as well as the ABI standard. This is why we know that mbstate_t 
> must be 8 byte aligned.
> 
>> May be some other standards that FreeBSD is tied to specify more specifics, 
>> such as a N byte integer always aligns to some multiple of N (a waste on the 
>> 68010), including the alignment for union or struct that it may be a part of 
>> tracking. But such rules force padding that may or may not be required to 
>> meet the language's more abstract criteria and such rules may not match the 
>> existing/in-use ABI.
> 
> It is all spelled out in the ARM EABI docs.
> 
>> So far as I can tell explicitly declared alignments may well be necessary. 
>> If that one "popular package", say, formed an array of FILE copies then the 
>> resultant alignments need not all match the ones produced by your example 
>> code unless the FILE declaration forces the compiler to match, causing 
>> sizeof(FILE) to track as well. FILE need not be the only such issue.
> 
> Arrays of FILEs isn’t an issue (except that it encodes the size of FILE into 
> the app). It’s the specifically quirky way that libc does it that’s the 
> problem.
> 
>> My background and reference material are mostly tied the languages --and so 
>> my notes tend to be limited to that much context.
> 
> Understood. While there may be issues with alignment still, tossing a big 
> hammer at the problem because they might exist will likely mean they will 
> persist far longer than fixing them one at a time. When we first ported to 
> arm, there were maybe half a dozen places that needed fixing. I doubt there’s 
> more now.
> 
> Can you try the patch in the above code review w/o the -f switch and let me 
> know if it works for you?
> 
> Warner

buildworld/buildkernel has been started on amd64 for a rpi2 target. That and 
install kernel/world and starting up a port rebuild on the rpi2 and waiting for 
it means it will be a few hours even if I start the next thing just as each 
prior thing finishes. I may give up and go to sleep first.

As for presumptions: I'll take your word on expected status of things. I've no 
clue. But absent even the hear-say status information at the time I did not 
presume that what was in front of me was all there is to worry about --nor did 
I try to go figure it all out on my own. I took a path to cover both 
possibilities for local-only vs. more-wide-spread (so long as that path did not 
force a split-up of some larger form of atomic action).

In my view "-mno-unaligned-access" is an even bigger hammer than I used. I find 
no clang statement about what its ABI consequences would be, unlike for what I 
did: What mix of more padding for alignment vs. more but smaller accesses? But 
as I remember I've seen "-mno-unaligned-access" in use in ports and the like so 
its consequences may be familiar material for some folks.

Absent any questions about ABI consequences "-mno-unaligned-access" does well 
mark the expected SCTLR bit[1] status, far better than what I did. Again: I was 
covering my ignorance while making any significant investigation/debugging as 
unlikely as I could.


> Original material:
> 
>> On Dec 25, 2015, at 7:24 AM, Mark Millard <mar...@dsl-only.net> wrote:
>> 
>> [Good News Summary: Rebuilding buildworld/buildkernel for rpi2 11.0-CURRENT 
>> 292413 from amd64 based on adding -fmax-type-align=4 has so far removed the 
>> crashes during the toolchain activity: no more misaligned accesses in libc's 
>> _fseeko or elsewhere.]
>> 
>> On 2015-Dec-25, at 12:31 AM, Mark Millard <mar...@dsl-only.net> wrote:
>> 
>>> On 2015-Dec-24, at 10:39 PM, Mark Millard <mar...@dsl-only.net> wrote:
>>> 
>>>> [I do not know if this partial crash analysis related to on-arm 
>>>> clang-associated activity is good enough and appropriate to submit or not.]
>>>> 
>>>> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below came 
>>>> from pkg install activity instead of port building. Used as-is.
>>>> 
>>>> When I just tried my first from-rpi2b builds (ports for a rpi2b), 
>>>> /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the 
>>>> following suggests an alignment error for the type of instructions that 
>>>> memset for 128 bytes was translated to (sizeof(mbstate_t)) in the code 
>>>> used by /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to 
>>>> check SCTLR bit[1] to be directly sure that alignment was being enforced.)
>>>> 
>>>> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar :
>>>> 
>>>>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru 
>>>>> .libs/libgnuintl.a  bindtextdom.o dcgettext.o dgettext.o gettext.o 
>>>>> finddomain.o hash-string.o loadmsgcat.o localealias.o textdomain.o 
>>>>> l10nflist.o explodename.o dcigettext.o dcngettext.o dngettext.o 
>>>>> ngettext.o pluralx.o plural-exp.o localcharset.o threadlib.o lock.o 
>>>>> relocatable.o langprefs.o localename.o log.o printf.o setlocale.o 
>>>>> version.o xsize.o osdep.o intl-compat.o
>>>>> Bus error (core dumped)
>>>>> *** [libgnuintl.la] Error code 138
>>>> 
>>>> It failed in _fseeko doing a memset that turned into uses of "vst1.64      
>>>> {d16-d17}, [r0]" instructions, for an address in register r0 that ended in 
>>>> 0xa4, so was not aligned to 8 byte boundaries. From what I read such "VSTn 
>>>> (multiple n-element structures)" that have .64 require 8 byte alignment. 
>>>> The evidence of the code and register value follow.
>>>> 
>>>>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar 
>>>>> /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gettext-tools/intl/ar.core
>>>>> . . .
>>>>> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
>>>>> whence=<value optimized out>, ltest=<value optimized out>) at 
>>>>> /usr/src/lib/libc/stdio/fseek.c:299
>>>>> 299               memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>>>>> . . .
>>>>> (gdb) x/24i 0x2033adb0
>>>>> 0x2033adb0 <_fseeko+836>: vmov.i32        q8, #0  ; 0x00000000
>>>>> 0x2033adb4 <_fseeko+840>: movw    r1, #65503      ; 0xffdf
>>>>> 0x2033adb8 <_fseeko+844>: stm     r4, {r0, r7}
>>>>> 0x2033adbc <_fseeko+848>: ldrh    r0, [r4, #12]
>>>>> 0x2033adc0 <_fseeko+852>: and     r0, r0, r1
>>>>> 0x2033adc4 <_fseeko+856>: strh    r0, [r4, #12]
>>>>> 0x2033adc8 <_fseeko+860>: add     r0, r4, #216    ; 0xd8
>>>>> 0x2033adcc <_fseeko+864>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033add0 <_fseeko+868>: add     r0, r4, #200    ; 0xc8
>>>>> 0x2033add4 <_fseeko+872>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033add8 <_fseeko+876>: add     r0, r4, #184    ; 0xb8
>>>>> 0x2033addc <_fseeko+880>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033ade0 <_fseeko+884>: add     r0, r4, #168    ; 0xa8
>>>>> 0x2033ade4 <_fseeko+888>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033ade8 <_fseeko+892>: add     r0, r4, #152    ; 0x98
>>>>> 0x2033adec <_fseeko+896>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033adf0 <_fseeko+900>: add     r0, r4, #136    ; 0x88
>>>>> 0x2033adf4 <_fseeko+904>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033adf8 <_fseeko+908>: add     r0, r4, #120    ; 0x78
>>>>> 0x2033adfc <_fseeko+912>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033ae00 <_fseeko+916>: add     r0, r4, #104    ; 0x68
>>>>> 0x2033ae04 <_fseeko+920>: vst1.64 {d16-d17}, [r0]
>>>>> 0x2033ae08 <_fseeko+924>: b       0x2033b070 <_fseeko+1540>
>>>>> 0x2033ae0c <_fseeko+928>: cmp     r5, #0  ; 0x0
>>>>> (gdb) info all-registers
>>>>> r0             0x20651ea4 543497892
>>>>> r1             0xffdf     65503
>>>>> r2             0x0        0
>>>>> r3             0x0        0
>>>>> r4             0x20651dcc 543497676
>>>>> r5             0x0        0
>>>>> r6             0x0        0
>>>>> r7             0x0        0
>>>>> r8             0x20359df4 540384756
>>>>> r9             0x0        0
>>>>> r10            0x0        0
>>>>> r11            0xbfbfb948 -1077954232
>>>>> r12            0x2037b208 540520968
>>>>> sp             0xbfbfb898 -1077954408
>>>>> lr             0x2035a004 540385284
>>>>> pc             0x2033adcc 540257740
>>>>> f0             0  (raw 0x000000000000000000000000)
>>>>> f1             0  (raw 0x000000000000000000000000)
>>>>> f2             0  (raw 0x000000000000000000000000)
>>>>> f3             0  (raw 0x000000000000000000000000)
>>>>> f4             0  (raw 0x000000000000000000000000)
>>>>> f5             0  (raw 0x000000000000000000000000)
>>>>> f6             0  (raw 0x000000000000000000000000)
>>>>> f7             0  (raw 0x000000000000000000000000)
>>>>> fps            0x0        0
>>>>> cpsr           0x60000010 1610612752
>>>> 
>>>> The syntax in use for vst1.64 instructions does not explicitly have the 
>>>> alignment notation. Presuming that the decoding is correct then from what 
>>>> I read the following applies:
>>>> 
>>>>> Home > NEON and VFP Programming > NEON load and store element and 
>>>>> structure instructions > Alignment restrictions in load and store, 
>>>>> element and structure instructions
>>>>> 
>>>>> . . . When the alignment is not specified in the instruction, the 
>>>>> alignment restriction is controlled by the A bit (SCTLR bit[1]):
>>>>>   •       if the A bit is 0, there are no alignment restrictions (except 
>>>>> for strongly ordered or device memory, where accesses must be element 
>>>>> aligned or the result is unpredictable)
>>>>>   •       if the A bit is 1, accesses must be element aligned.
>>>>> If an address is not correctly aligned, an alignment fault occurs.
>>>> 
>>>> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error would 
>>>> have the context to happen because of the mis-alignment.
>>>> 
>>>> The following shows the make.conf context that explains how 
>>>> /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
>>>> 
>>>>> # more /etc/make.conf
>>>>> WRKDIRPREFIX=/usr/obj/portswork
>>>>> WITH_DEBUG=
>>>>> WITH_DEBUG_FILES=
>>>>> MALLOC_PRODUCTION=
>>>>> #
>>>>> TO_TYPE=armv6
>>>>> TOOLS_TO_TYPE=arm-gnueabi
>>>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>>>> .if ${.MAKE.LEVEL} == 0
>>>>> CC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>>>>> CXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>>>>> CPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>>>> -march=armv7a
>>>>> .export CC
>>>>> .export CXX
>>>>> .export CPP
>>>>> AS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>>>> AR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>>>> LD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>>>> NM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>>>> OBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>>>> OBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>>>> RANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>>>> SIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>>>> STRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>>>> .export AS
>>>>> .export AR
>>>>> .export LD
>>>>> .export NM
>>>>> .export OBJCOPY
>>>>> .export OBJDUMP
>>>>> .export RANLIB
>>>>> .export SIZE
>>>>> .export STRINGS
>>>>> .endif
>>>> 
>>>> 
>>>> Other context:
>>>> 
>>>>> # freebsd-version -ku; uname -aKU
>>>>> 11.0-CURRENT
>>>>> 11.0-CURRENT
>>>>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec 22 
>>>>> 22:02:21 PST 2015     
>>>>> root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm 
>>>>> 1100091 1100091
>>>> 
>>>> 
>>>> 
>>>> I will note that world and kernel are my own build of -r292413 (earlier 
>>>> experiment) --a build made from an amd64 host context and put in place via 
>>>> DESTDIR=. My expectation would be that the amd64 context would not be 
>>>> likely to have similar alignment restrictions involved in its ar activity 
>>>> (or other activity). That would explain how I got this far using such a 
>>>> clang 3.7 related toolchain for targeting an rpi2 before finding such a 
>>>> problem.
>>> 
>>> 
>>> I realized re-reading the all above that it seems to suggest that the 
>>> _fseeko code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but 
>>> that was not my intent.
>>> 
>>> libc.so.7 is from my buildworld, including the fseeko implementation:
>>> 
>>> Reading symbols from /lib/libc.so.7...Reading symbols from 
>>> /usr/lib/debug//lib/libc.so.7.debug...done.
>>> done.
>>> Loaded symbols for /lib/libc.so.7
>>> 
>>> 
>>> head/sys/sys/_types.h has:
>>> 
>>> /*
>>> * mbstate_t is an opaque object to keep conversion state during multibyte
>>> * stream conversions.
>>> */
>>> typedef union {
>>>   char            __mbstate8[128];
>>>   __int64_t       _mbstateL;      /* for alignment */
>>> } __mbstate_t;
>>> 
>>> suggesting an implicit alignment of the union to whatever the 
>>> implementation defines for __int64_t --which need not be 8 byte alignment 
>>> (in the abstract, general case). But 8 byte alignment is a possibility as 
>>> well (in the abstract).
>>> 
>>> But printing *fp in gdb for the fp argument to _fseeko reports the same 
>>> not-8-byte aligned address for __mbstate8 that was in r0:
>>> 
>>>> (gdb) bt
>>>> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
>>>> whence=<value optimized out>, ltest=<value optimized out>) at 
>>>> /usr/src/lib/libc/stdio/fseek.c:299
>>>> #1  0x2033b108 in fseeko (fp=0x20651dcc, offset=18571438587904, whence=0) 
>>>> at /usr/src/lib/libc/stdio/fseek.c:82
>>>> #2  0x00016138 in ?? ()
>>>> (gdb) print fp
>>>> $2 = (FILE *) 0x20651dcc
>>>> (gdb) print *fp
>>>> $3 = {_p = 0x2069a240 "", _r = 0, _w = 0, _flags = 5264, _file = 36, _bf = 
>>>> {_base = 0x2069a240 "", _size = 32768}, _lbfsize = 0, _cookie = 
>>>> 0x20651dcc, _close = 0x20359dfc <__sclose>,
>>>> _read = 0x20359de4 <__sread>, _seek = 0x20359df4 <__sseek>, _write = 
>>>> 0x20359dec <__swrite>, _ub = {_base = 0x0, _size = 0}, _up = 0x0, _ur = 0, 
>>>> _ubuf = 0x20651e0c "", _nbuf = 0x20651e0f "", _lb = {
>>>> _base = 0x0, _size = 0}, _blksize = 32768, _offset = 0, _fl_mutex = 0x0, 
>>>> _fl_owner = 0x0, _fl_count = 0, _orientation = 0, _mbstate = {__mbstate8 = 
>>>> 0x20651e34 "", _mbstateL = 0}, _flags2 = 0}
>>> 
>>> The overall FILE struct containing the _mbstate field is also not 8-byte 
>>> aligned. But the offset from the start of the FILE struct to __mbstate8 is 
>>> a multiple of 8 bytes.
>>> 
>>> It is my interpretation that there is nothing here to justify the memset 
>>> implementation combination:
>>> 
>>> SCTLR bit[1]==1
>>> 
>>> mixed with
>>> 
>>> vst1.64 instructions
>>> 
>>> I.e.: one or both needs to change unless some way for forcing 8-byte 
>>> alignment is introduced.
>>> 
>>> I have not managed to track down anything that would indicate FreeBSD's 
>>> intent for SCTLR bit[1]. I do not even know if it is required by the design 
>>> to be constant (once initialized).
>> 
>> 
>> I have (so far) removed the build tool crashes based on adding 
>> -fmax-type-align=4 to avoid the misaligned accesses. Details follow.
>> 
>> src.conf on amd64 for the rpi2 targeting buildworld/buildkernel now looks 
>> like:
>> 
>>> # more ~/src.configs/src.conf.rpi2-clang.amd64-host
>>> TO_TYPE=armv6
>>> TOOLS_TO_TYPE=arm-gnueabi
>>> FROM_TYPE=amd64
>>> TOOLS_FROM_TYPE=x86_64
>>> VERSION_CONTEXT=11.0
>>> #
>>> KERNCONF=RPI2-NODBG
>>> TARGET=arm
>>> .if ${.MAKE.LEVEL} == 0
>>> TARGET_ARCH=${TO_TYPE}
>>> .export TARGET_ARCH
>>> .endif
>>> #
>>> WITHOUT_CROSS_COMPILER=
>>> #
>>> # For WITH_BOOT= . . .
>>> # arm-gnueabi-freebsd/bin/ld reports bootinfo.o: relocation 
>>> R_ARM_MOVW_ABS_NC against `a local symbol' can not be used when making a 
>>> shared object; recompile with -fPIC
>>> WITHOUT_BOOT=
>>> #
>>> WITH_FAST_DEPEND=
>>> WITH_LIBCPLUSPLUS=
>>> WITH_CLANG=
>>> WITH_CLANG_IS_CC=
>>> WITH_CLANG_FULL=
>>> WITH_LLDB=
>>> WITH_CLANG_EXTRAS=
>>> #
>>> WITHOUT_LIB32=
>>> WITHOUT_GCC=
>>> WITHOUT_GNUCXX=
>>> #
>>> NO_WERROR=
>>> MALLOC_PRODUCTION=
>>> #CFLAGS+= -DELF_VERBOSE
>>> #
>>> WITH_DEBUG=
>>> WITH_DEBUG_FILES=
>>> #
>>> # TOOLS_TO_TYPE based on ${TO_TYPE}-xtoolchain-gcc related bintutils...
>>> #
>>> #CROSS_TOOLCHAIN=${TO_TYPE}-gcc
>>> X_COMPILER_TYPE=clang
>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>> .if ${.MAKE.LEVEL} == 0
>>> XCC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>> -fmax-type-align=4
>>> XCXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>> -fmax-type-align=4
>>> XCPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>> -march=armv7a -fmax-type-align=4
>>> .export XCC
>>> .export XCXX
>>> .export XCPP
>>> XAS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>> XAR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>> XLD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>> XNM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>> XOBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>> XOBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>> XRANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>> XSIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>> #NO-SUCH: XSTRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>> XSTRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>> .export XAS
>>> .export XAR
>>> .export XLD
>>> .export XNM
>>> .export XOBJCOPY
>>> .export XOBJDUMP
>>> .export XRANLIB
>>> .export XSIZE
>>> .export XSTRINGS
>>> .endif
>>> #
>>> # Host compiler stuff:
>>> .if ${.MAKE.LEVEL} == 0
>>> CC=/usr/bin/clang -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>> CXX=/usr/bin/clang++ -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>> CPP=/usr/bin/clang-cpp -B/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin
>>> .export CC
>>> .export CXX
>>> .export CPP
>>> AS=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/as
>>> AR=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ar
>>> LD=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ld
>>> NM=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/nm
>>> OBJCOPY=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objcopy
>>> OBJDUMP=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/objdump
>>> RANLIB=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/ranlib
>>> SIZE=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/size
>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_FROM_TYPE}-freebsd/bin/strings
>>> STRINGS=/usr/local/bin/${TOOLS_FROM_TYPE}-freebsd-strings
>>> .export AS
>>> .export AR
>>> .export LD
>>> .export NM
>>> .export OBJCOPY
>>> .export OBJDUMP
>>> .export RANLIB
>>> .export SIZE
>>> .export STRINGS
>>> .endif
>> 
>> make.conf for during the on-rpi2 port builds now looks like:
>> 
>>> $ more /etc/make.conf
>>> WRKDIRPREFIX=/usr/obj/portswork
>>> WITH_DEBUG=
>>> WITH_DEBUG_FILES=
>>> MALLOC_PRODUCTION=
>>> #
>>> TO_TYPE=armv6
>>> TOOLS_TO_TYPE=arm-gnueabi
>>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>>> .if ${.MAKE.LEVEL} == 0
>>> CC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>> -fmax-type-align=4
>>> CXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a 
>>> -fmax-type-align=4
>>> CPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi 
>>> -march=armv7a -fmax-type-align=4
>>> .export CC
>>> .export CXX
>>> .export CPP
>>> AS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>>> AR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>>> LD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>>> NM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>>> OBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>>> OBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>>> RANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>>> SIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>>> STRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>>> .export AS
>>> .export AR
>>> .export LD
>>> .export NM
>>> .export OBJCOPY
>>> .export OBJDUMP
>>> .export RANLIB
>>> .export SIZE
>>> .export STRINGS
>>> .endif
>> 
>> 
>> 
>> ===
>> Mark Millard
>> markmi at dsl-only.net
>> 
>> 
>> 
>> _______________________________________________
>> freebsd-toolchain@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
>> To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"



_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Reply via email to