On 2015-Dec-24, at 10:39 PM, Mark Millard <mar...@dsl-only.net> wrote:

> [I do not know if this partial crash analysis related to on-arm 
> clang-associated activity is good enough and appropriate to submit or not.]
> 
> The /usr/local/arm-gnueabi-freebsd/bin/ar on the rpi2b involved below came 
> from pkg install activity instead of port building. Used as-is.
> 
> When I just tried my first from-rpi2b builds (ports for a rpi2b), 
> /usr/local/arm-gnueabi-freebsd/bin/ar crashed. I believe that the following 
> suggests an alignment error for the type of instructions that memset for 128 
> bytes was translated to (sizeof(mbstate_t)) in the code used by 
> /usr/local/arm-gnueabi-freebsd/bin/ar. (But I do not know how to check SCTLR 
> bit[1] to be directly sure that alignment was being enforced.)
> 
> The crash was a Bus error in /usr/local/arm-gnueabi-freebsd/bin/ar :
> 
>> libtool: link: /usr/local/arm-gnueabi-freebsd/bin/ar cru .libs/libgnuintl.a  
>> bindtextdom.o dcgettext.o dgettext.o gettext.o finddomain.o hash-string.o 
>> loadmsgcat.o localealias.o textdomain.o l10nflist.o explodename.o 
>> dcigettext.o dcngettext.o dngettext.o ngettext.o pluralx.o plural-exp.o 
>> localcharset.o threadlib.o lock.o relocatable.o langprefs.o localename.o 
>> log.o printf.o setlocale.o version.o xsize.o osdep.o intl-compat.o
>> Bus error (core dumped)
>> *** [libgnuintl.la] Error code 138
> 
> It failed in _fseeko doing a memset that turned into uses of "vst1.64 
> {d16-d17}, [r0]" instructions, for an address in register r0 that ended in 
> 0xa4, so was not aligned to 8 byte boundaries. From what I read such "VSTn 
> (multiple n-element structures)" that have .64 require 8 byte alignment. The 
> evidence of the code and register value follow.
> 
>> # gdb /usr/local/arm-gnueabi-freebsd/bin/ar 
>> /usr/obj/portswork/usr/ports/devel/gettext-tools/work/gettext-0.19.6/gettext-tools/intl/ar.core
>> . . .
>> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
>> whence=<value optimized out>, ltest=<value optimized out>) at 
>> /usr/src/lib/libc/stdio/fseek.c:299
>> 299          memset(&fp->_mbstate, 0, sizeof(mbstate_t));
>> . . .
>> (gdb) x/24i 0x2033adb0
>> 0x2033adb0 <_fseeko+836>:    vmov.i32        q8, #0  ; 0x00000000
>> 0x2033adb4 <_fseeko+840>:    movw    r1, #65503      ; 0xffdf
>> 0x2033adb8 <_fseeko+844>:    stm     r4, {r0, r7}
>> 0x2033adbc <_fseeko+848>:    ldrh    r0, [r4, #12]
>> 0x2033adc0 <_fseeko+852>:    and     r0, r0, r1
>> 0x2033adc4 <_fseeko+856>:    strh    r0, [r4, #12]
>> 0x2033adc8 <_fseeko+860>:    add     r0, r4, #216    ; 0xd8
>> 0x2033adcc <_fseeko+864>:    vst1.64 {d16-d17}, [r0]
>> 0x2033add0 <_fseeko+868>:    add     r0, r4, #200    ; 0xc8
>> 0x2033add4 <_fseeko+872>:    vst1.64 {d16-d17}, [r0]
>> 0x2033add8 <_fseeko+876>:    add     r0, r4, #184    ; 0xb8
>> 0x2033addc <_fseeko+880>:    vst1.64 {d16-d17}, [r0]
>> 0x2033ade0 <_fseeko+884>:    add     r0, r4, #168    ; 0xa8
>> 0x2033ade4 <_fseeko+888>:    vst1.64 {d16-d17}, [r0]
>> 0x2033ade8 <_fseeko+892>:    add     r0, r4, #152    ; 0x98
>> 0x2033adec <_fseeko+896>:    vst1.64 {d16-d17}, [r0]
>> 0x2033adf0 <_fseeko+900>:    add     r0, r4, #136    ; 0x88
>> 0x2033adf4 <_fseeko+904>:    vst1.64 {d16-d17}, [r0]
>> 0x2033adf8 <_fseeko+908>:    add     r0, r4, #120    ; 0x78
>> 0x2033adfc <_fseeko+912>:    vst1.64 {d16-d17}, [r0]
>> 0x2033ae00 <_fseeko+916>:    add     r0, r4, #104    ; 0x68
>> 0x2033ae04 <_fseeko+920>:    vst1.64 {d16-d17}, [r0]
>> 0x2033ae08 <_fseeko+924>:    b       0x2033b070 <_fseeko+1540>
>> 0x2033ae0c <_fseeko+928>:    cmp     r5, #0  ; 0x0
>> (gdb) info all-registers
>> r0             0x20651ea4    543497892
>> r1             0xffdf        65503
>> r2             0x0   0
>> r3             0x0   0
>> r4             0x20651dcc    543497676
>> r5             0x0   0
>> r6             0x0   0
>> r7             0x0   0
>> r8             0x20359df4    540384756
>> r9             0x0   0
>> r10            0x0   0
>> r11            0xbfbfb948    -1077954232
>> r12            0x2037b208    540520968
>> sp             0xbfbfb898    -1077954408
>> lr             0x2035a004    540385284
>> pc             0x2033adcc    540257740
>> f0             0     (raw 0x000000000000000000000000)
>> f1             0     (raw 0x000000000000000000000000)
>> f2             0     (raw 0x000000000000000000000000)
>> f3             0     (raw 0x000000000000000000000000)
>> f4             0     (raw 0x000000000000000000000000)
>> f5             0     (raw 0x000000000000000000000000)
>> f6             0     (raw 0x000000000000000000000000)
>> f7             0     (raw 0x000000000000000000000000)
>> fps            0x0   0
>> cpsr           0x60000010    1610612752
> 
> The syntax in use for vst1.64 instructions does not explicitly have the 
> alignment notation. Presuming that the decoding is correct then from what I 
> read the following applies:
> 
>> Home > NEON and VFP Programming > NEON load and store element and structure 
>> instructions > Alignment restrictions in load and store, element and 
>> structure instructions
>> 
>> . . . When the alignment is not specified in the instruction, the alignment 
>> restriction is controlled by the A bit (SCTLR bit[1]):
>>      •       if the A bit is 0, there are no alignment restrictions (except 
>> for strongly ordered or device memory, where accesses must be element 
>> aligned or the result is unpredictable)
>>      •       if the A bit is 1, accesses must be element aligned.
>> If an address is not correctly aligned, an alignment fault occurs.
> 
> So if at the time the "A bit" (SCTLR bit[1]) is 1 then the Bus error would 
> have the context to happen because of the mis-alignment.
> 
> The following shows the make.conf context that explains how 
> /usr/local/arm-gnueabi-freebsd/bin/ar came to be invoked:
> 
>> # more /etc/make.conf 
>> WRKDIRPREFIX=/usr/obj/portswork
>> WITH_DEBUG=
>> WITH_DEBUG_FILES=
>> MALLOC_PRODUCTION=
>> #
>> TO_TYPE=armv6
>> TOOLS_TO_TYPE=arm-gnueabi
>> CROSS_BINUTILS_PREFIX=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/
>> .if ${.MAKE.LEVEL} == 0
>> CC=/usr/bin/clang -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>> CXX=/usr/bin/clang++ -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>> CPP=/usr/bin/clang-cpp -target ${TO_TYPE}--freebsd11.0-gnueabi -march=armv7a
>> .export CC
>> .export CXX
>> .export CPP
>> AS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/as
>> AR=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ar
>> LD=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ld
>> NM=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/nm
>> OBJCOPY=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objcopy
>> OBJDUMP=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/objdump
>> RANLIB=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/ranlib
>> SIZE=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/size
>> #NO-SUCH: STRINGS=/usr/local/${TOOLS_TO_TYPE}-freebsd/bin/strings
>> STRINGS=/usr/local/bin/${TOOLS_TO_TYPE}-freebsd-strings
>> .export AS
>> .export AR
>> .export LD
>> .export NM
>> .export OBJCOPY
>> .export OBJDUMP
>> .export RANLIB
>> .export SIZE
>> .export STRINGS
>> .endif
> 
> 
> Other context:
> 
>> # freebsd-version -ku; uname -aKU
>> 11.0-CURRENT
>> 11.0-CURRENT
>> FreeBSD rpi2 11.0-CURRENT FreeBSD 11.0-CURRENT #0 r292413M: Tue Dec 22 
>> 22:02:21 PST 2015     
>> root@FreeBSDx64:/usr/obj/clang/arm.armv6/usr/src/sys/RPI2-NODBG  arm 1100091 
>> 1100091
> 
> 
> 
> I will note that world and kernel are my own build of -r292413 (earlier 
> experiment) --a build made from an amd64 host context and put in place via 
> DESTDIR=. My expectation would be that the amd64 context would not be likely 
> to have similar alignment restrictions involved in its ar activity (or other 
> activity). That would explain how I got this far using such a clang 3.7 
> related toolchain for targeting an rpi2 before finding such a problem.


I realized re-reading the all above that it seems to suggest that the _fseeko 
code involved is from /usr/local/arm-gnueabi-freebsd/bin/ar but that was not my 
intent.

libc.so.7 is from my buildworld, including the fseeko implementation:

Reading symbols from /lib/libc.so.7...Reading symbols from 
/usr/lib/debug//lib/libc.so.7.debug...done.
done.
Loaded symbols for /lib/libc.so.7


head/sys/sys/_types.h has:

/*
 * mbstate_t is an opaque object to keep conversion state during multibyte
 * stream conversions.
 */
typedef union {
        char            __mbstate8[128];
        __int64_t       _mbstateL;      /* for alignment */
} __mbstate_t;

suggesting an implicit alignment of the union to whatever the implementation 
defines for __int64_t --which need not be 8 byte alignment (in the abstract, 
general case). But 8 byte alignment is a possibility as well (in the abstract).

But printing *fp in gdb for the fp argument to _fseeko reports the same 
not-8-byte aligned address for __mbstate8 that was in r0:

> (gdb) bt
> #0  0x2033adcc in _fseeko (fp=0x20651dcc, offset=<value optimized out>, 
> whence=<value optimized out>, ltest=<value optimized out>) at 
> /usr/src/lib/libc/stdio/fseek.c:299
> #1  0x2033b108 in fseeko (fp=0x20651dcc, offset=18571438587904, whence=0) at 
> /usr/src/lib/libc/stdio/fseek.c:82
> #2  0x00016138 in ?? ()
> (gdb) print fp
> $2 = (FILE *) 0x20651dcc
> (gdb) print *fp
> $3 = {_p = 0x2069a240 "", _r = 0, _w = 0, _flags = 5264, _file = 36, _bf = 
> {_base = 0x2069a240 "", _size = 32768}, _lbfsize = 0, _cookie = 0x20651dcc, 
> _close = 0x20359dfc <__sclose>, 
>   _read = 0x20359de4 <__sread>, _seek = 0x20359df4 <__sseek>, _write = 
> 0x20359dec <__swrite>, _ub = {_base = 0x0, _size = 0}, _up = 0x0, _ur = 0, 
> _ubuf = 0x20651e0c "", _nbuf = 0x20651e0f "", _lb = {
>     _base = 0x0, _size = 0}, _blksize = 32768, _offset = 0, _fl_mutex = 0x0, 
> _fl_owner = 0x0, _fl_count = 0, _orientation = 0, _mbstate = {__mbstate8 = 
> 0x20651e34 "", _mbstateL = 0}, _flags2 = 0}

The overall FILE struct containing the _mbstate field is also not 8-byte 
aligned. But the offset from the start of the FILE struct to __mbstate8 is a 
multiple of 8 bytes.

It is my interpretation that there is nothing here to justify the memset 
implementation combination:

SCTLR bit[1]==1

mixed with

vst1.64 instructions

I.e.: one or both needs to change unless some way for forcing 8-byte alignment 
is introduced.

I have not managed to track down anything that would indicate FreeBSD's intent 
for SCTLR bit[1]. I do not even know if it is required by the design to be 
constant (once initialized).


===
Mark Millard
markmi at dsl-only.net


_______________________________________________
freebsd-toolchain@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"

Reply via email to