>Synopsis:  <alignment fault on armv7 (omap) using carp(4)>
>Category:  arm
>Environment:
    System      : OpenBSD 5.9
    Details     : OpenBSD 5.9 (DBGGENERIC) #0: Sat Feb  6 12:22:27 EST 2016
             r...@beagle2.mit.edu:/usr/src/sys/arch/armv7/compile/DBGGENERIC

    Architecture: OpenBSD.armv7
    Machine     : armv7
>Description:
    With two beaglebone black's running -current, an alignment fault is
    encountered at ip_input.c:262 in ipv4_input() when they are
    configured to use carp(4) to share the same IP address.

    Source context from ip_input.c (alignment fault occurs when
    ip->ip_dst.s_addr is loaded at line 262):

258:            ip = mtod(m, struct ip *);
259:    }
260:
261:    /* 127/8 must not appear on wire - RFC1122 */
262:    if ((ntohl(ip->ip_dst.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET ||
263:       (ntohl(ip->ip_src.s_addr) >> IN_CLASSA_NSHIFT) == IN_LOOPBACKNET) {
264:            if ((ifp->if_flags & IFF_LOOPBACK) == 0) {
265:                    ipstat.ips_badaddr++;
266:                    goto bad;

    ddb(4) output:

$ Fatal kernel mode data abort: 'Alignment Fault 1'
trapframe: 0xcb2d8e40
DFSR=00000001, DFAR=c4cb401e, spsr=80000013
r0 =c924d400, r1 =00000003, r2 =00000045, r3 =00000038
r4 =c4cb400e, r5 =c06f2ca4, r6 =00000014, r7 =c4d65800
r8 =c0710e50, r9 =c069294c, r10=c0692918, r11=cb2d8eb8
r12=60000093, ssp=cb2d8e8c, slr=c040bc88, pc =c04616ec

Stopped at      ipv4_input+0x9c:        ldrls   r3, [r4, #0x010]
ddb> trace
ipv4_input+0xc
        scp=0xc046165c rlv=0xc0461ab4 (ipintr+0x24)
        rsp=0xcb2d8ebc rfp=0xcb2d8ecc
        r10=0xc0692918 r8=0xc0710e50 r7=0xc06edd88 r6=0xc06edd88
        r5=0x00000000 r4=0x00000004
ipintr+0xc
        scp=0xc0461a9c rlv=0xc041b290 (netintr+0xa0)
        rsp=0xcb2d8ed0 rfp=0xcb2d8ef0
netintr+0xc
        scp=0xc041b1fc rlv=0xc053f3d0 (softintr_dispatch+0x84)
        rsp=0xcb2d8ef4 rfp=0xcb2d8f10
        r7=0x00000000 r6=0xc0710eb4 r5=0xc0710ec0 r4=0xc89e13a0
softintr_dispatch+0x18
        scp=0xc053f364 rlv=0xc053eef8 (arm_do_pending_intr+0x110)
        rsp=0xcb2d8f14 rfp=0xcb2d8f40
        r6=0xc0710190 r5=0x20000013 r4=0x00000004
arm_do_pending_intr+0x10
        scp=0xc053edf8 rlv=0xc040d9a8 (if_input_process+0xcc)
        rsp=0xcb2d8f44 rfp=0xcb2d8f78
        r10=0xc0692918 r9=0x00000000 r8=0x00000000 r7=0xcb2d8f44
        r6=0x00000000 r5=0xc4d65800 r4=0xc4d57480
if_input_process+0xc
        scp=0xc040d8e8 rlv=0xc03b5c2c (taskq_thread+0x90)
        rsp=0xcb2d8f7c rfp=0xcb2d8fb0
        r10=0xc06e643c r8=0xc06e65d8 r7=0xcb2d8f7c r6=0x00000001
        r5=0xc89e2040 r4=0xc03b5b04
taskq_thread+0xc
        scp=0xc03b5ba8 rlv=0xc0536c10 (proc_trampoline+0x18)
        rsp=0xcb2d8fb4 rfp=0xc07f3edc
        r7=0x00000000 r6=0x00000000 r5=0xc89e2040 r4=0xc03b5b9c
Bad frame pointer: 0xc07f3edc

    this problem has also been encountered with both BB's running -stable.

>How-To-Repeat:
    Install either -current or -stable on two beaglebone black's, with names
    beagle1 and beagle2. On a LAN 192.168.123.0/24 with default
    gateway 192.168.123.2, set /etc/mygate to 192.168.123.2 on beagle1 and
    beagle2, then set /etc/hostname.cpsw0 on beagle1 to be

inet 192.168.123.201 255.255.255.0 NONE

    and on beagle2

inet 192.168.123.202 255.255.255.0 NONE

    then run the following commands on both to use carp(4):

doas ifconfig carp0 create
doas ifconfig carp0 vhid 1 pass tyrell carpdev cpsw0 192.168.123.222
netmask 255.255.255.0

    shortly thereafter a beaglebone will encounter an alignment fault.

>Fix:
    The cause of this problem is unknown to me. I would speculate that the
    issue lies in m_pullup mishandling alignment, given that netowkring on
    the beaglebone black usually functions normally, and that there are
    branches prior to the crash in which m_pullup is used in deriving a
    pointer to ip, which when using carp(4) apparently misaligned.

    In investigating this issue further, I replaced offending 32-bit loads
    in the kernel with calls to get_unaligned_le32(), defined as (from
    linux/unaligned/packed_struct.h):

struct __una_u32 { u32 x; } __packed;
static inline u32 get_unaligned_le32(const void *p) {
    const struct __una_u32 *ptr = (const struct __una_u32 *)p;
    return ptr->x;
}

    Other than replacements in ip_input.c, udp_usrreq.c was also changed as
    well as the macros IN6_IS_ADDR_UNSPECIFIED, IN6_IS_ADDR_LOOPBACK,
    IN6_IS_ADDR_V4COMPAT, and IN6_IS_ADDR_V4MAPPED in in6.h.

    This resulted in carp(4) appearing to function normally, but beagle1
    and beagle2 repeatedly lost networking temporarily and recurrent
    'device timeout's appeared in dmesg (as well as carp(4) messages
    informing state changes from master to slave and vice versa).

    To me that behavior might suggest the problem is deeper than a
    bookkeeping mistake of aligning memory in mbuf.

dmesg:
OpenBSD 5.9 (DBGGENERIC) #0: Sat Feb  6 12:22:27 EST 2016
    r...@beagle2.mit.edu:/usr/src/sys/arch/armv7/compile/DBGGENERIC
real mem  = 536870912 (512MB)
avail mem = 518074368 (494MB)
warning: no entropy supplied by boot loader
mainbus0 at root
cpu0 at mainbus0: ARM Cortex A8 R3 rev 2 (ARMv7 core)
cpu0: DC enabled IC enabled WB disabled EABT branch prediction enabled
cpu0: 32KB(64b/l,4way) I-cache, 32KB(64b/l,4way) wr-back D-cache
omap0 at mainbus0: TI AM335x BeagleBone
prcm0 at omap0 rev 0.2
sitaracm0 at omap0: control module, rev 1.0
intc0 at omap0 rev 5.0
edma0 at omap0 rev 0.0
dmtimer0 at omap0 rev 3.1
dmtimer1 at omap0 rev 3.1
omdog0 at omap0 rev 0.1
omgpio0 at omap0: rev 0.1
gpio0 at omgpio0: 32 pins
omgpio1 at omap0: rev 0.1
gpio1 at omgpio1: 32 pins
omgpio2 at omap0: rev 0.1
gpio2 at omgpio2: 32 pins
omgpio3 at omap0: rev 0.1
gpio3 at omgpio3: 32 pins
omap0: device tiiic unit 0 not configured
omap0: device tiiic unit 1 not configured
omap0: device tiiic unit 2 not configured
ommmc0 at omap0
sdmmc0 at ommmc0
ommmc1 at omap0
sdmmc1 at ommmc1
com0 at omap0: ti16750, 64 byte fifo
com0: console
cpsw0 at omap0: version 1.12 (0), address 84:eb:18:e4:61:3a
ukphy0 at cpsw0 phy 0: Generic IEEE 802.3u media interface, rev. 1:
OUI 0x0001f0, model 0x000f
scsibus0 at sdmmc0: 2 targets, initiator 0
sd0 at scsibus0 targ 1 lun 0: <SD/MMC, Drive #01, > SCSI2 0/direct fixed
sd0: 30436MB, 512 bytes/sector, 62333952 sectors
scsibus1 at sdmmc1: 2 targets, initiator 0
sd1 at scsibus1 targ 1 lun 0: <SD/MMC, Drive #01, > SCSI2 0/direct fixed
sd1: 3648MB, 512 bytes/sector, 7471104 sectors
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
boot device: sd0
root on sd0a (c38fd352429a26ad.a) swap on sd0b dump on sd0b
WARNING: / was not properly unmounted
WARNING: CHECK AND RESET THE DATE!

Reply via email to