On Mon, 1 Jun 2020 at 23:52, Nick Desaulniers <ndesaulni...@google.com> wrote: > > On Mon, Jun 1, 2020 at 12:05 AM Will Deacon <w...@kernel.org> wrote: > > > > On Wed, May 27, 2020 at 02:41:04PM +0100, Lorenzo Pieralisi wrote: > > > On Tue, May 26, 2020 at 09:21:57PM +0100, Will Deacon wrote: > > > > Hi Lorenzo, Hanjun, [+Nick] > > > > > > > > On Thu, May 21, 2020 at 06:37:38PM +0100, Lorenzo Pieralisi wrote: > > > > > On Thu, May 21, 2020 at 11:09:53AM +0100, Will Deacon wrote: > > > > > > Hi folks, > > > > > > > > > > > > I just tried booting the arm64 for-kernelci branch under QEMU > > > > > > (version > > > > > > 4.2.50 (v4.2.0-779-g4354edb6dcc7)) with UBSAN enabled, and I see a > > > > > > couple of NULL pointer dereferences reported at boot. I think > > > > > > they're > > > > > > both GIC related (log below). I don't see a panic with UBSAN > > > > > > disabled, > > > > > > so something's fishy here. > > > > > > > > > > May I ask you the QEMU command line please - just to make sure I can > > > > > replicate it. > > > > > > > > As it turns out, I'm only able to reproduce this when building with > > > > Clang, > > > > but I don't know whether that's because GCC is missing something of > > > > Clang > > > > is signalling a false positive. You also don't need all of those whacky > > > > fuzzing options enabled. > > > > > > > > Anyway, to reproduce: > > > > > > > > $ git checkout for-next/kernelci > > > > $ make ARCH=arm64 CC=clang CROSS_COMPILE=aarch64-linux-gnu- defconfig > > > > <then do a menuconfig and enable UBSAN> > > > > $ make ARCH=arm64 CC=clang CROSS_COMPILE=aarch64-linux-gnu- Image > > > > > > > > I throw that at QEMU using: > > > > > > > > qemu-system-aarch64 -M virt -machine virtualization=true \ > > > > -machine virt,gic-version=3 \ > > > > -cpu max,sve=off -smp 2 -m 4096 \ > > > > -drive if=pflash,format=raw,file=efi.img,readonly \ > > > > -drive if=pflash,format=raw,file=varstore.img \ > > > > -drive if=virtio,format=raw,file=disk.img \ > > > > -device virtio-scsi-pci,id=scsi0 \ > > > > -device virtio-rng-pci \ > > > > -device virtio-net-pci,netdev=net0 \ > > > > -netdev user,id=net0,hostfwd=tcp::8222-:22 \ > > > > -nographic \ > > > > -kernel ~/work/linux/arch/arm64/boot/Image \ > > > > -append "earlycon root=/dev/vda2" > > > > > > > > I built QEMU a while ago according to: > > > > > > > > https://mirrors.edge.kernel.org/pub/linux/kernel/people/will/docs/qemu/qemu-arm64-howto.html > > > > > > > > and its version 4.2.50 (v4.2.0-779-g4354edb6dcc7). > > > > > > > > My clang is version 11.0.1. > > > > > > Thanks a lot Will. > > > > > > I *think* I was right - it is the ACPI_OFFSET() macro: > > > > > > #define ACPI_OFFSET(d, f) ACPI_PTR_DIFF (&(((d *) 0)->f), (void *) 0) > > > > > > that triggers the warnings (I suspected it because at least in one of > > > the warnings I could not see any dereference of any dynamically > > > allocated data). > > > > Cheers, Lorenzo. > > > > > Now on what to do with it - thoughts welcome. > > > > Nick -- any idea what to do about the above? The '#define' pasted by > > Lorenzo is causing a couple of spurious UBSAN splats when compiling with > > clang 11. > > If there's undefined behavior from that macro soup, we should be able > to reproduce it outside of the kernel and regardless of target > architecture, right? The macros aren't too much to throw into a file: > > ```foo.c > #define acpi_uintptr_t void * > #define ACPI_CAST_PTR(t, p) ((t *) (acpi_uintptr_t) (p)) > typedef unsigned char u8; > typedef unsigned long u64; > typedef u64 acpi_size; > #define ACPI_PTR_DIFF(a, b) ((acpi_size) (ACPI_CAST_PTR (u8, (a)) - > ACPI_CAST_PTR (u8, (b)))) > #define ACPI_OFFSET(d, f) ACPI_PTR_DIFF (&(((d *) 0)->f), (void *) 0) > > struct foo { > unsigned char bar; > int baz; > }; > > int main() { > return ACPI_OFFSET(struct foo, baz); > } > ``` > I think looks right. If we run that through -E, and clean that up > further, we get: > ```bar.c > typedef unsigned char u8; > typedef unsigned long u64; > > struct foo { > unsigned char bar; > int baz; > }; > > int main() { > return ((u64) (((u8 *) (void *) ((&(((struct foo *) 0)->baz)))) - ((u8 > *) (void *) (((void *) 0))))); > } > ``` > I may be miscounting my parentheses, but how do you take the address > of `type`->`member`? What does that even mean? > > + some more sanitizer folks and Ard for ACPI. > > anyways, running foo.c through a compiler: > $ clang -O2 foo.c -fsanitize=undefined > $ ./a.out > foo.c:15:12: runtime error: member access within null pointer of type > 'struct foo' > SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior foo.c:15:12 in > > (msg looks truncated, wtf) > > Anyways, it looks like the address of member from NULL subexpression > looks problematic. I wonder if offsetof can be used here? > > #define ACPI_OFFSET(d, f) ACPI_PTR_DIFF (offsetof(d, f), (void *) 0) > > Seems to work in my basic test case. Untested in the kernel. > > IIUC, ACPI_OFFSET is trying to calculate the difference between the > offset of a member of a struct and 0? Isn't that the tautology `x - 0 > == x`?
No. ACPI_OFFSET() is just a poor person's version of offsetof(). (Note that it calculates the difference between &(((d *) 0)->f) and (void *)0x0, so the 0x0 term is there on both sides)