On Fri Jul 5, 2024 at 3:12 PM AEST, David Gibson wrote: > On Fri, Jul 05, 2024 at 02:40:19PM +1000, Nicholas Piggin wrote: > > On Fri Jul 5, 2024 at 11:41 AM AEST, David Gibson wrote: > > > On Fri, Jul 05, 2024 at 11:18:47AM +1000, Nicholas Piggin wrote: > > > > On Thu Jul 4, 2024 at 10:15 PM AEST, Peter Maydell wrote: > > > > > On Sat, 29 Jun 2024 at 04:17, David Gibson > > > > > <da...@gibson.dropbear.id.au> wrote: > > > > > > > > > > > > On Fri, Jun 28, 2024 at 04:20:02PM +0100, Peter Maydell wrote: > > > > > > > On Thu, 27 Jun 2024 at 14:39, Akihiko Odaki > > > > > > > <akihiko.od...@daynix.com> wrote: > > > > > > > > > > > > > > > > FDT properties are aligned by 4 bytes, not 8 bytes. > > > > > > > > > > > > > > > > Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com> > > > > > > > > --- > > > > > > > > hw/ppc/vof.c | 2 +- > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/hw/ppc/vof.c b/hw/ppc/vof.c > > > > > > > > index e3b430a81f4f..b5b6514d79fc 100644 > > > > > > > > --- a/hw/ppc/vof.c > > > > > > > > +++ b/hw/ppc/vof.c > > > > > > > > @@ -646,7 +646,7 @@ static void vof_dt_memory_available(void > > > > > > > > *fdt, GArray *claimed, uint64_t base) > > > > > > > > mem0_reg = fdt_getprop(fdt, offset, "reg", &proplen); > > > > > > > > g_assert(mem0_reg && proplen == sizeof(uint32_t) * (ac + > > > > > > > > sc)); > > > > > > > > if (sc == 2) { > > > > > > > > - mem0_end = be64_to_cpu(*(uint64_t *)(mem0_reg + > > > > > > > > sizeof(uint32_t) * ac)); > > > > > > > > + mem0_end = ldq_be_p(mem0_reg + sizeof(uint32_t) * ac); > > > > > > > > } else { > > > > > > > > mem0_end = be32_to_cpu(*(uint32_t *)(mem0_reg + > > > > > > > > sizeof(uint32_t) * ac)); > > > > > > > > } > > > > > > > > > > > > > > I did wonder if there was a better way to do what this is doing, > > > > > > > but neither we (in system/device_tree.c) nor libfdt seem to > > > > > > > provide one. > > > > > > > > > > > > libfdt does provide unaligned access helpers (fdt32_ld() etc.), but > > > > > > not an automatic aligned-or-unaligned helper. Maybe we should add > > > > > > that? > > > > > > > > > > fdt32_ld() and friends only do the "load from this bit of memory" > > > > > part, which we already have QEMU utility functions for (and which > > > > > are this patch uses). > > > > > > > > > > This particular bit of code is dealing with an fdt property ("memory") > > > > > that is an array of (address, size) tuples where address and size > > > > > can independently be either 32 or 64 bits, and it wants the > > > > > size value of tuple 0. So the missing functionality is something at > > > > > a higher level than fdt32_ld() which would let you say "give me > > > > > tuple N field X" with some way to specify the tuple layout. (Which > > > > > is an awkward kind of API to write in C.) > > > > > > > > > > Slightly less general, but for this case we could perhaps have > > > > > something like the getprop equivalent of > > > > > qemu_fdt_setprop_sized_cells(): > > > > > > > > > > uint64_t value_array[2]; > > > > > qemu_fdt_getprop_sized_cells(fdt, nodename, "memory", &value_array, > > > > > ac, sc); > > > > > /* > > > > > * fills in value_array[0] with address, value_array[1] with size, > > > > > * probably barfs if the varargs-list of cell-sizes doesn't > > > > > * cover the whole property, similar to the current assert on > > > > > * proplen. > > > > > */ > > > > > mem0_end = value_array[0]; > > > > > > > > Since 4/8 byte cells are most common and size is probably > > > > normally known, what about something simpler to start with? > > > > > > Hrm, I don't think this helps much. As Peter points out the actual > > > load isn't really the issue, it's locating the right spot for it. > > > > I don't really see why that's a problem, it's just a pointer > > addition - base + fdt_address_cells * 4. The problem was in > > This is harder if #address-cells and #size-cells are different, or if > you're parsing ranges and #address-cells is different between parent > and child node. > > > the memory access (yes it's fixed with the patch but you could > > add a general libfdt way to do it). > > Huh.. well I'm getting different impressions of what the problem > actually is from what I initially read versus Peter Maydell's > comments, so I don't really know what to think.
If I'm not mistaken, the sanitizer caught an unaligned 64-bit load which is the bug. The tuple address calculation itself I think is not buggy. I suppose Peter was thinking of an accessor that takes care of addressing and alignment. I don't think we're at the point it warrants it here, but could be convinced (maybe a bunch of other code would use it). I think the API is a little dangerous for overflows though, hard to static check. sscanf() style could be checked by the compiler but seems overkill to implement. > If it's just the load then fdt32_ld() etc. already exist. Or is it > really such a hot path that unconditionally handling unaligned > accesses isn't tenable? Yeah that's true, hardly any point to adding the faster variant. It could just be fixed like this then? The original patch is a fix too, but I do prefer using the same style for both, and I think using the fdt accessor is nicer to read. Thanks, Nick --- diff --git a/hw/ppc/vof.c b/hw/ppc/vof.c index e3b430a81f..a666a133d7 100644 --- a/hw/ppc/vof.c +++ b/hw/ppc/vof.c @@ -646,9 +646,9 @@ static void vof_dt_memory_available(void *fdt, GArray *claimed, uint64_t base) mem0_reg = fdt_getprop(fdt, offset, "reg", &proplen); g_assert(mem0_reg && proplen == sizeof(uint32_t) * (ac + sc)); if (sc == 2) { - mem0_end = be64_to_cpu(*(uint64_t *)(mem0_reg + sizeof(uint32_t) * ac)); + mem0_end = fdt64_ld((fdt64_t *)(mem0_reg + sizeof(uint32_t) * ac)); } else { - mem0_end = be32_to_cpu(*(uint32_t *)(mem0_reg + sizeof(uint32_t) * ac)); + mem0_end = fdt32_ld((fdt32_t *)(mem0_reg + sizeof(uint32_t) * ac)); } g_array_sort(claimed, of_claimed_compare_func);