On Thu, 1 Feb 2024 at 08:15, Tomoyuki HIROSE <tomoyuki.hir...@igel.co.jp> wrote:
>
> The previous code ignored 'impl.unaligned' and handled unaligned accesses
> as is. But this implementation cannot emulate specific registers of some
> devices that allow unaligned access such as xHCI Host Controller Capability
> Registers.
> This commit checks 'impl.unaligned' and if it is false, QEMU emulates
> unaligned access with multiple aligned access.
>
> Signed-off-by: Tomoyuki HIROSE <tomoyuki.hir...@igel.co.jp>
> ---
>  system/memory.c | 38 +++++++++++++++++++++++++-------------
>  1 file changed, 25 insertions(+), 13 deletions(-)
>
> diff --git a/system/memory.c b/system/memory.c
> index a229a79988..a7ca0c9f54 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -535,10 +535,17 @@ static MemTxResult access_with_adjusted_size(hwaddr 
> addr,
>                                        MemTxAttrs attrs)
>  {
>      uint64_t access_mask;
> +    unsigned access_mask_shift;
> +    unsigned access_mask_start_offset;
> +    unsigned access_mask_end_offset;
>      unsigned access_size;
> -    unsigned i;
>      MemTxResult r = MEMTX_OK;
>      bool reentrancy_guard_applied = false;
> +    bool is_big_endian = memory_region_big_endian(mr);
> +    signed start_diff;
> +    signed current_offset;
> +    signed access_shift;

"signed foo" is a weird way to specify this type, which we use almost
nowhere else in the codebase -- this is equivalent to "int foo".

> +    hwaddr current_addr;
>
>      if (!access_size_min) {
>          access_size_min = 1;
> @@ -560,19 +567,24 @@ static MemTxResult access_with_adjusted_size(hwaddr 
> addr,
>          reentrancy_guard_applied = true;
>      }
>
> -    /* FIXME: support unaligned access? */
>      access_size = MAX(MIN(size, access_size_max), access_size_min);

This still has a problem I noted for the v1 patch:
we compute the access_size without thinking about the alignment,
so for an access like:
 * addr = 2, size = 4, access_size_min = 2, access_size_max = 8
we will calculate access_size = 4 and do two 4-byte accesses
(at addresses 0 and 4) when we should do two 2-byte accesses
(at addresses 2 and 4).

> -    access_mask = MAKE_64BIT_MASK(0, access_size * 8);
> -    if (memory_region_big_endian(mr)) {
> -        for (i = 0; i < size; i += access_size) {
> -            r |= access_fn(mr, addr + i, value, access_size,
> -                        (size - access_size - i) * 8, access_mask, attrs);
> -        }
> -    } else {
> -        for (i = 0; i < size; i += access_size) {
> -            r |= access_fn(mr, addr + i, value, access_size, i * 8,
> -                        access_mask, attrs);
> -        }
> +    start_diff = mr->ops->impl.unaligned ? 0 : addr & (access_size - 1);
> +    current_addr = addr - start_diff;
> +    for (current_offset = -start_diff; current_offset < (signed)size;
> +         current_offset += access_size, current_addr += access_size) {
> +        access_shift = is_big_endian
> +                          ? (signed)size - (signed)access_size - 
> current_offset
> +                          : current_offset;
> +        access_mask_shift = current_offset > 0 ? 0 : -current_offset;
> +        access_mask_start_offset = current_offset > 0 ? current_offset : 0;
> +        access_mask_end_offset = current_offset + access_size > size
> +                                     ? size
> +                                     : current_offset + access_size;
> +        access_mask = MAKE_64BIT_MASK(access_mask_shift * 8,
> +            (access_mask_end_offset - access_mask_start_offset) * 8);

I don't understand here why the access_mask_shift and the
access_mask_start_offset are different. Aren't we trying to create
a mask value with 1s from start through to end ?

> +
> +        r |= access_fn(mr, current_addr, value, access_size, access_shift * 
> 8,
> +                       access_mask, attrs);
>      }
>      if (mr->dev && reentrancy_guard_applied) {
>          mr->dev->mem_reentrancy_guard.engaged_in_io = false;

I agree with Philippe that we could be a lot more confident in
this change if we had some unit tests that tested whether
various combinations of unaligned accesses turned into the
right sequence of accesses to the underlying device.

thanks
-- PMM

Reply via email to