On Mon, 11 Oct 2021, Michael S. Tsirkin wrote:
On Mon, Oct 11, 2021 at 03:27:44PM +0200, BALATON Zoltan wrote:
On Mon, 11 Oct 2021, Michael S. Tsirkin wrote:
On Mon, Oct 11, 2021 at 12:13:55PM +0200, BALATON Zoltan wrote:
On Mon, 11 Oct 2021, Philippe Mathieu-Daudé wrote:
On 10/10/21 15:24, BALATON Zoltan wrote:
Hello,

I'm trying to fix shutdown and reboot on pegasos2 which uses ACPI as
part of the VIA VT8231 (similar to and modelled in hw/isa/vt82c686b.c)
and found that the guest writes to ACPI PM1aCNT register come out with
wrong endianness but not shure why. I have this:

$ qemu-system-ppc -M pegasos2 -monitor stdio
(qemu) info mtree
[...]
memory-region: pci1-io
  0000000000000000-000000000000ffff (prio 0, i/o): pci1-io
[...]
    0000000000000f00-0000000000000f7f (prio 0, i/o): via-pm
      0000000000000f00-0000000000000f03 (prio 0, i/o): acpi-evt
      0000000000000f04-0000000000000f05 (prio 0, i/o): acpi-cnt
      0000000000000f08-0000000000000f0b (prio 0, i/o): acpi-tmr

memory-region: system
  0000000000000000-ffffffffffffffff (prio 0, i/o): system
    0000000000000000-000000001fffffff (prio 0, ram): pegasos2.ram
    0000000080000000-00000000bfffffff (prio 0, i/o): alias pci1-mem0-win
@pci1-mem 0000000080000000-00000000bfffffff
    00000000c0000000-00000000dfffffff (prio 0, i/o): alias pci0-mem0-win
@pci0-mem 00000000c0000000-00000000dfffffff
    00000000f1000000-00000000f100ffff (prio 0, i/o): mv64361
    00000000f8000000-00000000f8ffffff (prio 0, i/o): alias pci0-io-win
@pci0-io 0000000000000000-0000000000ffffff
    00000000f9000000-00000000f9ffffff (prio 0, i/o): alias pci0-mem1-win
@pci0-mem 0000000000000000-0000000000ffffff
    00000000fd000000-00000000fdffffff (prio 0, i/o): alias pci1-mem1-win
@pci1-mem 0000000000000000-0000000000ffffff
    00000000fe000000-00000000feffffff (prio 0, i/o): alias pci1-io-win
@pci1-io 0000000000000000-0000000000ffffff
    00000000ff800000-00000000ffffffff (prio 0, i/o): alias pci1-mem3-win
@pci1-mem 00000000ff800000-00000000ffffffff
    00000000fff00000-00000000fff7ffff (prio 0, rom): pegasos2.rom

The guest (which is big endian PPC and I think wotks on real hardware)
writes to 0xf05 in the io region which should be the high byte of the
little endian register but in the acpi code it comes out wrong, instead
of 0x2800 I get in acpi_pm1_cnt_write: val=0x28

Looks like a northbridge issue (MV64340).
Does Pegasos2 enables bus swapping?
See hw/pci-host/mv64361.c:585:

static void warn_swap_bit(uint64_t val)
{
   if ((val & 0x3000000ULL) >> 24 != 1) {
       qemu_log_mask(LOG_UNIMP, "%s: Data swap not implemented", __func__);
   }
}

No, guests don't use this feature just byteswap themselves and write little
endian values to the mapped io region. (Or in this case just write one byte
of the 16 bit value, it specifically writes 0x28 to 0xf05.) That's why I
think the device model should not byteswap itself so the acpi regions should
be declared native endian and let the guest handle it. Some mentions I've
found say that native endian means don't byteswap, little endian means
byteswap if vcpu is big endian and big endian means byteswap if vcpu is
little endian. Since guests already account for this and write little endian
values to these regions there's no need to byteswap in device model and
changing to native endian should not affect little endian machines so unless
there's a better argument I'll try sending a patch.

Regards,
BALATON Zoltan

native endian means endian-ness is open-coded in the device handler
itself.  I think you just found a bug in memory core.

static const MemoryRegionOps acpi_pm_cnt_ops = {
   .read = acpi_pm_cnt_read,
   .write = acpi_pm_cnt_write,
   .impl.min_access_size = 2,
   .valid.min_access_size = 1,
   .valid.max_access_size = 2,
   .endianness = DEVICE_LITTLE_ENDIAN,
};


Because of that     .impl.min_access_size = 2,
the 1 byte write should be converted to a 2 byte
read+2 byte write.

docs/devel/memory.rst just says:
- .impl.min_access_size, .impl.max_access_size define the access sizes
 (in bytes) supported by the *implementation*; other access sizes will be
 emulated using the ones available.  For example a 4-byte write will be
 emulated using four 1-byte writes, if .impl.max_access_size = 1.



Instead what we have is:

MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
                                        hwaddr addr,
                                        uint64_t data,
                                        MemOp op,
                                        MemTxAttrs attrs)
{
   unsigned size = memop_size(op);

   if (!memory_region_access_valid(mr, addr, size, true, attrs)) {
       unassigned_mem_write(mr, addr, data, size);
       return MEMTX_DECODE_ERROR;
   }

   adjust_endianness(mr, &data, op);


---


static void adjust_endianness(MemoryRegion *mr, uint64_t *data, MemOp op)
{
   if ((op & MO_BSWAP) != devend_memop(mr->ops->endianness)) {
       switch (op & MO_SIZE) {
       case MO_8:
           break;
       case MO_16:
           *data = bswap16(*data);
           break;
       case MO_32:
           *data = bswap32(*data);
           break;
       case MO_64:
           *data = bswap64(*data);
           break;
       default:
           g_assert_not_reached();
       }
   }
}

so the byte swap is based on size before extending it to
.impl.min_access_size, not after.

OK thanks for the analysis, I'll wait what comes out of this.

Also, no read happens which I suspect is also a bug,
but could be harmless ...

I did not check if a read happens or not but generally I think that may be
another source of bugs if the memory core changes writes to read + write
because for some devices a read might have side effects


IMHO if reads have side effects then devices should not set 
.impl.min_access_size

If all this was documented somewhere then you could expect people know how to use it. I could not find any info on how to correctly use these so had to guess most of the time. Maybe it's somewhere there but not an obvious place. The code implementing this is not something that most people read just expect it to work. Could be useful to make a section about it in the docs. Maybe some words about it might help in here: https://www.qemu.org/docs/master/devel/memory.html

... but given we did not previously do the read, maybe we should keep
it that way at least for the time being.

How do you know there was no read before this write? Did you check it? I've only added a printf in the write method and saw the value was swapped but did not check if there was a read before that. There are no traces in these methods so maybe I would not see it unless adding a printf there too.

Regards,
BALATON Zoltan

Reply via email to