I'm having another look at the SET_FEATURE Solaris 10 error. I've enabled tracing and I see the following. The pci_cfg_read that shows up at the end continues a few thousand times(?) but I've omitted it. This appears to time out or something and then Solaris gives up on the device.
ide_ioport_read 41.468 pid=147030 addr=0x7 reg=b'Status' val=0x0 bus=0x55b77f922d10 s=0x55b77f 923168 ide_ioport_write 19.677 pid=147030 addr=0x6 reg=b'Device/Head' val=0xe0 bus=0x55b77f922d10 s=0 x55b77f923168 ide_ioport_write 19.417 pid=147030 addr=0x1 reg=b'Features' val=0x3 bus=0x55b77f922d10 s=0x55b 77f922d98 ide_ioport_write 9.027 pid=147030 addr=0x2 reg=b'Sector Count' val=0x42 bus=0x55b77f922d10 s=0 x55b77f922d98 ide_ioport_write 8.025 pid=147030 addr=0x7 reg=b'Command' val=0xef bus=0x55b77f922d10 s=0x55b7 7f922d98 ide_exec_cmd 0.461 pid=147030 bus=0x55b77f922d10 state=0x55b77f922d98 cmd=0xef pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 ide_ioport_read 35.577 pid=147030 addr=0x7 reg=b'Status' val=0x50 bus=0x55b77f922d10 s=0x55b77 f922d98 ide_ioport_read 29.095 pid=147030 addr=0x7 reg=b'Status' val=0x50 bus=0x55b77f922d10 s=0x55b77 f922d98 ide_ioport_write 19.146 pid=147030 addr=0x6 reg=b'Device/Head' val=0xe0 bus=0x55b77f922d10 s=0 x55b77f922d98 pci_cfg_read 9.468 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 127.712 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.942 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.793 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.852 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.803 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.762 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 105.219 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 102.634 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.943 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.883 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 pci_cfg_read 101.792 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x0 It looks like I've made mistakes in previous comments about the error and what the problem might be. Excuse my inexperience. Rather than spinning on ARTTIM23_INTR_CH1 it might be the case that Solaris 10 is spinning on CFR_INTR_CH0. I think this because of the following trace: pci_cfg_read 53.231 pid=147030 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 The two reads on the io status register show that DRDY (drive ready indicator bit) and DSC (drive seek complete bit). This doesn't look unusual to me. The error bit is also not set which is reassuring. I read through some of ftp://ftp.seagate.com/pub/acrobat/reference/111-1c.pdf and I'm confused by the discussion regarding interrupts and the status register. > INTRQ is cleared when the host reads the status register. My understand is that INTRQ is the signal from pin 31 on the drive and that the status register is on the drive. I understand the quoted statement as when the host (CMD646) reads the status register of the drive, the drive will lower the interrupt on this pin. The CMD646 has CFR_INTR_CH0 and ARTTIM23_INTR_CH1 in it's PCI configuration space. This is necessary to determine the source of an interrupt when the CMD646 ports are in PCI IDE Native Mode. Are we saying that when the drive lowers the interrupt, the CMD646 sees this and then clears CFR_INTR_CH0 and ARTTIM23_INTR_CH1 automatically? If this were the case then I don't know why there is an interface to clear them by writing to them. Also, if reading the ioport status register was enough to clear CFR_INTR_CH0 and ARTTIM23_INTR_CH1 (specific to CMD646) I can't reason why Linux, Solaris, and OpenBSD would have specific routines to clear them (following the CMD646 documentation) rather than just reading the ioport status register. With the patch, the tracing output changes to this: ide_ioport_read 128.512 pid=162907 addr=0x7 reg=b'Status' val=0x0 bus=0x55909512bd10 s=0x55909512c168 ide_ioport_write 22.622 pid=162907 addr=0x6 reg=b'Device/Head' val=0xe0 bus=0x55909512bd10 s=0x55909512c168 ide_ioport_write 21.330 pid=162907 addr=0x1 reg=b'Features' val=0x3 bus=0x55909512bd10 s=0x55909512bd98 ide_ioport_write 13.926 pid=162907 addr=0x2 reg=b'Sector Count' val=0x42 bus=0x55909512bd10 s=0x55909512bd98 ide_ioport_write 9.278 pid=162907 addr=0x7 reg=b'Command' val=0xef bus=0x55909512bd10 s=0x55909512bd98 ide_exec_cmd 0.921 pid=162907 bus=0x55909512bd10 state=0x55909512bd98 cmd=0xef pci_cfg_read 40.647 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 ide_ioport_read 40.445 pid=162907 addr=0x7 reg=b'Status' val=0x50 bus=0x55909512bd10 s=0x55909512bd98 ide_ioport_read 31.580 pid=162907 addr=0x7 reg=b'Status' val=0x50 bus=0x55909512bd10 s=0x55909512bd98 ide_ioport_write 17.923 pid=162907 addr=0x6 reg=b'Device/Head' val=0xe0 bus=0x55909512bd10 s=0x55909512bd98 pci_cfg_read 10.931 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 pci_cfg_read 19.136 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 pci_cfg_write 26.650 pid=162907 dev=b'cmd646-ide' devid=0x3 fnid=0x0 offs=0x50 val=0x4 Now there is no fatal error and Solaris does the expected by setting CFR_INTR_CH0 to clear it. QEMU then updates the interrupt status in cmd646_pci_config_write. I'm still unconvinced that we should be lowering interrupts when the ide ioport status register is read. I might be misunderstanding the documentation though (definitely possible). If I am misunderstanding the documentation, could you or Mark clarify what I'm getting wrong here. Thanks, Jasper Lowell. On Sat, 2020-02-22 at 21:05 +0100, BALATON Zoltan wrote: > On Sat, 22 Feb 2020, BALATON Zoltan wrote: > > On Sat, 22 Feb 2020, Mark Cave-Ayland wrote: > > > On 21/02/2020 06:50, jasper.low...@bt.com wrote: > > > > The Linux libATA API documentation mentions that on some > > > > hardware, > > > > reading the status register has the side effect of clearing the > > > > interrupt condition. When emulating the generic Sun4u machine > > > > running > > > > Solaris 10, the Solaris 10 CMD646 driver exits fatally because > > > > of this > > > > emulated side effect. This side effect is likely to not exist > > > > on real > > > > CMD646 hardware. > > > > > > > > Signed-off-by: Jasper Lowell <jasper.low...@bt.com> > > > > --- > > > > hw/ide/core.c | 1 - > > > > 1 file changed, 1 deletion(-) > > > > > > > > diff --git a/hw/ide/core.c b/hw/ide/core.c > > > > index 80000eb766..82fd0632ac 100644 > > > > --- a/hw/ide/core.c > > > > +++ b/hw/ide/core.c > > > > @@ -2210,7 +2210,6 @@ uint32_t ide_ioport_read(void *opaque, > > > > uint32_t > > > > addr) > > > > } else { > > > > ret = s->status; > > > > } > > > > - qemu_irq_lower(bus->irq); > > > > break; > > > > } > > > > > > I don't think that this is correct: from memory when I last > > > looked at this, > > > there > > > were 2 IDE status registers: the one from the original > > > specification which > > > clears the > > > IRQ upon read, and another one in subsequent revisions which > > > allows you to > > > read the > > > value without clearing any pending IRQ. My guess would be that > > > changing > > > this would > > > not only cause QEMU to deviate from the specification, but causes > > > problems > > > in other OSs. > > > > You're right, legacy ide has two status registers as described > > here: > > > > ftp://ftp.seagate.com/pub/acrobat/reference/111-1c.pdf > > > > Now question is which of these the above is emulating? Looks like > > CMD646 > > We have both ide_status_read() which does not clear irq and > ide_ioport_read() which does. pci_ide_cmd_read() which PCI ide should > use > calls ide_status_read() so I wonder why did reading status cleared > irq on > CMD646? So maybe it's cleared from somewhere else and above change > should > not be needed. > > Regards, > BALATON Zoltan