On Mon, 28 Dec 2020, Mark Cave-Ayland wrote:
On 24/12/2020 08:11, BALATON Zoltan via wrote:

On Wed, 23 Dec 2020, Guenter Roeck wrote:
On Thu, Dec 24, 2020 at 02:34:07AM +0100, BALATON Zoltan wrote:
[ ... ]

If we need legacy mode then we may be able to emulate that by setting BARs
to legacy ports ignoring what values are written to them if legacy mode
config is set (which may be what the real chip does) and we already have
IRQs hard wired to legacy values so that would give us legacy and
half-native mode which is enough for both fuloong2e and pegasos2 but I'm not sure how can we fix BARs in QEMU because that's also handled by generic PCI
code which I also don't want to break.

The code below works for booting Linux while at the same time not affecting
any other emulation. I don't claim it to be a perfect fix, and overloading
the existing property is a bit hackish, but it does work.

Yes, maybe combining it with my original patch 1 to change secondary to flags to make it a bit cleaner would work for me. Then we would either only emulate legacy or half-native mode which is sufficient for these two machines we have. If Mark or others do not object it this time, I can update my patch and resubmit with this one to fix this issue, otherwise let's wait what idea do they have because I hate to spend time with something only to be discarded again. I think we don't need more complete emulation of this chip than this for now but if somebody wants to attempt that I don't mind as long as it does not break pegasos2.

I had a play with your patches this afternoon, and spent some time performing some experiments and also reading various PCI bus master specifications and datasheets: this helped me understand a lot more about the theory of IRQ routing and compatible vs. legacy mode.

From reading all the documentation (including the VIA and other datasheets) I cannot find any reference to a half-native mode which makes me think

The half-native mode is my simpler term for Linux's "non 100% native mode". This may not exist in hardware but exists as a concept in some Linux (and maybe other) drivers so emulating it just means we do what these drivers expect to work correctly.

How this maps to hardware and what interactions are there with firmware may be interesting but I'm not interested to find out as long as all guests we care about work because adding more complexity just for the sake of correctly modeling hardware seems like a waste of time in this case. Thanks for taking the time to find and document these though, it may be useful if someone wants to clean this up further. I'm satisfied with getting it in good enough shape for fuloong2e and pegasos2 to boot the guests we want, because I'd rather spend time on other, more interesting stuff such as writing replacement firmware for pegasos2 to avoid needing an undistributable ROM, implementing missing sound support, improving ati-vga or getting the Mac ROM work with g3beige, and also FPU emulation on PPC (and these are just the QEMU related stuff, I can think of others too). All of those seem time better spent than beating this via-ide model further now just for the sake of perfection without any gain, because guests will not work better even after spending more time with this. That's why I call it a waste of time. I know you prefer perfect patches but as they say "Perfect is the enemy of good." (I could think of better use of your time too such as finishing your screamer patches or improving OpenBIOS or your original sparc interest but that's for you to decide what you do.)

I also try to improve these models and add missing stuff as needed but my goal is not perfection because I don't have that much time, just reaching good enough. It can always be improved later (or corrected if it turns out to be needed as in this case) but if we always hold back until getting it perfect we wont get anywhere. If your level of perfection was a requirement in QEMU a lot of devices would not be there as they could not get in in the first place which means other people cannot improve it as there's nothing there to start with. So I think something that is good enough is at least a good start towards perfection.

We can argue what level is good enough. I think if it makes guests work which seems to be the general approach in QEMU as a lot of devices don't actually model real hardware correctly but just so that guests run with it. Of course we should make it clean and follow hardware where possible but a lot of models don't do that (maybe actually very few are anywhere near perfect).

something else is wrong here. At the simplest level it could simply be that the VIA doesn't tri-state its legacy IRQ lines whilst the device is in native mode (the SI controller has an option for this), or it could indicate there is a PCI IRQ routing problem somewhere else that hasn't been picked up yet.

All of the datasheets suggest that legacy vs. native mode is selected by setting the correct bits in PCI_CLASS_PROG, and Linux reads this byte and configures itself to use legacy or native mode accordingly. Since the current default for the VIA is 0x8a then it should default to legacy mode, but we're immediately hitting some issues here: I've summarised my notes below for those interested.


1) PCI bus reset loses the default BAR addresses

The first problem we find is that the initialisation of the PCI bus erases the default BAR addresses: that's to say lines 133-137 in hw/ide/via.c will in effect do nothing:

133     pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x000001f0);
134     pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x000003f4);
135     pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x00000170);
136     pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x00000374);
137 pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0x0000cc01); /* BMIBA: 20-23h */

The lifecycle of the VIA IDE device goes like this: init() -> realize() -> reset() but then the PCI bus reset in pci_do_device_reset() immediately wipes the BAR addresses. This is why the legacy IDE ports currently don't appear at startup. Note I do see that other devices do try this e.g. gt64120_pci_realize() so it's an easy mistake to make.

This is from the original commit 10 years ago so I think QEMU may have worked differently back then and possibly this worked and just left there because nobody noticed until now. I did notice PCI config values are reset when starting to work on this and on your suggestion fixed the problem for that one register in PCI reset code that I've worked around first in this model.

2) -kernel doesn't initialise the VIA device

If you take a look at the PMON source it is possible to see that the firmware explicitly sets the PCI_CLASS_PROG to compatibility mode and disables the native PCI interrupt (https://github.com/loongson-community/pmon-2ef/blob/master/sys/dev/pci/vt82c686.c#L82).

Since Linux reads this byte on startup then this is why the kernel switches to compatibility mode by default. However the point here is that booting a kernel directly without firmware means the VIA IDE device isn't initialised as it would be in real life, and that's why there are attempts to pre-configure the device accordingly in via_ide_realize()/via_ide_reset().

Isn't this worked around by setting the mode to legacy at start up? Maybe you could emulate firmware in load_kernel() but I leave that exercise to somebody who is interested in running Linux on fuloong2e.

3) QEMU doesn't (easily) enable a BAR to be disabled

The ideal situation would be for QEMU's VIA IDE device to check PCI_CLASS_PROG and configure itself dynamically: with PCI_CLASS_PROG set for legacy mode by default, the device can disable its BARs until they are explicitly enabled.

According to the PCI bus master specification the recommended behaviour for a device in compatible mode is to ignore all writes to the BARs, and for all BAR reads to return 0. This fits nicely with Guenter's finding that the BMDMA BAR should not return a value in order for Linux to boot correctly in legacy mode.

Unfortunately there is no existing functionality for this in QEMU which means you would have to do this manually by overriding the PCI config read/write functions. This is trickier than it sounds because the reads/writes don't necessarily have to be aligned to the BAR addresses in configuration space.

I did go through this too when I've prepared my original patches and got to the same conclusion.

In summary whilst I'm not keen on the series in its current form, it seems the best solution for now. I've got a few comments on the latest version of the series which I will send along shortly.

Glad you've got to this at last. Would have probably saved some time if you accepted it back in March but that's gone now.

Regards,
BALATON Zoltan

Reply via email to