On Mon, 28 Dec 2020, Mark Cave-Ayland wrote:
On 24/12/2020 08:11, BALATON Zoltan via wrote:
On Wed, 23 Dec 2020, Guenter Roeck wrote:
On Thu, Dec 24, 2020 at 02:34:07AM +0100, BALATON Zoltan wrote:
[ ... ]
If we need legacy mode then we may be able to emulate that by setting
BARs
to legacy ports ignoring what values are written to them if legacy mode
config is set (which may be what the real chip does) and we already have
IRQs hard wired to legacy values so that would give us legacy and
half-native mode which is enough for both fuloong2e and pegasos2 but I'm
not
sure how can we fix BARs in QEMU because that's also handled by generic
PCI
code which I also don't want to break.
The code below works for booting Linux while at the same time not
affecting
any other emulation. I don't claim it to be a perfect fix, and overloading
the existing property is a bit hackish, but it does work.
Yes, maybe combining it with my original patch 1 to change secondary to
flags to make it a bit cleaner would work for me. Then we would either only
emulate legacy or half-native mode which is sufficient for these two
machines we have. If Mark or others do not object it this time, I can
update my patch and resubmit with this one to fix this issue, otherwise
let's wait what idea do they have because I hate to spend time with
something only to be discarded again. I think we don't need more complete
emulation of this chip than this for now but if somebody wants to attempt
that I don't mind as long as it does not break pegasos2.
I had a play with your patches this afternoon, and spent some time performing
some experiments and also reading various PCI bus master specifications and
datasheets: this helped me understand a lot more about the theory of IRQ
routing and compatible vs. legacy mode.
From reading all the documentation (including the VIA and other datasheets) I
cannot find any reference to a half-native mode which makes me think
The half-native mode is my simpler term for Linux's "non 100% native
mode". This may not exist in hardware but exists as a concept in some
Linux (and maybe other) drivers so emulating it just means we do what
these drivers expect to work correctly.
How this maps to hardware and what interactions are there with firmware
may be interesting but I'm not interested to find out as long as all
guests we care about work because adding more complexity just for the sake
of correctly modeling hardware seems like a waste of time in this case.
Thanks for taking the time to find and document these though, it may be
useful if someone wants to clean this up further. I'm satisfied with
getting it in good enough shape for fuloong2e and pegasos2 to boot the
guests we want, because I'd rather spend time on other, more interesting
stuff such as writing replacement firmware for pegasos2 to avoid needing
an undistributable ROM, implementing missing sound support, improving
ati-vga or getting the Mac ROM work with g3beige, and also FPU emulation
on PPC (and these are just the QEMU related stuff, I can think of others
too). All of those seem time better spent than beating this via-ide model
further now just for the sake of perfection without any gain, because
guests will not work better even after spending more time with this.
That's why I call it a waste of time. I know you prefer perfect patches
but as they say "Perfect is the enemy of good." (I could think of better
use of your time too such as finishing your screamer patches or improving
OpenBIOS or your original sparc interest but that's for you to decide what
you do.)
I also try to improve these models and add missing stuff as needed but my
goal is not perfection because I don't have that much time, just reaching
good enough. It can always be improved later (or corrected if it turns out
to be needed as in this case) but if we always hold back until getting it
perfect we wont get anywhere. If your level of perfection was a
requirement in QEMU a lot of devices would not be there as they could not
get in in the first place which means other people cannot improve it as
there's nothing there to start with. So I think something that is good
enough is at least a good start towards perfection.
We can argue what level is good enough. I think if it makes guests work
which seems to be the general approach in QEMU as a lot of devices don't
actually model real hardware correctly but just so that guests run with
it. Of course we should make it clean and follow hardware where possible
but a lot of models don't do that (maybe actually very few are anywhere
near perfect).
something else is wrong here. At the simplest level it could simply be that
the VIA doesn't tri-state its legacy IRQ lines whilst the device is in native
mode (the SI controller has an option for this), or it could indicate there
is a PCI IRQ routing problem somewhere else that hasn't been picked up yet.
All of the datasheets suggest that legacy vs. native mode is selected by
setting the correct bits in PCI_CLASS_PROG, and Linux reads this byte and
configures itself to use legacy or native mode accordingly. Since the current
default for the VIA is 0x8a then it should default to legacy mode, but we're
immediately hitting some issues here: I've summarised my notes below for
those interested.
1) PCI bus reset loses the default BAR addresses
The first problem we find is that the initialisation of the PCI bus erases
the default BAR addresses: that's to say lines 133-137 in hw/ide/via.c will
in effect do nothing:
133 pci_set_long(pci_conf + PCI_BASE_ADDRESS_0, 0x000001f0);
134 pci_set_long(pci_conf + PCI_BASE_ADDRESS_1, 0x000003f4);
135 pci_set_long(pci_conf + PCI_BASE_ADDRESS_2, 0x00000170);
136 pci_set_long(pci_conf + PCI_BASE_ADDRESS_3, 0x00000374);
137 pci_set_long(pci_conf + PCI_BASE_ADDRESS_4, 0x0000cc01); /* BMIBA:
20-23h */
The lifecycle of the VIA IDE device goes like this: init() -> realize() ->
reset() but then the PCI bus reset in pci_do_device_reset() immediately wipes
the BAR addresses. This is why the legacy IDE ports currently don't appear at
startup. Note I do see that other devices do try this e.g.
gt64120_pci_realize() so it's an easy mistake to make.
This is from the original commit 10 years ago so I think QEMU may have
worked differently back then and possibly this worked and just left there
because nobody noticed until now. I did notice PCI config values are reset
when starting to work on this and on your suggestion fixed the problem for
that one register in PCI reset code that I've worked around first in this
model.
2) -kernel doesn't initialise the VIA device
If you take a look at the PMON source it is possible to see that the firmware
explicitly sets the PCI_CLASS_PROG to compatibility mode and disables the
native PCI interrupt
(https://github.com/loongson-community/pmon-2ef/blob/master/sys/dev/pci/vt82c686.c#L82).
Since Linux reads this byte on startup then this is why the kernel switches
to compatibility mode by default. However the point here is that booting a
kernel directly without firmware means the VIA IDE device isn't initialised
as it would be in real life, and that's why there are attempts to
pre-configure the device accordingly in via_ide_realize()/via_ide_reset().
Isn't this worked around by setting the mode to legacy at start up? Maybe
you could emulate firmware in load_kernel() but I leave that exercise to
somebody who is interested in running Linux on fuloong2e.
3) QEMU doesn't (easily) enable a BAR to be disabled
The ideal situation would be for QEMU's VIA IDE device to check
PCI_CLASS_PROG and configure itself dynamically: with PCI_CLASS_PROG set for
legacy mode by default, the device can disable its BARs until they are
explicitly enabled.
According to the PCI bus master specification the recommended behaviour for a
device in compatible mode is to ignore all writes to the BARs, and for all
BAR reads to return 0. This fits nicely with Guenter's finding that the BMDMA
BAR should not return a value in order for Linux to boot correctly in legacy
mode.
Unfortunately there is no existing functionality for this in QEMU which means
you would have to do this manually by overriding the PCI config read/write
functions. This is trickier than it sounds because the reads/writes don't
necessarily have to be aligned to the BAR addresses in configuration space.
I did go through this too when I've prepared my original patches and got
to the same conclusion.
In summary whilst I'm not keen on the series in its current form, it
seems the best solution for now. I've got a few comments on the latest
version of the series which I will send along shortly.
Glad you've got to this at last. Would have probably saved some time if
you accepted it back in March but that's gone now.
Regards,
BALATON Zoltan