Hi Klaus,
I am blocked on my MSI issues for the moment, so I went back to look at this
SecBus issue. I think I have figured out what is going wrong, and I would not
be surprised if it's also a factor in some of the other troubles I am having
(discussed in other threads). But I am not sure what to do about it and would
really appreciate some advice.
My device constructor prepares the PCIDevice structure and then calls
PDMDevHlpPCIRegister() to register the device. That in turn calls
ich9pciRegisterInternal(), which adds my device to the set of bridges, here:
pBus->papBridgesR3[pBus->cBridges] = pPciDev;
pBus->cBridges++;
The problem comes later when VBox calls ich9pciInitBridgeTopology() recursively
to enumerate the busses. As it's walking through the list of bridges, it makes
the assumption that achInstanceData in the device instance is of type
ICH9PCIBUS, but for my device it instead contains my own device property
structure. This confuses the next deeper recursion layer, and as you correctly
guessed, it enters a recursion loop that breaks the stack.
for (uint32_t iBridge = 0; iBridge < pBus->cBridges; iBridge++)
{
PPCIDEVICE pBridge = pBus->papBridgesR3[iBridge];
AssertMsg(pBridge && pciDevIsPci2PciBridge(pBridge),
("Device is not a PCI bridge but on the list of PCI
bridges\n"));
PICH9PCIBUS pChildBus = PDMINS_2_DATA(pBridge->pDevIns, PICH9PCIBUS);
pGlobals->uBus++;
ich9pciInitBridgeTopology(pGlobals, pChildBus, uBusSecondary,
pGlobals->uBus);
}
It seems that if I want to register my device as a bridge, then I need my
achInstanceData buffer to contain an ICH9PCIBUS struct. I thought I could just
add this to the top of my own device properties structure, but it's declared in
DevPciIch9.cpp and therefore not available to me.
Clearly I am doing something wrong in the way I register my bridge, so what is
the right way to register a bridge such that its achInstanceData buffer
contains an ICH9PCIBUS struct and also my own device properties as well?
// RicV
From: [email protected] [mailto:[email protected]]
On Behalf Of Vilbig, Ric
Sent: Saturday, April 02, 2016 10:11
To: Klaus Espenlaub; [email protected]
Subject: Re: [vbox-dev] VM crash, NS_ERROR_FAILURE
Thank you Klaus. I will return to this SecBus issue after I get past the MSI
issue that I am having, and this information will certainly be helpful. At the
moment I have a work around (hack) for the SecBus problem which seems to be
working, whereas the MSI problem has me blocked. Once I get past that, I
should have time to figure out a proper solution for the SecBus problem.
// RicV
From: Klaus Espenlaub [mailto:[email protected]]
Sent: Friday, April 01, 2016 08:51
To: [email protected]<mailto:[email protected]>
Subject: Re: [vbox-dev] VM crash, NS_ERROR_FAILURE
Hi Ric,
On 30.03.2016 02:02, Vilbig, Ric wrote:
Hi,
I obviously carried on with my investigation after sending the original email,
and have figured out what is triggering this abort (not really fair to call it
a crash).
VBox.log actually is showing that the VM was never fully powered up. So the
crash happens before the CPU started executing instructions. See below, I know
that this doesn't make much sense to you.
When the BIOS starts initializing the PCI Configuration space for my PCIe
switch, it reads the secondary bus register (PCI CFG 0x19) before it's been
initialized, so the device model is returning 0. This puts the BIOS into a
loop, repeating the following over 5000 times before aborting the VM session.
PCI CFG Root Rd 0x0a L 2 = 0x0604 // Class
PCI CFG Root Rd 0x00 L 2 = 0x14ab // VendID
PCI CFG Root Rd 0x02 L 2 = 0x1000 // DevID
PCI CFG Root Wr 0x1c L 1 = 0xd0 // IOBase
PCI CFG Root Wr 0x20 L 2 = 0xf000 // MemBase
PCI CFG Root Rd 0x19 L 1 = 0x00 // SecBus
If I intercept the secondary bus register read, and return a 3 instead of
reading 0 from RTL, then it carries on with root configuration and my VM boots
and runs correctly. It's not detecting the downstream end point, but that is a
separate issue.
Meanwhile, does it make sense for the BIOS to read the secondary bus register
before it's been initialized? It seems like that register should be set up as
the BIOS proceeds through the enumeration. That is what the VM with PIIX3
chipset does.
It does, but for a non-obvious reason. VirtualBox pre-configures its PCI
devices before it starts the BIOS, especially the bus numbers. Looks like for
some reason this isn't done properly (or not making it correctly to your PCIe
switch). This confuses the code, most likely causing endless recursion and thus
a stack overflow. You should be able to use a debugger on the VM process to
find out the detail, because this is all normal userland code on the host -
which wouldn't work if it's BIOS code running inside the VM.
The motivation for moving the PCI bus configuration out of the BIOS is to some
extent historic (in the old days we always fought with the BIOS size
restriction, due to the extremely bad code quality by the BCC compiler), to
some extent an optimization (it's far easier and more efficient to do the hairy
stuff in 32 bit code on the host, and not in in the actual BIOS, which is
annoying 16 bit real mode code).
Klaus
_____________________________________________
Ric Vilbig
Mentor Graphics, Emulation Division
46871 Bayside Parkway, Fremont CA, 94538
Phone: 510-354-7360
Mobile: 408-529-2365
email: [email protected]<mailto:[email protected]>
From: Vilbig, Ric
Sent: Tuesday, March 29, 2016 11:40
To: [email protected]<mailto:[email protected]>
Cc: Vilbig, Ric
Subject: VM crash, NS_ERROR_FAILURE
Hi experts,
I would like to ask for some help to figure out why a certain VM crashes on
start-up. Although the problem is evidently induced by my PDM plug-in, the
crash does not appear to be happening therein. I need some help to root cause
where VBox is aborting the VM session.
> VBoxManage startvm "U14_ICH9_2"
Waiting for VM "U14_ICH9_2" to power on...
VBoxManage: error: The VM session was aborted
VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component
SessionMachine, interface ISession
I created this VM from the VirtualBox GUI, v5.0.16, which I built from the
tarball at https://www.virtualbox.org/wiki/Downloads and am running on an
Ubuntu 14 host. Then I switched the chipset to ICH9, then I installed Ubuntu
14 as the guest. The VM runs well, until I plug my virtual device model into
PDM (it's a PCIe switch with downstream endpoint). After plugging in my
virtual device, the VM crashes as shown above.
I tracked down everywhere NS_ERROR_FAILURE is mentioned in the sources. I
found that DirectoryServiceProvider::GetFile() returns that error twice, right
away, but that is also true in the working case when my device is unplugged.
In no other place is that specific error ever returned or asserted. However, I
found that E_FAIL is #defined to NS_ERROR_FAILURE, and there are hundreds of
references to E_FAIL, so I gave up trying to instrument them all.
I need some help to root cause this problem. Log files show that it is getting
as far as BIOS starting to initialize the switch, apparently stuck in a loop
doing that, but then lights out with no trail that I can follow.
Log files are attached. Lines bearing the "RicV" prefix were instrumented by
me to investigate this problem. Lines bearing the "RemDev" prefix are coming
from my PDM plug-in.
Thanks,
_____________________________________________
Ric Vilbig
Mentor Graphics, Emulation Division
46871 Bayside Parkway, Fremont CA, 94538
Phone: 510-354-7360
Mobile: 408-529-2365
email: [email protected]<mailto:[email protected]>
_______________________________________________
vbox-dev mailing list
[email protected]
https://www.virtualbox.org/mailman/listinfo/vbox-dev