Hi there,

I have a box running b81 which, when using AHCI mode for the SATA
controller, boots and installs fine from the installation CD and
boots fine when I choose the xVM option in GRUB, but not when I
choose the regular Solaris kernel in GRUB. (Everything works using
IDE emulation mode).

I've tought myself a little about kmdb and gone through the extent
of the debugging I'm capable of, so I'd appreciate any pointers on
where I should be looking next or if anyone has ideas on a fix.

Motherboard: ASUS M2A-VM, with latest (non-beta) BIOS
BIOS SATA controller mode set to "AHCI"
Nevada b81 (SXCE), 64bit

I want to use AHCI mode for the performance benefit, my experience
with IDE emulation (cmdk driver) has been bad. In IDE emulation
mode, the system boots fine.

In AHCI mode, the installer boots and installs correctly, but once
installed the system hangs during boot for a minute or two and
eventually panics because it can't mount the root filesystem, with
no other indications as to why.

However, if I boot into the xVM kernel, the system boots normally
and works like a charm, and uses the AHCI driver for drive access.

Wondering what's going on in the regular kernel versus xVM kernel, I
started doing some research into kmdb, and with my extremely limited
newfound knowledge, I've gone through a couple of debugging sessions
and determined the following:

After enabling moddebug and examining ::msgbuf after the panic, I
see that the long delay during the boot process is the result of
waiting for a SATA command to return, which eventually times out:

  installing ahci, module id 41.
  installing sata, module id 42.

  NOTICE: hba AHCI bersion = 1.10
  pcplusmp: pciclass,010601 (ahci) instance 0 vector 0x16 ioapic 0x4
  intin 0x16 is bound to cpu 0
  NOTICE: ahci watchdog: port 0 satapkt 0xffffff0204a6be08 timed out
  NOTICE: ahci watchdog: port 0 satapkt 0xffffff0204a6be08 timed out

  ... panic, unable to mount root fs ...

With some more digging (and lots of wishing I knew a lot more about
how to effectively use the debugger!), it looks like what's timing
out is the first sata_pkt sent from sata_fetch_device_identify_data
in sata.c. Looking at the sata_pkt structure at the address
referenced in the timeout message, it's just sending the SATA
command SATAC_ID_DEVICE (0xEC). After the long delay and packet,
that sata_pkt does end up with satapkt_reason set to the value
indicating a timeout.

So... where to go from here? I suspect my whole investigation
drilling down into the device_identify stuff is just a symptom of a
larger issue communicating with the SATA/AHCI controller, but why
does it work a) during installation, and b) when booting the xVM
kernel, but not when using the regular Solaris kernel?

Any input is appreciated!

Thank you,
Matthew
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to