When trying to boot a Solaris Dom0 kernel on a Tecra S1 laptop (Pentium-M cpu;
no PAE available; and the hardware design seems to force using the old
legacy 8259 PICs), using a xen hypervisor compiled with PAE disabled,
the Solaris x86 kernel crashes somewhere inside mach_init() with a
BAD TRAP (#pf page fault) with pc=0 and address 0xf000e6f2.

Crash happens in usr/src/uts/i86pc/os/mp_machdep.c at line 617,
because "pops->psm_softinit == NULL".

   612          if (pops->psm_notify_error) {
   613                  psm_notify_error = mach_notify_error;
   614                  notify_error = pops->psm_notify_error;
   615          }
   616
   617          (*pops->psm_softinit)();


Cause appears to be in mach_construct_info(), which does not
initialize mach_set[PSM_OWN_SYS_DEFAULT].  When we call
mach_get_platform(PSM_OWN_SYS_DEFAULT) at line 541, the mach_ver[]
array contains

        Index 0:   pointer to mach_ops
        Index 1-3: NULL

(that is, mach_set[PSM_OWN_SYS_DEFAULT] was NULL)


   522  static void
   523  mach_construct_info()
   524  {
   525          struct psm_sw *swp;
   526          int     mach_cnt[PSM_OWN_OVERRIDE+1] = {0};
   527          int     conflict_owner = 0;
   528
   529          if (psmsw->psw_forw == psmsw)
   530                  panic("No valid PSM modules found");
   531          mutex_enter(&psmsw_lock);
   532          for (swp = psmsw->psw_forw; swp != psmsw; swp = swp->psw_forw) {
   533                  if (!(swp->psw_flag & PSM_MOD_IDENTIFY))
   534                          continue;
   535                  mach_set[swp->psw_infop->p_owner] = swp->psw_infop->p_op
s;
   536                  mach_ver[swp->psw_infop->p_owner] = swp->psw_infop->p_ve
rsion;
   537                  mach_cnt[swp->psw_infop->p_owner]++;
   538          }
   539          mutex_exit(&psmsw_lock);
   540
   541          mach_get_platform(PSM_OWN_SYS_DEFAULT);

Apparently the xen platform module hasn't set the bit "swp->psw_flag &
PSM_MOD_IDENTIFY" so the code at lines 535 - 537 was skipped.
(swp->psw_flag for the xen platform module had a value of 1
== PSM_MOD_INSTALL).

When mach_get_platform(PSM_OWN_SYS_DEFAULT) is called, random data is
copied from address 0 into the mach_ops array (on the Tecra S1, 
mach_ops.psm_softinit remains set to NULL).



Problem #1:  the code shouldn't crash like this; I expect some kind
of error message, why the xen platform module has failed

    ==> Apparently the code assumes a "PSM_OWN_SYS_DEFAULT" psm module
    never fails the probe, that is, the "PSM_OWN_SYS_DEFAULT" psm module
    is always usable.  This isn't the case for xpv_psm, it fails psm_probe()
    on uppc machines.

    ==> shouldn't we have a xpv_uppc_psm (PSM_OWN_SYS_DEFAULT) &
    xpv_pcplusmp_psm (PSM_OWN_EXCLUSIVE) module;
    xpv_uppc_psm is always available, and xpv_pcplusmp_psm only on machines
    with APIC ?

Problem #2: why did the xen platform module fail to probe?

usr/src/uts/i86pc/os/mp_implfuncs.c: line 409

   397  void
   398  psm_install(void)
   399  {
   400          struct psm_sw *swp, *cswp;
   401          struct psm_ops *opsp;
   402          char machstring[15];
   403          int err;
   404
   405          mutex_enter(&psmsw_lock);
   406          for (swp = psmsw->psw_forw; swp != psmsw; ) {
   407                  opsp = swp->psw_infop->p_ops;
   408                  if (opsp->psm_probe) {
   409                          if ((*opsp->psm_probe)() == PSM_SUCCESS) {
   410                                  swp->psw_flag |= PSM_MOD_IDENTIFY;
   411                                  swp = swp->psw_forw;
   412                                  continue;
   413                          }
   414                  }


Root cause is that xpv_psm`xen_psm_probe() tries to probe the apic
usign apic_probe_common(), but the Tecra S1 doesn't have the apic
enabled, so apic_probe_common() returns -1

(When booting standard Solaris-x86, the uppc psm module is used)


         =================

The hypervisor seems to have partial(?) / full(?) support for such uppc
systems.  Shouldn't the Solaris i86xpv platform code support such a
system, too?
 
 
This message posted from opensolaris.org
_______________________________________________
xen-discuss mailing list
[email protected]

Reply via email to