> > - boot with kmdb, use F1-A to break into kmdb while the startup attempt for > > the > > second core is hanging, and simply ":c" continue: > > system startup continues and both cpu cores are online > > > > (why does dropping into kmdb fix this problem?) > > What does ::cpuinfo -v show when you drop in kmdb in > this case?
It boots like this (btw, an snv_40 system, bfu'ed to opensolaris-20060626): SunOS Release 5.11 Version wos_b44 32-bit Copyright 1983-2006 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. features: 1167fdf<cpuid,cmp,sse3,nx,sse2,sse,sep,pat,cx8,pae,mca,mmx,cmov,pge,mtrr,msr,tsc,lgpg> cpuid 0: initialized cpumod: cpu.generic mem = 2088252K (0x7f74f000) root nexus = i86pc ... 8042 device: [EMAIL PROTECTED], kb8042 # 0 kb80420 is /isa/[EMAIL PROTECTED],60/[EMAIL PROTECTED] 8042 device: [EMAIL PROTECTED], mouse8042 # 0 mouse80420 is /isa/[EMAIL PROTECTED],60/[EMAIL PROTECTED] NOTICE: Kernel debugger present: disabling console power management. pcplusmp: pciclass,0c0320 (ehci) instance 0 vector 0x17 ioapic 0x2 intin 0x17 is bound to cpu 1 PCI Express-device: pci8086,[EMAIL PROTECTED],7, ehci0 ehci0 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],7 PCI Express-device: pci8086,[EMAIL PROTECTED], uhci0 uhci0 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED] pcplusmp: pciclass,0c0300 (uhci) instance 1 vector 0x13 ioapic 0x2 intin 0x13 is bound to cpu 0 PCI Express-device: pci8086,[EMAIL PROTECTED],1, uhci1 uhci1 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],1 pcplusmp: pciclass,0c0300 (uhci) instance #2 vector 0x12 ioapic 0x2 intin 0x12 is bound to cpu 0 PCI Express-device: pci8086,[EMAIL PROTECTED],2, uhci2 uhci2 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],2 pcplusmp: pciclass,0c0300 (uhci) instance #3 vector 0x10 ioapic 0x2 intin 0x10 is bound to cpu 0 PCI Express-device: pci8086,[EMAIL PROTECTED],3, uhci3 uhci3 is /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED],3 cpu0: x86 (chipid 0x0 GenuineIntel family 6 model 14 step 8 clock 2000 MHz) cpu0: Intel(r) CPU T2500 @ 2.00GHz That's the last message on the screen. I think cpu0 is now looping here, waiting 20 seconds for cpu1 to be added to "procset" variable: http://cvs.opensolaris.org/source/xref/on/usr/src/uts/i86pc/os/mp_startup.c#982 When I type "F1-A" on the PS/2 keyboard before the 20 second wait has completed, it drops into kmdb and output on the screen looks something like this: cpu0: x86 (chipid 0x0 GenuineIntel family 6 model 14 step 8 clock 2000 MHz) cpu0: Intel(r) CPU T2500 @ 2.00GHz { I type F1-A here } { .... some kmdb stuff about new kernel modules .... } [0]> cpu1: x86 (chipid 0x0 GenuineIntel family 6 model 14 step 8 clock 2000 MHz) cpu1: Intel(r) CPU T2500 @ 2.00GHz Note the cpu1 messages, *after* the [0]> kmdb prompt. That is, as soon as I drop into kmdb, the two log messages printed by the cmn_err(CE_CONT, ...) call in init_cpu_info()... http://cvs.opensolaris.org/source/xref/on/usr/src/uts/i86pc/os/mp_startup.c#init_cpu_info ... appear on the console screen, after the kmdb prompt was printed! cpuinfo output, manually copied from the console screen > ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fec20b0c 1b 0 0 104 no no t-0 d0594de0 | | RUNNING <--+ +-- ..... READY EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 d0c86800 0 0 10 99 no no - d0e25de0 When I print the contents of the variable "procset", I see the value of "3". I seems as if the initialization code for cpu1 is waiting/hanging in the "init_cpu_info()" call, http://cvs.opensolaris.org/source/xref/on/usr/src/uts/i86pc/os/mp_startup.c#1133 before cpu1 is added to the procset, at line 1136: 1133 init_cpu_info(cp); <<<<< hangs inside this function 1134 1135 mutex_enter(&cpu_lock); 1136 CPUSET_ADD(procset, cp->cpu_id); <<<<< this is what start_other_cpus() is waiting for 1137 mutex_exit(&cpu_lock); Somehow, dropping into kmdb seem to "unblock" the hanging cmn_err call, and initialization for cpu core 1 completes. Hmm, I guess since we're just trying to start all the cpus in the system, kmdb does not yet know about the new cpu #1 and doesn't stop cpu 1 when I drop into kmdb. While I'm at the kmdb prompt, cpu 1 initialization completes in the background. This message posted from opensolaris.org _______________________________________________ opensolaris-code mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/opensolaris-code
