Re: PROBLEM: linux 3.16 APIC and bhyve won't boot
Hello, On Fri, Aug 29, 2014 at 03:44:20PM +0200, Chloé Desoutter wrote: > > [1.] One line summary of the problem: > On Linux 3.16 a custom-built kernel with bhyve won't boot and will > hang on the APIC timer calibration. Wow I haven't seen a report in this style in a while... > [2.] Full description of the problem/report: > I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests > lapic_cal_handler never gets called. This prevents the lapic_cal_loops > counter from being incremented and therefore the APIC calibration never > finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS > constant). I added some apic_printk's to check for these infos (not a > best practice but I don't have access to a debugger in this specific > context). > [7.7.] Other information that might be relevant to the problem > A stock Debian 3.2 kernel will boot. I don't have the possibility to > build such a bloated kernel to see what's missing. My goal is to > identify what is the minimal set of functionalities needed to have a > kernel start up in a bhyve context. I have several other Linux VMs > able to run on this hypervisor, none with such a recent kernel. Without being able to email explicit people in addition to the general LKML there's a good chance this will be overlooked. I can only guess the problem has something to do with arch/x86/kernel/apic/apic.c but I don't really know... What will help a lot is if you can try your cut down 3.16 but modify it using make oldconfig on an old 3.2 kernel and see if that boots. If it does, then you can use git bisect v3.16 v3.2 to narrow down the exact commit that introduced the problem. -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PROBLEM: linux 3.16 APIC and bhyve won't boot
Hello, [1.] One line summary of the problem: On Linux 3.16 a custom-built kernel with bhyve won't boot and will hang on the APIC timer calibration. [2.] Full description of the problem/report: I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests lapic_cal_handler never gets called. This prevents the lapic_cal_loops counter from being incremented and therefore the APIC calibration never finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS constant). I added some apic_printk's to check for these infos (not a best practice but I don't have access to a debugger in this specific context). [3.] Keywords (i.e., modules, networking, kernel): kernel, apic [4.] Kernel version (from /proc/version): n/a (won't boot) Linux version 3.16.1-bhyve (root@localhost) (gcc version 4.8.2 (Funtoo 4.8.2-r2) ) #22 SMP Fri Aug 29 12:23:32 Local [5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) N/A [6.] A small shell script or example program which triggers the problem (if possible) N/A [7.] Environment [7.1.] Software (add the output of the ver_linux script here) If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux localhost 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz GenuineIntel GNU/Linux Gnu C 4.8.2 Gnu make 4.0 binutils 2.23.2 1.0 2.23.2 util-linux scripts/ver_linux: line 23: fdformat: command not found mount assert module-init-tools 15 e2fsprogs 1.42.10 xfsprogs 3.1.11 Linux C Library2.18 Dynamic linker (ldd) 2.18 Procps 3.3.9 Net-tools 1.60_p20130513023548 Kbd2.0.1 Sh-utils 8.21 Modules Loaded ext4 crc16 jbd2 mbcache sg sr_mod cdrom virtio_blk virtio_net ahci libahci libata scsi_mod virtio_pci virtio_ring virtio [7.2.] Processor information (from /proc/cpuinfo): [snip] processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz stepping: 3 cpu MHz : 3388.196 cache size : 0 KB fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae cx8 apic sep pge cmov pat pse36 clflush mmx fxsr sse sse2 ss pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl nonstop_tsc pni pclmulqdq dtes64 ds_cpl smx ssse3 fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt fsgsbase erms bogomips: 6776.74 clflush size: 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual [7.3.] Module information (from /proc/modules): N/A [7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) /proc/ioports -0cf7 : PCI Bus :00 -001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-0060 : keyboard 0064-0064 : keyboard 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0220-0223 : pnp 00:01 0224-0227 : pnp 00:01 02f8-02ff : serial 03f8-03ff : serial 0400-0407 : pnp 00:01 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 04d0-04d1 : pnp 00:01 0cf8-0cff : PCI conf1 0d00-1fff : PCI Bus :00 2000-209f : PCI Bus :00 2000-203f : :00:01.0 2000-203f : virtio-pci 2040-205f : :00:02.0 2040-205f : virtio-pci 2060-207f : :00:03.0 2060-207f : virtio-pci /proc/iomem - : reserved 0001-0009fffe : System RAM 000f-000f : System ROM 0010-bfff : System RAM 0100-01359d65 : Kernel code 01359d66-01694f7f : Kernel data 0172a000-01807fff : Kernel bss c000-c01f : PCI Bus :00 c000-c0001fff : :00:01.0 c000-c0001fff : virtio-pci c0002000-c0003fff : :00:02.0 c0002000-c0003fff : virtio-pci c0004000-c0005fff : :00:03.0 c0004000-c0005fff : virtio-pci c0006000-c00063ff : :00:04.0 c0006000-c00063ff : ahci c0006800-c0006fff : :00:01.0 c0007000-c00077ff : :00:02.0 c0007800-c0007fff : :00:03.0 c0008000-c00087ff : :00:04.0 c0008800-c0008fff : :00:0f.0 e000-efff : PCI MMCONFIG [bus 00-ff] e000-efff : pnp 00:01 fec0-fec003ff : IOAPIC 0 fed0-fed003ff : HPET 0 fee0-fee00fff : Local APIC 1-13fff : System RAM [7.5.] PCI information ('lspci -vvv' as root) root@localhost:/# lspci -vvv 00:00.0 Host bridge: Network Appliance Corporation Device 1275 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF+ FastB2B+ ParErr+ DEVSEL=?? >TAbort+ SERR+ TAbort-
PROBLEM: linux 3.16 APIC and bhyve won't boot
Hello, [1.] One line summary of the problem: On Linux 3.16 a custom-built kernel with bhyve won't boot and will hang on the APIC timer calibration. [2.] Full description of the problem/report: I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests lapic_cal_handler never gets called. This prevents the lapic_cal_loops counter from being incremented and therefore the APIC calibration never finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS constant). I added some apic_printk's to check for these infos (not a best practice but I don't have access to a debugger in this specific context). [3.] Keywords (i.e., modules, networking, kernel): kernel, apic [4.] Kernel version (from /proc/version): n/a (won't boot) Linux version 3.16.1-bhyve (root@localhost) (gcc version 4.8.2 (Funtoo 4.8.2-r2) ) #22 SMP Fri Aug 29 12:23:32 Local [5.] Output of Oops.. message (if applicable) with symbolic information resolved (see Documentation/oops-tracing.txt) N/A [6.] A small shell script or example program which triggers the problem (if possible) N/A [7.] Environment [7.1.] Software (add the output of the ver_linux script here) If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux localhost 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz GenuineIntel GNU/Linux Gnu C 4.8.2 Gnu make 4.0 binutils 2.23.2 1.0 2.23.2 util-linux scripts/ver_linux: line 23: fdformat: command not found mount assert module-init-tools 15 e2fsprogs 1.42.10 xfsprogs 3.1.11 Linux C Library2.18 Dynamic linker (ldd) 2.18 Procps 3.3.9 Net-tools 1.60_p20130513023548 Kbd2.0.1 Sh-utils 8.21 Modules Loaded ext4 crc16 jbd2 mbcache sg sr_mod cdrom virtio_blk virtio_net ahci libahci libata scsi_mod virtio_pci virtio_ring virtio [7.2.] Processor information (from /proc/cpuinfo): [snip] processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 60 model name : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz stepping: 3 cpu MHz : 3388.196 cache size : 0 KB fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae cx8 apic sep pge cmov pat pse36 clflush mmx fxsr sse sse2 ss pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl nonstop_tsc pni pclmulqdq dtes64 ds_cpl smx ssse3 fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm xsaveopt fsgsbase erms bogomips: 6776.74 clflush size: 64 cache_alignment : 64 address sizes : 39 bits physical, 48 bits virtual [7.3.] Module information (from /proc/modules): N/A [7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem) /proc/ioports -0cf7 : PCI Bus :00 -001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-0060 : keyboard 0064-0064 : keyboard 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0220-0223 : pnp 00:01 0224-0227 : pnp 00:01 02f8-02ff : serial 03f8-03ff : serial 0400-0407 : pnp 00:01 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 04d0-04d1 : pnp 00:01 0cf8-0cff : PCI conf1 0d00-1fff : PCI Bus :00 2000-209f : PCI Bus :00 2000-203f : :00:01.0 2000-203f : virtio-pci 2040-205f : :00:02.0 2040-205f : virtio-pci 2060-207f : :00:03.0 2060-207f : virtio-pci /proc/iomem - : reserved 0001-0009fffe : System RAM 000f-000f : System ROM 0010-bfff : System RAM 0100-01359d65 : Kernel code 01359d66-01694f7f : Kernel data 0172a000-01807fff : Kernel bss c000-c01f : PCI Bus :00 c000-c0001fff : :00:01.0 c000-c0001fff : virtio-pci c0002000-c0003fff : :00:02.0 c0002000-c0003fff : virtio-pci c0004000-c0005fff : :00:03.0 c0004000-c0005fff : virtio-pci c0006000-c00063ff : :00:04.0 c0006000-c00063ff : ahci c0006800-c0006fff : :00:01.0 c0007000-c00077ff : :00:02.0 c0007800-c0007fff : :00:03.0 c0008000-c00087ff : :00:04.0 c0008800-c0008fff : :00:0f.0 e000-efff : PCI MMCONFIG [bus 00-ff] e000-efff : pnp 00:01 fec0-fec003ff : IOAPIC 0 fed0-fed003ff : HPET 0 fee0-fee00fff : Local APIC 1-13fff : System RAM [7.5.] PCI information ('lspci -vvv' as root) root@localhost:/# lspci -vvv 00:00.0 Host bridge: Network Appliance Corporation Device 1275 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF+ FastB2B+ ParErr+ DEVSEL=?? TAbort+ TAbort+ MAbort+
Re: PROBLEM: linux 3.16 APIC and bhyve won't boot
Hello, On Fri, Aug 29, 2014 at 03:44:20PM +0200, Chloé Desoutter wrote: [1.] One line summary of the problem: On Linux 3.16 a custom-built kernel with bhyve won't boot and will hang on the APIC timer calibration. Wow I haven't seen a report in this style in a while... [2.] Full description of the problem/report: I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests lapic_cal_handler never gets called. This prevents the lapic_cal_loops counter from being incremented and therefore the APIC calibration never finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS constant). I added some apic_printk's to check for these infos (not a best practice but I don't have access to a debugger in this specific context). snip [7.7.] Other information that might be relevant to the problem A stock Debian 3.2 kernel will boot. I don't have the possibility to build such a bloated kernel to see what's missing. My goal is to identify what is the minimal set of functionalities needed to have a kernel start up in a bhyve context. I have several other Linux VMs able to run on this hypervisor, none with such a recent kernel. Without being able to email explicit people in addition to the general LKML there's a good chance this will be overlooked. I can only guess the problem has something to do with arch/x86/kernel/apic/apic.c but I don't really know... What will help a lot is if you can try your cut down 3.16 but modify it using make oldconfig on an old 3.2 kernel and see if that boots. If it does, then you can use git bisect v3.16 v3.2 to narrow down the exact commit that introduced the problem. -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/