Re: PROBLEM: linux 3.16 APIC and bhyve won't boot

2014-08-29 Thread Sitsofe Wheeler
Hello,

On Fri, Aug 29, 2014 at 03:44:20PM +0200, Chloé Desoutter wrote:
> 
> [1.] One line summary of the problem:
> On Linux 3.16 a custom-built kernel with bhyve won't boot and will
> hang on the APIC timer calibration.

Wow I haven't seen a report in this style in a while...

> [2.] Full description of the problem/report:
> I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests
> lapic_cal_handler never gets called. This prevents the lapic_cal_loops
> counter from being incremented and therefore the APIC calibration never
> finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS
> constant). I added some apic_printk's to check for these infos (not a
> best practice but I don't have access to a debugger in this specific
> context).



> [7.7.] Other information that might be relevant to the problem
> A stock Debian 3.2 kernel will boot. I don't have the possibility to
> build such a bloated kernel to see what's missing. My goal is to
> identify what is the minimal set of functionalities needed to have a
> kernel start up in a bhyve context. I have several other Linux VMs
> able to run on this hypervisor, none with such a recent kernel.

Without being able to email explicit people in addition to the general
LKML there's a good chance this will be overlooked. I can only guess the
problem has something to do with arch/x86/kernel/apic/apic.c but I don't
really know...

What will help a lot is if you can try your cut down 3.16 but modify it
using make oldconfig on an old 3.2 kernel and see if that boots. If it
does, then you can use git bisect v3.16 v3.2 to narrow down the exact
commit that introduced the problem.

-- 
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


PROBLEM: linux 3.16 APIC and bhyve won't boot

2014-08-29 Thread Chloé Desoutter

Hello,

[1.] One line summary of the problem:
On Linux 3.16 a custom-built kernel with bhyve won't boot and will hang on the 
APIC timer calibration.
[2.] Full description of the problem/report:
I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests
lapic_cal_handler never gets called. This prevents the lapic_cal_loops
counter from being incremented and therefore the APIC calibration never
finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS
constant). I added some apic_printk's to check for these infos (not a
best practice but I don't have access to a debugger in this specific
context).

[3.] Keywords (i.e., modules, networking, kernel):
kernel, apic
[4.] Kernel version (from /proc/version):
n/a (won't boot)
Linux version 3.16.1-bhyve (root@localhost) (gcc version 4.8.2 (Funtoo
4.8.2-r2) ) #22 SMP Fri Aug 29 12:23:32 Local
[5.] Output of Oops.. message (if applicable) with symbolic information
 resolved (see Documentation/oops-tracing.txt)
N/A
[6.] A small shell script or example program which triggers the
 problem (if possible)
N/A
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
 
Linux localhost 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 Intel(R) Xeon(R)

CPU E3-1240 v3 @ 3.40GHz GenuineIntel GNU/Linux
 
Gnu C  4.8.2

Gnu make   4.0
binutils   2.23.2
1.0
2.23.2
util-linux scripts/ver_linux: line 23: fdformat: command not found
mount  assert
module-init-tools  15
e2fsprogs  1.42.10
xfsprogs   3.1.11
Linux C Library2.18
Dynamic linker (ldd)   2.18
Procps 3.3.9
Net-tools  1.60_p20130513023548
Kbd2.0.1
Sh-utils   8.21
Modules Loaded ext4 crc16 jbd2 mbcache sg sr_mod cdrom virtio_blk 
virtio_net ahci libahci libata scsi_mod virtio_pci virtio_ring virtio

[7.2.] Processor information (from /proc/cpuinfo):
[snip]
processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 60
model name  : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz
stepping: 3
cpu MHz : 3388.196
cache size  : 0 KB
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae cx8 apic sep pge cmov pat pse36 
clflush mmx fxsr sse sse2 ss pbe syscall nx pdpe1gb lm constant_tsc rep_good 
nopl nonstop_tsc pni pclmulqdq dtes64 ds_cpl smx ssse3 fma cx16 xtpr pcid 
sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 
xsaveopt fsgsbase erms
bogomips: 6776.74
clflush size: 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual

[7.3.] Module information (from /proc/modules):
N/A
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
/proc/ioports
-0cf7 : PCI Bus :00
  -001f : dma1
  0020-0021 : pic1
  0040-0043 : timer0
  0050-0053 : timer1
  0060-0060 : keyboard
  0064-0064 : keyboard
  0080-008f : dma page reg
  00a0-00a1 : pic2
  00c0-00df : dma2
  00f0-00ff : fpu
  0220-0223 : pnp 00:01
  0224-0227 : pnp 00:01
  02f8-02ff : serial
  03f8-03ff : serial
  0400-0407 : pnp 00:01
0400-0403 : ACPI PM1a_EVT_BLK
0404-0405 : ACPI PM1a_CNT_BLK
  0408-040b : ACPI PM_TMR
  04d0-04d1 : pnp 00:01
0cf8-0cff : PCI conf1
0d00-1fff : PCI Bus :00
2000-209f : PCI Bus :00
  2000-203f : :00:01.0
2000-203f : virtio-pci
  2040-205f : :00:02.0
2040-205f : virtio-pci
  2060-207f : :00:03.0
2060-207f : virtio-pci

/proc/iomem
- : reserved
0001-0009fffe : System RAM
000f-000f : System ROM
0010-bfff : System RAM
  0100-01359d65 : Kernel code
  01359d66-01694f7f : Kernel data
  0172a000-01807fff : Kernel bss
c000-c01f : PCI Bus :00
  c000-c0001fff : :00:01.0
c000-c0001fff : virtio-pci
  c0002000-c0003fff : :00:02.0
c0002000-c0003fff : virtio-pci
  c0004000-c0005fff : :00:03.0
c0004000-c0005fff : virtio-pci
  c0006000-c00063ff : :00:04.0
c0006000-c00063ff : ahci
  c0006800-c0006fff : :00:01.0
  c0007000-c00077ff : :00:02.0
  c0007800-c0007fff : :00:03.0
  c0008000-c00087ff : :00:04.0
  c0008800-c0008fff : :00:0f.0
e000-efff : PCI MMCONFIG  [bus 00-ff]
  e000-efff : pnp 00:01
fec0-fec003ff : IOAPIC 0
fed0-fed003ff : HPET 0
fee0-fee00fff : Local APIC
1-13fff : System RAM

 [7.5.] PCI information ('lspci -vvv' as root)
root@localhost:/# lspci -vvv
00:00.0 Host bridge: Network Appliance Corporation Device 1275
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF+ FastB2B+ ParErr+ DEVSEL=?? >TAbort+ SERR+ TAbort- 

PROBLEM: linux 3.16 APIC and bhyve won't boot

2014-08-29 Thread Chloé Desoutter

Hello,

[1.] One line summary of the problem:
On Linux 3.16 a custom-built kernel with bhyve won't boot and will hang on the 
APIC timer calibration.
[2.] Full description of the problem/report:
I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests
lapic_cal_handler never gets called. This prevents the lapic_cal_loops
counter from being incremented and therefore the APIC calibration never
finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS
constant). I added some apic_printk's to check for these infos (not a
best practice but I don't have access to a debugger in this specific
context).

[3.] Keywords (i.e., modules, networking, kernel):
kernel, apic
[4.] Kernel version (from /proc/version):
n/a (won't boot)
Linux version 3.16.1-bhyve (root@localhost) (gcc version 4.8.2 (Funtoo
4.8.2-r2) ) #22 SMP Fri Aug 29 12:23:32 Local
[5.] Output of Oops.. message (if applicable) with symbolic information
 resolved (see Documentation/oops-tracing.txt)
N/A
[6.] A small shell script or example program which triggers the
 problem (if possible)
N/A
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
 
Linux localhost 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u3 x86_64 Intel(R) Xeon(R)

CPU E3-1240 v3 @ 3.40GHz GenuineIntel GNU/Linux
 
Gnu C  4.8.2

Gnu make   4.0
binutils   2.23.2
1.0
2.23.2
util-linux scripts/ver_linux: line 23: fdformat: command not found
mount  assert
module-init-tools  15
e2fsprogs  1.42.10
xfsprogs   3.1.11
Linux C Library2.18
Dynamic linker (ldd)   2.18
Procps 3.3.9
Net-tools  1.60_p20130513023548
Kbd2.0.1
Sh-utils   8.21
Modules Loaded ext4 crc16 jbd2 mbcache sg sr_mod cdrom virtio_blk 
virtio_net ahci libahci libata scsi_mod virtio_pci virtio_ring virtio

[7.2.] Processor information (from /proc/cpuinfo):
[snip]
processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 60
model name  : Intel(R) Xeon(R) CPU E3-1240 v3 @ 3.40GHz
stepping: 3
cpu MHz : 3388.196
cache size  : 0 KB
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae cx8 apic sep pge cmov pat pse36 
clflush mmx fxsr sse sse2 ss pbe syscall nx pdpe1gb lm constant_tsc rep_good 
nopl nonstop_tsc pni pclmulqdq dtes64 ds_cpl smx ssse3 fma cx16 xtpr pcid 
sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 
xsaveopt fsgsbase erms
bogomips: 6776.74
clflush size: 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual

[7.3.] Module information (from /proc/modules):
N/A
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
/proc/ioports
-0cf7 : PCI Bus :00
  -001f : dma1
  0020-0021 : pic1
  0040-0043 : timer0
  0050-0053 : timer1
  0060-0060 : keyboard
  0064-0064 : keyboard
  0080-008f : dma page reg
  00a0-00a1 : pic2
  00c0-00df : dma2
  00f0-00ff : fpu
  0220-0223 : pnp 00:01
  0224-0227 : pnp 00:01
  02f8-02ff : serial
  03f8-03ff : serial
  0400-0407 : pnp 00:01
0400-0403 : ACPI PM1a_EVT_BLK
0404-0405 : ACPI PM1a_CNT_BLK
  0408-040b : ACPI PM_TMR
  04d0-04d1 : pnp 00:01
0cf8-0cff : PCI conf1
0d00-1fff : PCI Bus :00
2000-209f : PCI Bus :00
  2000-203f : :00:01.0
2000-203f : virtio-pci
  2040-205f : :00:02.0
2040-205f : virtio-pci
  2060-207f : :00:03.0
2060-207f : virtio-pci

/proc/iomem
- : reserved
0001-0009fffe : System RAM
000f-000f : System ROM
0010-bfff : System RAM
  0100-01359d65 : Kernel code
  01359d66-01694f7f : Kernel data
  0172a000-01807fff : Kernel bss
c000-c01f : PCI Bus :00
  c000-c0001fff : :00:01.0
c000-c0001fff : virtio-pci
  c0002000-c0003fff : :00:02.0
c0002000-c0003fff : virtio-pci
  c0004000-c0005fff : :00:03.0
c0004000-c0005fff : virtio-pci
  c0006000-c00063ff : :00:04.0
c0006000-c00063ff : ahci
  c0006800-c0006fff : :00:01.0
  c0007000-c00077ff : :00:02.0
  c0007800-c0007fff : :00:03.0
  c0008000-c00087ff : :00:04.0
  c0008800-c0008fff : :00:0f.0
e000-efff : PCI MMCONFIG  [bus 00-ff]
  e000-efff : pnp 00:01
fec0-fec003ff : IOAPIC 0
fed0-fed003ff : HPET 0
fee0-fee00fff : Local APIC
1-13fff : System RAM

 [7.5.] PCI information ('lspci -vvv' as root)
root@localhost:/# lspci -vvv
00:00.0 Host bridge: Network Appliance Corporation Device 1275
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF+ FastB2B+ ParErr+ DEVSEL=?? TAbort+ TAbort+ 
MAbort+ 

Re: PROBLEM: linux 3.16 APIC and bhyve won't boot

2014-08-29 Thread Sitsofe Wheeler
Hello,

On Fri, Aug 29, 2014 at 03:44:20PM +0200, Chloé Desoutter wrote:
 
 [1.] One line summary of the problem:
 On Linux 3.16 a custom-built kernel with bhyve won't boot and will
 hang on the APIC timer calibration.

Wow I haven't seen a report in this style in a while...

 [2.] Full description of the problem/report:
 I'm booting a 3.16 in bhyve (FreeBSD hypervisor) and according to my tests
 lapic_cal_handler never gets called. This prevents the lapic_cal_loops
 counter from being incremented and therefore the APIC calibration never
 finishes. It is supposed to be called 25 times (APIC_CAL_LOOPS
 constant). I added some apic_printk's to check for these infos (not a
 best practice but I don't have access to a debugger in this specific
 context).

snip

 [7.7.] Other information that might be relevant to the problem
 A stock Debian 3.2 kernel will boot. I don't have the possibility to
 build such a bloated kernel to see what's missing. My goal is to
 identify what is the minimal set of functionalities needed to have a
 kernel start up in a bhyve context. I have several other Linux VMs
 able to run on this hypervisor, none with such a recent kernel.

Without being able to email explicit people in addition to the general
LKML there's a good chance this will be overlooked. I can only guess the
problem has something to do with arch/x86/kernel/apic/apic.c but I don't
really know...

What will help a lot is if you can try your cut down 3.16 but modify it
using make oldconfig on an old 3.2 kernel and see if that boots. If it
does, then you can use git bisect v3.16 v3.2 to narrow down the exact
commit that introduced the problem.

-- 
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/