Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-11 Thread Richard Laysell


Thanks Mike,

I have upgraded to the test build of VirtualBox (6.1.35) and I am not
now seeing Firefox crashing in the virtual guest.  Both host and guest
are running 5.10.0-16-amd64 - 5.10.127-1.  So, I guess this is not a
kernel bug so much as a 'incompatibility' between these kernels and the
older versions of VirtualBox.

Regards,

Richard

On Fri, 2022-07-08 at 17:31 -0700, Mike Kupfer wrote:
> Bastian Blank wrote:
> 
> > As Virtualbox uses it's own kernel module, we can't provide any
> > help.
> > You need to ask them for a fixed version to work with that kernel
> > version.
> 
> FWIW, I'm running a test build[1] of VirtualBox with the -15 kernel
> on
> the host.  I, too, had the problem of Firefox crashing regularly with
> VirtualBox 6.1.34, but things seem stable with VirtualBox 6.1.35
> r151864.
> 
> mike
> [1] https://www.virtualbox.org/wiki/Testbuilds
> 



Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-08 Thread Mike Kupfer
Bastian Blank wrote:

> As Virtualbox uses it's own kernel module, we can't provide any help.
> You need to ask them for a fixed version to work with that kernel
> version.

FWIW, I'm running a test build[1] of VirtualBox with the -15 kernel on
the host.  I, too, had the problem of Firefox crashing regularly with
VirtualBox 6.1.34, but things seem stable with VirtualBox 6.1.35
r151864.

mike
[1] https://www.virtualbox.org/wiki/Testbuilds



Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-08 Thread Bastian Blank
Control: tags -1 unreproducible

On Tue, Jul 05, 2022 at 12:09:00PM +0200, Michael wrote:
> i am running virtualbox 6.1.34 from the virtualbox.org repo on a debian 11.3
> host. the guest also runs debian 11.3.

As Virtualbox uses it's own kernel module, we can't provide any help.
You need to ask them for a fixed version to work with that kernel
version.

Or you can use the included kvm support.

Bastian

-- 
Yes, it is written.  Good shall always destroy evil.
-- Sirah the Yang, "The Omega Glory", stardate unknown



Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-08 Thread Walter Schweizer
Package: src:linux
Version: 5.10.120-1
Followup-For: Bug #1014394
X-Debbugs-Cc: s...@users.sourceforge.net

Dear Maintainer,

*** Reporter, please consider answering these questions, where appropriate ***

   * What led up to the situation?
   * What exactly did you do (or not do) that was effective (or
 ineffective)?
   * What was the outcome of this action?
   * What outcome did you expect instead?

*** End of the template - remove these template lines ***

I am also affected by this issue.

I cannot use my VM images anymore on this kernel version.
The Windows10 guest randomly crashes with:
00:35:44.190732 GIM: HyperV: Guest indicates a fatal condition! P0=0x1e
P1=0xc005 P2=0xf80732849020 P3=0x0 P4=0x
00:35:44.190795 GIMHv: BugCheck 1e {c005, f80732849020, 0,
}
00:35:44.190795 KMODE_EXCEPTION_NOT_HANDLED
00:35:44.190796 P1: c005 - exception code - STATUS_ACCESS_VIOLATION
00:35:44.190796 P2: f80732849020 - EIP/RIP
00:35:44.190796 P3:  - Xcpt param #0
00:35:44.190796 P4:  - Xcpt param #1
00:35:45.515885 AHCI#0: Reset the HBA
00:35:45.515902 VD#0: Cancelling all active requests
00:35:45.516175 AHCI#0: Port 0 reset
00:35:45.517213 VD#0: Cancelling all active requests
00:35:47.317356 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be
unresponsive. Last heartbeat received 4 seconds ago
00:35:55.668680 VMMDev: Guest Log: VBoxGuest: BugCheck! P0=0x1e
P1=0xc005 P2=0xf80732849020 P3=0x0 P4=0x

When I downgrade the kernel to linux-image-5.10.0-14-amd64 (5.10.113-1)
the VM runs well!



-- Package-specific info:
** Kernel log: boot messages should be attached

** Model information
sys_vendor: FUJITSU
product_name: CELSIUS H760
product_version: 10601186099
chassis_vendor: FUJITSU
chassis_version: CELSIUS H760
bios_vendor: FUJITSU // Insyde Software Corp.
bios_version: Version 1.32
board_vendor: FUJITSU
board_name: FJNB29A
board_version: D4

** Network interface configuration:
*** /etc/network/interfaces:

source /etc/network/interfaces.d/*

auto lo
iface lo inet loopback

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th 
Gen Core Processor Host Bridge/DRAM Registers [8086:1910] (rev 07)
Subsystem: Fujitsu Limited. Xeon E3-1200 v5/E3-1500 v5/6th Gen Core 
Processor Host Bridge/DRAM Registers [10cf:1937]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: skl_uncore

00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe 
Controller (x16) [8086:1901] (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: 
Kernel driver in use: pcieport

00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset 
Family USB 3.0 xHCI Controller [8086:a12f] (rev 31) (prog-if 30 [XHCI])
Subsystem: Fujitsu Limited. 100 Series/C230 Series Chipset Family USB 
3.0 xHCI Controller [10cf:1937]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
SERR- 
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci

00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 
Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
Subsystem: Fujitsu Limited. 100 Series/C230 Series Chipset Family MEI 
Controller [10cf:1937]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: mei_me
Kernel modules: mei_me

00:16.3 Serial controller [0700]: Intel Corporation 100 Series/C230 Series 
Chipset Family KT Redirection [8086:a13d] (rev 31) (prog-if 02 [16550])
Subsystem: Fujitsu Limited. 100 Series/C230 Series Chipset Family KT 
Redirection [10cf:1937]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Kernel driver in use: serial

00:17.0 SATA controller [0106]: Intel Corporation HM170/QM170 Chipset SATA 
Controller [AHCI Mode] [8086:a103] (rev 31) (prog-if 01 [AHCI 1.0])
Subsystem: Fujitsu Limited. HM170/QM170 Chipset SATA Controller [AHCI 
Mode] [10cf:1937]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- 

Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-05 Thread Diederik de Haas
On Tuesday, 5 July 2022 21:55:54 CEST Michael wrote:
> and for your request:

Sorry I wasn't clear in my request.

> > and eventually guest processes start to randomly crash, e.g.:
> > 
> > Jun 30 20:17:01 vmguest kernel: traps: sh[250946] general protection fault
> > ip:7fb0c341708e sp:7ffec3154378 error:0 in
> > libc-2.31.so[7fb0c33a6000+14b000]
> > Jul 01 00:00:02 vmguest kernel: traps: hostname[253617] general protection
> > fault ip:7f905f2b24a6 sp:7fff44a30e30 error:0 in
> > libc-2.31.so[7f905f299000+14b000]
> > Jul 01 00:53:01 vmguest kernel: traps: wget[254290] general protection
> > fault ip:7f934bc23fda sp:7ffd716954d0 error:0 in
> > libtasn1.so.6.6.0[7f934bc1a000+c000]

I assumed these errors occurred in 1 boot session and having the FULL dmesg/
kern.log from *that* session would be useful.
Don't 'grep' the messages which you think are relevant. Often times there are 
other messages which give important clues to experts (which I am not), which 
seem irrelevant/harmless to you and me.
I'm guessing this bug should be sent upstream and they will (highly) likely 
ask for a full dmesg/kernel log.
(I'm merely trying to collect as much useful information as possible)

HTH,
  Diederik

signature.asc
Description: This is a digitally signed message part.


Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-05 Thread Michael

on the host's /var/log/apt/term.log:

Log started: 2022-06-12  10:57:30
...
Preparing to unpack .../08-linux-image-5.10.0-15-amd64_5.10.120-1_amd64.deb 
...

Unpacking linux-image-5.10.0-15-amd64 (5.10.120-1) ...
Preparing to unpack .../09-linux-image-amd64_5.10.120-1_amd64.deb ...
Unpacking linux-image-amd64 (5.10.120-1) over (5.10.113-1) ...
...
Setting up linux-image-5.10.0-15-amd64 (5.10.120-1) ...
...
I: /vmlinuz.old is now a symlink to boot/vmlinuz-5.10.0-14-amd64
I: /initrd.img.old is now a symlink to boot/initrd.img-5.10.0-14-amd64
I: /vmlinuz is now a symlink to boot/vmlinuz-5.10.0-15-amd64
I: /initrd.img is now a symlink to boot/initrd.img-5.10.0-15-amd64
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-5.10.0-15-amd64
/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.10.0-15-amd64
Found initrd image: /boot/initrd.img-5.10.0-15-amd64
Found linux image: /boot/vmlinuz-5.10.0-14-amd64
Found initrd image: /boot/initrd.img-5.10.0-14-amd64
Adding boot menu entry for EFI firmware configuration
done
...
Setting up linux-image-amd64 (5.10.120-1) ...
...
Log ended: 2022-06-12  10:58:50


so, the kernel was installed on 2022-06-12.


and for your request:

# cat <(gunzip -c /var/log/kern.log.{4,3,2}.gz) /var/log/kern.log{.1,} | 
grep -B5 -E "hrtimer|clocksource|traps:|Code:|general|segfault"


Apr 18 22:24:41 vmguest kernel: [1669500.364257] clocksource: timekeeping 
watchdog on CPU1: kvm-clock wd-wd read-back delay of 64914ns
Apr 18 22:24:41 vmguest kernel: [1669500.364328] clocksource: wd-tsc-wd 
read-back delay of 314448ns, clock-skew test skipped!

--
Apr 19 13:42:46 vmguest kernel: [0.00] DMI: innotek GmbH 
VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006

Apr 19 13:42:46 vmguest kernel: [0.00] Hypervisor detected: KVM
Apr 19 13:42:46 vmguest kernel: [0.00] kvm-clock: Using msrs 
4b564d01 and 4b564d00
Apr 19 13:42:46 vmguest kernel: [0.00] kvm-clock: cpu 0, msr 
236b8001, primary cpu clock
Apr 19 13:42:46 vmguest kernel: [0.02] kvm-clock: using sched 
offset of 5303526299 cycles
Apr 19 13:42:46 vmguest kernel: [0.07] clocksource: kvm-clock: 
mask: 0x max_cycles: 0x1cd42e4dffb, max_idle_ns: 
881590591483 ns

--
Apr 19 13:42:46 vmguest kernel: [0.180275] PM: hibernation: Registered 
nosave memory: [mem 0x0009f000-0x0009]
Apr 19 13:42:46 vmguest kernel: [0.180276] PM: hibernation: Registered 
nosave memory: [mem 0x000a-0x000e]
Apr 19 13:42:46 vmguest kernel: [0.180277] PM: hibernation: Registered 
nosave memory: [mem 0x000f-0x000f]
Apr 19 13:42:46 vmguest kernel: [0.180280] [mem 0x4000-0xfebf] 
available for PCI devices
Apr 19 13:42:46 vmguest kernel: [0.180282] Booting paravirtualized 
kernel on KVM
Apr 19 13:42:46 vmguest kernel: [0.180288] clocksource: 
refined-jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 
7645519600211568 ns

--
Apr 19 13:42:46 vmguest kernel: [0.216877] ACPI: Core revision 20200925
Apr 19 13:42:46 vmguest kernel: [0.217175] APIC: Switch to symmetric 
I/O mode setup

Apr 19 13:42:46 vmguest kernel: [0.217795] x2apic enabled
Apr 19 13:42:46 vmguest kernel: [0.218367] Switched APIC routing to 
physical x2apic.
Apr 19 13:42:46 vmguest kernel: [0.221130] ..TIMER: vector=0x30 apic1=0 
pin1=2 apic2=-1 pin2=-1
Apr 19 13:42:46 vmguest kernel: [0.221204] clocksource: tsc-early: 
mask: 0x max_cycles: 0x1598c7dcb76, max_idle_ns: 
440795222846 ns

--
Apr 19 13:42:46 vmguest kernel: [0.349407] smpboot: Max logical 
packages: 1
Apr 19 13:42:46 vmguest kernel: [0.349411] smpboot: Total of 4 
processors activated (11986.22 BogoMIPS)
Apr 19 13:42:46 vmguest kernel: [0.355058] node 0 deferred pages 
initialised in 4ms

Apr 19 13:42:46 vmguest kernel: [0.355163] devtmpfs: initialized
Apr 19 13:42:46 vmguest kernel: [0.355163] x86/mm: Memory block size: 
128MB
Apr 19 13:42:46 vmguest kernel: [0.357525] clocksource: jiffies: mask: 
0x max_cycles: 0x, max_idle_ns: 764504178510 ns

--
Apr 19 13:42:46 vmguest kernel: [0.709892] NetLabel:  unlabeled traffic 
allowed by default
Apr 19 13:42:46 vmguest kernel: [0.709892] PCI: Using ACPI for IRQ 
routing
Apr 19 13:42:46 vmguest kernel: [0.709892] PCI: pci_cache_line_size set 
to 64 bytes
Apr 19 13:42:46 vmguest kernel: [0.710019] e820: reserve RAM buffer 
[mem 0x0009fc00-0x0009]
Apr 19 13:42:46 vmguest kernel: [0.710030] e820: reserve RAM buffer 
[mem 0x3fff-0x3fff]
Apr 19 13:42:46 vmguest kernel: [0.715223] clocksource: Switched to 
clocksource kvm-clock

--
Apr 19 13:42:46 vmguest kernel: [0.736766] AppArmor: AppArmor 
Filesystem Enabled

Apr 19 13:42:46 vmguest kernel: [0.736806] pnp: PnP ACPI init
Apr 19 13:42:46 vmguest kernel: [0.736937] pnp 00:00: Plug and Play 
ACPI device, IDs PNP0303 

Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-05 Thread Diederik de Haas
Control: reassign -1 src:linux 5.10.120-1
Control: tag -1 upstream moreinfo

On Tuesday, 5 July 2022 12:09:00 CEST Michael wrote:
> package: linux-image-5.10.0-15-arm64
> version: 5.10.120-1
> 
> i am running virtualbox 6.1.34 from the virtualbox.org repo on a debian
> 11.3 host. the guest also runs debian 11.3.
> 
> when both host and guest run the latest stable kernel 5.10.0-15
> (5.10.120-1) i get random process crashes in the guest when having
> significant i/o for longer than a few seconds on the host.
> 
> and eventually guest processes start to randomly crash, e.g.:
> 
> Jun 30 20:17:01 vmguest kernel: traps: sh[250946] general protection fault
> ip:7fb0c341708e sp:7ffec3154378 error:0 in
> libc-2.31.so[7fb0c33a6000+14b000]
> Jul 01 00:00:02 vmguest kernel: traps: hostname[253617] general protection
> fault ip:7f905f2b24a6 sp:7fff44a30e30 error:0 in
> libc-2.31.so[7f905f299000+14b000]
> Jul 01 00:53:01 vmguest kernel: traps: wget[254290] general protection
> fault ip:7f934bc23fda sp:7ffd716954d0 error:0 in
> libtasn1.so.6.6.0[7f934bc1a000+c000]
> 
> if i switch to kernel 5.10.0-14 (5.10.113-1) on the host (the guest kernel
> remains 5.10.0-15), then the random process crashes in the guest disappear,
> although the complaints from hrtimer and clocksource still remain, but
> significantly less often.

This indeed looks like an upstream regression between 5.10.113 and 5.10.120.

What surprises me are the time gaps between those GPF messages, ~3.5h and 50m, 
but not within the same second/minute. What happens between those time stamps?

If there are more messages around those GPF messages, it would be useful to 
share those as well.

In Stable-Proposed-Updates there is a 5.10.127-1 version and it would be 
useful to test whether the issue happens with that version too.

signature.asc
Description: This is a digitally signed message part.


Bug#1014394: linux kernel 5.10.0-15 on virtualbox host causes random process crashes in guests

2022-07-05 Thread Michael

package: linux-image-5.10.0-15-arm64
version: 5.10.120-1

i am running virtualbox 6.1.34 from the virtualbox.org repo on a debian 
11.3 host. the guest also runs debian 11.3.


when both host and guest run the latest stable kernel 5.10.0-15 
(5.10.120-1) i get random process crashes in the guest when having 
significant i/o for longer than a few seconds on the host.


i.e., when i do a

 # mv  

or even a

 # md5sum 

or anything that causes significant i/o traffic on the host, the guest 
first starts with complaining:


Jul 01 20:44:49 vmguest kernel: hrtimer: interrupt took 21312191 ns
Jul 01 20:45:20 vmguest kernel: clocksource: timekeeping watchdog on CPU0: 
kvm-clock wd-wd read-back delay of 816732ns
Jul 01 20:45:20 vmguest kernel: clocksource: wd-tsc-wd read-back delay of 
2540783ns, clock-skew test skipped!


and eventually guest processes start to randomly crash, e.g.:

Jun 30 20:17:01 vmguest kernel: traps: sh[250946] general protection fault 
ip:7fb0c341708e sp:7ffec3154378 error:0 in 
libc-2.31.so[7fb0c33a6000+14b000]
Jul 01 00:00:02 vmguest kernel: traps: hostname[253617] general protection 
fault ip:7f905f2b24a6 sp:7fff44a30e30 error:0 in 
libc-2.31.so[7f905f299000+14b000]
Jul 01 00:53:01 vmguest kernel: traps: wget[254290] general protection 
fault ip:7f934bc23fda sp:7ffd716954d0 error:0 in 
libtasn1.so.6.6.0[7f934bc1a000+c000]


if i switch to kernel 5.10.0-14 (5.10.113-1) on the host (the guest kernel 
remains 5.10.0-15), then the random process crashes in the guest disappear, 
although the complaints from hrtimer and clocksource still remain, but 
significantly less often.


i started a thread in the 'debian-user' mailing list:
https://lists.debian.org/debian-user/2022/07/msg00043.html

thank you for looking into this!

greetings...