[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2021-08-17 Thread Marcelo Cerri
** Changed in: linux-azure (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Dexuan Cui
Sure, will do. But AFAICT, there is no ETA yet. Even if the fix was made
today, it would take quite some time (at least a few months?) to deploy
the fix to the whole Azure fleet. :-(

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Kaveh Moe
Yes, I am clear on this Dexuan. Had found the same info tonight,
including the table of causes of VM exit. The issue is not the VM exit
itself, it is that hyperv adversely affects the 32 bit programs state.
Please keep posting the progress on fixing the issue if you could as
this goes along.

Regards,K.

On Wednesday, December 16, 2020, 7:55:44 PM PST, Dexuan Cui 
<1904...@bugs.launchpad.net> wrote:  
 
 VM exits are pretty frequent and normal. "VM exits occur in response to 
certain instructions and events in VMX non-root operation" (see CHAPTER 27
VM EXITS of 
https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-sdm-volume-3c-system-programming-guide-part-3.html.

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

Status in linux-azure package in Ubuntu:
  New

Bug description:
  Running a container on an DV3 Standard_D8_v3 Azure host, as the
  container comes up, the Azure host VM kernel panics per the logs
  below.

  Isolated the issue to a process in the container which uses the
  virtual NICs available on the Azure host. The container also is
  running Ubuntu 18.04 based packages. The problem happens every single
  time the container is started, unless its NIC access process is not
  started.

  Has this sort of kernel panic on Azure been seen and what is the root
  cause and remedy please.

  Also the kernel logs on the Azure host show it vulnerable to the
  following CVE. There are other VMs and containers that can run on the
  Azure host without a kernel panic on it, but providing this info in
  case there is some tie-in to the panic.

  https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646

  Kernel panic from the Azure Host console:

  
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking 
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path: 
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent 
updates (family: Prod)
  2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent 
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter: 
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets: 
0;UpdateGSErrors: 0;AutoUpdate: 1]
  [11218.537937] PANIC: double fault, error_code: 0x0
  [11218.541423] Kernel panic - not syncing: Machine halted.
  [11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
  [11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090008  12/07/2018
  [11218.541423] Call Trace:
  [11218.541423]  <#DF>
  [11218.541423]  dump_stack+0x63/0x8b
  [11218.541423]  panic+0xe4/0x244
  [11218.541423]  df_debug+0x2d/0x30
  [11218.541423]  do_double_fault+0x9a/0x130
  [11218.541423]  double_fault+0x1e/0x30
  [11218.541423] RIP: 0010:0x1a80
  [11218.541423] RSP: 0018:2200 EFLAGS: 00010096
  [11218.541423] RAX: 0102 RBX: f7a40768 RCX: 
002f
  [11218.541423] RDX: f7ee9970 RSI: f7a40700 RDI: 
f7c3a000
  [11218.541423] RBP: fffd6430 R08:  R09: 

  [11218.541423] R10:  R11:  R12: 

  [11218.541423] R13:  R14:  R15: 

  [11218.541423]  
  [11218.541423] Kernel Offset: 0x2a40 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
  [11218.636804] [ cut here ]
  [11218.640802] sched: Unexpected reschedule of offline CPU#2!
  [11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128 
native_smp_send_reschedule+0x3f/0x50
  [11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel 
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype 
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner 
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid kvm_intel hv_balloon kvm 
serio_raw irqbypass intel_rapl_perf sch_fq_codel ib_iser rdma_cm iw_cm ib_cm 
ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables 
autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov
  [11218.640802]  async_memcpy async_pq 

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Dexuan Cui
VM exits are pretty frequent and normal. "VM exits occur in response to certain 
instructions and events in VMX non-root operation" (see CHAPTER 27
VM EXITS of 
https://software.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-sdm-volume-3c-system-programming-guide-part-3.html.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Kaveh Moe
Thanks Dexuan. I realized later and clarified my Q on the thread,
suggesting the experiment I mentioned. I have not tried it, but anyway
you guys are now zeroing in on the issue.

On Wednesday, December 16, 2020, 6:30:48 PM PST, Dexuan Cui 
<1904...@bugs.launchpad.net> wrote:  
 
 VM Exit is a term in the Intel CPU's Virtualization support (VMX). It
means the execution of the guest CPU is interrupted and the execution
"jumps" to some function in the hypervisor; the hypervisor analyzes the
reason of the VM Exit, and handles the VM exit properly, and then the
execution "jumps" back to wherever the guest CPU was interrupted. Here
the issue is: when the Level-2 guest CPU's VM Exit happens, somehow the
hypervisor messes up the Level-1 guest's 32-bit related state (i.e. the
SYSENTER instruction related state), so later when the 32-bit progarm
starts to run, the Level-1 guest kernel crashes due to double-fault. The
investigation is still ongoing.

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

Status in linux-azure package in Ubuntu:
  New

Bug description:
  Running a container on an DV3 Standard_D8_v3 Azure host, as the
  container comes up, the Azure host VM kernel panics per the logs
  below.

  Isolated the issue to a process in the container which uses the
  virtual NICs available on the Azure host. The container also is
  running Ubuntu 18.04 based packages. The problem happens every single
  time the container is started, unless its NIC access process is not
  started.

  Has this sort of kernel panic on Azure been seen and what is the root
  cause and remedy please.

  Also the kernel logs on the Azure host show it vulnerable to the
  following CVE. There are other VMs and containers that can run on the
  Azure host without a kernel panic on it, but providing this info in
  case there is some tie-in to the panic.

  https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646

  Kernel panic from the Azure Host console:

  
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking 
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path: 
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent 
updates (family: Prod)
  2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent 
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter: 
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets: 
0;UpdateGSErrors: 0;AutoUpdate: 1]
  [11218.537937] PANIC: double fault, error_code: 0x0
  [11218.541423] Kernel panic - not syncing: Machine halted.
  [11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
  [11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090008  12/07/2018
  [11218.541423] Call Trace:
  [11218.541423]  <#DF>
  [11218.541423]  dump_stack+0x63/0x8b
  [11218.541423]  panic+0xe4/0x244
  [11218.541423]  df_debug+0x2d/0x30
  [11218.541423]  do_double_fault+0x9a/0x130
  [11218.541423]  double_fault+0x1e/0x30
  [11218.541423] RIP: 0010:0x1a80
  [11218.541423] RSP: 0018:2200 EFLAGS: 00010096
  [11218.541423] RAX: 0102 RBX: f7a40768 RCX: 
002f
  [11218.541423] RDX: f7ee9970 RSI: f7a40700 RDI: 
f7c3a000
  [11218.541423] RBP: fffd6430 R08:  R09: 

  [11218.541423] R10:  R11:  R12: 

  [11218.541423] R13:  R14:  R15: 

  [11218.541423]  
  [11218.541423] Kernel Offset: 0x2a40 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
  [11218.636804] [ cut here ]
  [11218.640802] sched: Unexpected reschedule of offline CPU#2!
  [11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128 
native_smp_send_reschedule+0x3f/0x50
  [11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel 
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype 
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner 
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid kvm_intel hv_balloon kvm 
serio_raw irqbypass 

Re: [Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Kaveh Moe
Hi Dexuan,

To clarify my question: What is the cause of the VMexit (ie. trap to the
hyperv) in this scenario?

It might be worth it to do an experiment and reverse the order of
operations i.e. first start the hello program in a loop and THEN start
the nested VM via KVM. Then it would be the nested KVM operation causing
an expected VMexit which is not handled properly due to the running 32
bit program.

Regards,K.

On Wednesday, December 16, 2020, 5:52:20 PM PST, kaveh moezzi 
 wrote:  
 
  Thank you Dexuan. What does VMexit mean here please? Trying to see how the VM 
exit triggers a kernel panic, since we were just running a 32 bit hello world 
program, which should not cause a the host VM to "exit". 

Regards.

On Wednesday, December 16, 2020, 5:11:37 PM PST, Dexuan Cui 
<1904...@bugs.launchpad.net> wrote:  
 
 Hyper-V team just identified a bug where the Hyper-V hypervisor can
truncate the host SYSENTER_ESP/EIP to 16 bits on VMexit for some reason.
A further investigation is ongoing.

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

Status in linux-azure package in Ubuntu:
  New

Bug description:
  Running a container on an DV3 Standard_D8_v3 Azure host, as the
  container comes up, the Azure host VM kernel panics per the logs
  below.

  Isolated the issue to a process in the container which uses the
  virtual NICs available on the Azure host. The container also is
  running Ubuntu 18.04 based packages. The problem happens every single
  time the container is started, unless its NIC access process is not
  started.

  Has this sort of kernel panic on Azure been seen and what is the root
  cause and remedy please.

  Also the kernel logs on the Azure host show it vulnerable to the
  following CVE. There are other VMs and containers that can run on the
  Azure host without a kernel panic on it, but providing this info in
  case there is some tie-in to the panic.

  https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646

  Kernel panic from the Azure Host console:

  
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking 
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path: 
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent 
updates (family: Prod)
  2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent 
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter: 
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets: 
0;UpdateGSErrors: 0;AutoUpdate: 1]
  [11218.537937] PANIC: double fault, error_code: 0x0
  [11218.541423] Kernel panic - not syncing: Machine halted.
  [11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
  [11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090008  12/07/2018
  [11218.541423] Call Trace:
  [11218.541423]  <#DF>
  [11218.541423]  dump_stack+0x63/0x8b
  [11218.541423]  panic+0xe4/0x244
  [11218.541423]  df_debug+0x2d/0x30
  [11218.541423]  do_double_fault+0x9a/0x130
  [11218.541423]  double_fault+0x1e/0x30
  [11218.541423] RIP: 0010:0x1a80
  [11218.541423] RSP: 0018:2200 EFLAGS: 00010096
  [11218.541423] RAX: 0102 RBX: f7a40768 RCX: 
002f
  [11218.541423] RDX: f7ee9970 RSI: f7a40700 RDI: 
f7c3a000
  [11218.541423] RBP: fffd6430 R08:  R09: 

  [11218.541423] R10:  R11:  R12: 

  [11218.541423] R13:  R14:  R15: 

  [11218.541423]  
  [11218.541423] Kernel Offset: 0x2a40 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
  [11218.636804] [ cut here ]
  [11218.640802] sched: Unexpected reschedule of offline CPU#2!
  [11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128 
native_smp_send_reschedule+0x3f/0x50
  [11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel 
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype 
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner 
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid 

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Dexuan Cui
VM Exit is a term in the Intel CPU's Virtualization support (VMX). It
means the execution of the guest CPU is interrupted and the execution
"jumps" to some function in the hypervisor; the hypervisor analyzes the
reason of the VM Exit, and handles the VM exit properly, and then the
execution "jumps" back to wherever the guest CPU was interrupted. Here
the issue is: when the Level-2 guest CPU's VM Exit happens, somehow the
hypervisor messes up the Level-1 guest's 32-bit related state (i.e. the
SYSENTER instruction related state), so later when the 32-bit progarm
starts to run, the Level-1 guest kernel crashes due to double-fault. The
investigation is still ongoing.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Re: [Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Kaveh Moe
Thank you Dexuan. What does VMexit mean here please? Trying to see how
the VM exit triggers a kernel panic, since we were just running a 32 bit
hello world program, which should not cause a the host VM to "exit".

Regards.

On Wednesday, December 16, 2020, 5:11:37 PM PST, Dexuan Cui 
<1904...@bugs.launchpad.net> wrote:  
 
 Hyper-V team just identified a bug where the Hyper-V hypervisor can
truncate the host SYSENTER_ESP/EIP to 16 bits on VMexit for some reason.
A further investigation is ongoing.

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

Status in linux-azure package in Ubuntu:
  New

Bug description:
  Running a container on an DV3 Standard_D8_v3 Azure host, as the
  container comes up, the Azure host VM kernel panics per the logs
  below.

  Isolated the issue to a process in the container which uses the
  virtual NICs available on the Azure host. The container also is
  running Ubuntu 18.04 based packages. The problem happens every single
  time the container is started, unless its NIC access process is not
  started.

  Has this sort of kernel panic on Azure been seen and what is the root
  cause and remedy please.

  Also the kernel logs on the Azure host show it vulnerable to the
  following CVE. There are other VMs and containers that can run on the
  Azure host without a kernel panic on it, but providing this info in
  case there is some tie-in to the panic.

  https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3646

  Kernel panic from the Azure Host console:

  
Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:11.537914Z INFO MonitorHandler ExtHandler Stopped tracking 
cgroup: Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux-1.13.33, path: 
/sys/fs/cgroup/memory/system.slice/Microsoft.EnterpriseCloud.Monitoring.OmsAgentForLinux_1.13.33_e857c609-bc35-4b66-9a8b-e86fd8707e82.scope
  2020-11-17T00:50:23.291433Z INFO ExtHandler ExtHandler Checking for agent 
updates (family: Prod)
  2020-11-17T00:51:11.677191Z INFO ExtHandler ExtHandler [HEARTBEAT] Agent 
WALinuxAgent-2.2.52 is running as the goal state agent [DEBUG HeartbeatCounter: 
7;HeartbeatId: 8A2DD5B7-02E5-46E2-9EDB-F8CCBA274479;DroppedPackets: 
0;UpdateGSErrors: 0;AutoUpdate: 1]
  [11218.537937] PANIC: double fault, error_code: 0x0
  [11218.541423] Kernel panic - not syncing: Machine halted.
  [11218.541423] CPU: 0 PID: 9281 Comm: vmxt Not tainted 4.15.18+test #1
  [11218.541423] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090008  12/07/2018
  [11218.541423] Call Trace:
  [11218.541423]  <#DF>
  [11218.541423]  dump_stack+0x63/0x8b
  [11218.541423]  panic+0xe4/0x244
  [11218.541423]  df_debug+0x2d/0x30
  [11218.541423]  do_double_fault+0x9a/0x130
  [11218.541423]  double_fault+0x1e/0x30
  [11218.541423] RIP: 0010:0x1a80
  [11218.541423] RSP: 0018:2200 EFLAGS: 00010096
  [11218.541423] RAX: 0102 RBX: f7a40768 RCX: 
002f
  [11218.541423] RDX: f7ee9970 RSI: f7a40700 RDI: 
f7c3a000
  [11218.541423] RBP: fffd6430 R08:  R09: 

  [11218.541423] R10:  R11:  R12: 

  [11218.541423] R13:  R14:  R15: 

  [11218.541423]  
  [11218.541423] Kernel Offset: 0x2a40 from 0x8100 (relocation 
range: 0x8000-0xbfff)
  [11218.541423] ---[ end Kernel panic - not syncing: Machine halted.
  [11218.636804] [ cut here ]
  [11218.640802] sched: Unexpected reschedule of offline CPU#2!
  [11218.640802] WARNING: CPU: 0 PID: 9281 at arch/x86/kernel/smp.c:128 
native_smp_send_reschedule+0x3f/0x50
  [11218.640802] Modules linked in: xt_nat xt_u32 vxlan ip6_udp_tunnel 
udp_tunnel veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype 
br_netfilter xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 
iptable_nat ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter aufs xt_owner 
iptable_security xt_conntrack overlay openvswitch nsh nf_conntrack_ipv6 
nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_defrag_ipv6 nf_nat 
nf_conntrack nls_iso8859_1 joydev input_leds mac_hid kvm_intel hv_balloon kvm 
serio_raw irqbypass intel_rapl_perf sch_fq_codel ib_iser rdma_cm iw_cm ib_cm 
ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables 
autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov
  [11218.640802]  async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul 
crc32_pclmul hid_hyperv ghash_clmulni_intel hv_utils hv_storvsc pcbc ptp 
hv_netvsc hid pps_core scsi_transport_fc 

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-16 Thread Dexuan Cui
Hyper-V team just identified a bug where the Hyper-V hypervisor can
truncate the host SYSENTER_ESP/EIP to 16 bits on VMexit for some reason.
A further investigation is ongoing.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-14 Thread Kaveh Moe
Hi Marcelo,
This issue can also been seen on the latest Ubuntu 18.04 image. It is also seen 
in the 5.10-rc7 Linux kernel as well as 5.4. 

The issue can be reproduced 100% of the time using a more simple setup as 
described below. A basic "hello world" C program compiled for 32 bit crashes 
the Azure Linux host if a standard VM is first started as described below using 
qemu-kvm.
// the Azure host kernel panic reproduction with 32 bit hello world. Includes 
the kernel panic traceback at the end. 
// Install a standard ubuntu image on the host like below (if the 11/25/20 is 
no longer available, the current one will do).
root@jnpronevm1:~# wget 
https://cloud-images.ubuntu.com/bionic/20201125/bionic-server-cloudimg-amd64.img
--2020-11-30 19:43:20--  
https://cloud-images.ubuntu.com/bionic/20201125/bionic-server-cloudimg-amd64.img
Resolving cloud-images.ubuntu.com (cloud-images.ubuntu.com)... 91.189.88.247, 
91.189.88.248, 2001:67c:1360:8001::33, ...
Connecting to cloud-images.ubuntu.com 
(cloud-images.ubuntu.com)|91.189.88.247|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 360775680 (344M) [application/octet-stream]
Saving to: ‘bionic-server-cloudimg-amd64.img.1’

bionic-server-cloudim 100%[==>] 344.06M  20.4MB/s
in 18s

2020-11-30 19:43:39 (19.2 MB/s) - ‘bionic-server-cloudimg-amd64.img.1’
saved [360775680/360775680]

// make sure qemu-kvm is installed and kvm acceleration can be used. 
root@jnpronevm1:~#
root@jnpronevm1:~# kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

root@jnpronevm1:~# cat hello.c
#include 
int main() 
{ 
    printf("Hello world\n"); 
    
}

// compile about in 32 bit mode, 64 bit does not show the issue
root@jnpronevm1:~# gcc -m32 hello.c -o hello

// start the VM from the above downloaded image. 
root@jnpronevm1:~#/usr/bin/qemu-system-x86_64 -daemonize -name u1 -machine 
pc-i440fx-bionic,accel=kvm -m 1024 -smp 2,sockets=1 -uuid 
772495e6-8658-4306-ba5f-f59c15f42f69 -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
file=bionic-server-cloudimg-amd64.img,if=none,id=drive-virtio-disk0,format=raw,cache=writeback
 -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -chardev 
socket,id=charserial0,host=127.0.0.1,port=8601,telnet,server,nowait,logfile=/var/log/console.log
 -vnc 127.0.0.1:0 -vga cirrus
qemu-system-x86_64: warning: host doesn't support requested feature: 
CPUID.8001H:ECX.svm [bit 2]
qemu-system-x86_64: warning: host doesn't support requested feature: 
CPUID.8001H:ECX.svm [bit 2]

root@jnpronevm1:~# ps aux | grep qemu
root  2966 45.6  0.1 1655976 41064 ?   Sl   20:02   0:02 
/usr/bin/qemu-system-x86_64 -daemonize -name u1 -machine 
pc-i440fx-bionic,accel=kvm -m 1024 -smp 2,sockets=1 -uuid 
772495e6-8658-4306-ba5f-f59c15f42f69 -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
file=bionic-server-cloudimg-amd64.img,if=none,id=drive-virtio-disk0,format=raw,cache=writeback
 -device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
 -chardev 
socket,id=charserial0,host=127.0.0.1,port=8601,telnet,server,nowait,logfile=/var/log/console.log
 -vnc 127.0.0.1:0 -vga cirrus
root  2979  0.0  0.0  14864  1064 pts/0    S+   20:02   0:00 grep qemu

root@jnpronevm1:~# ./hello
The host VM hung up due to kernel panic after above step.

### the host vm hit kernel panic after running hello program.

[ 7858.522920] PANIC: double fault, error_code: 0x0
[ 7858.525679] Kernel panic - not syncing: Machine halted.
[ 7858.525679] CPU: 5 PID: 4746 Comm: hello Not tainted 4.15.18+test #1
[ 7858.525679] Hardware name: Microsoft Corporation Virtual Machine/Virtual 
Machine, BIOS 090008  12/07/2018
[ 7858.525679] Call Trace:
[ 7858.525679]  <#DF>
[ 7858.525679]  dump_stack+0x63/0x8b
[ 7858.525679]  panic+0xe4/0x244
[ 7858.525679]  df_debug+0x2d/0x30
[ 7858.525679]  do_double_fault+0x9a/0x130
[ 7858.525679]  double_fault+0x1e/0x30
[ 7858.525679] RIP: 0010:0x1a80
[ 7858.525679] RSP: 0018:e200 EFLAGS: 00010086
[ 7858.525679] RAX: 00c5 RBX: 0001 RCX: ffc98f2c
[ 7858.525679] RDX: 07d4 RSI: f7ef2d80 RDI: f7ef2000
[ 7858.525679] RBP: ffc98ed8 R08:  R09: 
[ 7858.525679] R10:  R11:  R12: 
[ 7858.525679] R13:  R14:  R15: 
[ 7858.525679]  
[ 7858.525679] Kernel Offset: 0xf00 from 0x8100 (relocation 
range: 0x8000-0xbfff)
[ 7858.525679] ---[ end Kernel panic - not syncing: Machine halted.
[ 7858.525679] WARNING: CPU: 5 PID: 4746 at kernel/sched/core.c:1192 
set_task_cpu+0x162/0x170
[ 7858.525679] Modules linked in: xt_owner iptable_security 
nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype xt_CHECKSUM 
iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-12-10 Thread Marcelo Cerri
Thanks for reporting the issue, Kaveh.

Do you have more information about the image you are using to create
this Azure host? By your last comment, it doesn't seem the host is
running one of the Ubuntu kernels.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1904632] Re: Ubuntu 18.04 Azure VM host kernel panic

2020-11-17 Thread Kaveh Moe
Version info about the Azure Ubuntu Host VM:

root@vm-5:~# uname -a
Linux vm-5 4.15.18+test #1 SMP Mon Oct 29 03:40:39 UTC 2018 x86_64 x86_64 
x86_64 GNU/Linux

root@vm-5:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:Ubuntu 18.04.5 LTS
Release:18.04
Codename:   bionic
root@vm-5:~#

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1904632

Title:
  Ubuntu 18.04 Azure VM host kernel panic

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1904632/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs