Re: Crash when writing to x86 hardware debug registers

2022-09-15 Thread Neil Sikka
Update: I rebuilt the hypervisor binary in debug mode and get the following
output in xl dmesg after the crash.

(XEN) HVM9 restore: CPU 0
(XEN) HVM9 restore: PIC 0
(XEN) HVM9 restore: PIC 1
(XEN) HVM9 restore: IOAPIC 0
(XEN) HVM9 restore: LAPIC 0
(XEN) HVM9 restore: LAPIC_REGS 0
(XEN) HVM9 restore: PCI_IRQ 0
(XEN) HVM9 restore: ISA_IRQ 0
(XEN) HVM9 restore: PCI_LINK 0
(XEN) HVM9 restore: PIT 0
(XEN) HVM9 restore: RTC 0
(XEN) HVM9 restore: HPET 0
(XEN) HVM9 restore: PMTIMER 0
(XEN) HVM9 restore: MTRR 0
(XEN) HVM9 restore: VIRIDIAN_DOMAIN 0
(XEN) HVM9 restore: CPU_XSAVE 0
(XEN) HVM9 restore: VIRIDIAN_VCPU 0
(XEN) HVM9 restore: VMCE_VCPU 0
(XEN) HVM9 restore: TSC_ADJUST 0
(XEN) HVM9 restore: CPU_MSR 0
(XEN) d9: VIRIDIAN MSR_TIME_REF_COUNT: accessed
(XEN) vmx.c:3295:d9v0 RDMSR 0x unimplemented
(XEN) d9v0 VIRIDIAN CRASH: 1e c096 f80575bc362c 0 0

On Thu, Sep 15, 2022 at 12:33 PM Neil Sikka  wrote:

> Hi All,
> I am running a userland debugger in Windows 10 HVM on Xen 4.16 on an Intel
> chip. I noticed when I set a hardware breakpoint (which writes to the DR0
> register), Windows 10 crashes. This crash reproduces both with and without
> viridian enabled in the DomU cfg file.
>
> (XEN) Xen version 4.16.1 (neil@) (gcc (Debian 10.2.1-6) 10.2.1 20210110)
> debug=n Tue Apr 19 11:20:04 EDT 2022
> (XEN) d13v0 VIRIDIAN CRASH: 1e c096 f8007f85562c 0 0
>
> This output from xl dmesg shows that I am not running a debug hypervisor,
> and that theres a viridian crash. I've gotten the following stop codes in
> the BSOD from Windows: KMODE EXCEPTION NOT HANDLED, SYSTEM_SERVICE
> EXCEPTION.
>
> I see this code in xen/xen/arch/x86/msr.c inside guest_wrmsr():
> case MSR_AMD64_DR0_ADDRESS_MASK:
> case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
> if ( !cp->extd.dbext )
> goto gp_fault;
>
> I was assuming AMD64 refers to a 64 bit CPU rather than an AMD CPU, and
> this is one of the few references I found to DR0, and I saw a deliberate
> fault raised if dbext is not set. However I'm told that dbext is unrelated,
> set by default and does not need to be set at hypervisor compile time.
>
> Any ideas why I'm getting this crash?
>
> Thanks in Advance,
> Neil
>
> --
> My Blog: http://www.neilscomputerblog.blogspot.com/
> Twitter: @neilsikka
>


-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Crash when writing to x86 hardware debug registers

2022-09-15 Thread Neil Sikka
Hi All,
I am running a userland debugger in Windows 10 HVM on Xen 4.16 on an Intel
chip. I noticed when I set a hardware breakpoint (which writes to the DR0
register), Windows 10 crashes. This crash reproduces both with and without
viridian enabled in the DomU cfg file.

(XEN) Xen version 4.16.1 (neil@) (gcc (Debian 10.2.1-6) 10.2.1 20210110)
debug=n Tue Apr 19 11:20:04 EDT 2022
(XEN) d13v0 VIRIDIAN CRASH: 1e c096 f8007f85562c 0 0

This output from xl dmesg shows that I am not running a debug hypervisor,
and that theres a viridian crash. I've gotten the following stop codes in
the BSOD from Windows: KMODE EXCEPTION NOT HANDLED, SYSTEM_SERVICE
EXCEPTION.

I see this code in xen/xen/arch/x86/msr.c inside guest_wrmsr():
case MSR_AMD64_DR0_ADDRESS_MASK:
case MSR_AMD64_DR1_ADDRESS_MASK ... MSR_AMD64_DR3_ADDRESS_MASK:
if ( !cp->extd.dbext )
goto gp_fault;

I was assuming AMD64 refers to a 64 bit CPU rather than an AMD CPU, and
this is one of the few references I found to DR0, and I saw a deliberate
fault raised if dbext is not set. However I'm told that dbext is unrelated,
set by default and does not need to be set at hypervisor compile time.

Any ideas why I'm getting this crash?

Thanks in Advance,
Neil

-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


OXenstored performance

2022-01-31 Thread Neil Sikka
Hello,
I see oxenstored running at 100% CPU on a busy Xen host. This article (
https://xenproject.org/2014/05/01/9124/) says that the maximum number of
VMs supported by oxenstore is 160. What is the technical reason for this
limit? Doesn't the Xen hypervisor support more than 160 concurrent DomUs?

-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Xen DomUs with empty state

2021-07-21 Thread Neil Sikka
Hello,
I am running xen 4.13.0 hosting many DomUs started in a short amount of time, 
but not all of them are accounted for by xentop:

183 domains: 2 running, 0 blocked, 92 paused, 0 crashed, 0 dying, 0 shutdown

Only 94 of the 183 VMs are accounted for, and the STATE column for many of the 
VMs shows "--". I have 56 physical CPUs. Why am I seeing this discrepancy 
and empty status columns?

Thanks,
Neil

Re: Windows 10 Kernel Debugging on Xen

2021-06-22 Thread Neil Sikka
I figured it out. Microsoft did not document that testsigning needs to be
enabled for kdnet to work.

On Tue, Jun 22, 2021 at 2:12 PM Tamas K Lengyel 
wrote:

> Make sure windbg is already waiting for the connection from the
> debugee by the time Windows starts booting. If you try to attach
> windbg later it won't work. It worked for me but obviously YMMV.
>
> Tamas
>
> On Tue, Jun 22, 2021 at 2:07 PM Neil Sikka  wrote:
> >
> > I tried that, but it seems like I'm getting an interrupt storm on the
> debugger VM (CPU spends all its time in the kernel) when I try to attach
> the debugger. This observation furthers my suspicion that there is
> communication, but there is something wrong with the protocol...
> >
> > On Tue, Jun 22, 2021 at 12:43 PM Tamas K Lengyel <
> tamas.k.leng...@gmail.com> wrote:
> >>
> >> I used Xen 4.15 and a pretty new version of Windows 10. It is a bit
> >> finicky, you have to run the debug commands on the debugee and then
> >> reboot. When the VM is rebooting the domain ID changes so you have to
> >> start the serial bridge then. Windbg will attach afterwards. Just make
> >> sure both VMs have serial='pty' set in their config file.
> >>
> >> Tamas
> >>
> >> On Tue, Jun 22, 2021 at 12:33 PM Neil Sikka 
> wrote:
> >> >
> >> > Thanks for the quick response, Tamas. I tried what you said and
> windbg waits and the debugee hangs when I click the break button in windbg,
> but I don't see any output in windbg. This means that there is SOME
> communication over the serial port that causes the debugee to hang when I
> click break. Could it be a debugger protocol issue? I also tried the
> guidance here by running the crlf program:
> >> > https://www.qubes-os.org/doc/windows-debugging/
> >> > But windbg waits and the debugee hangs in a similar manner.
> >> >
> >> > What versions of WIndows and Xen are you using?
> >> >
> >> > On Tue, Jun 22, 2021 at 12:10 PM Tamas K Lengyel <
> tamas.k.leng...@gmail.com> wrote:
> >> >>
> >> >> I have managed to get windbg working with a serial bridge between two
> >> >> Win10 VMs using the following script:
> >> >>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/serial-bridge.sh
> .
> >> >> The debugee has to enable a couple options so that windbg can attach:
> >> >>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/debug.cmd
> .
> >> >>
> >> >> Tamas
> >> >>
> >> >> On Tue, Jun 22, 2021 at 12:01 PM Neil Sikka 
> wrote:
> >> >> >
> >> >> > Hello,
> >> >> > Has anyone gotten a Windows10 (Version 1709 of later) kernel
> debugger attached when running the Windows10 debugger VM and the Windows10
> debugee VM on Xen 4.13.0 hypervisor? I am getting a "NIC hardware
> initialization failed" error. I tried the suggestions in the discussion
> here (https://bugzilla.redhat.com/show_bug.cgi?id=1947015):
> >> >> > -cpu
> Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,
> \
> >> >> >
> skip-l1dfl-vmentry=on,mpx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vendor-id=KVMKVMKVM
> >> >> >
> >> >> > note: i had to remove the following 2 arguments due to errors from
> QEMU:
> >> >> > pschange-mc-no=on
> >> >> > hv_vpindex
> >> >> >
> >> >> > Here was the error:
> >> >> > C:\Users\user\Desktop\oldDebuggers\x64>kdnet.exe
> >> >> >
> >> >> > Network debugging is supported on the following NICs:
> >> >> > busparams=0.4.0, Intel(R) PRO/1000 MT Network Connection, Plugged
> in.
> >> >> > The Microsoft hypervisor running this VM does not support KDNET.
> >> >> > Please upgrade to the hypervisor shipped in Windows 8 or WS2012 or
> later.
> >> >> >
> >> >> > KDNET initialization failed.  Status = 0xC182.
> >> >> > NIC hardware initialization failed.
> >> >> >
> >> >> > I am using an Intel e1000 NIC emulated through QEMU because its
> supposedly a supported NIC for Windows kernel NET debugging.
> >> >> >
> >> >> > Thanks in Advance!
> >> >> >
> >> >> > --
> >> >> > My Blog: http://www.neilscomputerblog.blogspot.com/
> >> >> > Twitter: @neilsikka
> >> >
> >> >
> >> >
> >> > --
> >> > My Blog: http://www.neilscomputerblog.blogspot.com/
> >> > Twitter: @neilsikka
> >
> >
> >
> > --
> > My Blog: http://www.neilscomputerblog.blogspot.com/
> > Twitter: @neilsikka
>


-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Re: Windows 10 Kernel Debugging on Xen

2021-06-22 Thread Neil Sikka
I tried that, but it seems like I'm getting an interrupt storm on the
debugger VM (CPU spends all its time in the kernel) when I try to attach
the debugger. This observation furthers my suspicion that there is
communication, but there is something wrong with the protocol...

On Tue, Jun 22, 2021 at 12:43 PM Tamas K Lengyel 
wrote:

> I used Xen 4.15 and a pretty new version of Windows 10. It is a bit
> finicky, you have to run the debug commands on the debugee and then
> reboot. When the VM is rebooting the domain ID changes so you have to
> start the serial bridge then. Windbg will attach afterwards. Just make
> sure both VMs have serial='pty' set in their config file.
>
> Tamas
>
> On Tue, Jun 22, 2021 at 12:33 PM Neil Sikka  wrote:
> >
> > Thanks for the quick response, Tamas. I tried what you said and windbg
> waits and the debugee hangs when I click the break button in windbg, but I
> don't see any output in windbg. This means that there is SOME communication
> over the serial port that causes the debugee to hang when I click break.
> Could it be a debugger protocol issue? I also tried the guidance here by
> running the crlf program:
> > https://www.qubes-os.org/doc/windows-debugging/
> > But windbg waits and the debugee hangs in a similar manner.
> >
> > What versions of WIndows and Xen are you using?
> >
> > On Tue, Jun 22, 2021 at 12:10 PM Tamas K Lengyel <
> tamas.k.leng...@gmail.com> wrote:
> >>
> >> I have managed to get windbg working with a serial bridge between two
> >> Win10 VMs using the following script:
> >>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/serial-bridge.sh
> .
> >> The debugee has to enable a couple options so that windbg can attach:
> >>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/debug.cmd
> .
> >>
> >> Tamas
> >>
> >> On Tue, Jun 22, 2021 at 12:01 PM Neil Sikka 
> wrote:
> >> >
> >> > Hello,
> >> > Has anyone gotten a Windows10 (Version 1709 of later) kernel debugger
> attached when running the Windows10 debugger VM and the Windows10 debugee
> VM on Xen 4.13.0 hypervisor? I am getting a "NIC hardware initialization
> failed" error. I tried the suggestions in the discussion here (
> https://bugzilla.redhat.com/show_bug.cgi?id=1947015):
> >> > -cpu
> Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,
> \
> >> >
> skip-l1dfl-vmentry=on,mpx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vendor-id=KVMKVMKVM
> >> >
> >> > note: i had to remove the following 2 arguments due to errors from
> QEMU:
> >> > pschange-mc-no=on
> >> > hv_vpindex
> >> >
> >> > Here was the error:
> >> > C:\Users\user\Desktop\oldDebuggers\x64>kdnet.exe
> >> >
> >> > Network debugging is supported on the following NICs:
> >> > busparams=0.4.0, Intel(R) PRO/1000 MT Network Connection, Plugged in.
> >> > The Microsoft hypervisor running this VM does not support KDNET.
> >> > Please upgrade to the hypervisor shipped in Windows 8 or WS2012 or
> later.
> >> >
> >> > KDNET initialization failed.  Status = 0xC182.
> >> > NIC hardware initialization failed.
> >> >
> >> > I am using an Intel e1000 NIC emulated through QEMU because its
> supposedly a supported NIC for Windows kernel NET debugging.
> >> >
> >> > Thanks in Advance!
> >> >
> >> > --
> >> > My Blog: http://www.neilscomputerblog.blogspot.com/
> >> > Twitter: @neilsikka
> >
> >
> >
> > --
> > My Blog: http://www.neilscomputerblog.blogspot.com/
> > Twitter: @neilsikka
>


-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Re: Windows 10 Kernel Debugging on Xen

2021-06-22 Thread Neil Sikka
Thanks for the quick response, Tamas. I tried what you said and windbg
waits and the debugee hangs when I click the break button in windbg, but I
don't see any output in windbg. This means that there is SOME communication
over the serial port that causes the debugee to hang when I click break.
Could it be a debugger protocol issue? I also tried the guidance here by
running the crlf program:
https://www.qubes-os.org/doc/windows-debugging/
But windbg waits and the debugee hangs in a similar manner.

What versions of WIndows and Xen are you using?

On Tue, Jun 22, 2021 at 12:10 PM Tamas K Lengyel 
wrote:

> I have managed to get windbg working with a serial bridge between two
> Win10 VMs using the following script:
>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/serial-bridge.sh
> .
> The debugee has to enable a couple options so that windbg can attach:
>
> https://github.com/intel/kernel-fuzzer-for-xen-project/blob/master/scripts/debug.cmd
> .
>
> Tamas
>
> On Tue, Jun 22, 2021 at 12:01 PM Neil Sikka  wrote:
> >
> > Hello,
> > Has anyone gotten a Windows10 (Version 1709 of later) kernel debugger
> attached when running the Windows10 debugger VM and the Windows10 debugee
> VM on Xen 4.13.0 hypervisor? I am getting a "NIC hardware initialization
> failed" error. I tried the suggestions in the discussion here (
> https://bugzilla.redhat.com/show_bug.cgi?id=1947015):
> > -cpu
> Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,
> \
> >
> skip-l1dfl-vmentry=on,mpx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vendor-id=KVMKVMKVM
> >
> > note: i had to remove the following 2 arguments due to errors from QEMU:
> > pschange-mc-no=on
> > hv_vpindex
> >
> > Here was the error:
> > C:\Users\user\Desktop\oldDebuggers\x64>kdnet.exe
> >
> > Network debugging is supported on the following NICs:
> > busparams=0.4.0, Intel(R) PRO/1000 MT Network Connection, Plugged in.
> > The Microsoft hypervisor running this VM does not support KDNET.
> > Please upgrade to the hypervisor shipped in Windows 8 or WS2012 or later.
> >
> > KDNET initialization failed.  Status = 0xC182.
> > NIC hardware initialization failed.
> >
> > I am using an Intel e1000 NIC emulated through QEMU because its
> supposedly a supported NIC for Windows kernel NET debugging.
> >
> > Thanks in Advance!
> >
> > --
> > My Blog: http://www.neilscomputerblog.blogspot.com/
> > Twitter: @neilsikka
>


-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Windows 10 Kernel Debugging on Xen

2021-06-22 Thread Neil Sikka
Hello,
Has anyone gotten a Windows10 (Version 1709 of later) kernel debugger
attached when running the Windows10 debugger VM and the Windows10 debugee
VM on Xen 4.13.0 hypervisor? I am getting a "NIC hardware initialization
failed" error. I tried the suggestions in the discussion here (
https://bugzilla.redhat.com/show_bug.cgi?id=1947015):
-cpu
Skylake-Server-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,ibpb=on,amd-ssbd=on,
\
skip-l1dfl-vmentry=on,mpx=off,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vendor-id=KVMKVMKVM

note: i had to remove the following 2 arguments due to errors from QEMU:
pschange-mc-no=on
hv_vpindex

Here was the error:
C:\Users\user\Desktop\oldDebuggers\x64>kdnet.exe

Network debugging is supported on the following NICs:
busparams=0.4.0, Intel(R) PRO/1000 MT Network Connection, Plugged in.
The Microsoft hypervisor running this VM does not support KDNET.
Please upgrade to the hypervisor shipped in Windows 8 or WS2012 or later.

KDNET initialization failed.  Status = 0xC182.
NIC hardware initialization failed.

I am using an Intel e1000 NIC emulated through QEMU because its supposedly
a supported NIC for Windows kernel NET debugging.

Thanks in Advance!

-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka


Locking in xl

2020-04-23 Thread Neil Sikka
Hello,
I see that in the xl binary in xen 4.13.0, the acquire_lock() and
release_lock() functions are only called from create_domain() in
xl_vmcontrol.c, so I assume the lock provides inter-PROCESS
synchronization in the case that multiple instances of xl are running
and creating multiple domains concurrently.

However, this lock causes a bottleneck in the case that an xl restore
process is restoring a DomU with a lot of memory. While the large
amount of memory is being copied from the checkpoint file on disk to
the physical machine's RAM, all other VM creation requests on the
system are starved, leading to a performance loss. When I removed the
lock, my testing of simultaneous creation of 2 DomU's concurrently
worked and I did not see any issues.

Does anyone know what shared resource these locks are guarding? Maybe
we should be making the lock more granular.

-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka



xen-network-common.sh MAC address assignment

2020-04-08 Thread Neil Sikka
Hello,
Why does git commit f400f2993b52e820d0da24a2e49a8fdfab0d2827, make
xen-network-common.sh set a static MAC address for the virtual device
as shown below?

ip link set dev ${dev} address fe:ff:ff:ff:ff:ff || true

I see the high order byte is 0xFE, which has the broadcast flag unset,
thereby signifying a unicast MAC address (as stated in the comment
right before this line). But shouldnt the device have a random MAC
address? Is there any reason why a hardcoded MAC is assigned? FYI, in
rewriting this script with a randomized MAC, my VM cannot communicate
with my host, so this might have something to do with my problem.

Thanks.

-- 
My Blog: http://www.neilscomputerblog.blogspot.com/
Twitter: @neilsikka