Domain Birth Time

2024-05-10 Thread James Dingwall
Hi,

We've added a feature to Xen 4.15 such that `xl uptime -b` reports the birth
time of the domain (i.e. a value preserved across migrations).  If this would
be of wider interest I can try porting this to a more recent release and
submitting it for review.

Regards,
James



Re: XSA-446 relevance on Intel

2024-01-05 Thread James Dingwall
On Tue, Dec 12, 2023 at 10:56:48AM +, Andrew Cooper wrote:
> On 12/12/2023 9:43 am, James Dingwall wrote:
> > Hi,
> >
> > We were experiencing a crash during PV domU boot on several different models
> > of hardware but all with Intel CPUs.  The Xen version was based on 
> > stable-4.15
> > at 4a4daf6bddbe8a741329df5cc8768f7dec664aed (XSA-444) with some local
> > patches.  Since updating the branch to 
> > b918c4cdc7ab2c1c9e9a9b54fa9d9c595913e028
> > (XSA-446) we have not observed the same crash.
> 
> That range covers:
> 
> 1f5f515da0f6 - iommu/amd-vi: use correct level for quarantine domain
> page tables
> b918c4cdc7ab - x86/spec-ctrl: Remove conditional IRQs-on-ness for INT
> $0x80/0x82 paths
> 
> so yeah - not much in the way of change.
> 
> > The occurrence was on 1-2% of boots and we couldn't determine a particular
> > sequence of events that would trigger it.  The kernel is based on Ubuntu's
> > 5.15.0-91 tag but we also observed the same with -85.  Due to the low
> > frequency it is possible that we simply haven't observed it again since
> > updating our Xen build.
> >
> > If I have followed the early startup this is happening shortly after 
> > detection
> > of possible CPU vulnerabilities and patching in alternative instructions.  
> > As
> > the RIP was native_irq_return_iret and XSA-446 related to interupt 
> > management
> > I wondered if it was possible that despite "Xen is not believed to be 
> > vulnerable
> > in default configurations on CPUs from other hardware vendors." there could
> > be some conditions in which an Intel CPU is affected?
> 
> In short, XSA-446 isn't plausibly related.  It's completely internal to
> Xen, with no alteration on guest state.
> 
> It is an error that Linux has ended up in native_irq_return_iret.  Linux
> cannot return to itself with an IRET instruction, and must use
> HYPERCALL_iret instead.
> 
> In recent versions of Linux, this is fixed up as about the earliest
> action a PV kernel takes, but on older versions of Linux, any
> interrupt/exception early enough on boot was fatal in this way.
> 
> 
> This part of the backtrace is odd:
> 
> [    0.398962]  ? native_iret+0x7/0x7
> [    0.398967]  ? insn_decode+0x79/0x100
> [    0.398975]  ? insn_decode+0xcf/0x100
> [    0.398980]  optimize_nops+0x68/0x150
> 
> as it's not clear how we've ended up in a case wanting to return back to
> the kernel to begin with.  However, it's most likely a pagefault, as
> optimize_nops() is making changes in arbitrary locations.
> 
> It is possible that a change in visible features has altered the
> behaviour enough not to crash, but if everything is still the same as
> far as you can tell, then it's likely just chance that you haven't seen
> it again.
> 
> This is definitely a Linux bug, so I suspect something bad has been
> backported into Ubuntu.
> 
> ~Andrew

Thanks for the response.  I had a look at the more recent kernels and managed
to backport "x86/entry,xen: Early rewrite of 
restore_regs_and_return_to_kernel()"
without too much trouble.  It may still be a coincidence that we haven't
encountered the problem but it seems to have gone away for now. 

Regards,
James



XSA-446 relevance on Intel

2023-12-12 Thread James Dingwall
Hi,

We were experiencing a crash during PV domU boot on several different models
of hardware but all with Intel CPUs.  The Xen version was based on stable-4.15
at 4a4daf6bddbe8a741329df5cc8768f7dec664aed (XSA-444) with some local
patches.  Since updating the branch to b918c4cdc7ab2c1c9e9a9b54fa9d9c595913e028
(XSA-446) we have not observed the same crash.

The occurrence was on 1-2% of boots and we couldn't determine a particular
sequence of events that would trigger it.  The kernel is based on Ubuntu's
5.15.0-91 tag but we also observed the same with -85.  Due to the low
frequency it is possible that we simply haven't observed it again since
updating our Xen build.

If I have followed the early startup this is happening shortly after detection
of possible CPU vulnerabilities and patching in alternative instructions.  As
the RIP was native_irq_return_iret and XSA-446 related to interupt management
I wondered if it was possible that despite "Xen is not believed to be vulnerable
in default configurations on CPUs from other hardware vendors." there could
be some conditions in which an Intel CPU is affected?

Thanks,
James

[0.374957] GDS: Unknown: Dependent on hypervisor status
[0.375007] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.375016] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.375022] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.375027] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
[0.375033] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
[0.375038] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
[0.375047] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.375053] x86/fpu: xstate_offset[5]: 1088, xstate_sizes[5]:   64
[0.375059] x86/fpu: xstate_offset[6]: 1152, xstate_sizes[6]:  512
[0.375053] x86/fpu: xstate_offset[5]: 1088, xstate_sizes[5]:   64
[0.375059] x86/fpu: xstate_offset[6]: 1152, xstate_sizes[6]:  512
[0.375064] x86/fpu: xstate_offset[7]: 1664, xstate_sizes[7]: 1024
[0.375047] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.375053] x86/fpu: xstate_offset[5]: 1088, xstate_sizes[5]:   64
[0.375059] x86/fpu: xstate_offset[6]: 1152, xstate_sizes[6]:  512
[0.375064] x86/fpu: xstate_offset[7]: 1664, xstate_sizes[7]: 1024
[0.375070] x86/fpu: Enabled xstate features 0xe7, context size is 2688 
bytes, using 'standard' format.
[0.398765] segment-related general protection fault: e030 [#1] SMP NOPTI
[0.398784] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-91-generic 
#101~20.04.1
[0.398792] RIP: e030:native_irq_return_iret+0x0/0x2
[0.398806] Code: 5b 41 5b 41 5a 41 59 41 58 58 59 5a 5e 5f 48 83 c4 08 eb 
0f 0f 1f 00 90 66 66 2e 0f 1f 84 00 00 00 00 00 f6 44 24 20 04 75 02 <48> cf 57 
0f 01 f8 eb 12 0f 20 df 90 90 90 90 90 48 81 e7 ff e7 ff
[0.398818] RSP: e02b:82e03bd8 EFLAGS: 00010046
[0.398825] RAX:  RBX: 82e03c30 RCX: 
[0.398831] RDX: 000f RSI: 81e011f4 RDI: 82e03ca0
[0.398836] RBP: 82e03c10 R08: 81e011ef R09: 0005
[0.398842] R10: 0006 R11: e8ae0feb75ccff49 R12: 81e011ef
[0.398848] R13: 0006 R14: 81e011f4 R15: 0005
[0.398860] FS:  () GS:88802dc0() 
knlGS:
[0.398866] CS:  1e030 DS:  ES:  CR0: 80050033
[0.398872] CR2:  CR3: 02e1 CR4: 00050660
[0.398880] Call Trace:
[0.398883]  
[0.398887]  ? show_trace_log_lvl+0x1d6/0x2ea
[0.398896]  ? show_trace_log_lvl+0x1d6/0x2ea
[0.398902]  ? optimize_nops+0x68/0x150
[0.398909]  ? show_regs.part.0+0x23/0x29
[0.398914]  ? __die_body.cold+0x8/0xd
[0.398919]  ? die_addr+0x3e/0x60
[0.398925]  ? exc_general_protection+0x1c1/0x350
[0.398933]  ? asm_exc_general_protection+0x27/0x30
[0.398939]  ? restore_regs_and_return_to_kernel+0x20/0x2c
[0.398945]  ? restore_regs_and_return_to_kernel+0x1b/0x2c
[0.398950]  ? restore_regs_and_return_to_kernel+0x1b/0x2c
[0.398956]  ? restore_regs_and_return_to_kernel+0x20/0x2c
[0.398962]  ? native_iret+0x7/0x7
[0.398967]  ? insn_decode+0x79/0x100
[0.398975]  ? insn_decode+0xcf/0x100
[0.398980]  optimize_nops+0x68/0x150
[0.398986]  apply_alternatives+0x181/0x3a0
[0.398991]  ? restore_regs_and_return_to_kernel+0x1b/0x2c
[0.398996]  ? fb_is_primary_device+0x25/0x73
[0.399003]  ? restore_regs_and_return_to_kernel+0x1b/0x2c
[0.399009]  ? apply_alternatives+0x8/0x3a0
[0.399014]  ? fb_is_primary_device+0x6e/0x73
[0.399019]  ? apply_returns+0xfc/0x180
[0.399024]  ? fb_is_primary_device+0x6e/0x73
[0.399029]  ? sanitize_boot_params.constprop.0+0xa/0xef
[0.399035]  ? fb_is_primary_device+0x73/0x73
[0.399040]  

Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2

2023-11-29 Thread James Dingwall
On Mon, Nov 20, 2023 at 10:24:05AM +0100, Roger Pau Monné wrote:
> On Mon, Nov 20, 2023 at 08:27:36AM +0000, James Dingwall wrote:
> > On Fri, Nov 17, 2023 at 10:56:30AM +0100, Jan Beulich wrote:
> > > On 17.11.2023 10:18, James Dingwall wrote:
> > > > On Thu, Nov 16, 2023 at 04:32:47PM +, Andrew Cooper wrote:
> > > >> On 16/11/2023 4:15 pm, James Dingwall wrote:
> > > >>> Hi,
> > > >>>
> > > >>> Per the msr_relaxed documentation:
> > > >>>
> > > >>>"If using this option is necessary to fix an issue, please report 
> > > >>> a bug."
> > > >>>
> > > >>> After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 
> > > >>> we
> > > >>> started experiencing a BSOD at boot with one of our Windows guests.  
> > > >>> We found
> > > >>> that enabling `msr_relaxed = 1` in the guest configuration has 
> > > >>> resolved the
> > > >>> problem.  With a debug build of Xen and `hvm_debug=2048` on the 
> > > >>> command line
> > > >>> the following messages were caught as the BSOD happened:
> > > >>>
> > > >>> (XEN) [HVM:11.0]  ecx=0x1a2
> > > >>> (XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented
> > > >>> (XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0
> > > >>>
> > > >>> I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this 
> > > >>> patch
> > > >>> series from last month:
> > > >>>
> > > >>> https://patchwork.kernel.org/project/xen-devel/list/?series=796550
> > > >>>
> > > >>> Picking out just a small part of that fixes the problem for us. 
> > > >>> Although the
> > > >>> the patch is against 4.15.5 I think it would be relevant to more 
> > > >>> recent
> > > >>> releases too.
> > > >>
> > > >> Which version of Windows, and what hardware?
> > > >>
> > > >> The Viridian Crash isn't about the RDMSR itself - it's presumably
> > > >> collateral damage shortly thereafter.
> > > >>
> > > >> Does filling in 0 for that MSR also resolve the issue?  It's model
> > > >> specific and we absolutely cannot pass it through from real hardware
> > > >> like that.
> > > >>
> > > > 
> > > > Hi Andrew,
> > > > 
> > > > Thanks for your response.  The guest is running Windows 10 and the crash
> > > > happens in a proprietary hardware driver.  A little bit of knowledge as
> > > > they say was enough to stop the crash but I don't understand the impact
> > > > of what I've actually done...
> > > > 
> > > > To rework the patch I'd need a bit of guidance, if I understand your
> > > > suggestion I set the MSR to 0 with this change in emul-priv-op.c:
> > > 
> > > For the purpose of the experiment suggested by Andrew ...
> > > 
> > > > diff --git a/xen/arch/x86/pv/emul-priv-op.c 
> > > > b/xen/arch/x86/pv/emul-priv-op.c
> > > > index ed97b1d6fcc..66f5e417df6 100644
> > > > --- a/xen/arch/x86/pv/emul-priv-op.c
> > > > +++ b/xen/arch/x86/pv/emul-priv-op.c
> > > > @@ -976,6 +976,10 @@ static int read_msr(unsigned int reg, uint64_t 
> > > > *val,
> > > >  *val = 0;
> > > >  return X86EMUL_OKAY;
> > > >  
> > > > +case MSR_TEMPERATURE_TARGET:
> > > > +*val = 0;
> > > > +return X86EMUL_OKAY;
> > > > +
> > > >  case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
> > > >  case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
> > > >  case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:
> > > 
> > > ... you wouldn't need this (affects PV domains only), and ...
> > > 
> > > > and this in vmx.c:
> > > > 
> > > > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > > > index 54023a92587..bbf37b7f272 100644
> > > > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > > > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > > > @@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int 
> > > > msr, uint64_t *msr_content)
> > > >  if ( !nvmx_msr_read_intercept(msr, msr_content) )
> > > >  goto gp_fault;
> > > >  break;
> > > > +
> > > > +case MSR_TEMPERATURE_TARGET:
> > > > +*msr_content = 0;
> > > > +break;
> 
> I think the preference now is to add such handling directly in
> guest_rdmsr()?  Protected with a:
> 
> if ( !(cp->x86_vendor & (X86_VENDOR_INTEL)) )
> goto gp_fault;
> 

It is possible we can patch the the driver which is triggering the BSOD but it
seems unlileky we'd be able to roll that out in advance of doing the Xen
upgrade for dom0.  If the problem we are encountering is specific to our
situation rather than a general case issue then we can easily carry a patch
for that.

Thanks for the help,
James



Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2

2023-11-20 Thread James Dingwall
On Fri, Nov 17, 2023 at 11:17:46AM +0100, Roger Pau Monné wrote:
> On Fri, Nov 17, 2023 at 09:18:39AM +0000, James Dingwall wrote:
> > On Thu, Nov 16, 2023 at 04:32:47PM +, Andrew Cooper wrote:
> > > On 16/11/2023 4:15 pm, James Dingwall wrote:
> > > > Hi,
> > > >
> > > > Per the msr_relaxed documentation:
> > > >
> > > >"If using this option is necessary to fix an issue, please report a 
> > > > bug."
> > > >
> > > > After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 we
> > > > started experiencing a BSOD at boot with one of our Windows guests.  We 
> > > > found
> > > > that enabling `msr_relaxed = 1` in the guest configuration has resolved 
> > > > the
> > > > problem.  With a debug build of Xen and `hvm_debug=2048` on the command 
> > > > line
> > > > the following messages were caught as the BSOD happened:
> > > >
> > > > (XEN) [HVM:11.0]  ecx=0x1a2
> > > > (XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented
> > > > (XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0
> > > >
> > > > I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this 
> > > > patch
> > > > series from last month:
> > > >
> > > > https://patchwork.kernel.org/project/xen-devel/list/?series=796550
> > > >
> > > > Picking out just a small part of that fixes the problem for us. 
> > > > Although the
> > > > the patch is against 4.15.5 I think it would be relevant to more recent
> > > > releases too.
> > > 
> > > Which version of Windows, and what hardware?
> > > 
> > > The Viridian Crash isn't about the RDMSR itself - it's presumably
> > > collateral damage shortly thereafter.
> > > 
> > > Does filling in 0 for that MSR also resolve the issue?  It's model
> > > specific and we absolutely cannot pass it through from real hardware
> > > like that.
> > > 
> > 
> > Hi Andrew,
> > 
> > Thanks for your response.  The guest is running Windows 10 and the crash
> > happens in a proprietary hardware driver.
> 
> When you say proprietary you mean a custom driver made for your
> use-case, or is this some vendor driver widely available?
> 

Hi Roger,

We have emulated some point of sale hardware with a custom qemu device.  It
is reasonably common but limited to its particular sector.  As the physical
hardware is all built to the same specification I assume the driver has made
assumptions about the availability of MSR_TEMPERATURE_TARGET and doesn't
handle the case it is absent which leads to the BSOD in the Windows guest.

Regards,
James



Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2

2023-11-20 Thread James Dingwall
On Fri, Nov 17, 2023 at 10:56:30AM +0100, Jan Beulich wrote:
> On 17.11.2023 10:18, James Dingwall wrote:
> > On Thu, Nov 16, 2023 at 04:32:47PM +, Andrew Cooper wrote:
> >> On 16/11/2023 4:15 pm, James Dingwall wrote:
> >>> Hi,
> >>>
> >>> Per the msr_relaxed documentation:
> >>>
> >>>"If using this option is necessary to fix an issue, please report a 
> >>> bug."
> >>>
> >>> After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 we
> >>> started experiencing a BSOD at boot with one of our Windows guests.  We 
> >>> found
> >>> that enabling `msr_relaxed = 1` in the guest configuration has resolved 
> >>> the
> >>> problem.  With a debug build of Xen and `hvm_debug=2048` on the command 
> >>> line
> >>> the following messages were caught as the BSOD happened:
> >>>
> >>> (XEN) [HVM:11.0]  ecx=0x1a2
> >>> (XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented
> >>> (XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0
> >>>
> >>> I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this patch
> >>> series from last month:
> >>>
> >>> https://patchwork.kernel.org/project/xen-devel/list/?series=796550
> >>>
> >>> Picking out just a small part of that fixes the problem for us. Although 
> >>> the
> >>> the patch is against 4.15.5 I think it would be relevant to more recent
> >>> releases too.
> >>
> >> Which version of Windows, and what hardware?
> >>
> >> The Viridian Crash isn't about the RDMSR itself - it's presumably
> >> collateral damage shortly thereafter.
> >>
> >> Does filling in 0 for that MSR also resolve the issue?  It's model
> >> specific and we absolutely cannot pass it through from real hardware
> >> like that.
> >>
> > 
> > Hi Andrew,
> > 
> > Thanks for your response.  The guest is running Windows 10 and the crash
> > happens in a proprietary hardware driver.  A little bit of knowledge as
> > they say was enough to stop the crash but I don't understand the impact
> > of what I've actually done...
> > 
> > To rework the patch I'd need a bit of guidance, if I understand your
> > suggestion I set the MSR to 0 with this change in emul-priv-op.c:
> 
> For the purpose of the experiment suggested by Andrew ...
> 
> > diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
> > index ed97b1d6fcc..66f5e417df6 100644
> > --- a/xen/arch/x86/pv/emul-priv-op.c
> > +++ b/xen/arch/x86/pv/emul-priv-op.c
> > @@ -976,6 +976,10 @@ static int read_msr(unsigned int reg, uint64_t *val,
> >  *val = 0;
> >  return X86EMUL_OKAY;
> >  
> > +case MSR_TEMPERATURE_TARGET:
> > +*val = 0;
> > +return X86EMUL_OKAY;
> > +
> >  case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
> >  case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
> >  case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:
> 
> ... you wouldn't need this (affects PV domains only), and ...
> 
> > and this in vmx.c:
> > 
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index 54023a92587..bbf37b7f272 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int msr, 
> > uint64_t *msr_content)
> >  if ( !nvmx_msr_read_intercept(msr, msr_content) )
> >  goto gp_fault;
> >  break;
> > +
> > +case MSR_TEMPERATURE_TARGET:
> > +*msr_content = 0;
> > +break;
> > +
> >  case MSR_IA32_MISC_ENABLE:
> >  rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
> >  /* Debug Trace Store is not supported. */
> 
> ... indeed this ought to do. An eventual real patch may want to look
> different, though.
> 

Thanks Jan, based on the information I've reduced the patch to what seems the
minimal necessary to workaround the BSOD.  I assume simply not ending up at
X86EMUL_EXCEPTION is the resolution regardless of what value is set.

Regards,
James

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 54023a92587..bbf37b7f272 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 if ( !nvmx_msr_read_intercept(msr, msr_content) )
 goto gp_fault;
 break;
+
+case MSR_TEMPERATURE_TARGET:
+*msr_content = 0;
+break;
+
 case MSR_IA32_MISC_ENABLE:
 rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
 /* Debug Trace Store is not supported. */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 8b3ad575dbc..34e800fdc01 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -498,6 +498,9 @@
 #define MSR_IA32_MISC_ENABLE_XD_DISABLE	(1ULL << 34)
 
 #define MSR_IA32_TSC_DEADLINE		0x06E0
+
+#define MSR_TEMPERATURE_TARGET		0x01a2
+
 #define MSR_IA32_ENERGY_PERF_BIAS	0x01b0
 
 /* Platform Shared Resource MSRs */


Re: xen 4.15.5: msr_relaxed required for MSR 0x1a2

2023-11-17 Thread James Dingwall
On Thu, Nov 16, 2023 at 04:32:47PM +, Andrew Cooper wrote:
> On 16/11/2023 4:15 pm, James Dingwall wrote:
> > Hi,
> >
> > Per the msr_relaxed documentation:
> >
> >"If using this option is necessary to fix an issue, please report a bug."
> >
> > After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 we
> > started experiencing a BSOD at boot with one of our Windows guests.  We 
> > found
> > that enabling `msr_relaxed = 1` in the guest configuration has resolved the
> > problem.  With a debug build of Xen and `hvm_debug=2048` on the command line
> > the following messages were caught as the BSOD happened:
> >
> > (XEN) [HVM:11.0]  ecx=0x1a2
> > (XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented
> > (XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0
> >
> > I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this patch
> > series from last month:
> >
> > https://patchwork.kernel.org/project/xen-devel/list/?series=796550
> >
> > Picking out just a small part of that fixes the problem for us. Although the
> > the patch is against 4.15.5 I think it would be relevant to more recent
> > releases too.
> 
> Which version of Windows, and what hardware?
> 
> The Viridian Crash isn't about the RDMSR itself - it's presumably
> collateral damage shortly thereafter.
> 
> Does filling in 0 for that MSR also resolve the issue?  It's model
> specific and we absolutely cannot pass it through from real hardware
> like that.
> 

Hi Andrew,

Thanks for your response.  The guest is running Windows 10 and the crash
happens in a proprietary hardware driver.  A little bit of knowledge as
they say was enough to stop the crash but I don't understand the impact
of what I've actually done...

To rework the patch I'd need a bit of guidance, if I understand your
suggestion I set the MSR to 0 with this change in emul-priv-op.c:

diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index ed97b1d6fcc..66f5e417df6 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -976,6 +976,10 @@ static int read_msr(unsigned int reg, uint64_t *val,
 *val = 0;
 return X86EMUL_OKAY;
 
+case MSR_TEMPERATURE_TARGET:
+*val = 0;
+return X86EMUL_OKAY;
+
 case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
 case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
 case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:

and this in vmx.c:

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 54023a92587..bbf37b7f272 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3259,6 +3259,11 @@ static int vmx_msr_read_intercept(unsigned int msr, 
uint64_t *msr_content)
 if ( !nvmx_msr_read_intercept(msr, msr_content) )
 goto gp_fault;
 break;
+
+case MSR_TEMPERATURE_TARGET:
+*msr_content = 0;
+break;
+
 case MSR_IA32_MISC_ENABLE:
 rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
 /* Debug Trace Store is not supported. */


Thanks,
James



xen 4.15.5: msr_relaxed required for MSR 0x1a2

2023-11-16 Thread James Dingwall
Hi,

Per the msr_relaxed documentation:

   "If using this option is necessary to fix an issue, please report a bug."

After recently upgrading an environment from Xen 4.14.5 to Xen 4.15.5 we
started experiencing a BSOD at boot with one of our Windows guests.  We found
that enabling `msr_relaxed = 1` in the guest configuration has resolved the
problem.  With a debug build of Xen and `hvm_debug=2048` on the command line
the following messages were caught as the BSOD happened:

(XEN) [HVM:11.0]  ecx=0x1a2
(XEN) vmx.c:3298:d11v0 RDMSR 0x01a2 unimplemented
(XEN) d11v0 VIRIDIAN CRASH: 1e c096 f80b8de81eb5 0 0

I found that MSR 0x1a2 is MSR_TEMPERATURE_TARGET and from that this patch
series from last month:

https://patchwork.kernel.org/project/xen-devel/list/?series=796550

Picking out just a small part of that fixes the problem for us. Although the
the patch is against 4.15.5 I think it would be relevant to more recent
releases too.

Thanks,
James
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 54023a92587..3f64471c8a8 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3259,6 +3259,14 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 if ( !nvmx_msr_read_intercept(msr, msr_content) )
 goto gp_fault;
 break;
+
+case MSR_TEMPERATURE_TARGET:
+if ( !rdmsr_safe(msr, *msr_content) )
+break;
+/* RO for guests, MSR_PLATFORM_INFO bits set accordingly in msr.c to indicate lack of write
+ * support. */
+goto gp_fault;
+
 case MSR_IA32_MISC_ENABLE:
 rdmsrl(MSR_IA32_MISC_ENABLE, *msr_content);
 /* Debug Trace Store is not supported. */
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index ed97b1d6fcc..eb9eb45e820 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -976,6 +976,9 @@ static int read_msr(unsigned int reg, uint64_t *val,
 *val = 0;
 return X86EMUL_OKAY;
 
+case MSR_TEMPERATURE_TARGET:
+goto normal;
+
 case MSR_P6_PERFCTR(0) ... MSR_P6_PERFCTR(7):
 case MSR_P6_EVNTSEL(0) ... MSR_P6_EVNTSEL(3):
 case MSR_CORE_PERF_FIXED_CTR0 ... MSR_CORE_PERF_FIXED_CTR2:
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 8b3ad575dbc..34e800fdc01 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -498,6 +498,9 @@
 #define MSR_IA32_MISC_ENABLE_XD_DISABLE	(1ULL << 34)
 
 #define MSR_IA32_TSC_DEADLINE		0x06E0
+
+#define MSR_TEMPERATURE_TARGET		0x01a2
+
 #define MSR_IA32_ENERGY_PERF_BIAS	0x01b0
 
 /* Platform Shared Resource MSRs */


Re: live migration fails: qemu placing pci devices at different locations

2023-11-01 Thread James Dingwall
On Tue, Oct 31, 2023 at 10:07:29AM +, James Dingwall wrote:
> Hi,
> 
> I'm having a bit of trouble performing live migration between hvm guests.  The
> sending side is xen 4.14.5 (qemu 5.0), receiving 4.15.5 (qemu 5.1).  The error
> message recorded in qemu-dm---incoming.log:
> 
> qemu-system-i386: Unknown savevm section or instance ':00:04.0/vga' 0. 
> Make sure that your current VM setup matches your saved VM setup, including 
> any hotplugged devices
> 
> I have patched libxl_dm.c to explicitly assign `addr=xx` values for various
> devices and when these are correct the domain migrates correctly.  However
> the configuration differences between guests means that the values are not
> consistent.  The domain config file doesn't allow the pci address to be
> expressed in the configuration for, e.g. `soundhw="DEVICE"`
> 
> e.g. 
> 
> diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
> index 6e531863ac0..daa7c49846f 100644
> --- a/tools/libs/light/libxl_dm.c
> +++ b/tools/libs/light/libxl_dm.c
> @@ -1441,7 +1441,7 @@ static int libxl__build_device_model_args_new(libxl__gc 
> *gc,
>  flexarray_append(dm_args, "-spice");
>  flexarray_append(dm_args, spiceoptions);
>  if (libxl_defbool_val(b_info->u.hvm.spice.vdagent)) {
> -flexarray_vappend(dm_args, "-device", "virtio-serial",
> +flexarray_vappend(dm_args, "-device", 
> "virtio-serial,addr=04",
>  "-chardev", "spicevmc,id=vdagent,name=vdagent", 
> "-device",
>  "virtserialport,chardev=vdagent,name=com.redhat.spice.0",
>  NULL);
> 
> The order of devices on the qemu command line (below) appears to be the same
> so my assumption is that the internals of qemu have resulted in things being
> connected in a different order.  The output of a Windows `lspci` tool is
> also included.
> 
> Could anyone make any additional suggestions on how I could try to gain
> consistency between the different qemu versions?

After a bit more head scratching we worked out the cause and a solution for
our case.  In xen 4.15.4 d65ebacb78901b695bc5e8a075ad1ad865a78928 was
introduced to stop using the deprecated qemu `-soundhw` option.  The qemu
device initialisation code looks like:

...
soundhw_init(); // handles old -soundhw option
...
/* init generic devices */
rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
qemu_opts_foreach(qemu_find_opts("device"),
  device_init_func, NULL, _fatal);
...

So for the old -soundhw option this was processed before any -device options
and the sound card was assigned the next available slot on the bus and then
any further -devices were added according to the command line order.  After
that xen change the sound card was added as a -device and depending on the
other emulated hardware would be added at a different point to the equivalent
-soundhw option.  By re-ordering the qemu command line building in libxl_dm.c
we can make the sound card be the first -device which resolves the migration
problem.

I think this would also have been a problem for live migration between 4.15.3
and 4.15.4 for a vm with a sound card and not just the major version jump we
are doing.

James



live migration fails: qemu placing pci devices at different locations

2023-10-31 Thread James Dingwall
Hi,

I'm having a bit of trouble performing live migration between hvm guests.  The
sending side is xen 4.14.5 (qemu 5.0), receiving 4.15.5 (qemu 5.1).  The error
message recorded in qemu-dm---incoming.log:

qemu-system-i386: Unknown savevm section or instance ':00:04.0/vga' 0. Make 
sure that your current VM setup matches your saved VM setup, including any 
hotplugged devices

I have patched libxl_dm.c to explicitly assign `addr=xx` values for various
devices and when these are correct the domain migrates correctly.  However
the configuration differences between guests means that the values are not
consistent.  The domain config file doesn't allow the pci address to be
expressed in the configuration for, e.g. `soundhw="DEVICE"`

e.g. 

diff --git a/tools/libs/light/libxl_dm.c b/tools/libs/light/libxl_dm.c
index 6e531863ac0..daa7c49846f 100644
--- a/tools/libs/light/libxl_dm.c
+++ b/tools/libs/light/libxl_dm.c
@@ -1441,7 +1441,7 @@ static int libxl__build_device_model_args_new(libxl__gc 
*gc,
 flexarray_append(dm_args, "-spice");
 flexarray_append(dm_args, spiceoptions);
 if (libxl_defbool_val(b_info->u.hvm.spice.vdagent)) {
-flexarray_vappend(dm_args, "-device", "virtio-serial",
+flexarray_vappend(dm_args, "-device", "virtio-serial,addr=04",
 "-chardev", "spicevmc,id=vdagent,name=vdagent", "-device",
 "virtserialport,chardev=vdagent,name=com.redhat.spice.0",
 NULL);

The order of devices on the qemu command line (below) appears to be the same
so my assumption is that the internals of qemu have resulted in things being
connected in a different order.  The output of a Windows `lspci` tool is
also included.

Could anyone make any additional suggestions on how I could try to gain
consistency between the different qemu versions?

Thanks,
James


xen 4.14.5

/usr/lib/xen/bin/qemu-system-i386 -xen-domid 19 -no-shutdown
  -chardev socket,id=libxl-cmd,fd=19,server,nowait -S 
  -mon chardev=libxl-cmd,mode=control
  -chardev 
socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-19,server,nowait
  -mon chardev=libxenstat-cmd,mode=control
  -nodefaults -no-user-config -name  -vnc 0.0.0.0:93 -display none
  -k en-us
  -spice 
port=35993,tls-port=0,addr=127.0.0.1,disable-ticketing,agent-mouse=on,disable-copy-paste,image-compression=auto_glz
 
  -device virtio-serial -chardev spicevmc,id=vdagent,name=vdagent
  -device virtserialport,chardev=vdagent,name=com.redhat.spice.0
  -device VGA,vgamem_mb=16
  -boot order=cn
  -usb -usbdevice tablet
  -soundhw hda
  -smp 2,maxcpus=2
  -device rtl8139,id=nic0,netdev=net0,mac=00:16:3e:64:c8:68
  -netdev type=tap,id=net0,ifname=vif19.0-emu,script=no,downscript=no
  -object 
tls-creds-x509,id=tls0,endpoint=client,dir=/etc/certificates/usbredir,verify-peer=yes
  -chardev 
socket,id=charredir_serial0,host=127.0.0.1,port=48052,reconnect=2,nodelay,keepalive=on,user-timeout=5
  -device isa-serial,chardev=charredir_serial0
  -chardev 
socket,id=charredir_serial1,host=127.0.0.1,port=48054,reconnect=2,nodelay,keepalive=on,user-timeout=5
  -device isa-serial,chardev=charredir_serial1
  -chardev 
socket,id=charredir_serial2,host=127.0.0.1,port=48055,reconnect=2,nodelay,keepalive=on,user-timeout=5
  -device pci-serial,chardev=charredir_serial2
  -trace events=/etc/xen/qemu-trace-options -machine xenfv -m 2032
  -drive file=/dev/drbd1002,if=ide,index=0,media=disk,format=raw,cache=writeback
  -drive file=/dev/drbd1003,if=ide,index=1,media=disk,format=raw,cache=writeback
  -runas 131091:131072

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] 
(rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)
00:03.0 Audio device: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) 
High Definition Audio Controller (rev 01)
00:04.0 Communication controller: Red Hat, Inc Virtio console
00:05.0 VGA compatible controller: Device 1234: (rev 02)
00:07.0 Serial controller: Red Hat, Inc. QEMU PCI 16550A Adapter (rev 01)



xen 4.15.5

/usr/lib/xen/bin/qemu-system-i386 -xen-domid 15 -no-shutdown
  -chardev socket,id=libxl-cmd,fd=19,server=on,wait=off -S
  -mon chardev=libxl-cmd,mode=control
  -chardev 
socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-15,server=on,wait=off
  -mon chardev=libxenstat-cmd,mode=control
  -nodefaults -no-user-config -name  -vnc 0.0.0.0:93 -display none
  -k en-us
  -spice 
port=35993,tls-port=0,addr=127.0.0.1,disable-ticketing=on,agent-mouse=on,disable-copy-paste=on,image-compression=auto_glz
  -device virtio-serial -chardev spicevmc,id=vdagent,name=vdagent
  -device 

Re: [PATCH] fix invalid frontend path for set_mtu

2022-04-27 Thread James Dingwall

On 2022-04-27 10:17, Anthony PERARD wrote:

On Tue, Apr 19, 2022 at 01:04:18PM +0100, James Dingwall wrote:
Thank you for your feedback.  I've updated the patch as suggested.  
I've also
incorporated two other changes, one is a simple style change for 
consistency,
the other is to change a the test for a valid mtu from > 0 to >= 68.  
I can

resubmit the original patch if either of these are a problem.


The style change is fine, but I'd rather have the change to the
mtu check in a different patch.

Otherwise, the patch looks better, thanks.


Here is a revised version of the patch that removes the mtu change.

Thanks,
Jamescommit f6ec92717522e74b4cc3aa4160b8ad6884e0b50c
Author: James Dingwall 
Date:   Tue Apr 19 12:45:31 2022 +0100

The set_mtu() function of xen-network-common.sh currently has this code:

if [ ${type_if} = vif ]
then
local dev_=${dev#vif}
local domid=${dev_%.*}
local devid=${dev_#*.}

local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"

xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
fi

This works fine if the device has its default name but if the xen config
defines the vifname parameter the FRONTEND_PATH is incorrectly constructed.
Learn the frontend path by reading the appropriate value from the backend.

Also change use of `...` to $(...) for a consistent style in the script.
    
Signed-off-by: James Dingwall 

diff --git a/tools/hotplug/Linux/xen-network-common.sh b/tools/hotplug/Linux/xen-network-common.sh
index 42fa704e8d..7a63308a9e 100644
--- a/tools/hotplug/Linux/xen-network-common.sh
+++ b/tools/hotplug/Linux/xen-network-common.sh
@@ -171,7 +171,7 @@ set_mtu () {
 local mtu=$(xenstore_read_default "$XENBUS_PATH/mtu" "")
 if [ -z "$mtu" ]
 then
-mtu="`ip link show dev ${bridge}| awk '/mtu/ { print $5 }'`"
+mtu="$(ip link show dev ${bridge}| awk '/mtu/ { print $5 }')"
 if [ -n "$mtu" ]
 then
 log debug "$bridge MTU is $mtu"
@@ -184,11 +184,7 @@ set_mtu () {
 
 if [ ${type_if} = vif ]
 then
-local dev_=${dev#vif}
-local domid=${dev_%.*}
-local devid=${dev_#*.}
-
-local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"
+local FRONTEND_PATH="$(xenstore_read "$XENBUS_PATH/frontend")"
 
 xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
 fi


Re: [PATCH] fix invalid frontend path for set_mtu

2022-04-19 Thread James Dingwall
Hi Anthony,

On Tue, Apr 12, 2022 at 02:03:17PM +0100, Anthony PERARD wrote:
> Hi James,
> 
> On Tue, Mar 01, 2022 at 09:35:13AM +0000, James Dingwall wrote:
> > The set_mtu() function of xen-network-common.sh currently has this code:
> > 
> > if [ ${type_if} = vif ]
> > then
> > local dev_=${dev#vif}
> > local domid=${dev_%.*}
> > local devid=${dev_#*.}
> > 
> > local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"
> > 
> > xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
> > fi
> > 
> > This works fine if the device has its default name but if the xen config
> > defines the vifname parameter the FRONTEND_PATH is incorrectly constructed.
> > Learn the frontend path by reading the appropriate value from the backend.
> 
> The patch looks fine, thanks. It is only missing a line
> "Signed-off-by: your_name " at the end of the description.
> The meaning of this line is described in the file CONTRIBUTING, section
> "Developer's Certificate of Origin".
> 

Thank you for your feedback.  I've updated the patch as suggested.  I've also
incorporated two other changes, one is a simple style change for consistency,
the other is to change a the test for a valid mtu from > 0 to >= 68.  I can
resubmit the original patch if either of these are a problem.

Thanks,
James
commit 03ad5670f8a7402e30b288a55d088e87685cd1a1
Author: James Dingwall 
Date:   Tue Apr 19 12:45:31 2022 +0100

The set_mtu() function of xen-network-common.sh currently has this code:

if [ ${type_if} = vif ]
then
local dev_=${dev#vif}
local domid=${dev_%.*}
local devid=${dev_#*.}

local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"

xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
fi

This works fine if the device has its default name but if the xen config
defines the vifname parameter the FRONTEND_PATH is incorrectly constructed.
Learn the frontend path by reading the appropriate value from the backend.

Also change use of `...` to $(...) for a consistent style in the script
and adjust the valid check from `mtu > 0` to `mtu >= 68` per RFC 791.

Signed-off-by: James Dingwall 

diff --git a/tools/hotplug/Linux/xen-network-common.sh b/tools/hotplug/Linux/xen-network-common.sh
index 42fa704e8d..9a382c39f4 100644
--- a/tools/hotplug/Linux/xen-network-common.sh
+++ b/tools/hotplug/Linux/xen-network-common.sh
@@ -171,24 +171,20 @@ set_mtu () {
 local mtu=$(xenstore_read_default "$XENBUS_PATH/mtu" "")
 if [ -z "$mtu" ]
 then
-mtu="`ip link show dev ${bridge}| awk '/mtu/ { print $5 }'`"
+mtu="$(ip link show dev ${bridge}| awk '/mtu/ { print $5 }')"
 if [ -n "$mtu" ]
 then
 log debug "$bridge MTU is $mtu"
 fi
 fi
-if [ -n "$mtu" ] && [ "$mtu" -gt 0 ]
+if [ -n "$mtu" ] && [ "$mtu" -ge 68 ]
 then
 log debug "setting $dev MTU to $mtu"
 ip link set dev ${dev} mtu ${mtu} || :
 
 if [ ${type_if} = vif ]
 then
-local dev_=${dev#vif}
-local domid=${dev_%.*}
-local devid=${dev_#*.}
-
-local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"
+local FRONTEND_PATH="$(xenstore_read "$XENBUS_PATH/frontend")"
 
 xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
 fi


[PATCH] fix invalid frontend path for set_mtu

2022-03-01 Thread James Dingwall
Hi,

The set_mtu() function of xen-network-common.sh currently has this code:

if [ ${type_if} = vif ]
then
local dev_=${dev#vif}
local domid=${dev_%.*}
local devid=${dev_#*.}

local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"

xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
fi

This works fine if the device has its default name but if the xen config
defines the vifname parameter the FRONTEND_PATH is incorrectly constructed.
Learn the frontend path by reading the appropriate value from the backend.

diff --git a/tools/hotplug/Linux/xen-network-common.sh 
b/tools/hotplug/Linux/xen-network-common.sh
index 02e2388600..cd98f0d486 100644
--- a/tools/hotplug/Linux/xen-network-common.sh
+++ b/tools/hotplug/Linux/xen-network-common.sh
@@ -163,11 +163,7 @@ set_mtu () {
 
 if [ ${type_if} = vif ]
 then
-local dev_=${dev#vif}
-local domid=${dev_%.*}
-local devid=${dev_#*.}
-
-local FRONTEND_PATH="/local/domain/$domid/device/vif/$devid"
+local FRONTEND_PATH=$(xenstore_read "$XENBUS_PATH/frontend")
 
 xenstore_write "$FRONTEND_PATH/mtu" ${mtu}
 fi



Thanks,
James



Re: [RFC] kernel: xenfs parameter to hide deprecated files

2022-03-01 Thread James Dingwall
Hi Juergen,

On Fri, Feb 25, 2022 at 03:09:05PM +0100, Juergen Gross wrote:
> On 23.02.22 19:08, James Dingwall wrote:
> > Hi,
> > 
> > I have been investigating a very intermittent issue we have with xenstore
> > access hanging.  Typically it seems to happen when all domains are stopped
> > prior to a system reboot.  xenstore is running in a stubdom and using the
> > hypervisor debug keys indicates the domain is still there.
> 
> Could it be dom0 shutdown handling is unloading some modules which are
> needed for Xenstore communication? E.g. xen-evtchn?
> 
> > 
> > I have come across some old list threads which suggested access via
> > /proc/xen/xenbus could cause problems but it seems patches went in to the
> > kernel for 4.10.  However to eliminate this entirely as a possibility
> > I came up with this kernel patch to hide deprecated entries in xenfs.
> 
> I don't see how this patch could help.
> 
> libxenstore is using /dev/xen/xenbus if it is available. So the only
> case where your patch would avoid accessing /proc/xen/xenbus would be
> if /dev/xen/xenbus isn't there. But this wouldn't make Xenstore more
> reactive, I guess. ;-)
> 
> > I found this old thread for a similar change where the entries were made
> > conditional on kernel config options instead of a module parameter but
> > this was never merged.
> > 
> > https://lkml.org/lkml/2015/11/30/761
> > 
> > If this would be a useful feature I would welcome feedback.
> 
> I'm not sure how helpful it is to let the user specify a boot parameter
> for hiding the files. It will probably not get used a lot.

Thank you for taking the time to look this over.  I did suspect it might
not be relevant for most people.  I'll keep it in our build for now to
see if we improve our xenstore stability.

Thank you also for your suggestions about why we might be having a xenstore
problem.  Next time we encounter that I'll check the status of the loaded
modules.

Regards,
James



[RFC] kernel: xenfs parameter to hide deprecated files

2022-02-23 Thread James Dingwall
Hi,

I have been investigating a very intermittent issue we have with xenstore
access hanging.  Typically it seems to happen when all domains are stopped
prior to a system reboot.  xenstore is running in a stubdom and using the
hypervisor debug keys indicates the domain is still there.

I have come across some old list threads which suggested access via
/proc/xen/xenbus could cause problems but it seems patches went in to the
kernel for 4.10.  However to eliminate this entirely as a possibility
I came up with this kernel patch to hide deprecated entries in xenfs.

I found this old thread for a similar change where the entries were made
conditional on kernel config options instead of a module parameter but
this was never merged.

https://lkml.org/lkml/2015/11/30/761

If this would be a useful feature I would welcome feedback.

Thanks,
James
diff --git a/drivers/xen/xenfs/super.c b/drivers/xen/xenfs/super.c
index d7d64235010d..d02c451f6a4d 100644
--- a/drivers/xen/xenfs/super.c
+++ b/drivers/xen/xenfs/super.c
@@ -3,6 +3,11 @@
  *  xenfs.c - a filesystem for passing info between the a domain and
  *  the hypervisor.
  *
+ * 2022-02-12  James Dingwall   Introduce hide_deprecated module parameter to
+ *  mask:
+ *  - xenbus (deprecated in xen 4.6.0)
+ *  - privcmd (deprecated in xen 4.7.0)
+ *
  * 2008-10-07  Alex ZefferttReplaced /proc/xen/xenbus with xenfs filesystem
  *  and /proc/xen compatibility mount point.
  *  Turned xenfs into a loadable module.
@@ -28,6 +33,13 @@
 MODULE_DESCRIPTION("Xen filesystem");
 MODULE_LICENSE("GPL");
 
+static bool __read_mostly hide_deprecated = 0;
+module_param(hide_deprecated, bool, 0444);
+MODULE_PARM_DESC(hide_deprecated,
+	"Allow deprecated files to be hidden in xenfs.\n"\
+	"  0 - (default) show deprecated xenfs files."\
+	"  1 - hide deprecated xenfs files [xenbus, privcmd].\n");
+
 static ssize_t capabilities_read(struct file *file, char __user *buf,
  size_t size, loff_t *off)
 {
@@ -69,8 +81,32 @@ static int xenfs_fill_super(struct super_block *sb, struct fs_context *fc)
 			xen_initial_domain() ? xenfs_init_files : xenfs_files);
 }
 
+static int xenfs_fill_super_hide_deprecated(struct super_block *sb, struct fs_context *fc)
+{
+	static const struct tree_descr xenfs_files[] = {
+		[2] = { "capabilities", _file_ops, S_IRUGO },
+		{""},
+	};
+
+	static const struct tree_descr xenfs_init_files[] = {
+		[2] = { "capabilities", _file_ops, S_IRUGO },
+		{ "xsd_kva", _kva_file_ops, S_IRUSR|S_IWUSR},
+		{ "xsd_port", _port_file_ops, S_IRUSR|S_IWUSR},
+#ifdef CONFIG_XEN_SYMS
+		{ "xensyms", _ops, S_IRUSR},
+#endif
+		{""},
+	};
+
+	return simple_fill_super(sb, XENFS_SUPER_MAGIC,
+			xen_initial_domain() ? xenfs_init_files : xenfs_files);
+}
+
 static int xenfs_get_tree(struct fs_context *fc)
 {
+	if(hide_deprecated)
+		return get_tree_single(fc, xenfs_fill_super_hide_deprecated);
+
 	return get_tree_single(fc, xenfs_fill_super);
 }
 


tools: propogate MTU to vif frontends (backporting)

2022-02-14 Thread James Dingwall
Hi,

I've been backporting this series to xen 4.14 and everything relating to the
backend seems to be working well.  For the frontend I can see the mtu value
published to xenstore but it does't appear to be consumed to set the matching
mtu in the guest.

https://lists.xenproject.org/archives/html/xen-devel/2020-08/msg00458.html

Is the expected solution a custom script running in the guest to make the
necessary change or have I missed something in how this is supposed to
operate?

Thanks,
James



Re: possible kernel/libxl race with xl network-attach

2022-01-24 Thread James Dingwall
On Mon, Jan 24, 2022 at 10:07:54AM +0100, Roger Pau Monné wrote:
> On Fri, Jan 21, 2022 at 03:05:07PM +0000, James Dingwall wrote:
> > On Fri, Jan 21, 2022 at 03:00:29PM +0100, Roger Pau Monné wrote:
> > > On Fri, Jan 21, 2022 at 01:34:54PM +0000, James Dingwall wrote:
> > > > On 2022-01-13 16:11, Roger Pau Monné wrote:
> > > > > On Thu, Jan 13, 2022 at 11:19:46AM +, James Dingwall wrote:
> > > > > > 
> > > > > > I have been trying to debug a problem where a vif with the backend
> > > > > > in a
> > > > > > driver domain is added to dom0.  Intermittently the hotplug script 
> > > > > > is
> > > > > > not invoked by libxl (running as xl devd) in the driver domain.  By
> > > > > > enabling some debug for the driver domain kernel and libxl I have
> > > > > > these
> > > > > > messages:
> > > > > > 
> > > > > > driver domain kernel (Ubuntu 5.4.0-92-generic):
> > > > > > 
> > > > > > [Thu Jan 13 01:39:31 2022] [1408] 564: vif vif-0-0 vif0.0:
> > > > > > Successfully created xenvif
> > > > > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > > > > /local/domain/0/device/vif/0 -> Initialising
> > > > > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > > > > xen_netback:backend_switch_state: backend/vif/0/0 -> InitWait
> > > > > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > > > > /local/domain/0/device/vif/0 -> Connected
> > > > > > [Thu Jan 13 01:39:31 2022] vif vif-0-0 vif0.0: Guest Rx ready
> > > > > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > > > > xen_netback:backend_switch_state: backend/vif/0/0 -> Connected
> > > > > > 
> > > > > > xl devd (Xen 4.14.3):
> > > > > > 
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > > > > wpath=/local/domain/2/backend token=3/0: event
> > > > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac569700:
> > > > > > nested ao, parent 0x5633ac567f90
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > > > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:1055:devstate_callback: backend
> > > > > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > > > > state 4
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > > > > wpath=/local/domain/2/backend token=3/0: event
> > > > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac56a220:
> > > > > > nested ao, parent 0x5633ac567f90
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > > > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > > > libxl_event.c:1055:devstate_callback: backend
> > > > > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > > > > state 4
> > > > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > > > libxl_aoutils.c:88:xswait_timeout_callback: backend
> > > > > > /local/domain/2/backend/vif/0/0/state (hoping for state change to
> > > > > > 2): xswait timeout (path=/local/domain/2/backend/vif/0/0/state)
> > > > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > > > libxl_event.c:850:libxl__ev_xswatch_deregister: watch
> > > > > > w=0x5633ac569180 wpath=/local/domain/

Re: possible kernel/libxl race with xl network-attach

2022-01-21 Thread James Dingwall
On Fri, Jan 21, 2022 at 03:00:29PM +0100, Roger Pau Monné wrote:
> On Fri, Jan 21, 2022 at 01:34:54PM +0000, James Dingwall wrote:
> > On 2022-01-13 16:11, Roger Pau Monné wrote:
> > > On Thu, Jan 13, 2022 at 11:19:46AM +0000, James Dingwall wrote:
> > > > 
> > > > I have been trying to debug a problem where a vif with the backend
> > > > in a
> > > > driver domain is added to dom0.  Intermittently the hotplug script is
> > > > not invoked by libxl (running as xl devd) in the driver domain.  By
> > > > enabling some debug for the driver domain kernel and libxl I have
> > > > these
> > > > messages:
> > > > 
> > > > driver domain kernel (Ubuntu 5.4.0-92-generic):
> > > > 
> > > > [Thu Jan 13 01:39:31 2022] [1408] 564: vif vif-0-0 vif0.0:
> > > > Successfully created xenvif
> > > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > > /local/domain/0/device/vif/0 -> Initialising
> > > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > > xen_netback:backend_switch_state: backend/vif/0/0 -> InitWait
> > > > [Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed:
> > > > /local/domain/0/device/vif/0 -> Connected
> > > > [Thu Jan 13 01:39:31 2022] vif vif-0-0 vif0.0: Guest Rx ready
> > > > [Thu Jan 13 01:39:31 2022] [26] 470:
> > > > xen_netback:backend_switch_state: backend/vif/0/0 -> Connected
> > > > 
> > > > xl devd (Xen 4.14.3):
> > > > 
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > > wpath=/local/domain/2/backend token=3/0: event
> > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac569700:
> > > > nested ao, parent 0x5633ac567f90
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:1055:devstate_callback: backend
> > > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > > state 4
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528
> > > > wpath=/local/domain/2/backend token=3/0: event
> > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac56a220:
> > > > nested ao, parent 0x5633ac567f90
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180
> > > > wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event
> > > > epath=/local/domain/2/backend/vif/0/0/state
> > > > 2022-01-13 01:39:31 UTC libxl: debug:
> > > > libxl_event.c:1055:devstate_callback: backend
> > > > /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting
> > > > state 4
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_aoutils.c:88:xswait_timeout_callback: backend
> > > > /local/domain/2/backend/vif/0/0/state (hoping for state change to
> > > > 2): xswait timeout (path=/local/domain/2/backend/vif/0/0/state)
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_event.c:850:libxl__ev_xswatch_deregister: watch
> > > > w=0x5633ac569180 wpath=/local/domain/2/backend/vif/0/0/state
> > > > token=2/1: deregister slotnum=2
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_event.c:1039:devstate_callback: backend
> > > > /local/domain/2/backend/vif/0/0/state wanted state 2  timed out
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_event.c:864:libxl__ev_xswatch_deregister: watch
> > > > w=0x5633ac569180: deregister unregistered
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_device.c:1092:device_backend_callback: calling
> > > > device_backend_cleanup
> > > > 2022-01-13 01:39:51 UTC libxl: debug:
> > > > libxl_event.c:864:libxl__ev_xswatch_deregister: watch
> > > > w=0x5633

Re: possible kernel/libxl race with xl network-attach

2022-01-21 Thread James Dingwall

On 2022-01-13 16:11, Roger Pau Monné wrote:

On Thu, Jan 13, 2022 at 11:19:46AM +, James Dingwall wrote:


I have been trying to debug a problem where a vif with the backend in 
a

driver domain is added to dom0.  Intermittently the hotplug script is
not invoked by libxl (running as xl devd) in the driver domain.  By
enabling some debug for the driver domain kernel and libxl I have 
these

messages:

driver domain kernel (Ubuntu 5.4.0-92-generic):

[Thu Jan 13 01:39:31 2022] [1408] 564: vif vif-0-0 vif0.0: 
Successfully created xenvif
[Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed: 
/local/domain/0/device/vif/0 -> Initialising
[Thu Jan 13 01:39:31 2022] [26] 470: xen_netback:backend_switch_state: 
backend/vif/0/0 -> InitWait
[Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed: 
/local/domain/0/device/vif/0 -> Connected

[Thu Jan 13 01:39:31 2022] vif vif-0-0 vif0.0: Guest Rx ready
[Thu Jan 13 01:39:31 2022] [26] 470: xen_netback:backend_switch_state: 
backend/vif/0/0 -> Connected


xl devd (Xen 4.14.3):

2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528 
wpath=/local/domain/2/backend token=3/0: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac569700: nested 
ao, parent 0x5633ac567f90
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180 
wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:1055:devstate_callback: backend 
/local/domain/2/backend/vif/0/0/state wanted state 2 still waiting 
state 4
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:750:watchfd_callback: watch w=0x7ffd416b0528 
wpath=/local/domain/2/backend token=3/0: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac56a220: nested 
ao, parent 0x5633ac567f90
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:750:watchfd_callback: watch w=0x5633ac569180 
wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:1055:devstate_callback: backend 
/local/domain/2/backend/vif/0/0/state wanted state 2 still waiting 
state 4
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_aoutils.c:88:xswait_timeout_callback: backend 
/local/domain/2/backend/vif/0/0/state (hoping for state change to 2): 
xswait timeout (path=/local/domain/2/backend/vif/0/0/state)
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:850:libxl__ev_xswatch_deregister: watch w=0x5633ac569180 
wpath=/local/domain/2/backend/vif/0/0/state token=2/1: deregister 
slotnum=2
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:1039:devstate_callback: backend 
/local/domain/2/backend/vif/0/0/state wanted state 2  timed out
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch 
w=0x5633ac569180: deregister unregistered
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_device.c:1092:device_backend_callback: calling 
device_backend_cleanup
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch 
w=0x5633ac569180: deregister unregistered
2022-01-13 01:39:51 UTC libxl: error: 
libxl_device.c:1105:device_backend_callback: unable to add device with 
path /local/domain/2/backend/vif/0/0
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch 
w=0x5633ac569280: deregister unregistered
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_device.c:1470:device_complete: device 
/local/domain/2/backend/vif/0/0 add failed
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:2035:libxl__ao__destroy: ao 0x5633ac568f30: destroy


the xenstore content for the backend:

# xenstore-ls /local/domain/2/backend/vif/0
0 = ""
 frontend = "/local/domain/0/device/vif/0"
 frontend-id = "0"
 online = "1"
 state = "4"
 script = "/etc/xen/scripts/vif-zynstra"
 vifname = "dom0.0"
 mac = "00:16:3e:6c:de:82"
 bridge = "cluster"
 handle = "0"
 type = "vif"
 feature-sg = "1"
 feature-gso-tcpv4 = "1"
 feature-gso-tcpv6 = "1"
 feature-ipv6-csum-offload = "1"
 feature-rx-copy = "1"
 feature-rx-flip = "0"
 feature-multicast-control = "1"
 feature-dynamic-multicast-control = "1"
 feature-split-event-channels = "1"
 multi-queue-max-queues = "2"
 feature-ctrl-ring = "1"
 hotplug-status = "connected"

My guess is that the libxl callback is started waiting for the backend
state key to be set to XenbusStateInitWait (2) but the frontend in 
dom0
has already triggered the backend to 

possible kernel/libxl race with xl network-attach

2022-01-13 Thread James Dingwall
Hi,

I have been trying to debug a problem where a vif with the backend in a 
driver domain is added to dom0.  Intermittently the hotplug script is 
not invoked by libxl (running as xl devd) in the driver domain.  By 
enabling some debug for the driver domain kernel and libxl I have these 
messages:

driver domain kernel (Ubuntu 5.4.0-92-generic):

[Thu Jan 13 01:39:31 2022] [1408] 564: vif vif-0-0 vif0.0: Successfully created 
xenvif
[Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed: 
/local/domain/0/device/vif/0 -> Initialising
[Thu Jan 13 01:39:31 2022] [26] 470: xen_netback:backend_switch_state: 
backend/vif/0/0 -> InitWait
[Thu Jan 13 01:39:31 2022] [26] 583: xen_netback:frontend_changed: 
/local/domain/0/device/vif/0 -> Connected
[Thu Jan 13 01:39:31 2022] vif vif-0-0 vif0.0: Guest Rx ready
[Thu Jan 13 01:39:31 2022] [26] 470: xen_netback:backend_switch_state: 
backend/vif/0/0 -> Connected

xl devd (Xen 4.14.3):

2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:750:watchfd_callback: watch 
w=0x7ffd416b0528 wpath=/local/domain/2/backend token=3/0: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac569700: nested ao, 
parent 0x5633ac567f90
2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:750:watchfd_callback: watch 
w=0x5633ac569180 wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:1055:devstate_callback: 
backend /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting 
state 4
2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:750:watchfd_callback: watch 
w=0x7ffd416b0528 wpath=/local/domain/2/backend token=3/0: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: 
libxl_event.c:2445:libxl__nested_ao_create: ao 0x5633ac56a220: nested ao, 
parent 0x5633ac567f90
2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:750:watchfd_callback: watch 
w=0x5633ac569180 wpath=/local/domain/2/backend/vif/0/0/state token=2/1: event 
epath=/local/domain/2/backend/vif/0/0/state
2022-01-13 01:39:31 UTC libxl: debug: libxl_event.c:1055:devstate_callback: 
backend /local/domain/2/backend/vif/0/0/state wanted state 2 still waiting 
state 4
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_aoutils.c:88:xswait_timeout_callback: backend 
/local/domain/2/backend/vif/0/0/state (hoping for state change to 2): xswait 
timeout (path=/local/domain/2/backend/vif/0/0/state)
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:850:libxl__ev_xswatch_deregister: watch w=0x5633ac569180 
wpath=/local/domain/2/backend/vif/0/0/state token=2/1: deregister slotnum=2
2022-01-13 01:39:51 UTC libxl: debug: libxl_event.c:1039:devstate_callback: 
backend /local/domain/2/backend/vif/0/0/state wanted state 2  timed out
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0x5633ac569180: 
deregister unregistered
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_device.c:1092:device_backend_callback: calling device_backend_cleanup
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0x5633ac569180: 
deregister unregistered
2022-01-13 01:39:51 UTC libxl: error: 
libxl_device.c:1105:device_backend_callback: unable to add device with path 
/local/domain/2/backend/vif/0/0
2022-01-13 01:39:51 UTC libxl: debug: 
libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0x5633ac569280: 
deregister unregistered
2022-01-13 01:39:51 UTC libxl: debug: libxl_device.c:1470:device_complete: 
device /local/domain/2/backend/vif/0/0 add failed
2022-01-13 01:39:51 UTC libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 
0x5633ac568f30: destroy

the xenstore content for the backend:

# xenstore-ls /local/domain/2/backend/vif/0
0 = ""
 frontend = "/local/domain/0/device/vif/0"
 frontend-id = "0"
 online = "1"
 state = "4"
 script = "/etc/xen/scripts/vif-zynstra"
 vifname = "dom0.0"
 mac = "00:16:3e:6c:de:82"
 bridge = "cluster"
 handle = "0"
 type = "vif"
 feature-sg = "1"
 feature-gso-tcpv4 = "1"
 feature-gso-tcpv6 = "1"
 feature-ipv6-csum-offload = "1"
 feature-rx-copy = "1"
 feature-rx-flip = "0"
 feature-multicast-control = "1"
 feature-dynamic-multicast-control = "1"
 feature-split-event-channels = "1"
 multi-queue-max-queues = "2"
 feature-ctrl-ring = "1"
 hotplug-status = "connected"

My guess is that the libxl callback is started waiting for the backend 
state key to be set to XenbusStateInitWait (2) but the frontend in dom0 
has already triggered the backend to transition to XenbusStateConnected 
(4) and therefore it does not successfully complete.

Does this seem a reasonable explanation for the problem and what would 
the best approach to try and solve it?

Thanks,
James



Re: xen 4.14.3 incorrect (~3x) cpu frequency reported

2022-01-07 Thread James Dingwall


On Fri, Jan 07, 2022 at 12:39:04PM +0100, Jan Beulich wrote:
> On 06.01.2022 16:08, James Dingwall wrote:
> >>> On Wed, Jul 21, 2021 at 12:59:11PM +0200, Jan Beulich wrote:  
> >>>   
> >>>> On 21.07.2021 11:29, James Dingwall wrote:   
> >>>>   
> >>>>> We have a system which intermittently starts up and reports an 
> >>>>> incorrect cpu frequency:   
> > ...
> >>> I'm sorry to ask, but have you got around to actually doing that? Or
> >>> else is resolving this no longer of interest?
> > 
> > We have experienced an occurence of this issue on 4.14.3 with 'loglvl=all'
> > present on the xen command line.  I have attached the 'xl dmesg' output for
> > the fast MHz boot, the diff from the normal case is small so I've not added
> > that log separately:
> > 
> > --- normal-mhz/xl-dmesg.txt 2022-01-06 14:13:47.231465234 +
> > +++ funny-mhz/xl-dmesg.txt  2022-01-06 13:45:43.825148510 +
> > @@ -211,7 +211,7 @@
> >  (XEN)  cap enforcement granularity: 10ms
> >  (XEN) load tracking window length 1073741824 ns
> >  (XEN) Platform timer is 24.000MHz HPET
> > -(XEN) Detected 2294.639 MHz processor.
> > +(XEN) Detected 7623.412 MHz processor.
> >  (XEN) EFI memory map:
> >  (XEN)  0-07fff type=3 attr=000f
> >  (XEN)  08000-3cfff type=7 attr=000f
> 
> Below is a patch (suitably adjusted for 4.14.3) which I would hope can
> take care of the issue (assuming my vague guess on the reasons wasn't
> entirely off). It has some debugging code intentionally left in, and
> it's also not complete yet (other timer code needing similar
> adjustment). Given the improvements I've observed independent of your
> issue, I may not wait with submission until getting feedback from you,
> since - aiui - it may take some time for you to actually run into a
> case where the change would actually make an observable difference.

I'll get it added to our build and see what we find...

Thanks,
James

> 
> Jan
> 
> x86: improve TSC / CPU freq calibration accuracy
> 
> While the problem report was for extreme errors, even smaller ones would
> better be avoided: The calculated period to run calibration loops over
> can (and usually will) be shorter than the actual time elapsed between
> first and last platform timer and TSC reads. Adjust values returned from
> the init functions accordingly.
> 
> On a Skylake system I've tested this on accuracy (using HPET) went from
> detecting in some cases more than 220kHz too high a value to about
> ±1kHz. On other systems the original error range was much smaller, with
> less (in some cases only very little) improvement.
> 
> Reported-by: James Dingwall 
> Signed-off-by: Jan Beulich 
> ---
> TBD: Do we think we need to guard against the bizarre case of
>  "target + count" overflowing (i.e. wrapping)?
> TBD: Accuracy could be slightly further improved by using a (to be
>  introduced) rounding variant of muldiv64().
> TBD: I'm not entirely sure how useful the conditionals are - there
>  shouldn't be any inaccuracies from the division when count equals
>  target (upon entry to the conditionals), as then the divisor is
>  what the original value was just multiplied by.
> 
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -378,8 +378,9 @@ static u64 read_hpet_count(void)
>  
>  static int64_t __init init_hpet(struct platform_timesource *pts)
>  {
> -uint64_t hpet_rate, start;
> +uint64_t hpet_rate, start, expired;
>  uint32_t count, target;
> +unsigned int i;//temp
>  
>  if ( hpet_address && strcmp(opt_clocksource, pts->id) &&
>   cpuidle_using_deep_cstate() )
> @@ -415,16 +416,35 @@ static int64_t __init init_hpet(struct p
>  
>  pts->frequency = hpet_rate;
>  
> +for(i = 0; i < 16; ++i) {//temp
>  count = hpet_read32(HPET_COUNTER);
>  start = rdtsc_ordered();
>  target = count + CALIBRATE_VALUE(hpet_rate);
>  if ( target < count )
>  while ( hpet_read32(HPET_COUNTER) >= count )
>  continue;
> -while ( hpet_read32(HPET_COUNTER) < target )
> +while ( (count = hpet_read32(HPET_COUNTER)) < target )
>  continue;
>  
> -return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
> +expired = rdtsc_ordered() - start;
> +
> +if ( likely(count > target

Re: xen 4.14.3 incorrect (~3x) cpu frequency reported

2022-01-06 Thread James Dingwall
Hi Jan,

> > On Wed, Jul 21, 2021 at 12:59:11PM +0200, Jan Beulich wrote:
> > 
> >> On 21.07.2021 11:29, James Dingwall wrote: 
> >> 
> >>> We have a system which intermittently starts up and reports an incorrect 
> >>> cpu frequency:   
...
> > I'm sorry to ask, but have you got around to actually doing that? Or
> > else is resolving this no longer of interest?

We have experienced an occurence of this issue on 4.14.3 with 'loglvl=all'
present on the xen command line.  I have attached the 'xl dmesg' output for
the fast MHz boot, the diff from the normal case is small so I've not added
that log separately:

--- normal-mhz/xl-dmesg.txt 2022-01-06 14:13:47.231465234 +
+++ funny-mhz/xl-dmesg.txt  2022-01-06 13:45:43.825148510 +
@@ -211,7 +211,7 @@
 (XEN)  cap enforcement granularity: 10ms
 (XEN) load tracking window length 1073741824 ns
 (XEN) Platform timer is 24.000MHz HPET
-(XEN) Detected 2294.639 MHz processor.
+(XEN) Detected 7623.412 MHz processor.
 (XEN) EFI memory map:
 (XEN)  0-07fff type=3 attr=000f
 (XEN)  08000-3cfff type=7 attr=000f
@@ -616,6 +616,7 @@
 (XEN) PCI add device :b7:00.1
 (XEN) PCI add device :b7:00.2
 (XEN) PCI add device :b7:00.3
+(XEN) Platform timer appears to have unexpectedly wrapped 10 or more times.
 (XEN) [VT-D]d0:PCIe: unmap :65:00.2
 (XEN) [VT-D]d32753:PCIe: map :65:00.2
 (XEN) [VT-D]d0:PCIe: unmap :65:00.1

I also have the dom0 kernel dmesg available if that would be useful but I've
left it off initially because the log is quite large.  I don't see much in
the diff between boots except where speed/times are reported and where things
are initialised in a slightly different order.

Thanks,
James
(XEN) parameter "basevideo" unknown!
 Xen 4.14.3
(XEN) Xen version 4.14.3 (@) (gcc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0) 
debug=n  Fri Dec 10 16:11:21 UTC 2021
(XEN) Latest ChangeSet: Fri Dec 10 16:10:15 2021 + git:a598336409-dirty
(XEN) build-id: 7b441504c9977229a3c6779041ea6493
(XEN) Bootloader: EFI
(XEN) Command line: console=vga,com2 com2=115200,8n1 basevideo dom0_max_vcpus=4 
dom0_mem=min:6144,max:65536m iommu=on,required,intpost,verbose,debug 
sched=credit2 flask=enforcing gnttab_max_frames=128 xpti=off smt=on loglvl=all
(XEN) Xen image load base address: 0x5d40
(XEN) Video information:
(XEN)  VGA is graphics mode 1024x768, 32 bpp
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 2 EDD information structures
(XEN) CPU Vendor: Intel, Family 6 (0x6), Model 85 (0x55), Stepping 4 (raw 
00050654)
(XEN) EFI RAM map:
(XEN)  [, 0009] (usable)
(XEN)  [000a, 000f] (reserved)
(XEN)  [0010, 6965efff] (usable)
(XEN)  [6965f000, 6bee5fff] (reserved)
(XEN)  [6bee6000, 6c0a6fff] (usable)
(XEN)  [6c0a7000, 6ca43fff] (ACPI NVS)
(XEN)  [6ca44000, 6ed16fff] (reserved)
(XEN)  [6ed17000, 6fff] (usable)
(XEN)  [7000, 8fff] (reserved)
(XEN)  [fd00, fe7f] (reserved)
(XEN)  [fed2, fed44fff] (reserved)
(XEN)  [ff00, ] (reserved)
(XEN)  [0001, 00207fff] (usable)
(XEN) ACPI: RSDP 6C0A7000, 0024 (r2 SUPERM)
(XEN) ACPI: XSDT 6C0A70C8, 0114 (r1 SUPERM   SUPERM  1072009 AMI 10013)
(XEN) ACPI: FACP 6C0E9D78, 0114 (r6 SUPERM SMCI--MB  1072009 INTL 20091013)
(XEN) ACPI: DSDT 6C0A7278, 42AFC (r2 SUPERM SMCI--MB  1072009 INTL 20091013)
(XEN) ACPI: FACS 6CA42080, 0040
(XEN) ACPI: FPDT 6C0E9E90, 0044 (r1  1072009 AMI 10013)
(XEN) ACPI: FIDT 6C0E9ED8, 009C (r1 SUPERM SMCI--MB  1072009 AMI 10013)
(XEN) ACPI: SPMI 6C0E9F78, 0041 (r5 SUPERM SMCI--MB0 AMI.0)
(XEN) ACPI: UEFI 6C0E9FC0, 0048 (r1 SUPERM SMCI--MB  1072009   113)
(XEN) ACPI: UEFI 6C0EA008, 005C (r1  INTEL RstUefiV0 0)
(XEN) ACPI: MCFG 6C0EA068, 003C (r1 SUPERM SMCI--MB  1072009 MSFT   97)
(XEN) ACPI: HPET 6C0EA0A8, 0038 (r1 SUPERM SMCI--MB1 INTL 20091013)
(XEN) ACPI: APIC 6C0EA0E0, 071E (r3 SUPERM SMCI--MB0 INTL 20091013)
(XEN) ACPI: MIGT 6C0EA800, 0040 (r1 SUPERM SMCI--MB0 INTL 20091013)
(XEN) ACPI: MSCT 6C0EA840, 004E (r1 SUPERM SMCI--MB1 INTL 20091013)
(XEN) ACPI: PCAT 6C0EA890, 0068 (r2 SUPERM SMCI--MB2 INTL 20091013)
(XEN) ACPI: PCCT 6C0EA8F8, 006E (r1 SUPERM SMCI--MB2 INTL 20091013)
(XEN) ACPI: RASF 6C0EA968, 0030 (r1 SUPERM SMCI--MB1 INTL 20091013)
(XEN) ACPI: SLIT 6C0EA998, 002D (r1 SUPERM SMCI--MB1 INTL 20091013)
(XEN) ACPI: SRAT 6C0EA9C8

Re: xen 4.11.4 incorrect (~3x) cpu frequency reported

2021-11-05 Thread James Dingwall
Hi Jan,

On Fri, Nov 05, 2021 at 01:50:04PM +0100, Jan Beulich wrote:
> On 26.07.2021 14:33, James Dingwall wrote:
> > Hi Jan,
> > 
> > Thank you for taking the time to reply.
> > 
> > On Wed, Jul 21, 2021 at 12:59:11PM +0200, Jan Beulich wrote:
> >> On 21.07.2021 11:29, James Dingwall wrote:
> >>> We have a system which intermittently starts up and reports an incorrect 
> >>> cpu frequency:
> >>>
> >>> # grep -i mhz /var/log/kern.log 
> >>> Jul 14 17:47:47 dom0 kernel: [0.000475] tsc: Detected 2194.846 MHz 
> >>> processor
> >>> Jul 14 22:03:37 dom0 kernel: [0.000476] tsc: Detected 2194.878 MHz 
> >>> processor
> >>> Jul 14 23:05:13 dom0 kernel: [0.000478] tsc: Detected 2194.848 MHz 
> >>> processor
> >>> Jul 14 23:20:47 dom0 kernel: [0.000474] tsc: Detected 2194.856 MHz 
> >>> processor
> >>> Jul 14 23:57:39 dom0 kernel: [0.000476] tsc: Detected 2194.906 MHz 
> >>> processor
> >>> Jul 15 01:04:09 dom0 kernel: [0.000476] tsc: Detected 2194.858 MHz 
> >>> processor
> >>> Jul 15 01:27:15 dom0 kernel: [0.000482] tsc: Detected 2194.870 MHz 
> >>> processor
> >>> Jul 15 02:00:13 dom0 kernel: [0.000481] tsc: Detected 2194.924 MHz 
> >>> processor
> >>> Jul 15 03:09:23 dom0 kernel: [0.000475] tsc: Detected 2194.892 MHz 
> >>> processor
> >>> Jul 15 03:32:50 dom0 kernel: [0.000482] tsc: Detected 2194.856 MHz 
> >>> processor
> >>> Jul 15 04:05:27 dom0 kernel: [0.000480] tsc: Detected 2194.886 MHz 
> >>> processor
> >>> Jul 15 05:00:38 dom0 kernel: [0.000473] tsc: Detected 2194.914 MHz 
> >>> processor
> >>> Jul 15 05:59:33 dom0 kernel: [0.000480] tsc: Detected 2194.924 MHz 
> >>> processor
> >>> Jul 15 06:22:31 dom0 kernel: [0.000474] tsc: Detected 2194.910 MHz 
> >>> processor
> >>> Jul 15 17:52:57 dom0 kernel: [0.000474] tsc: Detected 2194.854 MHz 
> >>> processor
> >>> Jul 15 18:51:36 dom0 kernel: [0.000474] tsc: Detected 2194.900 MHz 
> >>> processor
> >>> Jul 15 19:07:26 dom0 kernel: [0.000478] tsc: Detected 2194.902 MHz 
> >>> processor
> >>> Jul 15 19:43:56 dom0 kernel: [0.000154] tsc: Detected 6895.384 MHz 
> >>> processor
> >>
> >> Well, this is output from Dom0. What we'd need to see (in addition)
> >> is the corresponding hypervisor log at maximum verbosity (loglvl=all).
> > 
> > This was just to illustrate that the dom0 usually reports the correct 
> > speed.  I'll update the xen boot options with loglvl=all and try to collect 
> > the boot messages for each case.
> > 
> >>
> >>> The xen 's' debug output:
> >>>
> >>> (XEN) TSC marked as reliable, warp = 0 (count=4)
> >>> (XEN) dom1: mode=0,ofs=0x1d1ac8bf8e,khz=6895385,inc=1
> >>> (XEN) dom2: mode=0,ofs=0x28bc24c746,khz=6895385,inc=1
> >>> (XEN) dom3: mode=0,ofs=0x345696b138,khz=6895385,inc=1
> >>> (XEN) dom4: mode=0,ofs=0x34f2635f31,khz=6895385,inc=1
> >>> (XEN) dom5: mode=0,ofs=0x3581618a7d,khz=6895385,inc=1
> >>> (XEN) dom6: mode=0,ofs=0x3627ca68b2,khz=6895385,inc=1
> >>> (XEN) dom7: mode=0,ofs=0x36dd491860,khz=6895385,inc=1
> >>> (XEN) dom8: mode=0,ofs=0x377a57ea1a,khz=6895385,inc=1
> >>> (XEN) dom9: mode=0,ofs=0x381eb175ce,khz=6895385,inc=1
> >>> (XEN) dom10: mode=0,ofs=0x38cab2e260,khz=6895385,inc=1
> >>> (XEN) dom11: mode=0,ofs=0x397fc47387,khz=6895385,inc=1
> >>> (XEN) dom12: mode=0,ofs=0x3a552762a0,khz=6895385,inc=1
> >>>
> >>> A processor from /proc/cpuinfo in dom0:
> >>>
> >>> processor   : 3
> >>> vendor_id   : GenuineIntel
> >>> cpu family  : 6
> >>> model   : 85
> >>> model name  : Intel(R) Xeon(R) D-2123IT CPU @ 2.20GHz
> >>> stepping: 4
> >>> microcode   : 0x265
> >>> cpu MHz : 6895.384
> >>> [...]
> >>>
> >>> Xen has been built at 310ab79875cb705cc2c7daddff412b5a4899f8c9 from the 
> >>> stable-4.12 branch.
> >>
> >> While this contradicts the title, both 4.11 and 4.12 are out of general
> >> support. Hence it would be more helpful if you could obtain respective
> >> logs with a more modern version of Xen - ideally from the master branch,
> >> or else the most recent stable one (4.15). Provided of course the issue
> >> continues to exist there in the first place.
> > 
> > That was my error, I meant the stable-4.11 branch.  We have a development 
> > environment based around 4.14.2 which I can test.
> 
> I'm sorry to ask, but have you got around to actually doing that? Or
> else is resolving this no longer of interest?

We have recorded a couple of other occurences on 4.11 but it is happening so
infrequently (probably once every few hundred boots) that further investigation
is low on a long list of tasks.  We are also moving to 4.14.3 and so far have
no occurences with that version.

Thanks,
James



domain never exits after using 'xl save'

2021-09-23 Thread James Dingwall
Hi,

This is an issue that was observed on 4.11.3 but I have reproduced on 4.14.3.
After using the `xl save` command the associated `xl create` process exits
which later results in the domain not being cleaned up when the guest is
shutdown.

e.g.:

# xl list -v | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
guest01  15  2048 3 -b1555.9 
d13cc54d-dcb8-4337-9dfe-3b04f671b16a- system_u:system_r:migrate_domU_t

# ps -ef | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
root 18694 1  0 Sep22 ?00:00:00 /usr/sbin/xl create -p 
/etc/xen/config/d13cc54d-dcb8-4337-9dfe-3b04f671b16a.cfg

# xl save -p guest01 /vmsave/guest01.mem
Saving to /vmsave/guest01.mem new xl format (info 0x3/0x0/2900)
xc: info: Saving domain 15, type x86 HVM
xc: Frames: 1044480/1044480  100%
xc: End of stream: 0/00%

# xl list -v | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
guest01  15  2048 3 --p---1558.3 
d13cc54d-dcb8-4337-9dfe-3b04f671b16a- system_u:system_r:migrate_domU_t

# ps -ef | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
- no matches -

# xl unpause guest01

# xl list -v | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
guest01  15  2048 3 -b1559.0 
d13cc54d-dcb8-4337-9dfe-3b04f671b16a- system_u:system_r:migrate_domU_t

# xl shutdown guest01

# xl list -v | grep d13cc54d-dcb8-4337-9dfe-3b04f671b16
guest01  15  2048 3 ---s--1575.8 
d13cc54d-dcb8-4337-9dfe-3b04f671b16a0 system_u:system_r:migrate_domU_t


What we would expect is that the `xl create` process remains running so that
when the domain is later shutdown then it gets cleaned up without having to
manually `xl destroy`.

tools/xl/xl_vmcontrol.c handle_domain_death() has (0 == DOMAIN_RESTART_NONE in 
xl.h)

case LIBXL_SHUTDOWN_REASON_SUSPEND:
LOG("Domain has suspended.");
return 0;

The while(1) loop of create_domain() has a switch statement which handles this
return value with:

case DOMAIN_RESTART_NONE:
LOG("Done. Exiting now");
libxl_event_free(ctx, event);
ret = 0;
goto out;


Is this the expected behaviour?  Would an approach to getting the behaviour we
want be to change the return value from handle_domain_death() to one which
doesn't trigger the exit?

Thanks,
James



Re: xen 4.11.4 incorrect (~3x) cpu frequency reported

2021-07-26 Thread James Dingwall
Hi Jan,

Thank you for taking the time to reply.

On Wed, Jul 21, 2021 at 12:59:11PM +0200, Jan Beulich wrote:
> On 21.07.2021 11:29, James Dingwall wrote:
> > We have a system which intermittently starts up and reports an incorrect 
> > cpu frequency:
> > 
> > # grep -i mhz /var/log/kern.log 
> > Jul 14 17:47:47 dom0 kernel: [0.000475] tsc: Detected 2194.846 MHz 
> > processor
> > Jul 14 22:03:37 dom0 kernel: [0.000476] tsc: Detected 2194.878 MHz 
> > processor
> > Jul 14 23:05:13 dom0 kernel: [0.000478] tsc: Detected 2194.848 MHz 
> > processor
> > Jul 14 23:20:47 dom0 kernel: [0.000474] tsc: Detected 2194.856 MHz 
> > processor
> > Jul 14 23:57:39 dom0 kernel: [0.000476] tsc: Detected 2194.906 MHz 
> > processor
> > Jul 15 01:04:09 dom0 kernel: [0.000476] tsc: Detected 2194.858 MHz 
> > processor
> > Jul 15 01:27:15 dom0 kernel: [0.000482] tsc: Detected 2194.870 MHz 
> > processor
> > Jul 15 02:00:13 dom0 kernel: [0.000481] tsc: Detected 2194.924 MHz 
> > processor
> > Jul 15 03:09:23 dom0 kernel: [0.000475] tsc: Detected 2194.892 MHz 
> > processor
> > Jul 15 03:32:50 dom0 kernel: [0.000482] tsc: Detected 2194.856 MHz 
> > processor
> > Jul 15 04:05:27 dom0 kernel: [0.000480] tsc: Detected 2194.886 MHz 
> > processor
> > Jul 15 05:00:38 dom0 kernel: [0.000473] tsc: Detected 2194.914 MHz 
> > processor
> > Jul 15 05:59:33 dom0 kernel: [0.000480] tsc: Detected 2194.924 MHz 
> > processor
> > Jul 15 06:22:31 dom0 kernel: [0.000474] tsc: Detected 2194.910 MHz 
> > processor
> > Jul 15 17:52:57 dom0 kernel: [0.000474] tsc: Detected 2194.854 MHz 
> > processor
> > Jul 15 18:51:36 dom0 kernel: [0.000474] tsc: Detected 2194.900 MHz 
> > processor
> > Jul 15 19:07:26 dom0 kernel: [0.000478] tsc: Detected 2194.902 MHz 
> > processor
> > Jul 15 19:43:56 dom0 kernel: [0.000154] tsc: Detected 6895.384 MHz 
> > processor
> 
> Well, this is output from Dom0. What we'd need to see (in addition)
> is the corresponding hypervisor log at maximum verbosity (loglvl=all).

This was just to illustrate that the dom0 usually reports the correct speed.  
I'll update the xen boot options with loglvl=all and try to collect the boot 
messages for each case.

> 
> > The xen 's' debug output:
> > 
> > (XEN) TSC marked as reliable, warp = 0 (count=4)
> > (XEN) dom1: mode=0,ofs=0x1d1ac8bf8e,khz=6895385,inc=1
> > (XEN) dom2: mode=0,ofs=0x28bc24c746,khz=6895385,inc=1
> > (XEN) dom3: mode=0,ofs=0x345696b138,khz=6895385,inc=1
> > (XEN) dom4: mode=0,ofs=0x34f2635f31,khz=6895385,inc=1
> > (XEN) dom5: mode=0,ofs=0x3581618a7d,khz=6895385,inc=1
> > (XEN) dom6: mode=0,ofs=0x3627ca68b2,khz=6895385,inc=1
> > (XEN) dom7: mode=0,ofs=0x36dd491860,khz=6895385,inc=1
> > (XEN) dom8: mode=0,ofs=0x377a57ea1a,khz=6895385,inc=1
> > (XEN) dom9: mode=0,ofs=0x381eb175ce,khz=6895385,inc=1
> > (XEN) dom10: mode=0,ofs=0x38cab2e260,khz=6895385,inc=1
> > (XEN) dom11: mode=0,ofs=0x397fc47387,khz=6895385,inc=1
> > (XEN) dom12: mode=0,ofs=0x3a552762a0,khz=6895385,inc=1
> > 
> > A processor from /proc/cpuinfo in dom0:
> > 
> > processor   : 3
> > vendor_id   : GenuineIntel
> > cpu family  : 6
> > model   : 85
> > model name  : Intel(R) Xeon(R) D-2123IT CPU @ 2.20GHz
> > stepping: 4
> > microcode   : 0x265
> > cpu MHz : 6895.384
> > [...]
> > 
> > Xen has been built at 310ab79875cb705cc2c7daddff412b5a4899f8c9 from the 
> > stable-4.12 branch.
> 
> While this contradicts the title, both 4.11 and 4.12 are out of general
> support. Hence it would be more helpful if you could obtain respective
> logs with a more modern version of Xen - ideally from the master branch,
> or else the most recent stable one (4.15). Provided of course the issue
> continues to exist there in the first place.

That was my error, I meant the stable-4.11 branch.  We have a development 
environment based around 4.14.2 which I can test.  My assumption had been that 
xen reads or calculates this frequency and provides it to the dom0 since it is 
reported in the hypervisor log before dom0 is started.

Regards,
James



xen 4.11.4 incorrect (~3x) cpu frequency reported

2021-07-21 Thread James Dingwall
Hi,

We have a system which intermittently starts up and reports an incorrect cpu 
frequency:

# grep -i mhz /var/log/kern.log 
Jul 14 17:47:47 dom0 kernel: [0.000475] tsc: Detected 2194.846 MHz processor
Jul 14 22:03:37 dom0 kernel: [0.000476] tsc: Detected 2194.878 MHz processor
Jul 14 23:05:13 dom0 kernel: [0.000478] tsc: Detected 2194.848 MHz processor
Jul 14 23:20:47 dom0 kernel: [0.000474] tsc: Detected 2194.856 MHz processor
Jul 14 23:57:39 dom0 kernel: [0.000476] tsc: Detected 2194.906 MHz processor
Jul 15 01:04:09 dom0 kernel: [0.000476] tsc: Detected 2194.858 MHz processor
Jul 15 01:27:15 dom0 kernel: [0.000482] tsc: Detected 2194.870 MHz processor
Jul 15 02:00:13 dom0 kernel: [0.000481] tsc: Detected 2194.924 MHz processor
Jul 15 03:09:23 dom0 kernel: [0.000475] tsc: Detected 2194.892 MHz processor
Jul 15 03:32:50 dom0 kernel: [0.000482] tsc: Detected 2194.856 MHz processor
Jul 15 04:05:27 dom0 kernel: [0.000480] tsc: Detected 2194.886 MHz processor
Jul 15 05:00:38 dom0 kernel: [0.000473] tsc: Detected 2194.914 MHz processor
Jul 15 05:59:33 dom0 kernel: [0.000480] tsc: Detected 2194.924 MHz processor
Jul 15 06:22:31 dom0 kernel: [0.000474] tsc: Detected 2194.910 MHz processor
Jul 15 17:52:57 dom0 kernel: [0.000474] tsc: Detected 2194.854 MHz processor
Jul 15 18:51:36 dom0 kernel: [0.000474] tsc: Detected 2194.900 MHz processor
Jul 15 19:07:26 dom0 kernel: [0.000478] tsc: Detected 2194.902 MHz processor
Jul 15 19:43:56 dom0 kernel: [0.000154] tsc: Detected 6895.384 MHz processor

The xen 's' debug output:

(XEN) TSC marked as reliable, warp = 0 (count=4)
(XEN) dom1: mode=0,ofs=0x1d1ac8bf8e,khz=6895385,inc=1
(XEN) dom2: mode=0,ofs=0x28bc24c746,khz=6895385,inc=1
(XEN) dom3: mode=0,ofs=0x345696b138,khz=6895385,inc=1
(XEN) dom4: mode=0,ofs=0x34f2635f31,khz=6895385,inc=1
(XEN) dom5: mode=0,ofs=0x3581618a7d,khz=6895385,inc=1
(XEN) dom6: mode=0,ofs=0x3627ca68b2,khz=6895385,inc=1
(XEN) dom7: mode=0,ofs=0x36dd491860,khz=6895385,inc=1
(XEN) dom8: mode=0,ofs=0x377a57ea1a,khz=6895385,inc=1
(XEN) dom9: mode=0,ofs=0x381eb175ce,khz=6895385,inc=1
(XEN) dom10: mode=0,ofs=0x38cab2e260,khz=6895385,inc=1
(XEN) dom11: mode=0,ofs=0x397fc47387,khz=6895385,inc=1
(XEN) dom12: mode=0,ofs=0x3a552762a0,khz=6895385,inc=1

A processor from /proc/cpuinfo in dom0:

processor   : 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 85
model name  : Intel(R) Xeon(R) D-2123IT CPU @ 2.20GHz
stepping: 4
microcode   : 0x265
cpu MHz : 6895.384
cache size  : 8448 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 1
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi 
mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good nopl nonstop_tsc 
cpuid pni pclmulqdq monitor est ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes 
xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch intel_ppin ssbd ibrs 
ibpb stibp fsgsbase bmi1 hle avx2 bmi2 erms rtm avx512f avx512dq rdseed adx 
clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 md_clear
bugs: null_seg cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass 
l1tf mds swapgs taa itlb_multihit
bogomips: 13790.76
clflush size: 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Xen has been built at 310ab79875cb705cc2c7daddff412b5a4899f8c9 from the 
stable-4.12 branch.  The system is a supermicro server, model X11SDV-4C-TP8F.  
I'm not sure if the incorrect value has been read from hardware or Xen has 
miscalculated the frequency so any pointers on things to examine would be 
welcome.

Thanks,
James



Re: [PATCH for-4.12 and older] x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL} (again)

2021-02-04 Thread James Dingwall
Hi Jan,

On Thu, Feb 04, 2021 at 10:36:06AM +0100, Jan Beulich wrote:
> X86_VENDOR_* aren't bit masks in the older trees.
> 
> Reported-by: James Dingwall 
> Signed-off-by: Jan Beulich 
> 
> --- a/xen/arch/x86/msr.c
> +++ b/xen/arch/x86/msr.c
> @@ -226,7 +226,8 @@ int guest_rdmsr(const struct vcpu *v, ui
>   */
>  case MSR_IA32_PERF_STATUS:
>  case MSR_IA32_PERF_CTL:
> -if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
> +if ( cp->x86_vendor != X86_VENDOR_INTEL &&
> + cp->x86_vendor != X86_VENDOR_CENTAUR )
>  goto gp_fault;
>  
>  *val = 0;

Thanks for this patch, I've applied it and the Windows guest no longer crashes.

Regards,
James



Re: VIRIDIAN CRASH: 3b c0000096 75b12c5 9e7f1580 0

2021-02-04 Thread James Dingwall
Hi Jan,

Thank you for your reply.

On Wed, Feb 03, 2021 at 03:55:07PM +0100, Jan Beulich wrote:
> On 01.02.2021 16:26, James Dingwall wrote:
> > I am building the xen 4.11 branch at 
> > 310ab79875cb705cc2c7daddff412b5a4899f8c9 which includes commit 
> > 3b5de119f0399cbe745502cb6ebd5e6633cc139c "86/msr: fix handling of 
> > MSR_IA32_PERF_{STATUS/CTL}".  I think this should address this error 
> > recorded in xen's dmesg:
> > 
> > (XEN) d11v0 VIRIDIAN CRASH: 3b c096 75b12c5 9e7f1580 0
> 
> It seems to me that you imply some information here which might
> better be spelled out. As it stands I do not see the immediate
> connection between the cited commit and the crash. C096 is
> STATUS_PRIVILEGED_INSTRUCTION, which to me ought to be impossible
> for code running in ring 0. Of course I may simply not know enough
> about modern Windows' internals to understand the connection.

Searching for "VIRIDIAN CRASH: 3b" led me to this thread and then to the commit 
based on the commit log message.

https://patchwork.kernel.org/project/xen-devel/patch/20201007102032.98565-1-roger@citrix.com/

I have naively assumed that the RCX register indicated MSR_IA32_PERF_CTL based 
on:

#define MSR_IA32_PERF_CTL 0x0199

I've added this patch:

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 99c848ff41..7a764907d5 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -232,12 +232,16 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, 
uint64_t *val)
  */
 case MSR_IA32_PERF_STATUS:
 case MSR_IA32_PERF_CTL:
-if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) ) {
+printk(KERN_DEBUG "JKD: MSR %#x FAULT1: %#x & %#x\n", msr, 
cp->x86_vendor, (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR));
+
 goto gp_fault;
+}
 
 *val = 0;
 if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
 break;
+printk(KERN_DEBUG "JKD: MSR FAULT2\n");
 goto gp_fault;
 
 /*

and now in the hypervisor log when the domain crashes:

(XEN) JKD: MSR 0x199 FAULT1: 0 & 0x2
(XEN) d11v0 VIRIDIAN CRASH: 3b c096 1146d2c5 6346d580 0
(XEN) avc:  denied  { reset } for domid=11 scontext=system_u:system_r:domU_t 
tcontext=system_u:system_r:domU_t_self tclass=event

I'm not sure what is expected in cp->x86_vendor but this is running on an Intel 
CPU so I would have thought 0x1 based on

#define X86_VENDOR_INTEL (1 << 0)

I have also booted with flask=disabled to to eliminate the reported avc denial 
as the cause.

> 
> > I have removed `viridian = [..]` from the xen config nut still get this 
> > reliably when launching PassMark Performance Test and it is collecting 
> > CPU information.
> > 
> > This is recorded in the domain qemu-dm log:
> > 
> > 21244@1612191983.279616:xen_platform_log xen platform: XEN|BUGCHECK: >
> > 21244@1612191983.279819:xen_platform_log xen platform: XEN|BUGCHECK: 
> > SYSTEM_SERVICE_EXCEPTION: C096 F800A43C72C5 
> > D0014343D580 
> > 21244@1612191983.279959:xen_platform_log xen platform: XEN|BUGCHECK: 
> > EXCEPTION (F800A43C72C5):
> > 21244@1612191983.280075:xen_platform_log xen platform: XEN|BUGCHECK: - Code 
> > = C148320F
> > 21244@1612191983.280205:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Flags = 0B4820E2
> > 21244@1612191983.280346:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Address = A824948D4800
> > 21244@1612191983.280504:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[0] = 8B0769850F07
> > 21244@1612191983.280633:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[1] = 46B70F4024448906
> > 21244@1612191983.280754:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[2] = 0F2444896604
> > 21244@1612191983.280876:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[3] = E983C88B410646B6
> > 21244@1612191983.281012:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[4] = 0D7401E9831E7401
> > 21244@1612191983.281172:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[5] = 54B70F217502F983
> > 21244@1612191983.281304:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[6] = 54B70F15EBED4024
> > 21244@1612191983.281426:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[7] = EBC0B70FED664024
> > 21244@1612191983.281547:xen_platform_log xen platform: XEN|BUGCHECK: - 
> > Parameter[8] = 0FEC402454B70F09
> > 21244@1612191983.281668:xen_platform_log xen platform: XEN

VIRIDIAN CRASH: 3b c0000096 75b12c5 9e7f1580 0

2021-02-01 Thread James Dingwall
Hi,

I am building the xen 4.11 branch at 
310ab79875cb705cc2c7daddff412b5a4899f8c9 which includes commit 
3b5de119f0399cbe745502cb6ebd5e6633cc139c "86/msr: fix handling of 
MSR_IA32_PERF_{STATUS/CTL}".  I think this should address this error 
recorded in xen's dmesg:

(XEN) d11v0 VIRIDIAN CRASH: 3b c096 75b12c5 9e7f1580 0

I have removed `viridian = [..]` from the xen config nut still get this 
reliably when launching PassMark Performance Test and it is collecting 
CPU information.

This is recorded in the domain qemu-dm log:

21244@1612191983.279616:xen_platform_log xen platform: XEN|BUGCHECK: >
21244@1612191983.279819:xen_platform_log xen platform: XEN|BUGCHECK: 
SYSTEM_SERVICE_EXCEPTION: C096 F800A43C72C5 D0014343D580 

21244@1612191983.279959:xen_platform_log xen platform: XEN|BUGCHECK: EXCEPTION 
(F800A43C72C5):
21244@1612191983.280075:xen_platform_log xen platform: XEN|BUGCHECK: - Code = 
C148320F
21244@1612191983.280205:xen_platform_log xen platform: XEN|BUGCHECK: - Flags = 
0B4820E2
21244@1612191983.280346:xen_platform_log xen platform: XEN|BUGCHECK: - Address 
= A824948D4800
21244@1612191983.280504:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[0] = 8B0769850F07
21244@1612191983.280633:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[1] = 46B70F4024448906
21244@1612191983.280754:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[2] = 0F2444896604
21244@1612191983.280876:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[3] = E983C88B410646B6
21244@1612191983.281012:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[4] = 0D7401E9831E7401
21244@1612191983.281172:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[5] = 54B70F217502F983
21244@1612191983.281304:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[6] = 54B70F15EBED4024
21244@1612191983.281426:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[7] = EBC0B70FED664024
21244@1612191983.281547:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[8] = 0FEC402454B70F09
21244@1612191983.281668:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[9] = 448B42244489C0B6
21244@1612191983.281809:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[10] = 2444B70F06894024
21244@1612191983.281932:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[11] = 4688440446896644
21244@1612191983.282052:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[12] = 073846C74906
21244@1612191983.282185:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[13] = F883070AE900
21244@1612191983.282340:xen_platform_log xen platform: XEN|BUGCHECK: - 
Parameter[14] = 8B06F9850F07
21244@1612191983.282480:xen_platform_log xen platform: XEN|BUGCHECK: EXCEPTION 
(A824848948C2):
21244@1612191983.282617:xen_platform_log xen platform: XEN|BUGCHECK: CONTEXT 
(D0014343D580):
21244@1612191983.282717:xen_platform_log xen platform: XEN|BUGCHECK: - GS = 002B
21244@1612191983.282816:xen_platform_log xen platform: XEN|BUGCHECK: - FS = 0053
21244@1612191983.282914:xen_platform_log xen platform: XEN|BUGCHECK: - ES = 002B
21244@1612191983.283011:xen_platform_log xen platform: XEN|BUGCHECK: - DS = 002B
21244@1612191983.283127:xen_platform_log xen platform: XEN|BUGCHECK: - SS = 0018
21244@1612191983.283226:xen_platform_log xen platform: XEN|BUGCHECK: - CS = 0010
21244@1612191983.283332:xen_platform_log xen platform: XEN|BUGCHECK: - EFLAGS = 
0202
21244@1612191983.283444:xen_platform_log xen platform: XEN|BUGCHECK: - RDI = 
F64D5C20
21244@1612191983.283555:xen_platform_log xen platform: XEN|BUGCHECK: - RSI = 
F6367280
21244@1612191983.283666:xen_platform_log xen platform: XEN|BUGCHECK: - RBX = 
8011E060
21244@1612191983.283810:xen_platform_log xen platform: XEN|BUGCHECK: - RDX = 
F64D5C20
21244@1612191983.283972:xen_platform_log xen platform: XEN|BUGCHECK: - RCX = 
0199
21244@1612191983.284350:xen_platform_log xen platform: XEN|BUGCHECK: - RAX = 
0004
21244@1612191983.284523:xen_platform_log xen platform: XEN|BUGCHECK: - RBP = 
4343E891
21244@1612191983.284658:xen_platform_log xen platform: XEN|BUGCHECK: - RIP = 
A43C72C5
21244@1612191983.284842:xen_platform_log xen platform: XEN|BUGCHECK: - RSP = 
4343DFA0
21244@1612191983.284959:xen_platform_log xen platform: XEN|BUGCHECK: - R8 = 
0008
21244@1612191983.285073:xen_platform_log xen platform: XEN|BUGCHECK: - R9 = 
000E
21244@1612191983.285188:xen_platform_log xen platform: XEN|BUGCHECK: - R10 = 
0002
21244@1612191983.285304:xen_platform_log xen platform: XEN|BUGCHECK: - R11 = 
4343E808
21244@1612191983.285420:xen_platform_log xen platform: XEN|BUGCHECK: - R12 = 

21244@1612191983.285564:xen_platform_log xen platform: XEN|BUGCHECK: - R13 = 
F7964E50
21244@1612191983.285680:xen_platform_log xen platform: XEN|BUGCHECK: - R14 = 

Re: [Xen-devel] [PATCH] xen/xenbus: fix self-deadlock after killing user process

2019-10-21 Thread James Dingwall
On Tue, Oct 01, 2019 at 05:03:55PM +0200, Juergen Gross wrote:
> In case a user process using xenbus has open transactions and is killed
> e.g. via ctrl-C the following cleanup of the allocated resources might
> result in a deadlock due to trying to end a transaction in the xenbus
> worker thread:
> 
> [ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds.
> [ 2551.492215]   Tainted: P   OE 5.0.0-29-generic #5
> [ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [ 2551.528585] xenbus  D037  2 0x8080
> [ 2551.528590] Call Trace:
> [ 2551.528603]  __schedule+0x2c0/0x870
> [ 2551.528606]  ? _cond_resched+0x19/0x40
> [ 2551.528632]  schedule+0x2c/0x70
> [ 2551.528637]  xs_talkv+0x1ec/0x2b0
> [ 2551.528642]  ? wait_woken+0x80/0x80
> [ 2551.528645]  xs_single+0x53/0x80
> [ 2551.528648]  xenbus_transaction_end+0x3b/0x70
> [ 2551.528651]  xenbus_file_free+0x5a/0x160
> [ 2551.528654]  xenbus_dev_queue_reply+0xc4/0x220
> [ 2551.528657]  xenbus_thread+0x7de/0x880
> [ 2551.528660]  ? wait_woken+0x80/0x80
> [ 2551.528665]  kthread+0x121/0x140
> [ 2551.528667]  ? xb_read+0x1d0/0x1d0
> [ 2551.528670]  ? kthread_park+0x90/0x90
> [ 2551.528673]  ret_from_fork+0x35/0x40
> 
> Fix this by doing the cleanup via a workqueue instead.
> 
> Reported-by: James Dingwall 
> Fixes: fd8aa9095a95c ("xen: optimize xenbus driver for multiple concurrent 
> xenstore accesses")
> Cc:  # 4.11
> Signed-off-by: Juergen Gross 
> ---
>  drivers/xen/xenbus/xenbus_dev_frontend.c | 20 ++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/xen/xenbus/xenbus_dev_frontend.c 
> b/drivers/xen/xenbus/xenbus_dev_frontend.c
> index 08adc590f631..597af455a522 100644
> --- a/drivers/xen/xenbus/xenbus_dev_frontend.c
> +++ b/drivers/xen/xenbus/xenbus_dev_frontend.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -116,6 +117,8 @@ struct xenbus_file_priv {
>   wait_queue_head_t read_waitq;
>  
>   struct kref kref;
> +
> + struct work_struct wq;
>  };
>  
>  /* Read out any raw xenbus messages queued up. */
> @@ -300,14 +303,14 @@ static void watch_fired(struct xenbus_watch *watch,
>   mutex_unlock(>dev_data->reply_mutex);
>  }
>  
> -static void xenbus_file_free(struct kref *kref)
> +static void xenbus_worker(struct work_struct *wq)
>  {
>   struct xenbus_file_priv *u;
>   struct xenbus_transaction_holder *trans, *tmp;
>   struct watch_adapter *watch, *tmp_watch;
>   struct read_buffer *rb, *tmp_rb;
>  
> - u = container_of(kref, struct xenbus_file_priv, kref);
> + u = container_of(wq, struct xenbus_file_priv, wq);
>  
>   /*
>* No need for locking here because there are no other users,
> @@ -333,6 +336,18 @@ static void xenbus_file_free(struct kref *kref)
>   kfree(u);
>  }
>  
> +static void xenbus_file_free(struct kref *kref)
> +{
> + struct xenbus_file_priv *u;
> +
> + /*
> +  * We might be called in xenbus_thread().
> +  * Use workqueue to avoid deadlock.
> +  */
> + u = container_of(kref, struct xenbus_file_priv, kref);
> + schedule_work(>wq);
> +}
> +
>  static struct xenbus_transaction_holder *xenbus_get_transaction(
>   struct xenbus_file_priv *u, uint32_t tx_id)
>  {
> @@ -650,6 +665,7 @@ static int xenbus_file_open(struct inode *inode, struct 
> file *filp)
>   INIT_LIST_HEAD(>watches);
>   INIT_LIST_HEAD(>read_buffers);
>   init_waitqueue_head(>read_waitq);
> + INIT_WORK(>wq, xenbus_worker);
>  
>   mutex_init(>reply_mutex);
>   mutex_init(>msgbuffer_mutex);
> -- 
> 2.16.4
> 

We have been having some crashes with an Ubuntu 5.0.0-31 kernel with 
this patch and thanks to the pstore fix "x86/xen: Return from panic 
notifier" we caught the oops below.  It seems to be in the same area of 
code as this patch but I'm unsure if it is directly related to this 
change or a secondary issue.  From the logs collected I can see this 
happened while there were several parallel `xl create` process running 
but so I have not been able to reproduce this in a test script but 
perhaps the trace will give some clues.

Thanks,
James


<4>[53626.726580] [ cut here ]
<2>[53626.726583] kernel BUG at /build/slowfs/ubuntu-bionic/mm/slub.c:305!
<4>[53626.739554] invalid opcode:  [#1] SMP NOPTI
<4>[53626.751119] CPU: 0 PID: 38 Comm: xenwatch Tainted: P   OE 
5.0.0-31-generic #33~18.04.1z1
<4>[53626.763015] Hardw

Re: [Xen-devel] [PATCH] xen/xenbus: fix self-deadlock after killing user process

2019-10-07 Thread James Dingwall
On Tue, Oct 01, 2019 at 01:37:24PM -0400, Boris Ostrovsky wrote:
> On 10/1/19 11:03 AM, Juergen Gross wrote:
> > In case a user process using xenbus has open transactions and is killed
> > e.g. via ctrl-C the following cleanup of the allocated resources might
> > result in a deadlock due to trying to end a transaction in the xenbus
> > worker thread:
> >
> > [ 2551.474706] INFO: task xenbus:37 blocked for more than 120 seconds.
> > [ 2551.492215]   Tainted: P   OE 5.0.0-29-generic #5
> > [ 2551.510263] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> > this message.
> > [ 2551.528585] xenbus  D037  2 0x8080
> > [ 2551.528590] Call Trace:
> > [ 2551.528603]  __schedule+0x2c0/0x870
> > [ 2551.528606]  ? _cond_resched+0x19/0x40
> > [ 2551.528632]  schedule+0x2c/0x70
> > [ 2551.528637]  xs_talkv+0x1ec/0x2b0
> > [ 2551.528642]  ? wait_woken+0x80/0x80
> > [ 2551.528645]  xs_single+0x53/0x80
> > [ 2551.528648]  xenbus_transaction_end+0x3b/0x70
> > [ 2551.528651]  xenbus_file_free+0x5a/0x160
> > [ 2551.528654]  xenbus_dev_queue_reply+0xc4/0x220
> > [ 2551.528657]  xenbus_thread+0x7de/0x880
> > [ 2551.528660]  ? wait_woken+0x80/0x80
> > [ 2551.528665]  kthread+0x121/0x140
> > [ 2551.528667]  ? xb_read+0x1d0/0x1d0
> > [ 2551.528670]  ? kthread_park+0x90/0x90
> > [ 2551.528673]  ret_from_fork+0x35/0x40
> >
> > Fix this by doing the cleanup via a workqueue instead.
> >
> > Reported-by: James Dingwall 
> > Fixes: fd8aa9095a95c ("xen: optimize xenbus driver for multiple concurrent 
> > xenstore accesses")
> > Cc:  # 4.11
> > Signed-off-by: Juergen Gross 
> 
> Reviewed-by: Boris Ostrovsky 
> 

Tested-by: James Dingwall 

This patch does resolve the observed issue although for my (extreme and 
not representative of our normal workload) test case the worker still 
gets blocked for some time if the xenstore-rm is interrupted and no 
concurrent xenstore commands can run.  I assume that the worker 
completes the rm and then does a rollback in the background rather than 
being interrupted early as a result of the userspace program being 
terminated.

Thanks,
James

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] failed to launch qemu when running de-privileged (xen 4.8)

2018-08-02 Thread James Dingwall
Hi,

I had a bit of a head scratcher while writing a patch for 4.8 which 
allows the qemu-dm process for a stubdom to be executed as an 
unprivileged user.  After a liberal sprinkling of log messages I found 
that my problem was related to the check of the return code from 
getpwnam_r.  In 4.11 the relevant code looks like this:


ret = NAME##_r(spec, resultbuf, buf, buf_size, );   \
if (ret == ERANGE) {\
buf_size += 128;\
continue;   \
}   \
if (ret != 0)   \
return ERROR_FAIL;  \
if (resultp != NULL) {  \
if (out) *out = resultp;\
return 1;   \
}   \
return 0;   \


if (ret != 0)   \
return ERROR_FAIL;  \


However checking the man page for getpwnam_r (and getpwuid_r now for 
4.11) it is not just 0 which can indicate an entry is not found:

   0 or ENOENT or ESRCH or EBADF or EPERM or ...
  The given name or uid was not found.
   EINTR  A signal was caught; see signal(7).
   EIOI/O error.
   EMFILE The per-process limit on the number of open file descriptors has 
been reached.
   ENFILE The system-wide limit on the total number of open files has been 
reached.
   ENOMEM Insufficient memory to allocate passwd structure.
   ERANGE Insufficient buffer space supplied.

In my case the domid specific qemu user was not present (just using 
xen-qemuuser-shared) and I was getting an ENOENT from getpwnam_r.
I'm sure there should be a more elegant way to write the check but
it solved my case.

+ret = getpwnam_r(username, , buf, buf_size, );
+if (ret == ERANGE) {
+buf_size += 128;
+continue;
+}
+if (ret == EINTR || ret == EIO || ret == EMFILE || ret == ENFILE || 
ret == ENOMEM)
+return ERROR_FAIL;
+if (user != NULL)
+return 1;
+return 0;

Thanks,
James

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel