[Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Christoffer Dall
The virt board has an arch timer, which is always on.  Emit the
"always-on" property to indicate to Linux that it can switch off the
periodic timer and reduces the amount of interrupts injected into a
guest.

Signed-off-by: Christoffer Dall 
---
 hw/arm/virt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 05f9087..265fe9a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi, 
int gictype)
 qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
 "arm,armv7-timer");
 }
+qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
 qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, irqflags,
GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, irqflags,
-- 
2.1.2.330.g565301e.dirty




Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Andrew Jones
On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
> The virt board has an arch timer, which is always on.  Emit the
> "always-on" property to indicate to Linux that it can switch off the
> periodic timer and reduces the amount of interrupts injected into a
> guest.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  hw/arm/virt.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 05f9087..265fe9a 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo *vbi, 
> int gictype)
>  qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
>  "arm,armv7-timer");
>  }
> +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
>  qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
> GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, irqflags,
> GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, irqflags,
> -- 
> 2.1.2.330.g565301e.dirty
> 
>

Hi Christoffer,

We should also patch the ACPI generation at the same time. I think
something like

 - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
 + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;

should do it.

Also, having the guest reduce the number of interrupts sounds good. Can
you point me to something to read about how/why a guest may choose to do
that, and what the trade-offs are?

Thanks,
drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Christoffer Dall
On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
> > The virt board has an arch timer, which is always on.  Emit the
> > "always-on" property to indicate to Linux that it can switch off the
> > periodic timer and reduces the amount of interrupts injected into a
> > guest.
> > 
> > Signed-off-by: Christoffer Dall 
> > ---
> >  hw/arm/virt.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 05f9087..265fe9a 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo 
> > *vbi, int gictype)
> >  qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
> >  "arm,armv7-timer");
> >  }
> > +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
> >  qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
> > GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, 
> > irqflags,
> > GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, 
> > irqflags,
> > -- 
> > 2.1.2.330.g565301e.dirty
> > 
> >
> 
> Hi Christoffer,
> 
> We should also patch the ACPI generation at the same time. I think
> something like
> 
>  - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
>  + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;

I'm really not familiar enough with ACPI to be comfortable writing code
for this or testing this.

But if someone can pick this up and add the ACPI bits or can post a
follow-up patch, then I'm all for it :)

> 
> should do it.
> 
> Also, having the guest reduce the number of interrupts sounds good. Can
> you point me to something to read about how/why a guest may choose to do
> that, and what the trade-offs are?
> 
Not really, but you can ask Marc.

-Christoffer



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Andrew Jones
On Tue, Jan 19, 2016 at 01:43:41PM +0100, Christoffer Dall wrote:
> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> > On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
> > > The virt board has an arch timer, which is always on.  Emit the
> > > "always-on" property to indicate to Linux that it can switch off the
> > > periodic timer and reduces the amount of interrupts injected into a
> > > guest.
> > > 
> > > Signed-off-by: Christoffer Dall 
> > > ---
> > >  hw/arm/virt.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index 05f9087..265fe9a 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo 
> > > *vbi, int gictype)
> > >  qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
> > >  "arm,armv7-timer");
> > >  }
> > > +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
> > >  qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
> > > GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, 
> > > irqflags,
> > > GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, 
> > > irqflags,
> > > -- 
> > > 2.1.2.330.g565301e.dirty
> > > 
> > >
> > 
> > Hi Christoffer,
> > 
> > We should also patch the ACPI generation at the same time. I think
> > something like
> > 
> >  - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
> >  + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;
> 
> I'm really not familiar enough with ACPI to be comfortable writing code
> for this or testing this.
> 
> But if someone can pick this up and add the ACPI bits or can post a
> follow-up patch, then I'm all for it :)

I can post a follow-up patch.

> 
> > 
> > should do it.
> > 
> > Also, having the guest reduce the number of interrupts sounds good. Can
> > you point me to something to read about how/why a guest may choose to do
> > that, and what the trade-offs are?
> > 
> Not really, but you can ask Marc.

OK, CCing him. One thing I see is that without this change we're
currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
it's not true. Having that set may disable the oneshot capabilityj
necessary to switch to nohz mode? I'll just stop there with my
speculation though, so Marc won't have to correct too much...

drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Marc Zyngier
On 19/01/16 13:32, Andrew Jones wrote:
> On Tue, Jan 19, 2016 at 01:43:41PM +0100, Christoffer Dall wrote:
>> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
>>> On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
 The virt board has an arch timer, which is always on.  Emit the
 "always-on" property to indicate to Linux that it can switch off the
 periodic timer and reduces the amount of interrupts injected into a
 guest.

 Signed-off-by: Christoffer Dall 
 ---
  hw/arm/virt.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/hw/arm/virt.c b/hw/arm/virt.c
 index 05f9087..265fe9a 100644
 --- a/hw/arm/virt.c
 +++ b/hw/arm/virt.c
 @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo 
 *vbi, int gictype)
  qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
  "arm,armv7-timer");
  }
 +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
  qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
 GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, 
 irqflags,
 GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, 
 irqflags,
 -- 
 2.1.2.330.g565301e.dirty


>>>
>>> Hi Christoffer,
>>>
>>> We should also patch the ACPI generation at the same time. I think
>>> something like
>>>
>>>  - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
>>>  + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;
>>
>> I'm really not familiar enough with ACPI to be comfortable writing code
>> for this or testing this.
>>
>> But if someone can pick this up and add the ACPI bits or can post a
>> follow-up patch, then I'm all for it :)
> 
> I can post a follow-up patch.
> 
>>
>>>
>>> should do it.
>>>
>>> Also, having the guest reduce the number of interrupts sounds good. Can
>>> you point me to something to read about how/why a guest may choose to do
>>> that, and what the trade-offs are?
>>>
>> Not really, but you can ask Marc.
> 
> OK, CCing him. One thing I see is that without this change we're
> currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> it's not true. Having that set may disable the oneshot capabilityj
> necessary to switch to nohz mode? I'll just stop there with my
> speculation though, so Marc won't have to correct too much...

You're spot on. See 82a5619 in the kernel tree. When I did a similar
change in kvmtool, I saw a massive reduction in the number of timer
interrupts injected (specially when the number of vcpu is relatively high).

This also have interesting benefits when running on a model, where
you're trying to squeeze the last bits of "performance" from the monster...

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Andrew Jones
On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> On 19/01/16 13:32, Andrew Jones wrote:
> > On Tue, Jan 19, 2016 at 01:43:41PM +0100, Christoffer Dall wrote:
> >> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> >>> On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
>  The virt board has an arch timer, which is always on.  Emit the
>  "always-on" property to indicate to Linux that it can switch off the
>  periodic timer and reduces the amount of interrupts injected into a
>  guest.
> 
>  Signed-off-by: Christoffer Dall 
>  ---
>   hw/arm/virt.c | 1 +
>   1 file changed, 1 insertion(+)
> 
>  diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>  index 05f9087..265fe9a 100644
>  --- a/hw/arm/virt.c
>  +++ b/hw/arm/virt.c
>  @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo 
>  *vbi, int gictype)
>   qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
>   "arm,armv7-timer");
>   }
>  +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
>   qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
>  GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, 
>  irqflags,
>  GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, 
>  irqflags,
>  -- 
>  2.1.2.330.g565301e.dirty
> 
> 
> >>>
> >>> Hi Christoffer,
> >>>
> >>> We should also patch the ACPI generation at the same time. I think
> >>> something like
> >>>
> >>>  - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
> >>>  + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;
> >>
> >> I'm really not familiar enough with ACPI to be comfortable writing code
> >> for this or testing this.
> >>
> >> But if someone can pick this up and add the ACPI bits or can post a
> >> follow-up patch, then I'm all for it :)
> > 
> > I can post a follow-up patch.
> > 
> >>
> >>>
> >>> should do it.
> >>>
> >>> Also, having the guest reduce the number of interrupts sounds good. Can
> >>> you point me to something to read about how/why a guest may choose to do
> >>> that, and what the trade-offs are?
> >>>
> >> Not really, but you can ask Marc.
> > 
> > OK, CCing him. One thing I see is that without this change we're
> > currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> > it's not true. Having that set may disable the oneshot capabilityj
> > necessary to switch to nohz mode? I'll just stop there with my
> > speculation though, so Marc won't have to correct too much...
> 
> You're spot on. See 82a5619 in the kernel tree. When I did a similar
> change in kvmtool, I saw a massive reduction in the number of timer
> interrupts injected (specially when the number of vcpu is relatively high).
> 
> This also have interesting benefits when running on a model, where
> you're trying to squeeze the last bits of "performance" from the monster...
> 
> Thanks,

Thanks Marc! Christoffer, I'll create the ACPI patch, and do some pre-/post-
tracing to confirm the happy reduction in interrupts :-)

Also, as for this patch,

Reviewed-by: Andrew Jones 

drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Peter Maydell
On 19 January 2016 at 11:49, Christoffer Dall
 wrote:
> The virt board has an arch timer, which is always on.  Emit the
> "always-on" property to indicate to Linux that it can switch off the
> periodic timer and reduces the amount of interrupts injected into a
> guest.
>
> Signed-off-by: Christoffer Dall 



Applied to target-arm.next, thanks.

-- PMM



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-19 Thread Andrew Jones
On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> On 19/01/16 13:32, Andrew Jones wrote:
> > On Tue, Jan 19, 2016 at 01:43:41PM +0100, Christoffer Dall wrote:
> >> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> >>> On Tue, Jan 19, 2016 at 12:49:18PM +0100, Christoffer Dall wrote:
>  The virt board has an arch timer, which is always on.  Emit the
>  "always-on" property to indicate to Linux that it can switch off the
>  periodic timer and reduces the amount of interrupts injected into a
>  guest.
> 
>  Signed-off-by: Christoffer Dall 
>  ---
>   hw/arm/virt.c | 1 +
>   1 file changed, 1 insertion(+)
> 
>  diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>  index 05f9087..265fe9a 100644
>  --- a/hw/arm/virt.c
>  +++ b/hw/arm/virt.c
>  @@ -291,6 +291,7 @@ static void fdt_add_timer_nodes(const VirtBoardInfo 
>  *vbi, int gictype)
>   qemu_fdt_setprop_string(vbi->fdt, "/timer", "compatible",
>   "arm,armv7-timer");
>   }
>  +qemu_fdt_setprop(vbi->fdt, "/timer", "always-on", NULL, 0);
>   qemu_fdt_setprop_cells(vbi->fdt, "/timer", "interrupts",
>  GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_S_EL1_IRQ, 
>  irqflags,
>  GIC_FDT_IRQ_TYPE_PPI, ARCH_TIMER_NS_EL1_IRQ, 
>  irqflags,
>  -- 
>  2.1.2.330.g565301e.dirty
> 
> 
> >>>
> >>> Hi Christoffer,
> >>>
> >>> We should also patch the ACPI generation at the same time. I think
> >>> something like
> >>>
> >>>  - gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE;
> >>>  + gtdt->non_secure_el1_flags = ACPI_EDGE_SENSITIVE | ACPI_GTDT_ALWAYS_ON;
> >>
> >> I'm really not familiar enough with ACPI to be comfortable writing code
> >> for this or testing this.
> >>
> >> But if someone can pick this up and add the ACPI bits or can post a
> >> follow-up patch, then I'm all for it :)
> > 
> > I can post a follow-up patch.
> > 
> >>
> >>>
> >>> should do it.
> >>>
> >>> Also, having the guest reduce the number of interrupts sounds good. Can
> >>> you point me to something to read about how/why a guest may choose to do
> >>> that, and what the trade-offs are?
> >>>
> >> Not really, but you can ask Marc.
> > 
> > OK, CCing him. One thing I see is that without this change we're
> > currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> > it's not true. Having that set may disable the oneshot capabilityj
> > necessary to switch to nohz mode? I'll just stop there with my
> > speculation though, so Marc won't have to correct too much...
> 
> You're spot on. See 82a5619 in the kernel tree. When I did a similar
> change in kvmtool, I saw a massive reduction in the number of timer
> interrupts injected (specially when the number of vcpu is relatively high).
> 
> This also have interesting benefits when running on a model, where
> you're trying to squeeze the last bits of "performance" from the monster...
>

Hmm, I'm probably testing this wrong, but I don't see any difference in
the number of injected timer interrupts. My guest, which I boot with
UEFI, has 

CONFIG_ARM_ARCH_TIMER=y
CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
CONFIG_ARM_TIMER_SP804=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HZ_1000=y
CONFIG_HZ=1000

I've boot a guest using DT with and without this patch

---WITHOUT---

# ls /proc/device-tree/timer
compatible  interrupts  name
# cat /proc/interrupts  
   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6 
  CPU7
  3:   6958   5766   5166   5187   5576   5129 4695 
  4398   GIC  27 Edge  arch_timer
# sleep 120 && cat /proc/interrupts  
   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6 
  CPU7
  3:   7557   5986   5487   5265   6232   5868 5464 
  4438   GIC  27 Edge  arch_timer

---WITH---

# ls /proc/device-tree/timer
always-on  compatible  interrupts  name
# cat /proc/interrupts 
   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6 
  CPU7
  3:   7005   6080   4996   5391   5165   5257 4930 
  4844   GIC  27 Edge  arch_timer
# sleep 120 && cat /proc/interrupts 
   CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6 
  CPU7
  3:   7523   6505   5264   6717   5273   5391 5526 
  4901   GIC  27 Edge  arch_timer



And kvm trace data has

---WITHOUT---
$ grep kvm_timer_update_irq trace.out | wc -l
94336
---WITH---
$ grep kvm_timer_update_irq trace.out | wc -l
95838


Any suggestions?

Thanks,
drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Andrew Jones
On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
> On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> > >> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> > > OK, CCing him. One thing I see is that without this change we're
> > > currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> > > it's not true. Having that set may disable the oneshot capabilityj
> > > necessary to switch to nohz mode? I'll just stop there with my
> > > speculation though, so Marc won't have to correct too much...
> > 
> > You're spot on. See 82a5619 in the kernel tree. When I did a similar
> > change in kvmtool, I saw a massive reduction in the number of timer
> > interrupts injected (specially when the number of vcpu is relatively high).
> > 
> > This also have interesting benefits when running on a model, where
> > you're trying to squeeze the last bits of "performance" from the monster...
> >
> 
> Hmm, I'm probably testing this wrong, but I don't see any difference in
> the number of injected timer interrupts. My guest, which I boot with
> UEFI, has 
> 
> CONFIG_ARM_ARCH_TIMER=y
> CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
> CONFIG_ARM_TIMER_SP804=y
> CONFIG_HIGH_RES_TIMERS=y
> CONFIG_TICK_ONESHOT=y
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
> CONFIG_NO_HZ=y
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
> 
> I've boot a guest using DT with and without this patch
> 
> ---WITHOUT---
> 
> # ls /proc/device-tree/timer
> compatible  interrupts  name
> # cat /proc/interrupts  
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6   
> CPU7
>   3:   6958   5766   5166   5187   5576   5129 4695   
> 4398   GIC  27 Edge  arch_timer
> # sleep 120 && cat /proc/interrupts  
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6   
> CPU7
>   3:   7557   5986   5487   5265   6232   5868 5464   
> 4438   GIC  27 Edge  arch_timer
> 
> ---WITH---
> 
> # ls /proc/device-tree/timer
> always-on  compatible  interrupts  name
> # cat /proc/interrupts 
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6   
> CPU7
>   3:   7005   6080   4996   5391   5165   5257 4930   
> 4844   GIC  27 Edge  arch_timer
> # sleep 120 && cat /proc/interrupts 
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6   
> CPU7
>   3:   7523   6505   5264   6717   5273   5391 5526   
> 4901   GIC  27 Edge  arch_timer
> 
> 
> 
> And kvm trace data has
> 
> ---WITHOUT---
> $ grep kvm_timer_update_irq trace.out | wc -l
> 94336
> ---WITH---
> $ grep kvm_timer_update_irq trace.out | wc -l
> 95838
> 
>

Must be how I'm looking, because I just tried kvmtool with/without
Marc's patch that adds always-on, but don't see any reduction of
interrupts there either. I used a defconfig guest kernel. Also,
not that I think it should matter, but my host kernel is 4.4-rc4
based.

I'd like to be able to see a difference with/without this always-on
patch, not because I don't think we should take it anyway, but because
I need a test case for the ACPI counterpart.

Thanks,
drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Marc Zyngier
On 20/01/16 14:01, Andrew Jones wrote:
> On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
>> On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
 OK, CCing him. One thing I see is that without this change we're
 currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
 it's not true. Having that set may disable the oneshot capabilityj
 necessary to switch to nohz mode? I'll just stop there with my
 speculation though, so Marc won't have to correct too much...
>>>
>>> You're spot on. See 82a5619 in the kernel tree. When I did a similar
>>> change in kvmtool, I saw a massive reduction in the number of timer
>>> interrupts injected (specially when the number of vcpu is relatively high).
>>>
>>> This also have interesting benefits when running on a model, where
>>> you're trying to squeeze the last bits of "performance" from the monster...
>>>
>>
>> Hmm, I'm probably testing this wrong, but I don't see any difference in
>> the number of injected timer interrupts. My guest, which I boot with
>> UEFI, has 
>>
>> CONFIG_ARM_ARCH_TIMER=y
>> CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
>> CONFIG_ARM_TIMER_SP804=y
>> CONFIG_HIGH_RES_TIMERS=y
>> CONFIG_TICK_ONESHOT=y
>> CONFIG_NO_HZ_COMMON=y
>> # CONFIG_HZ_PERIODIC is not set
>> CONFIG_NO_HZ_IDLE=y
>> # CONFIG_NO_HZ_FULL is not set
>> CONFIG_NO_HZ=y
>> CONFIG_HZ_1000=y
>> CONFIG_HZ=1000
>>
>> I've boot a guest using DT with and without this patch
>>
>> ---WITHOUT---
>>
>> # ls /proc/device-tree/timer
>> compatible  interrupts  name
>> # cat /proc/interrupts  
>>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6  
>>  CPU7
>>   3:   6958   5766   5166   5187   5576   5129 4695  
>>  4398   GIC  27 Edge  arch_timer
>> # sleep 120 && cat /proc/interrupts  
>>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6  
>>  CPU7
>>   3:   7557   5986   5487   5265   6232   5868 5464  
>>  4438   GIC  27 Edge  arch_timer
>>
>> ---WITH---
>>
>> # ls /proc/device-tree/timer
>> always-on  compatible  interrupts  name
>> # cat /proc/interrupts 
>>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6  
>>  CPU7
>>   3:   7005   6080   4996   5391   5165   5257 4930  
>>  4844   GIC  27 Edge  arch_timer
>> # sleep 120 && cat /proc/interrupts 
>>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 CPU6  
>>  CPU7
>>   3:   7523   6505   5264   6717   5273   5391 5526  
>>  4901   GIC  27 Edge  arch_timer
>>
>>
>>
>> And kvm trace data has
>>
>> ---WITHOUT---
>> $ grep kvm_timer_update_irq trace.out | wc -l
>> 94336
>> ---WITH---
>> $ grep kvm_timer_update_irq trace.out | wc -l
>> 95838
>>
>>
> 
> Must be how I'm looking, because I just tried kvmtool with/without
> Marc's patch that adds always-on, but don't see any reduction of
> interrupts there either. I used a defconfig guest kernel. Also,
> not that I think it should matter, but my host kernel is 4.4-rc4
> based.
> 
> I'd like to be able to see a difference with/without this always-on
> patch, not because I don't think we should take it anyway, but because
> I need a test case for the ACPI counterpart.

I just run a couple of quick tests, measuring interrupt rate (vmstat 1)
on the host, with one VM (2 vcpus) idling, and I'm seeing the following
thing:

Without "always-on": ~380 interrupts per second
With "always-on": ~40 interrupts per second

This is with kvmtool, 32bit host (but none of that is arch specific anyway).

M.
-- 
Jazz is not dead. It just smells funny...



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Andrew Jones
On Wed, Jan 20, 2016 at 02:28:05PM +, Marc Zyngier wrote:
> On 20/01/16 14:01, Andrew Jones wrote:
> > On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
> >> On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> > On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
>  OK, CCing him. One thing I see is that without this change we're
>  currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
>  it's not true. Having that set may disable the oneshot capabilityj
>  necessary to switch to nohz mode? I'll just stop there with my
>  speculation though, so Marc won't have to correct too much...
> >>>
> >>> You're spot on. See 82a5619 in the kernel tree. When I did a similar
> >>> change in kvmtool, I saw a massive reduction in the number of timer
> >>> interrupts injected (specially when the number of vcpu is relatively 
> >>> high).
> >>>
> >>> This also have interesting benefits when running on a model, where
> >>> you're trying to squeeze the last bits of "performance" from the 
> >>> monster...
> >>>
> >>
> >> Hmm, I'm probably testing this wrong, but I don't see any difference in
> >> the number of injected timer interrupts. My guest, which I boot with
> >> UEFI, has 
> >>
> >> CONFIG_ARM_ARCH_TIMER=y
> >> CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
> >> CONFIG_ARM_TIMER_SP804=y
> >> CONFIG_HIGH_RES_TIMERS=y
> >> CONFIG_TICK_ONESHOT=y
> >> CONFIG_NO_HZ_COMMON=y
> >> # CONFIG_HZ_PERIODIC is not set
> >> CONFIG_NO_HZ_IDLE=y
> >> # CONFIG_NO_HZ_FULL is not set
> >> CONFIG_NO_HZ=y
> >> CONFIG_HZ_1000=y
> >> CONFIG_HZ=1000
> >>
> >> I've boot a guest using DT with and without this patch
> >>
> >> ---WITHOUT---
> >>
> >> # ls /proc/device-tree/timer
> >> compatible  interrupts  name
> >> # cat /proc/interrupts  
> >>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
> >> CPU6   CPU7
> >>   3:   6958   5766   5166   5187   5576   5129 
> >> 4695   4398   GIC  27 Edge  arch_timer
> >> # sleep 120 && cat /proc/interrupts  
> >>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
> >> CPU6   CPU7
> >>   3:   7557   5986   5487   5265   6232   5868 
> >> 5464   4438   GIC  27 Edge  arch_timer
> >>
> >> ---WITH---
> >>
> >> # ls /proc/device-tree/timer
> >> always-on  compatible  interrupts  name
> >> # cat /proc/interrupts 
> >>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
> >> CPU6   CPU7
> >>   3:   7005   6080   4996   5391   5165   5257 
> >> 4930   4844   GIC  27 Edge  arch_timer
> >> # sleep 120 && cat /proc/interrupts 
> >>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
> >> CPU6   CPU7
> >>   3:   7523   6505   5264   6717   5273   5391 
> >> 5526   4901   GIC  27 Edge  arch_timer
> >>
> >>
> >>
> >> And kvm trace data has
> >>
> >> ---WITHOUT---
> >> $ grep kvm_timer_update_irq trace.out | wc -l
> >> 94336
> >> ---WITH---
> >> $ grep kvm_timer_update_irq trace.out | wc -l
> >> 95838
> >>
> >>
> > 
> > Must be how I'm looking, because I just tried kvmtool with/without
> > Marc's patch that adds always-on, but don't see any reduction of
> > interrupts there either. I used a defconfig guest kernel. Also,
> > not that I think it should matter, but my host kernel is 4.4-rc4
> > based.
> > 
> > I'd like to be able to see a difference with/without this always-on
> > patch, not because I don't think we should take it anyway, but because
> > I need a test case for the ACPI counterpart.
> 
> I just run a couple of quick tests, measuring interrupt rate (vmstat 1)
> on the host, with one VM (2 vcpus) idling, and I'm seeing the following
> thing:
> 
> Without "always-on": ~380 interrupts per second
> With "always-on": ~40 interrupts per second
> 
> This is with kvmtool, 32bit host (but none of that is arch specific anyway).
>

For me (64bit host, one VM (8 vcpus)) of 100 'vmstat 1' samples I have the
following.

Without "always-on": mean=56.370 sd=33.404 min=1 max=244
With "always-on":mean=51.580 sd=33.361 min=1 max=273

I'm also using kvmtool, and my guest is idle.

So a difference between 32 and 64bit hosts? Again, my guest config is
now just a defconfig. My host config is not, but I'm not sure what
options to look for other than what I wrote above, which are the same
for my host.

Thanks,
drew



Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Marc Zyngier
On 20/01/16 15:06, Andrew Jones wrote:
> On Wed, Jan 20, 2016 at 02:28:05PM +, Marc Zyngier wrote:
>> On 20/01/16 14:01, Andrew Jones wrote:
>>> On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
 On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
>>> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
>> OK, CCing him. One thing I see is that without this change we're
>> currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
>> it's not true. Having that set may disable the oneshot capabilityj
>> necessary to switch to nohz mode? I'll just stop there with my
>> speculation though, so Marc won't have to correct too much...
>
> You're spot on. See 82a5619 in the kernel tree. When I did a similar
> change in kvmtool, I saw a massive reduction in the number of timer
> interrupts injected (specially when the number of vcpu is relatively 
> high).
>
> This also have interesting benefits when running on a model, where
> you're trying to squeeze the last bits of "performance" from the 
> monster...
>

 Hmm, I'm probably testing this wrong, but I don't see any difference in
 the number of injected timer interrupts. My guest, which I boot with
 UEFI, has 

 CONFIG_ARM_ARCH_TIMER=y
 CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
 CONFIG_ARM_TIMER_SP804=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_TICK_ONESHOT=y
 CONFIG_NO_HZ_COMMON=y
 # CONFIG_HZ_PERIODIC is not set
 CONFIG_NO_HZ_IDLE=y
 # CONFIG_NO_HZ_FULL is not set
 CONFIG_NO_HZ=y
 CONFIG_HZ_1000=y
 CONFIG_HZ=1000

 I've boot a guest using DT with and without this patch

 ---WITHOUT---

 # ls /proc/device-tree/timer
 compatible  interrupts  name
 # cat /proc/interrupts  
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
 CPU6   CPU7
   3:   6958   5766   5166   5187   5576   5129 
 4695   4398   GIC  27 Edge  arch_timer
 # sleep 120 && cat /proc/interrupts  
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
 CPU6   CPU7
   3:   7557   5986   5487   5265   6232   5868 
 5464   4438   GIC  27 Edge  arch_timer

 ---WITH---

 # ls /proc/device-tree/timer
 always-on  compatible  interrupts  name
 # cat /proc/interrupts 
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
 CPU6   CPU7
   3:   7005   6080   4996   5391   5165   5257 
 4930   4844   GIC  27 Edge  arch_timer
 # sleep 120 && cat /proc/interrupts 
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
 CPU6   CPU7
   3:   7523   6505   5264   6717   5273   5391 
 5526   4901   GIC  27 Edge  arch_timer



 And kvm trace data has

 ---WITHOUT---
 $ grep kvm_timer_update_irq trace.out | wc -l
 94336
 ---WITH---
 $ grep kvm_timer_update_irq trace.out | wc -l
 95838


>>>
>>> Must be how I'm looking, because I just tried kvmtool with/without
>>> Marc's patch that adds always-on, but don't see any reduction of
>>> interrupts there either. I used a defconfig guest kernel. Also,
>>> not that I think it should matter, but my host kernel is 4.4-rc4
>>> based.
>>>
>>> I'd like to be able to see a difference with/without this always-on
>>> patch, not because I don't think we should take it anyway, but because
>>> I need a test case for the ACPI counterpart.
>>
>> I just run a couple of quick tests, measuring interrupt rate (vmstat 1)
>> on the host, with one VM (2 vcpus) idling, and I'm seeing the following
>> thing:
>>
>> Without "always-on": ~380 interrupts per second
>> With "always-on": ~40 interrupts per second
>>
>> This is with kvmtool, 32bit host (but none of that is arch specific anyway).
>>
> 
> For me (64bit host, one VM (8 vcpus)) of 100 'vmstat 1' samples I have the
> following.
> 
> Without "always-on": mean=56.370 sd=33.404 min=1 max=244
> With "always-on":mean=51.580 sd=33.361 min=1 max=273
> 
> I'm also using kvmtool, and my guest is idle.
> 
> So a difference between 32 and 64bit hosts? Again, my guest config is
> now just a defconfig. My host config is not, but I'm not sure what
> options to look for other than what I wrote above, which are the same
> for my host.

Just tried on Seattle with a 64bit guest, and there is hardly any
difference indeed. Both host and guest are "mostly" defconfig as well.
So there is a kernel configuration difference.

Running my 32bit guest on a 64bit host definitely shows a massive
difference (with 8 vcpus):

Without "always-on": ~1200 interrupts per second
With "always-on": ~50 interrupts per second

[Head scratching, poking Mark]

Right, I now know what 

Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Andrew Jones
On Wed, Jan 20, 2016 at 04:20:03PM +, Marc Zyngier wrote:
> On 20/01/16 15:06, Andrew Jones wrote:
> > On Wed, Jan 20, 2016 at 02:28:05PM +, Marc Zyngier wrote:
> >> On 20/01/16 14:01, Andrew Jones wrote:
> >>> On Tue, Jan 19, 2016 at 07:48:14PM +0100, Andrew Jones wrote:
>  On Tue, Jan 19, 2016 at 01:43:07PM +, Marc Zyngier wrote:
> >>> On Tue, Jan 19, 2016 at 01:37:16PM +0100, Andrew Jones wrote:
> >> OK, CCing him. One thing I see is that without this change we're
> >> currently setting the clock feature CLOCK_EVT_FEAT_C3STOP, even though
> >> it's not true. Having that set may disable the oneshot capabilityj
> >> necessary to switch to nohz mode? I'll just stop there with my
> >> speculation though, so Marc won't have to correct too much...
> >
> > You're spot on. See 82a5619 in the kernel tree. When I did a similar
> > change in kvmtool, I saw a massive reduction in the number of timer
> > interrupts injected (specially when the number of vcpu is relatively 
> > high).
> >
> > This also have interesting benefits when running on a model, where
> > you're trying to squeeze the last bits of "performance" from the 
> > monster...
> >
> 
>  Hmm, I'm probably testing this wrong, but I don't see any difference in
>  the number of injected timer interrupts. My guest, which I boot with
>  UEFI, has 
> 
>  CONFIG_ARM_ARCH_TIMER=y
>  CONFIG_ARM_ARCH_TIMER_EVTSTREAM=y
>  CONFIG_ARM_TIMER_SP804=y
>  CONFIG_HIGH_RES_TIMERS=y
>  CONFIG_TICK_ONESHOT=y
>  CONFIG_NO_HZ_COMMON=y
>  # CONFIG_HZ_PERIODIC is not set
>  CONFIG_NO_HZ_IDLE=y
>  # CONFIG_NO_HZ_FULL is not set
>  CONFIG_NO_HZ=y
>  CONFIG_HZ_1000=y
>  CONFIG_HZ=1000
> 
>  I've boot a guest using DT with and without this patch
> 
>  ---WITHOUT---
> 
>  # ls /proc/device-tree/timer
>  compatible  interrupts  name
>  # cat /proc/interrupts  
> CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
>  CPU6   CPU7
>    3:   6958   5766   5166   5187   5576   5129 
>  4695   4398   GIC  27 Edge  arch_timer
>  # sleep 120 && cat /proc/interrupts  
> CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
>  CPU6   CPU7
>    3:   7557   5986   5487   5265   6232   5868 
>  5464   4438   GIC  27 Edge  arch_timer
> 
>  ---WITH---
> 
>  # ls /proc/device-tree/timer
>  always-on  compatible  interrupts  name
>  # cat /proc/interrupts 
> CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
>  CPU6   CPU7
>    3:   7005   6080   4996   5391   5165   5257 
>  4930   4844   GIC  27 Edge  arch_timer
>  # sleep 120 && cat /proc/interrupts 
> CPU0   CPU1   CPU2   CPU3   CPU4   CPU5 
>  CPU6   CPU7
>    3:   7523   6505   5264   6717   5273   5391 
>  5526   4901   GIC  27 Edge  arch_timer
> 
> 
> 
>  And kvm trace data has
> 
>  ---WITHOUT---
>  $ grep kvm_timer_update_irq trace.out | wc -l
>  94336
>  ---WITH---
>  $ grep kvm_timer_update_irq trace.out | wc -l
>  95838
> 
> 
> >>>
> >>> Must be how I'm looking, because I just tried kvmtool with/without
> >>> Marc's patch that adds always-on, but don't see any reduction of
> >>> interrupts there either. I used a defconfig guest kernel. Also,
> >>> not that I think it should matter, but my host kernel is 4.4-rc4
> >>> based.
> >>>
> >>> I'd like to be able to see a difference with/without this always-on
> >>> patch, not because I don't think we should take it anyway, but because
> >>> I need a test case for the ACPI counterpart.
> >>
> >> I just run a couple of quick tests, measuring interrupt rate (vmstat 1)
> >> on the host, with one VM (2 vcpus) idling, and I'm seeing the following
> >> thing:
> >>
> >> Without "always-on": ~380 interrupts per second
> >> With "always-on": ~40 interrupts per second
> >>
> >> This is with kvmtool, 32bit host (but none of that is arch specific 
> >> anyway).
> >>
> > 
> > For me (64bit host, one VM (8 vcpus)) of 100 'vmstat 1' samples I have the
> > following.
> > 
> > Without "always-on": mean=56.370 sd=33.404 min=1 max=244
> > With "always-on":mean=51.580 sd=33.361 min=1 max=273
> > 
> > I'm also using kvmtool, and my guest is idle.
> > 
> > So a difference between 32 and 64bit hosts? Again, my guest config is
> > now just a defconfig. My host config is not, but I'm not sure what
> > options to look for other than what I wrote above, which are the same
> > for my host.
> 
> Just tried on Seattle with a 64bit guest, and there is hardly any
> difference indeed. Both host and guest are "mostly" defc

Re: [Qemu-devel] [PATCH] hw/arm/virt: Add always-on property to the virt board timer

2016-01-20 Thread Marc Zyngier
On 20/01/16 16:47, Andrew Jones wrote:
> On Wed, Jan 20, 2016 at 04:20:03PM +, Marc Zyngier wrote:
>> Just tried on Seattle with a 64bit guest, and there is hardly any
>> difference indeed. Both host and guest are "mostly" defconfig as well.
>> So there is a kernel configuration difference.
>>
>> Running my 32bit guest on a 64bit host definitely shows a massive
>> difference (with 8 vcpus):
>>
>> Without "always-on": ~1200 interrupts per second
>> With "always-on": ~50 interrupts per second
>>
>> [Head scratching, poking Mark]
>>
>> Right, I now know what is going on: The arm64 kernel uses
>> tick_setup_hrtimer_broadcast() so that it can still use the arch timer
>> as a broadcast timer (forcing one CPU to remain on), while the 32bit
>> kernel relies on the presence of a backup timer (sp804 anyone?) or the
>> guarantee that the timer cannot go away (always-on).
>>
>> This is probably why I'm seeing such a gain with a 32bit guest, and none
>> with a 64bit guest (the kernel already does the right thing). As to why
>> there is such a difference between the two architectures, this is a
>> story for another day...
>>
> 
> Thanks Marc! I just confirmed with an AArch32 guest using QEMU, and the
> patch we've hijacked for this thread. Without the patch I get a megaton
> of interrupts (~14000/s). With the patch, after letting the guest chill
> for a while, I'm getting ~150/s (8 vcpus).
> 
> I guess I don't have a test case for the ACPI code though. afaik we only
> have UEFI for AArch64 guests, and we don't have ACPI boot without UEFI.
> Or maybe I can hack time_init to remove the tick_setup_hrtimer_broadcast
> call?

Yeah, that should work.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...