Hi Stefano,

On 24/05/18 00:47, Stefano Stabellini wrote:
On Tue, 22 May 2018, Julien Grall wrote:
Using current is fairly expensive, so save up into a variable.

Signed-off-by: Julien Grall <julien.gr...@arm.com>

Good idea. I am curious to know actually how much this patch would save
but I am not going to ask you run the tests.

I haven't benchmark it but looked at the resulting assembly code. This reduces by about ~20% the number of instructions in the function.

AFAIU, this is because of the way per-cpu access have been implemented. The per-cpu offset is stored in a system register (TPIDR_EL2), all the read to it cannot be optimized (access using volatile).

So every direct use of "current" will require at least a system register access and then a load from memory.

Cheers,


Reviewed-by: Stefano Stabellini <sstabell...@kernel.org>

---
  xen/arch/arm/traps.c | 14 ++++++++------
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index 020b0b8eef..b1546f6907 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2024,8 +2024,10 @@ static void enter_hypervisor_head(struct cpu_user_regs 
*regs)
  {
      if ( guest_mode(regs) )
      {
+        struct vcpu *v = current;
+
          /* If the guest has disabled the workaround, bring it back on. */
-        if ( needs_ssbd_flip(current) )
+        if ( needs_ssbd_flip(v) )
              arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2_FID, 1, NULL);
/*
@@ -2034,8 +2036,8 @@ static void enter_hypervisor_head(struct cpu_user_regs 
*regs)
           * but the crucial bit is "On taking a vSError interrupt, HCR_EL2.VSE
           * (alias of HCR.VA) is cleared to 0."
           */
-        if ( current->arch.hcr_el2 & HCR_VA )
-            current->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
+        if ( v->arch.hcr_el2 & HCR_VA )
+            v->arch.hcr_el2 = READ_SYSREG(HCR_EL2);
#ifdef CONFIG_NEW_VGIC
          /*
@@ -2045,11 +2047,11 @@ static void enter_hypervisor_head(struct cpu_user_regs 
*regs)
           * TODO: Investigate whether this is necessary to do on every
           * trap and how it can be optimised.
           */
-        vtimer_update_irqs(current);
-        vcpu_update_evtchn_irq(current);
+        vtimer_update_irqs(v);
+        vcpu_update_evtchn_irq(v);
  #endif
- vgic_sync_from_lrs(current);
+        vgic_sync_from_lrs(v);
      }
  }
--
2.11.0


--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to