Re: [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-09-21 Thread Sean Christopherson
On Wed, Sep 15, 2021, Zhu, Lingshan wrote:
> 
> 
> On 8/27/2021 3:59 AM, Sean Christopherson wrote:
> > TL;DR: Please don't merge this patch, it's broken and is also built on a 
> > shoddy
> > foundation that I would like to fix.
> Hi Sean,Peter, Paolo
> 
> I will send out an V11 which drops this patch since it's buggy, and Sean is
> working on fix this.
> Does this sound good?

Works for me, thanks!



Re: [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-09-14 Thread Zhu, Lingshan




On 8/27/2021 3:59 AM, Sean Christopherson wrote:

TL;DR: Please don't merge this patch, it's broken and is also built on a shoddy
foundation that I would like to fix.

Hi Sean,Peter, Paolo

I will send out an V11 which drops this patch since it's buggy, and Sean 
is working on fix this.

Does this sound good?

Thanks,
Zhu Lingshan


On Fri, Aug 06, 2021, Zhu Lingshan wrote:

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 464917096e73..e466fc8176e1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6489,9 +6489,18 @@ static void perf_pending_event(struct irq_work *entry)
   */
  struct perf_guest_info_callbacks *perf_guest_cbs;
  
+/* explicitly use __weak to fix duplicate symbol error */

+void __weak arch_perf_update_guest_cbs(void)
+{
+}
+
  int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
  {
+   if (WARN_ON_ONCE(perf_guest_cbs))
+   return -EBUSY;
+
perf_guest_cbs = cbs;
+   arch_perf_update_guest_cbs();

This is horribly broken, it fails to cleanup the static calls when KVM 
unregisters
the callbacks, which happens when the vendor module, e.g. kvm_intel, is 
unloaded.
The explosion doesn't happen until 'kvm' is unloaded because the functions are
implemented in 'kvm', i.e. the use-after-free is deferred a bit.

   BUG: unable to handle page fault for address: a011bb90
   #PF: supervisor instruction fetch in kernel mode
   #PF: error_code(0x0010) - not-present page
   PGD 6211067 P4D 6211067 PUD 6212063 PMD 102b99067 PTE 0
   Oops: 0010 [#1] PREEMPT SMP
   CPU: 0 PID: 1047 Comm: rmmod Not tainted 5.14.0-rc2+ #460
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
   RIP: 0010:0xa011bb90
   Code: Unable to access opcode bytes at RIP 0xa011bb66.
   Call Trace:

? perf_misc_flags+0xe/0x50
? perf_prepare_sample+0x53/0x6b0
? perf_event_output_forward+0x67/0x160
? kvm_clock_read+0x14/0x30
? kvm_sched_clock_read+0x5/0x10
? sched_clock_cpu+0xd/0xd0
? __perf_event_overflow+0x52/0xf0
? handle_pmi_common+0x1f2/0x2d0
? __flush_tlb_all+0x30/0x30
? intel_pmu_handle_irq+0xcf/0x410
? nmi_handle+0x5/0x260
? perf_event_nmi_handler+0x28/0x50
? nmi_handle+0xc7/0x260
? lock_release+0x2b0/0x2b0
? default_do_nmi+0x6b/0x170
? exc_nmi+0x103/0x130
? end_repeat_nmi+0x16/0x1f
? lock_release+0x2b0/0x2b0
? lock_release+0x2b0/0x2b0
? lock_release+0x2b0/0x2b0

   Modules linked in: irqbypass [last unloaded: kvm]

Even more fun, the existing perf_guest_cbs framework is also broken, though it's
much harder to get it to fail, and probably impossible to get it to fail without
some help.  The issue is that perf_guest_cbs is global, which means that it can
be nullified by KVM (during module unload) while the callbacks are being 
accessed
by a PMI handler on a different CPU.

The bug has escaped notice because all dererfences of perf_guest_cbs follow the
same "perf_guest_cbs && perf_guest_cbs->is_in_guest()" pattern, and AFAICT the
compiler never reload perf_guest_cbs in this sequence.  The compiler does reload
perf_guest_cbs for any future dereferences, but the ->is_in_guest() guard all 
but
guarantees the PMI handler will win the race, e.g. to nullify perf_guest_cbs,
KVM has to completely exit the guest and teardown down all VMs before it can be
unloaded.

But with a help, e.g. RAED_ONCE(perf_guest_cbs), unloading kvm_intel can trigger
a NULL pointer derference, e.g. this tweak

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 1eb45139fcc6..202e5ad97f82 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2954,7 +2954,7 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
  {
 int misc = 0;

-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (READ_ONCE(perf_guest_cbs) && 
READ_ONCE(perf_guest_cbs)->is_in_guest()) {
 if (perf_guest_cbs->is_user_mode())
 misc |= PERF_RECORD_MISC_GUEST_USER;
 else


while spamming module load/unload leads to:

   BUG: kernel NULL pointer dereference, address: 
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x) - not-present page
   PGD 0 P4D 0
   Oops:  [#1] PREEMPT SMP
   CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
   RIP: 0010:perf_misc_flags+0x1c/0x70
   Call Trace:
perf_prepare_sample+0x53/0x6b0
perf_event_output_forward+0x67/0x160
__perf_event_overflow+0x52/0xf0
handle_pmi_common+0x207/0x300
intel_pmu_handle_irq+0xcf/0x410
perf_event_nmi_handler+0x28/0x50
nmi_handle+0xc7/0x260
default_do_nmi+0x6b/0x170
exc_nmi+0x103/0x130
asm_exc_nmi+0x76/0xbf


The good news is that I have a series that should fix both the existing NULL 
pointer
bug and mostly obviate the need for static calls.  

Re: [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-08-27 Thread Sean Christopherson
On Fri, Aug 06, 2021, Zhu Lingshan wrote:
> @@ -2944,18 +2966,21 @@ static unsigned long code_segment_base(struct pt_regs 
> *regs)
>  
>  unsigned long perf_instruction_pointer(struct pt_regs *regs)
>  {
> - if (perf_guest_cbs && perf_guest_cbs->is_in_guest())
> - return perf_guest_cbs->get_guest_ip();
> + unsigned long ip = static_call(x86_guest_get_ip)();
> +
> + if (likely(!ip))

Pivoting on ip==0 isn't correct, it's perfectly legal for a guest to execute
from %rip=0.  Unless there's some static_call() magic that supports this with a
default function:

if (unlikely(!static_call(x86_guest_get_ip)(&ip)))
regs->ip + code_segment_base(regs)

return ip;

The easiest thing is keep the existing:

if (unlikely(static_call(x86_guest_state)()))
return static_call(x86_guest_get_ip)();

return regs->ip + code_segment_base(regs);

It's an extra call for PMIs in guest, but I don't think any of the KVM folks 
care
_that_ much about the performance in this case.

> + ip = regs->ip + code_segment_base(regs);
>  
> - return regs->ip + code_segment_base(regs);
> + return ip;
>  }



Re: [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-08-26 Thread Like Xu

On 27/8/2021 3:59 am, Sean Christopherson wrote:

TL;DR: Please don't merge this patch, it's broken and is also built on a shoddy
foundation that I would like to fix.


Obviously, this patch is not closely related to the guest PEBS feature enabling,
and we can certainly put this issue in another discussion thread [1].

[1] https://lore.kernel.org/kvm/20210827005718.585190-1-sea...@google.com/



On Fri, Aug 06, 2021, Zhu Lingshan wrote:

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 464917096e73..e466fc8176e1 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6489,9 +6489,18 @@ static void perf_pending_event(struct irq_work *entry)
   */
  struct perf_guest_info_callbacks *perf_guest_cbs;
  
+/* explicitly use __weak to fix duplicate symbol error */

+void __weak arch_perf_update_guest_cbs(void)
+{
+}
+
  int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
  {
+   if (WARN_ON_ONCE(perf_guest_cbs))
+   return -EBUSY;
+
perf_guest_cbs = cbs;
+   arch_perf_update_guest_cbs();


This is horribly broken, it fails to cleanup the static calls when KVM 
unregisters
the callbacks, which happens when the vendor module, e.g. kvm_intel, is 
unloaded.
The explosion doesn't happen until 'kvm' is unloaded because the functions are
implemented in 'kvm', i.e. the use-after-free is deferred a bit.

   BUG: unable to handle page fault for address: a011bb90
   #PF: supervisor instruction fetch in kernel mode
   #PF: error_code(0x0010) - not-present page
   PGD 6211067 P4D 6211067 PUD 6212063 PMD 102b99067 PTE 0
   Oops: 0010 [#1] PREEMPT SMP
   CPU: 0 PID: 1047 Comm: rmmod Not tainted 5.14.0-rc2+ #460
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
   RIP: 0010:0xa011bb90
   Code: Unable to access opcode bytes at RIP 0xa011bb66.
   Call Trace:

? perf_misc_flags+0xe/0x50
? perf_prepare_sample+0x53/0x6b0
? perf_event_output_forward+0x67/0x160
? kvm_clock_read+0x14/0x30
? kvm_sched_clock_read+0x5/0x10
? sched_clock_cpu+0xd/0xd0
? __perf_event_overflow+0x52/0xf0
? handle_pmi_common+0x1f2/0x2d0
? __flush_tlb_all+0x30/0x30
? intel_pmu_handle_irq+0xcf/0x410
? nmi_handle+0x5/0x260
? perf_event_nmi_handler+0x28/0x50
? nmi_handle+0xc7/0x260
? lock_release+0x2b0/0x2b0
? default_do_nmi+0x6b/0x170
? exc_nmi+0x103/0x130
? end_repeat_nmi+0x16/0x1f
? lock_release+0x2b0/0x2b0
? lock_release+0x2b0/0x2b0
? lock_release+0x2b0/0x2b0

   Modules linked in: irqbypass [last unloaded: kvm]

Even more fun, the existing perf_guest_cbs framework is also broken, though it's
much harder to get it to fail, and probably impossible to get it to fail without
some help.  The issue is that perf_guest_cbs is global, which means that it can
be nullified by KVM (during module unload) while the callbacks are being 
accessed
by a PMI handler on a different CPU.

The bug has escaped notice because all dererfences of perf_guest_cbs follow the
same "perf_guest_cbs && perf_guest_cbs->is_in_guest()" pattern, and AFAICT the
compiler never reload perf_guest_cbs in this sequence.  The compiler does reload
perf_guest_cbs for any future dereferences, but the ->is_in_guest() guard all 
but
guarantees the PMI handler will win the race, e.g. to nullify perf_guest_cbs,
KVM has to completely exit the guest and teardown down all VMs before it can be
unloaded.

But with a help, e.g. RAED_ONCE(perf_guest_cbs), unloading kvm_intel can trigger
a NULL pointer derference, e.g. this tweak

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 1eb45139fcc6..202e5ad97f82 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2954,7 +2954,7 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
  {
 int misc = 0;

-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (READ_ONCE(perf_guest_cbs) && 
READ_ONCE(perf_guest_cbs)->is_in_guest()) {
 if (perf_guest_cbs->is_user_mode())
 misc |= PERF_RECORD_MISC_GUEST_USER;
 else


while spamming module load/unload leads to:

   BUG: kernel NULL pointer dereference, address: 
   #PF: supervisor read access in kernel mode
   #PF: error_code(0x) - not-present page
   PGD 0 P4D 0
   Oops:  [#1] PREEMPT SMP
   CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
   RIP: 0010:perf_misc_flags+0x1c/0x70
   Call Trace:
perf_prepare_sample+0x53/0x6b0
perf_event_output_forward+0x67/0x160
__perf_event_overflow+0x52/0xf0
handle_pmi_common+0x207/0x300
intel_pmu_handle_irq+0xcf/0x410
perf_event_nmi_handler+0x28/0x50
nmi_handle+0xc7/0x260
default_do_nmi+0x6b/0x170
exc_nmi+0x103/0x130
asm_exc_nmi+0x76/0xbf


The good news is that I have a series that should fix both the existing NUL

Re: [PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-08-26 Thread Sean Christopherson
TL;DR: Please don't merge this patch, it's broken and is also built on a shoddy
   foundation that I would like to fix.

On Fri, Aug 06, 2021, Zhu Lingshan wrote:
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 464917096e73..e466fc8176e1 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6489,9 +6489,18 @@ static void perf_pending_event(struct irq_work *entry)
>   */
>  struct perf_guest_info_callbacks *perf_guest_cbs;
>  
> +/* explicitly use __weak to fix duplicate symbol error */
> +void __weak arch_perf_update_guest_cbs(void)
> +{
> +}
> +
>  int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
>  {
> + if (WARN_ON_ONCE(perf_guest_cbs))
> + return -EBUSY;
> +
>   perf_guest_cbs = cbs;
> + arch_perf_update_guest_cbs();

This is horribly broken, it fails to cleanup the static calls when KVM 
unregisters
the callbacks, which happens when the vendor module, e.g. kvm_intel, is 
unloaded.
The explosion doesn't happen until 'kvm' is unloaded because the functions are
implemented in 'kvm', i.e. the use-after-free is deferred a bit.

  BUG: unable to handle page fault for address: a011bb90
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0010) - not-present page
  PGD 6211067 P4D 6211067 PUD 6212063 PMD 102b99067 PTE 0
  Oops: 0010 [#1] PREEMPT SMP
  CPU: 0 PID: 1047 Comm: rmmod Not tainted 5.14.0-rc2+ #460
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:0xa011bb90
  Code: Unable to access opcode bytes at RIP 0xa011bb66.
  Call Trace:
   
   ? perf_misc_flags+0xe/0x50
   ? perf_prepare_sample+0x53/0x6b0
   ? perf_event_output_forward+0x67/0x160
   ? kvm_clock_read+0x14/0x30
   ? kvm_sched_clock_read+0x5/0x10
   ? sched_clock_cpu+0xd/0xd0
   ? __perf_event_overflow+0x52/0xf0
   ? handle_pmi_common+0x1f2/0x2d0
   ? __flush_tlb_all+0x30/0x30
   ? intel_pmu_handle_irq+0xcf/0x410
   ? nmi_handle+0x5/0x260
   ? perf_event_nmi_handler+0x28/0x50
   ? nmi_handle+0xc7/0x260
   ? lock_release+0x2b0/0x2b0
   ? default_do_nmi+0x6b/0x170
   ? exc_nmi+0x103/0x130
   ? end_repeat_nmi+0x16/0x1f
   ? lock_release+0x2b0/0x2b0
   ? lock_release+0x2b0/0x2b0
   ? lock_release+0x2b0/0x2b0
   
  Modules linked in: irqbypass [last unloaded: kvm]

Even more fun, the existing perf_guest_cbs framework is also broken, though it's
much harder to get it to fail, and probably impossible to get it to fail without
some help.  The issue is that perf_guest_cbs is global, which means that it can
be nullified by KVM (during module unload) while the callbacks are being 
accessed
by a PMI handler on a different CPU.

The bug has escaped notice because all dererfences of perf_guest_cbs follow the
same "perf_guest_cbs && perf_guest_cbs->is_in_guest()" pattern, and AFAICT the
compiler never reload perf_guest_cbs in this sequence.  The compiler does reload
perf_guest_cbs for any future dereferences, but the ->is_in_guest() guard all 
but
guarantees the PMI handler will win the race, e.g. to nullify perf_guest_cbs,
KVM has to completely exit the guest and teardown down all VMs before it can be
unloaded.

But with a help, e.g. RAED_ONCE(perf_guest_cbs), unloading kvm_intel can trigger
a NULL pointer derference, e.g. this tweak

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 1eb45139fcc6..202e5ad97f82 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2954,7 +2954,7 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
 {
int misc = 0;

-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (READ_ONCE(perf_guest_cbs) && 
READ_ONCE(perf_guest_cbs)->is_in_guest()) {
if (perf_guest_cbs->is_user_mode())
misc |= PERF_RECORD_MISC_GUEST_USER;
else


while spamming module load/unload leads to:

  BUG: kernel NULL pointer dereference, address: 
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x) - not-present page
  PGD 0 P4D 0
  Oops:  [#1] PREEMPT SMP
  CPU: 6 PID: 1825 Comm: stress Not tainted 5.14.0-rc2+ #459
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:perf_misc_flags+0x1c/0x70
  Call Trace:
   perf_prepare_sample+0x53/0x6b0
   perf_event_output_forward+0x67/0x160
   __perf_event_overflow+0x52/0xf0
   handle_pmi_common+0x207/0x300
   intel_pmu_handle_irq+0xcf/0x410
   perf_event_nmi_handler+0x28/0x50
   nmi_handle+0xc7/0x260
   default_do_nmi+0x6b/0x170
   exc_nmi+0x103/0x130
   asm_exc_nmi+0x76/0xbf


The good news is that I have a series that should fix both the existing NULL 
pointer
bug and mostly obviate the need for static calls.  The bad news is that my 
approach,
making perf_guest_cbs per-CPU, likely complicates turning these into static 
calls,
though I'm guessing it's still a solvable problem.

Tangentially related, IMO we should make architectures opt-in to getting
perf_guest

[PATCH V10 01/18] perf/core: Use static_call to optimize perf_guest_info_callbacks

2021-08-06 Thread Zhu Lingshan
From: Like Xu 

For "struct perf_guest_info_callbacks", the two fields "is_in_guest"
and "is_user_mode" are replaced with a new multiplexed member named
"state", and the "get_guest_ip" field will be renamed to "get_ip".

For arm64, xen and kvm/x86, the application of DEFINE_STATIC_CALL_RET0
could make all that perf_guest_cbs stuff suck less. For arm, csky, nds32,
and riscv, just applied some renamed refactoring.

Cc: Will Deacon 
Cc: Marc Zyngier 
Cc: Guo Ren 
Cc: Nick Hu 
Cc: Paul Walmsley 
Cc: Boris Ostrovsky 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvm...@lists.cs.columbia.edu
Cc: linux-c...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
Suggested-by: Peter Zijlstra (Intel) 
Original-by: Peter Zijlstra (Intel) 
Signed-off-by: Like Xu 
Signed-off-by: Zhu Lingshan 
Reviewed-by: Boris Ostrovsky 
Acked-by: Peter Zijlstra (Intel) 
---
 arch/arm/kernel/perf_callchain.c   | 16 +++-
 arch/arm64/kernel/perf_callchain.c | 29 +-
 arch/arm64/kvm/perf.c  | 22 -
 arch/csky/kernel/perf_callchain.c  |  4 +--
 arch/nds32/kernel/perf_event_cpu.c | 16 +++-
 arch/riscv/kernel/perf_callchain.c |  4 +--
 arch/x86/events/core.c | 39 --
 arch/x86/events/intel/core.c   |  7 +++---
 arch/x86/include/asm/kvm_host.h|  2 +-
 arch/x86/kvm/pmu.c |  2 +-
 arch/x86/kvm/x86.c | 37 +++-
 arch/x86/xen/pmu.c | 33 ++---
 include/linux/perf_event.h | 12 ++---
 kernel/events/core.c   |  9 +++
 14 files changed, 144 insertions(+), 88 deletions(-)

diff --git a/arch/arm/kernel/perf_callchain.c b/arch/arm/kernel/perf_callchain.c
index 3b69a76d341e..1ce30f86d6c7 100644
--- a/arch/arm/kernel/perf_callchain.c
+++ b/arch/arm/kernel/perf_callchain.c
@@ -64,7 +64,7 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, 
struct pt_regs *regs
 {
struct frame_tail __user *tail;
 
-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (perf_guest_cbs && perf_guest_cbs->state()) {
/* We don't support guest os callchain now */
return;
}
@@ -100,7 +100,7 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *re
 {
struct stackframe fr;
 
-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (perf_guest_cbs && perf_guest_cbs->state()) {
/* We don't support guest os callchain now */
return;
}
@@ -111,8 +111,8 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx 
*entry, struct pt_regs *re
 
 unsigned long perf_instruction_pointer(struct pt_regs *regs)
 {
-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest())
-   return perf_guest_cbs->get_guest_ip();
+   if (perf_guest_cbs && perf_guest_cbs->state())
+   return perf_guest_cbs->get_ip();
 
return instruction_pointer(regs);
 }
@@ -120,9 +120,13 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
 unsigned long perf_misc_flags(struct pt_regs *regs)
 {
int misc = 0;
+   unsigned int state = 0;
 
-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
-   if (perf_guest_cbs->is_user_mode())
+   if (perf_guest_cbs)
+   state = perf_guest_cbs->state();
+
+   if (perf_guest_cbs && state) {
+   if (state & PERF_GUEST_USER)
misc |= PERF_RECORD_MISC_GUEST_USER;
else
misc |= PERF_RECORD_MISC_GUEST_KERNEL;
diff --git a/arch/arm64/kernel/perf_callchain.c 
b/arch/arm64/kernel/perf_callchain.c
index 4a72c2727309..1b344e23fd2f 100644
--- a/arch/arm64/kernel/perf_callchain.c
+++ b/arch/arm64/kernel/perf_callchain.c
@@ -5,6 +5,7 @@
  * Copyright (C) 2015 ARM Limited
  */
 #include 
+#include 
 #include 
 
 #include 
@@ -99,10 +100,25 @@ compat_user_backtrace(struct compat_frame_tail __user 
*tail,
 }
 #endif /* CONFIG_COMPAT */
 
+DEFINE_STATIC_CALL_RET0(arm64_guest_state, *(perf_guest_cbs->state));
+DEFINE_STATIC_CALL_RET0(arm64_guest_get_ip, *(perf_guest_cbs->get_ip));
+
+void arch_perf_update_guest_cbs(void)
+{
+   static_call_update(arm64_guest_state, (void *)&__static_call_return0);
+   static_call_update(arm64_guest_get_ip, (void *)&__static_call_return0);
+
+   if (perf_guest_cbs && perf_guest_cbs->state)
+   static_call_update(arm64_guest_state, perf_guest_cbs->state);
+
+   if (perf_guest_cbs && perf_guest_cbs->get_ip)
+   static_call_update(arm64_guest_get_ip, perf_guest_cbs->get_ip);
+}
+
 void perf_callchain_user(struct perf_callchain_entry_ctx *entry,
 struct pt_regs *regs)
 {
-   if (perf_guest_cbs && perf_guest_cbs->is_in_guest()) {
+   if (static_call(arm64_guest_state)()) {
/* We don't support guest os ca