On Thu, Sep 17, 2020 at 12:53:02PM +0100, Marc Zyngier wrote: > On 2020-09-17 12:42, Leo Yan wrote: > > On Thu, Sep 17, 2020 at 11:21:15AM +0100, Marc Zyngier wrote: > > > > [...] > > > > > > > > +const char *vcpu_id_str = "id"; > > > > > > > > > > On Arm64, ftrace tracepoint "kvm_entry" doesn't contain the field "id" > > > > > or field "vcpu_id", thus it always reads out the "id" is 0 and it is > > > > > recorded into Perf's structure vcpu_event_record::vcpu_id and assigned > > > > > to perf thread's private data "thread::private". > > > > > > > > > > With current code, it will not mess up different vcpus' samples > > > > > because > > > > > now the samples are analyzed based on thread context, but since all > > > > > threads' "vcpu_id" is zero, thus all samples are accounted for > > > > > "vcpu_id=0" and cannot print out correct result with option "--vcpu": > > > > > > > > > > > > > > > $ perf kvm stat report --vcpu 4 > > > > > > > > > > Analyze events for all VMs, VCPU 4: > > > > > > > > > > VM-EXIT Samples Samples% Time% Min Time > > > > > Max Time Avg time > > > > > > > > > > Total Samples:0, Total events handled time:0.00us. > > > > > > > > > > > > > > > This is an issue I observed, if we want to support option "--vcpu", > > > > > seems we need to change ftrace event for "kvm_entry", but this will > > > > > break ABI. > > > > > > > > > > Essentially, this issue is caused by different archs using different > > > > > format for ftrace event "kvm_entry", on x86 it contains feild > > > > > "vcpu_id" but arm64 only just records "vcpu_pc". > > > > > > > > > > @Marc, @Will, do you have any suggestion for this? Do you think it's > > > > > feasible to add a new field "vcpu_id" into the tracepoint "kvm_entry" > > > > > for Arm64's version? > > > > > > The question really is: how will you handle the ABI breackage? > > > I don't see a good solution for it, apart from having a *separate* > > > tracepoint that collects all the information you need. And even that > > > is > > > really ugly. > > > > I searched a bit and found in practice it's not impossible to add new > > parameters for existed tracepoint, e.g. [1][2] are two examples to add > > new parameters for existed tracepoints and have been merged into > > mainline kernel. IIUC, we keep the old parameters for a tracepoint > > so this can avoid to break ABI if any apps have used this tracepoint, > > and adding a new parameter for the tracepoint should be safe. > > > > If you agree with this, I'd like to suggest to apply below change. > > How about you think for this? > > > > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index 46dc3d75cf13..d9f9b8e1df77 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -736,7 +736,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu) > > /************************************************************** > > * Enter the guest > > */ > > - trace_kvm_entry(*vcpu_pc(vcpu)); > > + trace_kvm_entry(vcpu->vcpu_id, *vcpu_pc(vcpu)); > > guest_enter_irqoff(); > > > > ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu); > > diff --git a/arch/arm64/kvm/trace_arm.h b/arch/arm64/kvm/trace_arm.h > > index 4691053c5ee4..e1d3e7a67e8b 100644 > > --- a/arch/arm64/kvm/trace_arm.h > > +++ b/arch/arm64/kvm/trace_arm.h > > @@ -12,18 +12,20 @@ > > * Tracepoints for entry/exit to guest > > */ > > TRACE_EVENT(kvm_entry, > > - TP_PROTO(unsigned long vcpu_pc), > > - TP_ARGS(vcpu_pc), > > + TP_PROTO(unsigned int vcpu_id, unsigned long vcpu_pc), > > + TP_ARGS(vcpu_id, vcpu_pc), > > > > TP_STRUCT__entry( > > + __field( unsigned int, vcpu_id ) > > __field( unsigned long, vcpu_pc ) > > ), > > > > TP_fast_assign( > > + __entry->vcpu_id = vcpu_id; > > __entry->vcpu_pc = vcpu_pc; > > ), > > > > - TP_printk("PC: 0x%08lx", __entry->vcpu_pc) > > + TP_printk("vcpu: %u, PC: 0x%08lx", __entry->vcpu_id, __entry->vcpu_pc) > > ); > > > > TRACE_EVENT(kvm_exit, > > > > How is that not breaking the ABI? You are adding a new field, and anything > that expect to read 'PC: 0x.....' at the beginning of the line now fails. > The examples you give are also blatant ABI breakages. because it is done > somewhere else doesn't make it valid. > > Anything that can be parsed by userspace is ABI. If you don't believe me, > please read the entertaining discussion we had when we tried to drop > Bogomips from /proc/cpuinfo.
The discussion thread was too long [1] to read all replies :) ... but I understand we should be very careful for ABI breakage. > So unless you get me Linus' stamp of approval for this, it's not happening. > Feel free to add a *new* tracepoint instead. Okay, thanks for the info and suggestions. Thanks, Leo [1] https://lkml.org/lkml/2015/1/4/132