Re: Unable to use perf in VM
On 30/11/16 19:17, Wei Huang wrote: > > > On 11/30/2016 07:37 AM, Marc Zyngier wrote: >> On 30/11/16 11:48, Marc Zyngier wrote: >>> + Shannon >>> >>> On 29/11/16 22:04, Itaru Kitayama wrote: Hi, In a VM (virsh controlled, KVM acceleration enabled) on a recent kvmarm kernel host, I find I am unable to use perf to obtain performance statistics for a complex task like kernel build. (I've verified this is seen with a Fedora 25 VM and host combination as well) APM folks CC'ed think this might be caused by a bug in the core PMU framework code, thus I'd like to have experts opinion on this issue. [root@localhost linux]# perf stat -B make CHK include/config/kernel.release [ 119.617684] git[1144]: undefined instruction: pc=fc000808ff30 [ 119.623040] Code: 51000442 92401042 d51b9ca2 d5033fdf (d53b9d40) [ 119.627607] Internal error: undefined instruction: 0 [#1] SMP >>> >>> [...] >>> >>> In a VM running mainline hosted on an AMD Seattle box: >>> >>> Performance counter stats for 'make': >>> >>> 1526089.499304 task-clock:u (msec) #0.932 CPUs utilized >>> >>> 0 context-switches:u#0.000 K/sec >>> >>> 0 cpu-migrations:u #0.000 K/sec >>> >>> 29527793 page-faults:u #0.019 M/sec >>> >>> 2913174122673 cycles:u #1.909 GHz >>> >>> 2365040892322 instructions:u#0.81 insn per cycle >>> >>> branches:u >>> >>>32049215378 branch-misses:u #0.00% of all >>> branches >>> >>> 1637.531444837 seconds time elapsed >>> >>> Running the same host kernel on a Mustang system, the guest explodes >>> in the way you reported. The failing instruction always seems to be >>> an access to pmxevcntr_el0 (I've seen both reads and writes). >>> >>> Funnily enough, it dies if you try any HW event other than cycles >>> ("perf stat -e cycles ls" works, and "perf stat -e instructions ls" >>> explodes). Which would tend to indicate that we're screwing up >>> the counter selection, but I have no proof of that (specially that >>> the Seattle guest is working just as expected). >> >> It turns out that we *don't* inject an undef. It seems to be generated >> locally at EL1. >> >> Still digging. > > Just FYI: I saw it on Mustang before. My initial thought was HW related, > but without proof. I am interested to see your findings... It would have been good to report it earlier. Anyway, I've identified the root issue, which seems to boil down to how you interpret a small corner of the PMU architecture. I've raised it with the architecture team here, and I should have a workaround/fix shortly. Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Unable to use perf in VM
On 11/30/2016 07:37 AM, Marc Zyngier wrote: > On 30/11/16 11:48, Marc Zyngier wrote: >> + Shannon >> >> On 29/11/16 22:04, Itaru Kitayama wrote: >>> Hi, >>> >>> In a VM (virsh controlled, KVM acceleration enabled) on a recent >>> kvmarm kernel host, I find I am unable to use perf to obtain >>> performance statistics for a complex task like kernel build. >>> (I've verified this is seen with a Fedora 25 VM and host combination >>> as well) >>> APM folks CC'ed think this might be caused by a bug in the core PMU >>> framework code, thus I'd like to have experts opinion on this issue. >>> >>> [root@localhost linux]# perf stat -B make >>>CHK include/config/kernel.release >>> [ 119.617684] git[1144]: undefined instruction: pc=fc000808ff30 >>> [ 119.623040] Code: 51000442 92401042 d51b9ca2 d5033fdf (d53b9d40) >>> [ 119.627607] Internal error: undefined instruction: 0 [#1] SMP >> >> [...] >> >> In a VM running mainline hosted on an AMD Seattle box: >> >> Performance counter stats for 'make': >> >> 1526089.499304 task-clock:u (msec) #0.932 CPUs utilized >> >> 0 context-switches:u#0.000 K/sec >> >> 0 cpu-migrations:u #0.000 K/sec >> >> 29527793 page-faults:u #0.019 M/sec >> >> 2913174122673 cycles:u #1.909 GHz >> >> 2365040892322 instructions:u#0.81 insn per cycle >> >> branches:u >> >>32049215378 branch-misses:u #0.00% of all branches >> >> >> 1637.531444837 seconds time elapsed >> >> Running the same host kernel on a Mustang system, the guest explodes >> in the way you reported. The failing instruction always seems to be >> an access to pmxevcntr_el0 (I've seen both reads and writes). >> >> Funnily enough, it dies if you try any HW event other than cycles >> ("perf stat -e cycles ls" works, and "perf stat -e instructions ls" >> explodes). Which would tend to indicate that we're screwing up >> the counter selection, but I have no proof of that (specially that >> the Seattle guest is working just as expected). > > It turns out that we *don't* inject an undef. It seems to be generated > locally at EL1. > > Still digging. Just FYI: I saw it on Mustang before. My initial thought was HW related, but without proof. I am interested to see your findings... > > M. > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Unable to use perf in VM
On 30/11/16 11:48, Marc Zyngier wrote: > + Shannon > > On 29/11/16 22:04, Itaru Kitayama wrote: >> Hi, >> >> In a VM (virsh controlled, KVM acceleration enabled) on a recent >> kvmarm kernel host, I find I am unable to use perf to obtain >> performance statistics for a complex task like kernel build. >> (I've verified this is seen with a Fedora 25 VM and host combination >> as well) >> APM folks CC'ed think this might be caused by a bug in the core PMU >> framework code, thus I'd like to have experts opinion on this issue. >> >> [root@localhost linux]# perf stat -B make >>CHK include/config/kernel.release >> [ 119.617684] git[1144]: undefined instruction: pc=fc000808ff30 >> [ 119.623040] Code: 51000442 92401042 d51b9ca2 d5033fdf (d53b9d40) >> [ 119.627607] Internal error: undefined instruction: 0 [#1] SMP > > [...] > > In a VM running mainline hosted on an AMD Seattle box: > > Performance counter stats for 'make': > > 1526089.499304 task-clock:u (msec) #0.932 CPUs utilized > > 0 context-switches:u#0.000 K/sec > > 0 cpu-migrations:u #0.000 K/sec > > 29527793 page-faults:u #0.019 M/sec > > 2913174122673 cycles:u #1.909 GHz > > 2365040892322 instructions:u#0.81 insn per cycle > > branches:u > >32049215378 branch-misses:u #0.00% of all branches > > > 1637.531444837 seconds time elapsed > > Running the same host kernel on a Mustang system, the guest explodes > in the way you reported. The failing instruction always seems to be > an access to pmxevcntr_el0 (I've seen both reads and writes). > > Funnily enough, it dies if you try any HW event other than cycles > ("perf stat -e cycles ls" works, and "perf stat -e instructions ls" > explodes). Which would tend to indicate that we're screwing up > the counter selection, but I have no proof of that (specially that > the Seattle guest is working just as expected). It turns out that we *don't* inject an undef. It seems to be generated locally at EL1. Still digging. M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: Unable to use perf in VM
+ Shannon On 29/11/16 22:04, Itaru Kitayama wrote: > Hi, > > In a VM (virsh controlled, KVM acceleration enabled) on a recent > kvmarm kernel host, I find I am unable to use perf to obtain > performance statistics for a complex task like kernel build. > (I've verified this is seen with a Fedora 25 VM and host combination > as well) > APM folks CC'ed think this might be caused by a bug in the core PMU > framework code, thus I'd like to have experts opinion on this issue. > > [root@localhost linux]# perf stat -B make >CHK include/config/kernel.release > [ 119.617684] git[1144]: undefined instruction: pc=fc000808ff30 > [ 119.623040] Code: 51000442 92401042 d51b9ca2 d5033fdf (d53b9d40) > [ 119.627607] Internal error: undefined instruction: 0 [#1] SMP [...] In a VM running mainline hosted on an AMD Seattle box: Performance counter stats for 'make': 1526089.499304 task-clock:u (msec) #0.932 CPUs utilized 0 context-switches:u#0.000 K/sec 0 cpu-migrations:u #0.000 K/sec 29527793 page-faults:u #0.019 M/sec 2913174122673 cycles:u #1.909 GHz 2365040892322 instructions:u#0.81 insn per cycle branches:u 32049215378 branch-misses:u #0.00% of all branches 1637.531444837 seconds time elapsed Running the same host kernel on a Mustang system, the guest explodes in the way you reported. The failing instruction always seems to be an access to pmxevcntr_el0 (I've seen both reads and writes). Funnily enough, it dies if you try any HW event other than cycles ("perf stat -e cycles ls" works, and "perf stat -e instructions ls" explodes). Which would tend to indicate that we're screwing up the counter selection, but I have no proof of that (specially that the Seattle guest is working just as expected). Shannon, any idea? Thanks, M. -- Jazz is not dead. It just smells funny... ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Unable to use perf in VM
Hi, In a VM (virsh controlled, KVM acceleration enabled) on a recent kvmarm kernel host, I find I am unable to use perf to obtain performance statistics for a complex task like kernel build. (I've verified this is seen with a Fedora 25 VM and host combination as well) APM folks CC'ed think this might be caused by a bug in the core PMU framework code, thus I'd like to have experts opinion on this issue. [root@localhost linux]# perf stat -B make CHK include/config/kernel.release [ 119.617684] git[1144]: undefined instruction: pc=fc000808ff30 [ 119.623040] Code: 51000442 92401042 d51b9ca2 d5033fdf (d53b9d40) [ 119.627607] Internal error: undefined instruction: 0 [#1] SMP [ 119.633600] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_broute bridge stp llc ebtable_nat ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_mangle ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw iptable_mangle iptable_security ebtable_filter ebtables ip6table_filter ip6_tables vfat fat chipreg mtd virtio_net qemu_fw_cfg nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c virtio_console virtio_scsi gpio_keys virtio_mmio virtio_ring virtio [ 119.677249] CPU: 0 PID: 1144 Comm: git Tainted: P4.8.0 #3 [ 119.682480] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 119.687660] task: fe00ea96cb00 task.stack: fe00e6a6 [ 119.692167] PC is at armv8pmu_read_counter+0x30/0x68 [ 119.695973] LR is at armpmu_event_update+0x34/0x98 [ 119.699615] pc : [] lr : [] pstate: 81c5 [ 119.705137] sp : fe00e6a63790 [ 119.707663] x29: fe00e6a63790 x28: fe00f275ca08 [ 119.711875] x27: x26: 0001 [ 119.716097] x25: fe00fff22328 x24: 1af18c08 [ 119.720345] x23: fe00f275ca00 x22: fe00ff8d5c00 [ 119.724589] x21: 87b2b5f5 x20: fe00e6a53800 [ 119.728827] x19: fe00e6a539c0 x18: [ 119.733018] x17: 03ff8b323c20 x16: [ 119.737274] x15: x14: 726573752f656369 [ 119.741395] x13: 6c732e726573752f x12: [ 119.745596] x11: 0001 x10: 0001 [ 119.749800] x9 : x8 : 0001 [ 119.754042] x7 : x6 : 00d2ea7d1b28 [ 119.758232] x5 : 0002 x4 : 0200f722 [ 119.762446] x3 : x2 : [ 119.766655] x1 : fc000808ff00 x0 : 0004 [ 119.770757] [ 119.771947] Process git (pid: 1144, stack limit = 0xfe00e6a60020) [ 119.776829] Stack: (0xfe00e6a63790 to 0xfe00e6a64000) [ 119.781234] 3780: fe00e6a637c0 fc00086fe790 [ 119.787139] 37a0: fe00e6a53800 fe00e6a53800 fe00ff8d5c00 0001 [ 119.793007] 37c0: fe00e6a637e0 fc00086fe7e4 fe00fff22210 fe00e6a53800 [ 119.798999] 37e0: fe00e6a63810 fc00081cc260 fe00e6a53800 fe00e6a53800 [ 119.804893] 3800: fe00fff22418 fe00fff2241c fe00e6a63860 fc00081cc404 [ 119.810826] 3820: fe00f275ca00 fe00e6a53800 fe00fff22418 fe00fff2241c [ 119.816766] 3840: fe00f275ca00 fe00e6a53800 fe00fff22328 0007 [ 119.822690] 3860: fe00e6a638b0 fc00081cc900 fe00f275ca00 fe00e6a53800 [ 119.828597] 3880: fe00fff22328 fe00f275ca58 fe00f275ca48 fe00ea96d3b0 [ 119.834537] 38a0: fc0008d7ba00 fe00fff22328 fe00e6a638f0 fc00081cca00 [ 119.840379] 38c0: fc0008cf7000 fe00ea96d3a0 fc0008d7c2a0 fe00f275ca00 [ 119.846291] 38e0: fe00ea96cb00 fc00081ce554 fe00e6a63900 fc00081ce560 [ 119.852112] 3900: fe00e6a63990 fc000884655c fc0008d7ba00 fe00fff1ad80 [ 119.858904] 3920: fc0008d74000 fe00ea96cb00 fc00088469b0 [ 119.864738] 3940: fe00ea96d108 fe00fff1ad80 0054 fe00e44b39a8 [ 119.870551] 3960: fc0008d7ba00 fe00fff1ad80 fc0008d74000 fe00ea96cb00 [ 119.876396] 3980: fc00088469b0 fe00fff1ad80 fe00e6a63a00 fc00088469b0 [ 119.882286] 39a0: fe00e6a6 fc0008846208 fe00fff1b698 7fff [ 119.888225] 39c0: fe00e6a63b70 fe00f296ca80 fdff80387c80 fe00f27f3f00 [ 119.894004] 39e0: 0054 fc00083c6dec fe00ee7e7e00 [ 119.899745] 3a00: fe00e6a63a20 fc00088499ec 7fff fc00083c6df8 [ 119.905577] 3a20: fe00e6a63ac0 fc0008846208 fe00fff1ad80 [ 119.911378] 3a40: fe00fff1b698 7fff fe00e6a63b70 fe00f296ca80 [ 119.917247] 3a60: fdff80387c80 fe00f27f3f00 fe00e6a63aa0 fc000818ac7c [ 119.923206] 3a80: fe00ebdb4d00 fe005a50