On Wed, 28 Jun 2017 12:26:02 +0100 Mark Rutland <mark.rutl...@arm.com> wrote:
> On Tue, Jun 27, 2017 at 04:07:58PM -0500, Kim Phillips wrote: > > I'm close to finishing the bts version of userspace, and have been > > testing a bit more thoroughly, so now I consistently see the excessive > > PADding when recording a CPU that's idle. I.e., when I taskset the perf > > record to the same CPU I specify to record's -C (taskset -c n perf > > record -C n), I get max. twenty-odd number of PAD bytes at the end of > > the AUX buffers in the perf.data file. If, OTOH, I taskset -c n perf > > record -C m, where m != n, I get a couple of valid event records in the > > buffer, and the rest of the buffer is filled with PADding. > > > > It wouldn't be a problem except that it's wastes too much space > > sometimes. Here is a good output buffer sample from a --mmap-pages=,12 > > run, with only 4 PADs tacked onto the end: > > > > 0xd190 [0x30]: PERF_RECORD_AUXTRACE size: 0x48 offset: 0 ref: > > 0xe914f7e3ce idx: 0 tid: -1 cpu: 2 > > . > > . ... ARM SPE data: size 72 bytes > > . 00000000: 4a 01 B COND > > [...] > > > . 0000003b: 71 a5 39 e1 14 e9 00 00 00 TS > > 1001077684645 > > . 00000044: 00 PAD > > . 00000045: 00 PAD > > . 00000046: 00 PAD > > . 00000047: 00 PAD > > > > whereas this one - from later on in the same run - is over 99% PADs: > > > > 0xd250 [0x30]: PERF_RECORD_AUXTRACE size: 0x5fc0 offset: 0xfffff4ae0044 > > ref: 0xe91cead1dd idx: 0 tid: -1 cpu: 2 > > . > > . ... ARM SPE data: size 24512 bytes > > . 00000000: 4a 00 B > > [...] > > > . 000000b0: 71 8f 4e e1 14 e9 00 00 00 TS > > 1001077689999 > > . 000000b9: 00 PAD > > ...ALL PADs...ALL PADs...ALL PADs...ALL PADs...ALL PADs...ALL PADs... > > . 00005fbf: 00 PAD > > Interesting. > > If you cat /proc/interrupts, do you see many more SPE interrupts on CPU > n than on m? When n == m, I see approx. 1 IRQ per SPE buffer full. When n != m, I see neither CPU n or m incur SPE interrupts; the workload ran but didn't get recorded, or, rather, 'idleness' got recorded instead. > Otherwise, I wonder if this is some odd interaction with idle. Can you > try to forcefully load that other CPU? > > e.g. run something like: > > taskset -c <n> sh -c 'while true; do done' > > ... in parallel with the tracer. If I do a: taskset -c 1 sh -c 'while true; do echo blah > /dev/null' & taskset -c 0 perf record -C 1 ... then non-idleness and non-PADdingness get recorded. > For reference, what was your event sample period (i.e. the value of > perf_event_attr::sample_period)? > > Did you modify that at all with PERF_EVENT_IOC_PERIOD? If that's the same as 'perf record -c <period>', then, yes, I set the period to values such as 512, 1024. > > > > Meanwhile, when using fvp-base.dtb, my model setup stops booting the > > > > kernel after "smp: Bringing up secondary CPUs ...". If I however take > > > > the second SPE node from fvp-base.dts and add it to my working device > > > > tree, I get this during the driver probe: > > > > > > > > [ 1.042063] arm_spe_pmu spe-pmu@0: probed for CPUs 0-7 > > > > [max_record_sz 64, align 1, features 0xf] > > > > [ 1.043582] arm_spe_pmu spe-pmu@1: probed for CPUs 0-7 > > > > [max_record_sz 64, align 1, features 0xf] > > > > [ 1.043631] genirq: Flags mismatch irq 6. 00004404 (arm_spe_pmu) vs. > > > > 00004404 (arm_spe_pmu) > > > > > > Looks like you've screwed up your IRQ partitions, so you are effectively > > > registering the same device twice, which then blows up due to lack of > > > shared > > > irqs. > > > > > > Either remove one of the devices, or use IRQ partitions to restrict them > > > to unique sets of CPUs. > > > > Right, but since I want to get parity with what you're running - > > fvp_base.dtb - I tried to debug the hang after "smp: Bringing up > > secondary CPUs ..." problem, and could only debug it to the PSCI driver > > hitting one of these cases: > > > > case PSCI_RET_INVALID_PARAMS: > > case PSCI_RET_INVALID_ADDRESS: > > Sounds like your DT is describing CPUs that don't exist (or perhaps the > same CPU several times). Certainly, PSCI and the kernel disagree on > which CPUS exist. > > What exact DT are you using? the one this commit to linux-will's perf/spe branch provides: commit 2a73de57eaf61cdfd61be1e20a46e4a2c326775f Author: Marc Zyngier <marc.zyng...@arm.com> Date: Tue Mar 11 18:14:45 2014 +0000 arm64: dts: add model device-tree Signed-off-by: Marc Zyngier <marc.zyng...@arm.com> Signed-off-by: Will Deacon <will.dea...@arm.com> > Are you using the bootwrapper, or ATF? I'm guessing you're using the > bootwrapper. I'm using the wrapper to wrap arm-trusted-firmware (ATF?) objects, so, both? I noticed the wrapper I was using was pretty old, so I updated it. arm-trusted-firmware, btw, has just been updated to enable SPE at lower ELs, so I don't have to use a hacked-up version anymore. I also updated my BL33 to the latest upstream u-boot vexpress_aemv8a_dram_defconfig, and at least now the kernel continues to boot, even though it can't bring up 6 of the 7 secondary CPUs. > Which version of the bootwrapepr are you using? If it doesn't have > commit: > > ccdc936924b3682d ("Dynamically determine the set of CPUs") > > ... have you configured it appropriately with --with-cpu-ids? > > How is your model configured? CLUSTER0_NUM_CORES=4 CLUSTER1_NUM_CORES=4 > Which CPU IDs does it think exist? 1,2,3,4,0x100,0x101,0x102,0x103 ...which are different from the above device tree!: 0,0x100,0x200,0x300,0x10000,0x10100,0x10200,0x10300 So I imagine that's the problem, thanks! I don't see how to tell the model to put the CPUs at different addresses, only a lot of GICv3 redistributor switches? btw, where can I get updates to the run-model.sh scripts? Answer off-list? > > Note: it's yet another place I have to manually instrument the error > > path in a kernel driver in lieu of it being more naturally verbose by > > itself; I *implore* you to reconsider adding proper user messaging to > > arm_spe_pmu_event_init(). > > Given this is a FW configuration issue (i.e. a system-level error), I'm > more than happy to make the PSCI driver messages more helpful where > possible. > > That's completely orthogonal to the SPE debug messages for requests made > by the user. I respectfully disagree, given the current state of the interfaces involved. > > I can't tell which part of the fvp-base device tree is not liked by the > > firmware; I tried different combinations of the PSCI node, different CPU > > enumerations (cpu@100 vs cpu@1), removing idle-states properties...any > > hints appreciated. > > The bootwrapper doesn't support idle. So no idle-states should be in the > DT. > > If you can share your DT, bootwrapper configuration, and model > configuration, I can try to debug this with you. I reverted the wrapper's ccdc936924b3682d ("Dynamically determine the set of CPUs") commit you mentioned above, and specified the cpu-ids manually, and am now running with 8 CPUs, although linux enumerates them as 0,1,8,9,10,11,12,13? Thanks for your continued support, Kim