On Wed, Apr 14, 2021 at 05:41:46PM +0300, James Clark wrote: > Hi, > > For this change, I also tried removing the setting of PERF_SAMPLE_TIME in > cs_etm__synth_events(). In theory, this would remove the sorting when opening > the file, but the change doesn't affect when the built-in events are saved to > the inject file. Resulting in events like MMAP and COMM with timestamps, but > the synthesised events without. This results in the same issue of the > synthesised events appearing before the COMM and MMAP events. If it was > possible to somehow tell perf to remove timestamps from built-in events, > removing PERF_SAMPLE_TIME would probably be the right solution, because we > don't set sample.time. > > For Arm v8.4 we will have the kernel time in the etm timestamps, so an if can > be added to switch between this behaviour and the next (more correct) one > depending on the hardware. > > On the subject of timestamps, but not related to this change, some > combinations of timestamp options aren't working. For example: > > perf record -e cs_etm/time,@tmc_etr0/u --per-thread > or perf record -e cs_etm/@tmc_etr0/u --timestamp --per-thread > > These don't work because of the assumption that etm->timeless_decoding == > --per-thread > and kernel timestamps enabled (/time/ or --timestamp) == etm timestamps > enabled (/timestamp/), which isn't necessarily true. > > This can be made to work with a few code changes for cs_etm/time,timestamp/u > --per-thread, but cs_etm/time/u --per-thread could be a bit more work. > Changes involved would be using "per_cpu_mmaps" in some places instead of > etm->timeless_decoding, and also setting etm->timeless_decoding based on > whether there are any etm timestamps, not kernel ones. Although to search for > any etm timestamp would involve a full decode ahead of time which might not > be feasible (or maybe just checking the options, although that's not how it's > done in cs_etm__is_timeless_decoding() currently).
Confirm for one thing: For the orignal perf data file with "--per-thread" option, the decoder runs into the condition for "etm->timeless_decoding"; and it doesn't contain ETM timestamp. Afterwards, the injected perf data file also misses ETM timestamp and hit the condition "etm->timeless_decoding". So I am confusing why the original perf data can be processed properly but fails to handle the injected perf data file. Thanks, Leo > Or, we could force /time/ and /timestamp/ options to always be enabled > together in the record stage. > > > Thanks > James > > On 14/04/2021 17:39, James Clark wrote: > > The following attribute is set when synthesising samples in > > timed decoding mode: > > > > attr.sample_type |= PERF_SAMPLE_TIME; > > > > This results in new samples that appear to have timestamps but > > because we don't assign any timestamps to the samples, when the > > resulting inject file is opened again, the synthesised samples > > will be on the wrong side of the MMAP or COMM events. > > > > For example this results in the samples being associated with > > the perf binary, rather than the target of the record: > > > > perf record -e cs_etm/@tmc_etr0/u top > > perf inject -i perf.data -o perf.inject --itrace=i100il > > perf report -i perf.inject > > > > Where 'Command' == perf should show as 'top': > > > > # Overhead Command Source Shared Object Source Symbol > > Target Symbol Basic Block Cycles > > # ........ ....... .................... ...................... > > ...................... .................. > > # > > 31.08% perf [unknown] [.] 0x000000000040c3f8 [.] > > 0x000000000040c3e8 - > > > > If the perf.data file is opened directly with perf, without the > > inject step, then this already works correctly because the > > events are synthesised after the COMM and MMAP events and > > no second sorting happens. Re-sorting only happens when opening > > the perf.inject file for the second time so timestamps are > > needed. > > > > Using the timestamp from the AUX record mirrors the current > > behaviour when opening directly with perf, because the events > > are generated on the call to cs_etm__process_queues(). > > > > Signed-off-by: James Clark <[email protected]> > > Co-developed-by: Al Grant <[email protected]> > > Signed-off-by: Al Grant <[email protected]> > > --- > > tools/perf/util/cs-etm.c | 10 ++++++++-- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > > index c25da2ffa8f3..d0fa9dce47f1 100644 > > --- a/tools/perf/util/cs-etm.c > > +++ b/tools/perf/util/cs-etm.c > > @@ -54,6 +54,7 @@ struct cs_etm_auxtrace { > > u8 sample_instructions; > > > > int num_cpu; > > + u64 latest_kernel_timestamp; > > u32 auxtrace_type; > > u64 branches_sample_type; > > u64 branches_id; > > @@ -1192,6 +1193,8 @@ static int cs_etm__synth_instruction_sample(struct > > cs_etm_queue *etmq, > > event->sample.header.misc = cs_etm__cpu_mode(etmq, addr); > > event->sample.header.size = sizeof(struct perf_event_header); > > > > + if (!etm->timeless_decoding) > > + sample.time = etm->latest_kernel_timestamp; > > sample.ip = addr; > > sample.pid = tidq->pid; > > sample.tid = tidq->tid; > > @@ -1248,6 +1251,8 @@ static int cs_etm__synth_branch_sample(struct > > cs_etm_queue *etmq, > > event->sample.header.misc = cs_etm__cpu_mode(etmq, ip); > > event->sample.header.size = sizeof(struct perf_event_header); > > > > + if (!etm->timeless_decoding) > > + sample.time = etm->latest_kernel_timestamp; > > sample.ip = ip; > > sample.pid = tidq->pid; > > sample.tid = tidq->tid; > > @@ -2412,9 +2417,10 @@ static int cs_etm__process_event(struct perf_session > > *session, > > else if (event->header.type == PERF_RECORD_SWITCH_CPU_WIDE) > > return cs_etm__process_switch_cpu_wide(etm, event); > > > > - if (!etm->timeless_decoding && > > - event->header.type == PERF_RECORD_AUX) > > + if (!etm->timeless_decoding && event->header.type == PERF_RECORD_AUX) { > > + etm->latest_kernel_timestamp = sample_kernel_timestamp; > > return cs_etm__process_queues(etm); > > + } > > > > return 0; > > } > >

