On 21.10.2020 10:34, Namhyung Kim wrote: > On Wed, Oct 14, 2020 at 9:09 PM Alexey Budankov > <alexey.budan...@linux.intel.com> wrote: >> >> Hi, >> >> On 14.10.2020 13:52, Namhyung Kim wrote: >>> Hi, >>> >>> On Mon, Oct 12, 2020 at 6:01 PM Alexey Budankov >>> <alexey.budan...@linux.intel.com> wrote: >>>> >>>> >>>> Write trace data into per mmap trace files located >>>> at data directory. Streaming thread adjusts its affinity >>>> according to mask of the buffer being processed. >>>> >>>> Signed-off-by: Alexey Budankov <alexey.budan...@linux.intel.com> >>>> --- >>> [SNIP] >>>> @@ -1184,8 +1203,12 @@ static int record__mmap_read_evlist(struct record >>>> *rec, struct evlist *evlist, >>>> /* >>>> * Mark the round finished in case we wrote >>>> * at least one event. >>>> + * >>>> + * No need for round events in directory mode, >>>> + * because per-cpu maps and files have data >>>> + * sorted by kernel. >>>> */ >>>> - if (bytes_written != rec->bytes_written) >>>> + if (!record__threads_enabled(rec) && bytes_written != >>>> rec->bytes_written) >>>> rc = record__write(rec, NULL, &finished_round_event, >>>> sizeof(finished_round_event)); >>> >>> This means it needs to keep all events in the ordered events queue >>> when perf report processes the data, right? >> >> Looks so. > > Maybe it's not related to this directly. But we need to think about > how to make perf report faster and more efficient as well.
Makes sense. Agreed. > > In my previous attempt, I separated samples from other events > to be in different mmaps so they were saved to different files > (or in a separate part of the data file). > > And perf report processes the meta events (FORK/MMAP/...) > first to construct the system image and then processes samples > with multi-threads. Looks like separation to global, per-process events and per-thread ones. Alternative algorithm could possibly be multi-passing of trace data. First pass is to capture global events and build process state overtime progress picture. Second pass is to capture and map per-thread samples and/or other events into process state according to samples and events time. > > Once it has the image, it could bypass the ordered events queue > entirely. > > Thanks > > Namhyung > Thanks, Alexei