On Tue, Oct 09, 2018 at 11:58:53AM +0300, Alexey Budankov wrote: > > Trace file offset is read once before mmaps iterating loop and written > back after all performance data enqueued for aio writing. Trace file offset > is incremented linearly after every successful aio write operation. > > record__aio_sync() blocks till completion of started AIO operation > and then proceeds. > > record__mmap_read_sync() implements a barrier for all incomplete > aio write requests. > > Signed-off-by: Alexey Budankov <alexey.budan...@linux.intel.com> > --- > Changes in v12: > - implemented record__aio_get/set_pos(), record__aio_enabled() > - implemented simple --aio option > Changes in v11: > - replacing the both lseek() syscalls in every loop iteration by the only > two syscalls just before and after the loop at record__mmap_read_evlist() > and advancing *in-flight* off file pos value at perf_mmap__aio_push() > Changes in v10: > - avoided lseek() setting file pos back in case of record__aio_write() > failure > - compacted code selecting between serial and AIO streaming > - optimized call places of record__mmap_read_sync() > Changes in v9: > - enable AIO streaming only when --aio-cblocks option is specified explicitly > Changes in v8: > - split AIO completion check into separate record__aio_complete() > Changes in v6: > - handled errno == EAGAIN case from aio_write(); > Changes in v5: > - data loss metrics decreased from 25% to 2x in trialed configuration; > - avoided nanosleep() prior calling aio_suspend(); > - switched to per cpu multi record__aio_sync() aio > - record_mmap_read_sync() now does global barrier just before > switching trace file or collection stop; > - resolved livelock on perf record -e intel_pt// -- dd if=/dev/zero > of=/dev/null count=100000 > Changes in v4: > - converted void *bf to struct perf_mmap *md in signatures > - written comment in perf_mmap__push() just before perf_mmap__get(); > - written comment in record__mmap_read_sync() on possible restarting > of aio_write() operation and releasing perf_mmap object after all; > - added perf_mmap__put() for the cases of failed aio_write(); > Changes in v3: > - written comments about nanosleep(0.5ms) call prior aio_suspend() > to cope with intrusiveness of its implementation in glibc; > - written comments about rationale behind coping profiling data > into mmap->data buffer; > --- > tools/perf/Documentation/perf-record.txt | 5 + > tools/perf/builtin-record.c | 220 > ++++++++++++++++++++++++++++++- > tools/perf/perf.h | 1 + > tools/perf/util/evlist.c | 6 +- > tools/perf/util/evlist.h | 2 +- > tools/perf/util/mmap.c | 86 +++++++++++- > tools/perf/util/mmap.h | 5 + > 7 files changed, 316 insertions(+), 9 deletions(-) > > diff --git a/tools/perf/Documentation/perf-record.txt > b/tools/perf/Documentation/perf-record.txt > index 246dee081efd..5cedb3e75434 100644 > --- a/tools/perf/Documentation/perf-record.txt > +++ b/tools/perf/Documentation/perf-record.txt > @@ -435,6 +435,11 @@ Specify vmlinux path which has debuginfo. > --buildid-all:: > Record build-id of all DSOs regardless whether it's actually hit or not. > > +--aio:: > +Enable asynchronous (Posix AIO) trace writing mode.
nit, there's an extra whitespace at the end of above line, making the 'git am' to not apply your patch thanks, jirka