Re: [PATCH 3/3] perf record: mmap output file

2013-10-16 Thread Ingo Molnar
* David Ahern wrote: > On 10/15/13 10:06 AM, Ingo Molnar wrote: > > splice() is very fast and should be able to process a lot of pages in > > one go, so the feedback loop should be pretty weak. mmap() triggers > > kernel code as well, every time we run out of the 64 MB window we got > > to r

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Peter Zijlstra
On Tue, Oct 15, 2013 at 06:06:46PM +0200, Ingo Molnar wrote: > > splice() is very fast and should be able to process a lot of pages in one > go, so the feedback loop should be pretty weak. mmap() triggers kernel > code as well, every time we run out of the 64 MB window we got to remap > it, rig

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread David Ahern
On 10/15/13 10:06 AM, Ingo Molnar wrote: splice() is very fast and should be able to process a lot of pages in one go, so the feedback loop should be pretty weak. mmap() triggers kernel code as well, every time we run out of the 64 MB window we got to remap it, right? Yes, 1 mmap, 1 munmap for

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Ingo Molnar
* David Ahern wrote: > On 10/15/13 9:27 AM, Ingo Molnar wrote: > > > >* Peter Zijlstra wrote: > > > >>On Tue, Oct 15, 2013 at 11:32:45AM -0300, Arnaldo Carvalho de Melo wrote: > >> > >>>Jiri and PeterZ probaby will have comments here... ;-) :-) > >> > >>The only complication with splice is the

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Peter Zijlstra
On Tue, Oct 15, 2013 at 05:27:47PM +0200, Ingo Molnar wrote: > Wanna send a patch for people to try? Looks like there's real interest in > speeding up perf record as much as possible! I've got a pile of pending bug reports to sort through before I can start making more bugs ;-) -- To unsubscrib

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread David Ahern
On 10/15/13 9:27 AM, Ingo Molnar wrote: * Peter Zijlstra wrote: On Tue, Oct 15, 2013 at 11:32:45AM -0300, Arnaldo Carvalho de Melo wrote: Jiri and PeterZ probaby will have comments here... ;-) :-) The only complication with splice is the vmalloc support; other than that it should be fairl

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Ingo Molnar
* Peter Zijlstra wrote: > On Tue, Oct 15, 2013 at 11:32:45AM -0300, Arnaldo Carvalho de Melo wrote: > > > Jiri and PeterZ probaby will have comments here... ;-) :-) > > The only complication with splice is the vmalloc support; other than > that it should be fairly straight fwd. In the initia

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Peter Zijlstra
On Tue, Oct 15, 2013 at 11:32:45AM -0300, Arnaldo Carvalho de Melo wrote: > Jiri and PeterZ probaby will have comments here... ;-) :-) The only complication with splice is the vmalloc support; other than that it should be fairly straight fwd. -- To unsubscribe from this list: send the line "unsub

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread Arnaldo Carvalho de Melo
Em Tue, Oct 15, 2013 at 08:04:15AM -0600, David Ahern escreveu: > On 10/8/13 11:59 PM, Ingo Molnar wrote: > > 2) > > Yet another method would be to avoid the copies altogether via the splice > > system-call - see: > > git grep splice kernel/trace/ > > To make splice low-overhead we'd have to

Re: [PATCH 3/3] perf record: mmap output file

2013-10-15 Thread David Ahern
On 10/8/13 11:59 PM, Ingo Molnar wrote: Here are some thoughts on how 'perf record' tracing performance could be further improved: 1) The use of non-temporal stores (MOVNTQ) to copy the ring-buffer into the file buffer makes sure the CPU cache is not trashed by the copying - which is the larges

Re: [PATCH 3/3] perf record: mmap output file

2013-10-09 Thread Mike Galbraith
On Tue, 2013-10-08 at 21:26 -0600, David Ahern wrote: > When recording raw_syscalls for the entire system, e.g., > perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1 > > you end up with a negative feedback loop as perf itself calls > write() fairly often. This patch handles the pr

Re: [PATCH 3/3] perf record: mmap output file

2013-10-08 Thread Ingo Molnar
* David Ahern wrote: > When recording raw_syscalls for the entire system, e.g., > perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1 > > you end up with a negative feedback loop as perf itself calls > write() fairly often. This patch handles the problem by mmap'ing the > file in

[PATCH 3/3] perf record: mmap output file

2013-10-08 Thread David Ahern
When recording raw_syscalls for the entire system, e.g., perf record -e raw_syscalls:*,sched:sched_switch -a -- sleep 1 you end up with a negative feedback loop as perf itself calls write() fairly often. This patch handles the problem by mmap'ing the file in chunks of 64M at a time and copies