Hello, I'd like to revive the topic...
On Tue, 2012-10-16 at 18:23 +0100, Peter Zijlstra wrote: > On Tue, 2012-10-16 at 12:13 +0200, Stephane Eranian wrote: > > Hi, > > > > There are many situations where we want to correlate events happening at > > the user level with samples recorded in the perf_event kernel sampling > > buffer. > > For instance, we might want to correlate the call to a function or creation > > of > > a file with samples. Similarly, when we want to monitor a JVM with jitted > > code, > > we need to be able to correlate jitted code mappings with perf event samples > > for symbolization. > > > > Perf_events allows timestamping of samples with PERF_SAMPLE_TIME. > > That causes each PERF_RECORD_SAMPLE to include a timestamp > > generated by calling the local_clock() -> sched_clock_cpu() function. > > > > To make correlating user vs. kernel samples easy, we would need to > > access that sched_clock() functionality. However, none of the existing > > clock calls permit this at this point. They all return timestamps which are > > not using the same source and/or offset as sched_clock. > > > > I believe a similar issue exists with the ftrace subsystem. > > > > The problem needs to be adressed in a portable manner. Solutions > > based on reading TSC for the user level to reconstruct sched_clock() > > don't seem appropriate to me. > > > > One possibility to address this limitation would be to extend > > clock_gettime() > > with a new clock time, e.g., CLOCK_PERF. > > > > However, I understand that sched_clock_cpu() provides ordering guarantees > > only > > when invoked on the same CPU repeatedly, i.e., it's not globally > > synchronized. > > But we already have to deal with this problem when merging samples obtained > > from different CPU sampling buffer in per-thread mode. So this is not > > necessarily > > a showstopper. > > > > Alternatives could be to use uprobes but that's less practical to setup. > > > > Anyone with better ideas? > > You forgot to CC the time people ;-) > > I've no problem with adding CLOCK_PERF (or another/better name). > > Thomas, John? I've just faced the same issue - correlating an event in userspace with data from the perf stream, but to my mind what I want to get is a value returned by perf_clock() _in the current "session" context_. Stephane didn't like the idea of opening a "fake" perf descriptor in order to get the timestamp, but surely one must have the "session" already running to be interested in such data in the first place? So I think the ioctl() idea is not out of place here... How about the simple change below? Regards Pawel 8<--- >From 2ad51a27fbf64bf98cee190efc3fbd7002819692 Mon Sep 17 00:00:00 2001 From: Pawel Moll <pawel.m...@arm.com> Date: Fri, 1 Feb 2013 14:03:56 +0000 Subject: [PATCH] perf: Add ioctl to return current time value To co-relate user space events with the perf events stream a current (as in: "what time(stamp) is it now?") time value must be made available. This patch adds a perf ioctl that makes this possible. Signed-off-by: Pawel Moll <pawel.m...@arm.com> --- include/uapi/linux/perf_event.h | 1 + kernel/events/core.c | 8 ++++++++ 2 files changed, 9 insertions(+) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 4f63c05..b745fb0 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -316,6 +316,7 @@ struct perf_event_attr { #define PERF_EVENT_IOC_PERIOD _IOW('$', 4, __u64) #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5) #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *) +#define PERF_EVENT_IOC_GET_TIME _IOR('$', 7, __u64) enum perf_event_ioc_flags { PERF_IOC_FLAG_GROUP = 1U << 0, diff --git a/kernel/events/core.c b/kernel/events/core.c index 301079d..4202b1c 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -3298,6 +3298,14 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case PERF_EVENT_IOC_SET_FILTER: return perf_event_set_filter(event, (void __user *)arg); + case PERF_EVENT_IOC_GET_TIME: + { + u64 time = perf_clock(); + if (copy_to_user((void __user *)arg, &time, sizeof(time))) + return -EFAULT; + return 0; + } + default: return -ENOTTY; } -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/