On Tue, Jan 10, 2017 at 12:20:00PM -0800, David Carrillo-Cisneros wrote: > On Tue, Jan 10, 2017 at 6:14 AM, Mark Rutland <mark.rutl...@arm.com> wrote: > > On Tue, Jan 10, 2017 at 02:24:58AM -0800, David Carrillo-Cisneros wrote: > > For example, on a big.LITTLE system, big and little CPU PMUs share the > > same context, but their events are mutually incompatible. On big CPUs we > > only want to consider the sub-tree of big events, and on little CPUs we > > only want to consider little events. Hence, we need to be abel to search > > by PMU. > > I see it now. So, if PMU were added to the rb-tree keys. How can the > generic code know what's the PMU of the current CPU?
I'm not immediately sure. We might need to augment struct pmu or perf_event_context with information such that we can determine that. That's not something I'd considered in great detail, and I'm not sure if peter had something in mind. > > For SW PMUs, pmu::add() should never fail, and regardless of the order > > of the list we should be able to pmu::add() all events. Given that, why > > does the manner in which rotation occurs matter for SW PMUs? > > > >> Another complicatino is that using ctx->time (or timestamp) implies that > >> groups added during the same context switch may not have unique key. > >> This increases the complexity of that finds all events in the rb-tree > >> that are within a time interval. > > > > Could you elaborate on this? I don't understand what the problem is > > here. If we need uniqueness where {pmu,cpu,runtime} are equal, can't we > > extend the comparison to {pmu,cpu,runtime,event pointer}? That way > > everything we need is already implicit in the event, and we don't need > > perf_event::rbtree_key nor do we need > > perf_event_context::nr_inactive_added. > > Yes, we could extend the comparison. But I am trying to keep the key a > u64 to speed up things. > > I found it easier to simply create a counter and use it as an equivalent to > (timestamp, unique id). Both ways induce the same order of events. As I mentioned before, I believe that Peter's intent was to consider runtime, rather than a last-scheduled timestamp, so I don't think the counter is equivalent. It might be that either way is fine; I'll leave it to Peter to weigh in. Do we have any benchmark figures either way? Thanks, Mark.