core: use rb-tree to sched in event groups

Liang, Kan Wed, 11 Jan 2017 12:34:13 -0800


.
> 
> Kan, in your per-cpu event list patch you mentioned that you saw a large
> overhead in perf_iterate_ctx() when skipping events for other CPUs.
> Which callers of perf_iterate_ctx() specifically was that problematic for? Do
> those callers only care about the *active* events, for example?
>


Based on my test, the large overhead was observed in perf_iterate_sb.
Yes, it only cares about the *active* events.

> Maybe the overhead of skipping !current_cpu events is ok at sched_in time
> in most cases. If the overhead of skipping those only matters for a subset of
> perf_iterate_ctx() callers, then maybe we can optimise them in another
> fashion (e.g. use the active events lists, or a new list specific to that 
> iterate
> user, depending on what they actually need).
> That way we can drop cpu from the sort.
> 
> > The rb-tree allows us to find events with minimum and maximum
> > timestamp for a given CPU/cgroup + flexible type. The list
> > ctx->inactive_groups is sorted by timestamp.
> >
> > We could find a list position for the first event of each CPU/cgroup
> > that is to be scheduled and iterate over all of them, selecting events
> > from the list's head with the smallest timestampt, but it's too complicated.
> >
> > A simpler alternative is to find the smallest subinterval of
> > ctx->inactive_groups that contains all eligible events. Let's call
> > ctx->this
> > minimum subinterval S.
> >
> > S is formed of smaller subintervals, no necessarily exclusive, intervals.
> > Each one has all the events that are eligible for a given CPU or cgroup.
> > We find S by searching for the start/end of each one of these
> > CPU/cgroup subintervals and combining them. The drawback is that there
> > may be events in S that are not eligible (since ctx->inactive_group is
> > in stamp order).
> 
> The other drawback is that this is not fair, since CPU comes before runtime
> in the sort order. You'll always try some events before others (e.g. cpu == -1
> before cpu == current), before considering runtime. I believe this means
> that events can be permanently starved.
> 
> So either we need to fold those together somehow, or drop CPU from the
> sort order (assuming that we can, as above).
> 
> Thanks,
> Mark.

RE: [RFC 3/6] perf/core: use rb-tree to sched in event groups

Reply via email to