. > > Kan, in your per-cpu event list patch you mentioned that you saw a large > overhead in perf_iterate_ctx() when skipping events for other CPUs. > Which callers of perf_iterate_ctx() specifically was that problematic for? Do > those callers only care about the *active* events, for example? >
Based on my test, the large overhead was observed in perf_iterate_sb. Yes, it only cares about the *active* events. > Maybe the overhead of skipping !current_cpu events is ok at sched_in time > in most cases. If the overhead of skipping those only matters for a subset of > perf_iterate_ctx() callers, then maybe we can optimise them in another > fashion (e.g. use the active events lists, or a new list specific to that > iterate > user, depending on what they actually need). > That way we can drop cpu from the sort. > > > The rb-tree allows us to find events with minimum and maximum > > timestamp for a given CPU/cgroup + flexible type. The list > > ctx->inactive_groups is sorted by timestamp. > > > > We could find a list position for the first event of each CPU/cgroup > > that is to be scheduled and iterate over all of them, selecting events > > from the list's head with the smallest timestampt, but it's too complicated. > > > > A simpler alternative is to find the smallest subinterval of > > ctx->inactive_groups that contains all eligible events. Let's call > > ctx->this > > minimum subinterval S. > > > > S is formed of smaller subintervals, no necessarily exclusive, intervals. > > Each one has all the events that are eligible for a given CPU or cgroup. > > We find S by searching for the start/end of each one of these > > CPU/cgroup subintervals and combining them. The drawback is that there > > may be events in S that are not eligible (since ctx->inactive_group is > > in stamp order). > > The other drawback is that this is not fair, since CPU comes before runtime > in the sort order. You'll always try some events before others (e.g. cpu == -1 > before cpu == current), before considering runtime. I believe this means > that events can be permanently starved. > > So either we need to fold those together somehow, or drop CPU from the > sort order (assuming that we can, as above). > > Thanks, > Mark.

