Hi Peter, On Tue, Apr 20, 2021 at 7:28 PM Peter Zijlstra <pet...@infradead.org> wrote: > > On Fri, Apr 16, 2021 at 06:49:09PM +0900, Namhyung Kim wrote: > > On Thu, Apr 15, 2021 at 11:51 PM Peter Zijlstra <pet...@infradead.org> > > wrote: > > > > +static void perf_update_cgroup_node(struct perf_event *event, struct > > > > cgroup *cgrp) > > > > +{ > > > > + u64 delta_count, delta_time_enabled, delta_time_running; > > > > + int i; > > > > + > > > > + if (event->cgrp_node_count == 0) > > > > + goto out; > > > > + > > > > + delta_count = local64_read(&event->count) - > > > > event->cgrp_node_count; > > From here... > > > > > + delta_time_enabled = event->total_time_enabled - > > > > event->cgrp_node_time_enabled; > > > > + delta_time_running = event->total_time_running - > > > > event->cgrp_node_time_running; > > > > + > > > > + /* account delta to all ancestor cgroups */ > > > > + for (i = 0; i <= cgrp->level; i++) { > > > > + struct perf_cgroup_node *node; > > > > + > > > > + node = find_cgroup_node(event, cgrp->ancestor_ids[i]); > > > > + if (node) { > > > > + node->count += delta_count; > > > > + node->time_enabled += delta_time_enabled; > > > > + node->time_running += delta_time_running; > > > > + } > > > > + } > > ... till here, NMI could hit and increment event->count, which then > means that: > > > > > + > > > > +out: > > > > + event->cgrp_node_count = local64_read(&event->count); > > This load doesn't match the delta_count load and events will go missing. > > Obviously correct solution is: > > event->cgrp_node_count += delta_count; > > > > > > + event->cgrp_node_time_enabled = event->total_time_enabled; > > > > + event->cgrp_node_time_running = event->total_time_running; > > And while total_time doesn't have that problem, consistency would then > have you do: > > event->cgrp_node_time_foo += delta_time_foo; > > > > > > > This is wrong; there's no guarantee these are the same values you read > > > at the begin, IOW you could be loosing events. > > > > Could you please elaborate? > > You forgot NMI.
Thanks for your explanation. Maybe I'm missing something but this event is basically for counting and doesn't allow sampling. Do you say it's affected by other sampling events? Note that it's not reading from the PMU here, what it reads is a snapshot of last pmu->read(event) afaik. Thanks, Namhyung