On Tue, Feb 06, 2018 at 12:58:23PM -0500, Liang, Kan wrote: > > > > With the exception of handling 'empty' buffers, I ended up with the > > below. Please try again. > > > > There are two small errors. After fixing them, the patch works well.
Well, it still doesn't do A, two read()s without PEBS record in between. So that needs fixing. What 3/5 does, call x86_perf_event_update() after drain_pebs() is actively wrong after this patch. > > + > > + /* > > + * Careful, not all hw sign-extends above the physical width > > + * of the counter. > > + */ > > + delta = (new_raw_count << shift) - (prev_raw_count << shift); > > + delta >>= shift; > > new_raw_count could be smaller than prev_raw_count. > The sign bit will be set. The delta>> could be wrong. > > I think we can add a period here to prevent it. > + delta = (period << shift) + (new_raw_count << shift) - > + (prev_raw_count << shift); > + delta >>= shift; > ...... > + local64_add(delta + period * (count - 1), &event->count); > Right it does, but that wrecks case A again, because then we get here with !@count. Maybe something like: s64 new, old; new = ((s64)(new_raw_count << shift) >> shift); old = ((s64)(old_raw_count << shift) >> shift); local64_add(new - old + count * period, &event->count); And then make intel_pmu_drain_pebs_*(), call this function even when !n.