Namhyung Kim <namhy...@kernel.org> writes: > Hi Peter and Kan, > > (Adding PPC folks) > > On Tue, Nov 17, 2020 at 2:01 PM Namhyung Kim <namhy...@kernel.org> wrote: >> >> Hello, >> >> On Thu, Nov 12, 2020 at 4:54 AM Liang, Kan <kan.li...@linux.intel.com> wrote: >> > >> > >> > >> > On 11/11/2020 11:25 AM, Peter Zijlstra wrote: >> > > On Mon, Nov 09, 2020 at 09:49:31AM -0500, Liang, Kan wrote: >> > > >> > >> - When the large PEBS was introduced (9c964efa4330), the sched_task() >> > >> should >> > >> be invoked to flush the PEBS buffer in each context switch. However, The >> > >> perf_sched_events in account_event() is not updated accordingly. The >> > >> perf_event_task_sched_* never be invoked for a pure per-CPU context. >> > >> Only >> > >> per-task event works. >> > >> At that time, the perf_pmu_sched_task() is outside of >> > >> perf_event_context_sched_in/out. It means that perf has to double >> > >> perf_pmu_disable() for per-task event. >> > > >> > >> - The patch 1 tries to fix broken per-CPU events. The CPU context >> > >> cannot be >> > >> retrieved from the task->perf_event_ctxp. So it has to be tracked in the >> > >> sched_cb_list. Yes, the code is very similar to the original codes, but >> > >> it >> > >> is actually the new code for per-CPU events. The optimization for >> > >> per-task >> > >> events is still kept. >> > >> For the case, which has both a CPU context and a task context, yes, >> > >> the >> > >> __perf_pmu_sched_task() in this patch is not invoked. Because the >> > >> sched_task() only need to be invoked once in a context switch. The >> > >> sched_task() will be eventually invoked in the task context. >> > > >> > > The thing is; your first two patches rely on PERF_ATTACH_SCHED_CB and >> > > only set that for large pebs. Are you sure the other users (Intel LBR >> > > and PowerPC BHRB) don't need it? >> > >> > I didn't set it for LBR, because the perf_sched_events is always enabled >> > for LBR. But, yes, we should explicitly set the PERF_ATTACH_SCHED_CB >> > for LBR. >> > >> > if (has_branch_stack(event)) >> > inc = true; >> > >> > > >> > > If they indeed do not require the pmu::sched_task() callback for CPU >> > > events, then I still think the whole perf_sched_cb_{inc,dec}() interface >> > >> > No, LBR requires the pmu::sched_task() callback for CPU events. >> > >> > Now, The LBR registers have to be reset in sched in even for CPU events. >> > >> > To fix the shorter LBR callstack issue for CPU events, we also need to >> > save/restore LBRs in pmu::sched_task(). >> > https://lore.kernel.org/lkml/1578495789-95006-4-git-send-email-kan.li...@linux.intel.com/ >> > >> > > is confusing at best. >> > > >> > > Can't we do something like this instead? >> > > >> > I think the below patch may have two issues. >> > - PERF_ATTACH_SCHED_CB is required for LBR (maybe PowerPC BHRB as well) >> > now. >> > - We may disable the large PEBS later if not all PEBS events support >> > large PEBS. The PMU need a way to notify the generic code to decrease >> > the nr_sched_task. >> >> Any updates on this? I've reviewed and tested Kan's patches >> and they all look good. >> >> Maybe we can talk to PPC folks to confirm the BHRB case? > > Can we move this forward? I saw patch 3/3 also adds PERF_ATTACH_SCHED_CB > for PowerPC too. But it'd be nice if ppc folks can confirm the change.
Sorry I've read the whole thread, but I'm still not entirely sure I understand the question. cheers