On Mon, 29 Mar 2021 11:38:04 +0100, Jingyi Wang <wangjingy...@huawei.com> wrote: > > > > On 3/29/2021 5:55 PM, Marc Zyngier wrote: > > On Mon, 29 Mar 2021 09:52:08 +0100, > > Jingyi Wang <wangjingy...@huawei.com> wrote: > >> > >> IRM, bit[40] in ICC_SGI1R, determines how the generated SGIs > >> are distributed to PEs. If the bit is set, interrupts are routed > >> to all PEs in the system excluding "self". We use cpumask to > >> determine if this bit should be set and make use of that. > >> > >> This will reduce vm trap when broadcast IPIs are sent. > > > > I remember writing similar code about 4 years ago, only to realise > > what: > > > > - the cost of computing the resulting mask is pretty high for large > > machines > > - Linux almost never sends broadcast IPIs, so the complexity was all > > in vain > > > > What changed? Please provide supporting data showing how many IPIs we > > actually save, and for which workload. > Maybe we can implement send_IPI_allbutself hooks as other some other > archs instead of computing cpumask here?
The question remains: how often is that used? x86 uses it only for NMI (we don't broadcast our pseudo-NMI) and reboot, it seems. Anything I missed? Do we have a different use case on arm64? At the moment, this doesn't seem very useful. Thanks, M. -- Without deviation from the norm, progress is not possible.