Hi John, On Thu, Jan 02, 2020 at 05:44:39PM +0000, John Garry wrote: > And for the overall system, we have: > > PerfTop: 85864 irqs/sec kernel:89.6% exact: 0.0% lost: 0/34434 drop: > 0/40116 [4000Hz cycles], (all, 96 CPUs) > -------------------------------------------------------------------------------------------------------------------------- > > 27.43% [kernel] [k] arm_smmu_cmdq_issue_cmdlist > 11.71% [kernel] [k] _raw_spin_unlock_irqrestore > 6.35% [kernel] [k] _raw_spin_unlock_irq > 2.65% [kernel] [k] get_user_pages_fast > 2.03% [kernel] [k] __slab_free > 1.55% [kernel] [k] tick_nohz_idle_exit > 1.47% [kernel] [k] arm_lpae_map > 1.39% [kernel] [k] __fget > 1.14% [kernel] [k] __lock_text_start > 1.09% [kernel] [k] _raw_spin_lock > 1.08% [kernel] [k] bio_release_pages.part.42 > 1.03% [kernel] [k] __sbitmap_get_word > 0.97% [kernel] [k] arm_smmu_atc_inv_domain.constprop.42 > 0.91% [kernel] [k] fput_many > 0.88% [kernel] [k] __arm_lpae_map > > One thing to note is that we still spend an appreciable amount of time in > arm_smmu_atc_inv_domain(), which is disappointing when considering it should > effectively be a noop. > > As for arm_smmu_cmdq_issue_cmdlist(), I do note that during the testing our > batch size is 1, so we're not seeing the real benefit of the batching. I > can't help but think that we could improve this code to try to combine CMD > SYNCs for small batches. > > Anyway, let me know your thoughts or any questions. I'll have a look if a > get a chance for other possible bottlenecks.
Did you ever get any more information on this? I don't have any SMMUv3 hardware any more, so I can't really dig into this myself. Will _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu