Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-08-28 Thread Elliot Berman




On 8/28/2023 3:56 PM, Ackerley Tng wrote:
> 1. Since the physical memory's representation is the inode and should be
> coupled to the virtual machine (as a concept, not struct kvm), should
> the binding/coupling be with the file, or the inode?
>

I've been working on Gunyah's implementation in parallel (not yet posted 
anywhere). Thus far, I've coupled the virtual machine struct to the 
struct file so that I can increment the file refcount when mapping the 
gmem to the virtual machine.


> 2. Should struct kvm still be bound to the file/inode at gmem file
> creation time, since
>
> + struct kvm isn't a good representation of a "virtual machine"
> + we currently don't have anything that really represents a "virtual
>   machine" without hardware support
>
>
> I'd also like to bring up another userspace use case that Google has:
> re-use of gmem files for rebooting guests when the KVM instance is
> destroyed and rebuilt.
>
> When rebooting a VM there are some steps relating to gmem that are
> performance-sensitive:
>
> a.  Zeroing pages from the old VM when we close a gmem file/inode
> b. Deallocating pages from the old VM when we close a gmem file/inode
> c.   Allocating pages for the new VM from the new gmem file/inode
> d.  Zeroing pages on page allocation
>
> We want to reuse the gmem file to save re-allocating pages (b. and c.),
> and one of the two page zeroing allocations (a. or d.).
>
> Binding the gmem file to a struct kvm on creation time means the gmem
> file can't be reused with another VM on reboot. Also, host userspace is
> forced to close the gmem file to allow the old VM to be freed.
>
> For other places where files pin KVM, like the stats fd pinning vCPUs, I
> guess that matters less since there isn't much of a penalty to close and
> re-open the stats fd.

I had a 3rd question that's related to how to wire the gmem up to a 
virtual machine:


I learned of a usecase to implement copy-on-write for gmem. The premise 
would be to have a "golden copy" of the memory that multiple virtual 
machines can map in as RO. If a virtual machine tries to write to those 
pages, they get copied to a virtual machine-specific page that isn't 
shared with other VMs. How do we track those pages?


Thanks,
Elliot


Re: [PATCH v7 24/24] iommu: Convert remaining simple drivers to domain_alloc_paging()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:38PM -0300, Jason Gunthorpe wrote:
> These drivers don't support IOMMU_DOMAIN_DMA, so this commit effectively
> allows them to support that mode.
> 
> The prior work to require default_domains makes this safe because every
> one of these drivers is either compilation incompatible with dma-iommu.c,
> or already establishing a default_domain. In both cases alloc_domain()
> will never be called with IOMMU_DOMAIN_DMA for these drivers so it is safe
> to drop the test.
> 
> Removing these tests clarifies that the domain allocation path is only
> about the functionality of a paging domain and has nothing to do with
> policy of how the paging domain is used for UNMANAGED/DMA/DMA_FQ.
> 
> Tested-by: Niklas Schnelle 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/msm_iommu.c| 7 ++-
>  drivers/iommu/mtk_iommu_v1.c | 7 ++-
>  drivers/iommu/omap-iommu.c   | 7 ++-
>  drivers/iommu/s390-iommu.c   | 7 ++-
>  4 files changed, 8 insertions(+), 20 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 23/24] iommu: Convert simple drivers with DOMAIN_DMA to domain_alloc_paging()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:37PM -0300, Jason Gunthorpe wrote:
> These drivers are all trivially converted since the function is only
> called if the domain type is going to be
> IOMMU_DOMAIN_UNMANAGED/DMA.
> 
> Tested-by: Heiko Stuebner 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c | 6 ++
>  drivers/iommu/exynos-iommu.c| 7 ++-
>  drivers/iommu/ipmmu-vmsa.c  | 7 ++-
>  drivers/iommu/mtk_iommu.c   | 7 ++-
>  drivers/iommu/rockchip-iommu.c  | 7 ++-
>  drivers/iommu/sprd-iommu.c  | 7 ++-
>  drivers/iommu/sun50i-iommu.c| 9 +++--
>  drivers/iommu/tegra-smmu.c  | 7 ++-
>  8 files changed, 17 insertions(+), 40 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 22/24] iommu: Add ops->domain_alloc_paging()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:36PM -0300, Jason Gunthorpe wrote:
> This callback requests the driver to create only a __IOMMU_DOMAIN_PAGING
> domain, so it saves a few lines in a lot of drivers needlessly checking
> the type.
> 
> More critically, this allows us to sweep out all the
> IOMMU_DOMAIN_UNMANAGED and IOMMU_DOMAIN_DMA checks from a lot of the
> drivers, simplifying what is going on in the code and ultimately removing
> the now-unused special cases in drivers where they did not support
> IOMMU_DOMAIN_DMA.
> 
> domain_alloc_paging() should return a struct iommu_domain that is
> functionally compatible with ARM_DMA_USE_IOMMU, dma-iommu.c and iommufd.
> 
> Be forwards looking and pass in a 'struct device *' argument. We can
> provide this when allocating the default_domain. No drivers will look at
> this.
> 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/iommu.c | 17 ++---
>  include/linux/iommu.h |  3 +++
>  2 files changed, 17 insertions(+), 3 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



[PATCH] Update creation of flash_block_cache to accout for potential panic

2023-08-28 Thread Audra Mitchell
With PPC builds enabling CONFIG_HARDENED_USERCOPY, interacting with the RunTime
Abstraction Services (RTAS) firmware by writing to
/proc/powerpc/rtas/firmware_flash will end up triggering the mm/usercopy.c:101
assertion:

[   38.647148] rw /proc/powerpc/rtas/firmware_flash
[   38.650254] usercopy: Kernel memory overwrite attempt detected to SLUB 
object 'rtas_flash_cache' (offset 0, size 34)!
[   38.650264] [ cut here ]
[   38.650264] kernel BUG at mm/usercopy.c:101!
[   38.650267] Oops: Exception in kernel mode, sig: 5 [#1]
[   38.650283] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[   38.650287] Modules linked in: binfmt_misc loop rfkill bonding tls sunrpc 
pseries_rng drm fuse drm_panel_orientation_quirks xfs libcrc32c sd_mod t10_pi 
sg ibmveth ibmvscsi scsi_transport_srp vmx_crypto
[   38.650306] CPU: 0 PID: 12898 Comm: echo Kdump: loaded Not tainted 
5.14.0-299.el9.ppc64le #1
[   38.650311] NIP:  c056d870 LR: c056d86c CTR: c0886090
[   38.650314] REGS: c000ba6e78c0 TRAP: 0700   Not tainted  
(5.14.0-299.el9.ppc64le)
[   38.650318] MSR:  80029033   CR: 28002203  
XER: 2004
[   38.650326] CFAR: c01f76fc IRQMASK: 0
[   38.650326] GPR00: c056d86c c000ba6e7b60 c2b15a00 
0069
[   38.650326] GPR04: c00fff447f90 c00fff4ccd00 000f 
0027
[   38.650326] GPR08:  c00fff44adc0 000ffd2f 
2000
[   38.650326] GPR12: 6174722720746365 c2ea  

[   38.650326] GPR16:    

[   38.650326] GPR20:   0002 
0001
[   38.650326] GPR24:  00012eef55a0 c25f39e0 
c000b988d000
[   38.650326] GPR28: c000b988d022 0022  
c134d6e8
[   38.650366] NIP [c056d870] usercopy_abort+0xb0/0xc0
[   38.650373] LR [c056d86c] usercopy_abort+0xac/0xc0
[   38.650377] Call Trace:
[   38.650379] [c000ba6e7b60] [c056d86c] usercopy_abort+0xac/0xc0 
(unreliable)
[   38.650384] [c000ba6e7be0] [c05178f0] 
__check_heap_object+0xf0/0x120
[   38.650389] [c000ba6e7c00] [c056d5e0] 
check_heap_object+0x1f0/0x220
[   38.650394] [c000ba6e7c40] [c056d6a0] 
__check_object_size+0x90/0x1b0
[   38.650399] [c000ba6e7c80] [c00462fc] 
rtas_flash_write+0x11c/0x2b0
[   38.650404] [c000ba6e7ce0] [c064d2ec] proc_reg_write+0xfc/0x160
[   38.650409] [c000ba6e7d10] [c0579e64] vfs_write+0xe4/0x390
[   38.650413] [c000ba6e7d60] [c057a414] ksys_write+0x84/0x140
[   38.650417] [c000ba6e7db0] [c002f314] 
system_call_exception+0x164/0x310
[   38.650421] [c000ba6e7e10] [c000bfe8] 
system_call_vectored_common+0xe8/0x278
[   38.650426] --- interrupt: 3000 at 0x7fff87f3aa34
[   38.650430] NIP:  7fff87f3aa34 LR:  CTR: 
[   38.650433] REGS: c000ba6e7e80 TRAP: 3000   Not tainted  
(5.14.0-299.el9.ppc64le)
[   38.650436] MSR:  8280f033   CR: 
42002408  XER: 
[   38.650446] IRQMASK: 0

This used to be caught with a warning in __check_heap_object to allow impacted
drivers time to update to kmem_cache_create_usercopy, but commit 53944f171a89d
("mm: remove HARDENED_USERCOPY_FALLBACK") removed that check. To resolve this
issue, update the creation of the flash_block_cache to use
kmem_cache_create_usercopy with a default size of RTAS_BLK_SIZE.

Signed-off-by: Audra Mitchell 
---
 arch/powerpc/kernel/rtas_flash.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
index a99179d83538..0a156a600f31 100644
--- a/arch/powerpc/kernel/rtas_flash.c
+++ b/arch/powerpc/kernel/rtas_flash.c
@@ -710,9 +710,9 @@ static int __init rtas_flash_init(void)
if (!rtas_validate_flash_data.buf)
return -ENOMEM;
 
-   flash_block_cache = kmem_cache_create("rtas_flash_cache",
+   flash_block_cache = kmem_cache_create_usercopy("rtas_flash_cache",
  RTAS_BLK_SIZE, RTAS_BLK_SIZE, 0,
- NULL);
+ 0, RTAS_BLK_SIZE, NULL);
if (!flash_block_cache) {
printk(KERN_ERR "%s: failed to create block cache\n",
__func__);
-- 
2.40.1



Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory

2023-08-28 Thread Ackerley Tng
Sean Christopherson  writes:

> On Mon, Aug 21, 2023, Ackerley Tng wrote:
>> Sean Christopherson  writes:
>>
>> > On Tue, Aug 15, 2023, Ackerley Tng wrote:
>> >> Sean Christopherson  writes:
>> >> > Nullifying the KVM pointer isn't sufficient, because without additional 
>> >> > actions
>> >> > userspace could extract data from a VM by deleting its memslots and 
>> >> > then binding
>> >> > the guest_memfd to an attacker controlled VM.  Or more likely with TDX 
>> >> > and SNP,
>> >> > induce badness by coercing KVM into mapping memory into a guest with 
>> >> > the wrong
>> >> > ASID/HKID.
>> >> >
>> >> > I can think of three ways to handle that:
>> >> >
>> >> >   (a) prevent a different VM from *ever* binding to the gmem instance
>> >> >   (b) free/zero physical pages when unbinding
>> >> >   (c) free/zero when binding to a different VM
>> >> >
>> >> > Option (a) is easy, but that pretty much defeats the purpose of 
>> >> > decopuling
>> >> > guest_memfd from a VM.
>> >> >
>> >> > Option (b) isn't hard to implement, but it screws up the lifecycle of 
>> >> > the memory,
>> >> > e.g. would require memory when a memslot is deleted.  That isn't 
>> >> > necessarily a
>> >> > deal-breaker, but it runs counter to how KVM memlots currently operate. 
>> >> >  Memslots
>> >> > are basically just weird page tables, e.g. deleting a memslot doesn't 
>> >> > have any
>> >> > impact on the underlying data in memory.  TDX throws a wrench in this 
>> >> > as removing
>> >> > a page from the Secure EPT is effectively destructive to the data 
>> >> > (can't be mapped
>> >> > back in to the VM without zeroing the data), but IMO that's an oddity 
>> >> > with TDX and
>> >> > not necessarily something we want to carry over to other VM types.
>> >> >
>> >> > There would also be performance implications (probably a non-issue in 
>> >> > practice),
>> >> > and weirdness if/when we get to sharing, linking and/or mmap()ing gmem. 
>> >> >  E.g. what
>> >> > should happen if the last memslot (binding) is deleted, but there 
>> >> > outstanding userspace
>> >> > mappings?
>> >> >
>> >> > Option (c) is better from a lifecycle perspective, but it adds its own 
>> >> > flavor of
>> >> > complexity, e.g. the performant way to reclaim TDX memory requires the 
>> >> > TDMR
>> >> > (effectively the VM pointer), and so a deferred relcaim doesn't really 
>> >> > work for
>> >> > TDX.  And I'm pretty sure it *can't* work for SNP, because RMP entries 
>> >> > must not
>> >> > outlive the VM; KVM can't reuse an ASID if there are pages assigned to 
>> >> > that ASID
>> >> > in the RMP, i.e. until all memory belonging to the VM has been fully 
>> >> > freed.
>
> ...
>
>> I agree with you that nulling the KVM pointer is insufficient to keep
>> host userspace out of the TCB. Among the three options (a) preventing a
>> different VM (HKID/ASID) from binding to the gmem instance, or zeroing
>> the memory either (b) on unbinding, or (c) on binding to another VM
>> (HKID/ASID),
>>
>> (a) sounds like adding a check issued to TDX/SNP upon binding and this
>> check would just return OK for software-protected VMs (line of sight
>> to removing host userspace from TCB).
>>
>> Or, we could go further for software-protected VMs and add tracking in
>> the inode to prevent the same inode from being bound to different
>> "HKID/ASID"s, perhaps like this:
>>
>> + On first binding, store the KVM pointer in the inode - not file (but
>>   not hold a refcount)
>> + On rebinding, check that the KVM matches the pointer in the inode
>> + On intra-host migration, update the KVM pointer in the inode to allow
>>   binding to the new struct kvm
>>
>> I think you meant associating the file with a struct kvm at creation
>> time as an implementation for (a), but technically since the inode is
>> the representation of memory, tracking of struct kvm should be with the
>> inode instead of the file.
>>
>> (b) You're right that this messes up the lifecycle of the memory and
>> wouldn't work with intra-host migration.
>>
>> (c) sounds like doing the clearing on a check similar to that of (a)
>
> Sort of, though it's much nastier, because it requires the "old" KVM instance 
> to
> be alive enough to support various operations.  I.e. we'd have to make 
> stronger
> guarantees about exactly when the handoff/transition could happen.
>

Good point!

>> If we track struct kvm with the inode, then I think (a), (b) and (c) can
>> be independent of the refcounting method. What do you think?
>
> No go.  Because again, the inode (physical memory) is coupled to the virtual 
> machine
> as a thing, not to a "struct kvm".  Or more concretely, the inode is coupled 
> to an
> ASID or an HKID, and there can be multiple "struct kvm" objects associated 
> with a
> single ASID.  And at some point in the future, I suspect we'll have multiple 
> KVM
> objects per HKID too.
>
> The current SEV use case is for the migration helper, where two KVM objects 
> share
> a single ASID (the "real" VM 

Re: [PATCH v7 21/24] iommu: Add __iommu_group_domain_alloc()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:35PM -0300, Jason Gunthorpe wrote:
> Allocate a domain from a group. Automatically obtains the iommu_ops to use
> from the device list of the group. Convert the internal callers to use it.
> 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---

Reviewed-by: Jerry Snitselaar 



Re: [PATCH] Update creation of flash_block_cache to accout for potential panic

2023-08-28 Thread Nathan Lynch
Audra Mitchell  writes:
> With PPC builds enabling CONFIG_HARDENED_USERCOPY, interacting with the 
> RunTime
> Abstraction Services (RTAS) firmware by writing to
> /proc/powerpc/rtas/firmware_flash will end up triggering the mm/usercopy.c:101
> assertion:

Thanks, this was fixed already:

4f3175979e62 "powerpc/rtas_flash: allow user copy to flash block cache objects"


Re: [PATCH v7 20/24] iommu: Require a default_domain for all iommu drivers

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:34PM -0300, Jason Gunthorpe wrote:
> At this point every iommu driver will cause a default_domain to be
> selected, so we can finally remove this gap from the core code.
> 
> The following table explains what each driver supports and what the
> resulting default_domain will be:
> 
> ops->defaut_domain
> IDENTITY   DMA  PLATFORMv  ARM32  
> dma-iommu  ARCH
> amd/iommu.c Y   Y   N/A either
> apple-dart.cY   Y   N/A either
> arm-smmu.c  Y   Y   IDENTITYeither
> qcom_iommu.cG   Y   IDENTITYeither
> arm-smmu-v3.c   Y   Y   N/A either
> exynos-iommu.c  G   Y   IDENTITYeither
> fsl_pamu_domain.c   Y   Y   N/A N/A   
>   PLATFORM
> intel/iommu.c   Y   Y   N/A either
> ipmmu-vmsa.cG   Y   IDENTITYeither
> msm_iommu.c G   IDENTITYN/A
> mtk_iommu.c G   Y   IDENTITYeither
> mtk_iommu_v1.c  G   IDENTITYN/A
> omap-iommu.cG   IDENTITYN/A
> rockchip-iommu.cG   Y   IDENTITYeither
> s390-iommu.cY   Y   N/A N/A   
>   PLATFORM
> sprd-iommu.cY   N/A DMA
> sun50i-iommu.c  G   Y   IDENTITYeither
> tegra-smmu.cG   Y   IDENTITY
> IDENTITY
> virtio-iommu.c  Y   Y   N/A either
> spapr   Y   Y   N/A N/A   
>   PLATFORM
>  * G means ops->identity_domain is used
>  * N/A means the driver will not compile in this configuration
> 
> ARM32 drivers select an IDENTITY default domain through either the
> ops->identity_domain or directly requesting an IDENTIY domain through
> alloc_domain().
> 
> In ARM64 mode tegra-smmu will still block the use of dma-iommu.c and
> forces an IDENTITY domain.
> 
> S390 uses a PLATFORM domain to represent when the dma_ops are set to the
> s390 iommu code.
> 
> fsl_pamu uses an PLATFORM domain.
> 
> POWER SPAPR uses PLATFORM and blocking to enable its weird VFIO mode.
> 
> The x86 drivers continue unchanged.
> 
> After this patch group->default_domain is only NULL for a short period
> during bus iommu probing while all the groups are constituted. Otherwise
> it is always !NULL.
> 
> This completes changing the iommu subsystem driver contract to a system
> where the current iommu_domain always represents some form of translation
> and the driver is continuously asserting a definable translation mode.
> 
> It resolves the confusion that the original ops->detach_dev() caused
> around what translation, exactly, is the IOMMU performing after
> detach. There were at least three different answers to that question in
> the tree, they are all now clearly named with domain types.
> 
> Tested-by: Heiko Stuebner 
> Tested-by: Niklas Schnelle 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/iommu.c | 22 +++---
>  1 file changed, 7 insertions(+), 15 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 19/24] iommu/sun50i: Add an IOMMU_IDENTITIY_DOMAIN

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:33PM -0300, Jason Gunthorpe wrote:
> Prior to commit 1b932ceddd19 ("iommu: Remove detach_dev callbacks") the
> sun50i_iommu_detach_device() function was being called by
> ops->detach_dev().
> 
> This is an IDENTITY domain so convert sun50i_iommu_detach_device() into
> sun50i_iommu_identity_attach() and a full IDENTITY domain and thus hook it
> back up the same was as the old ops->detach_dev().
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/sun50i-iommu.c | 26 +++---
>  1 file changed, 19 insertions(+), 7 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 18/24] iommu/mtk_iommu: Add an IOMMU_IDENTITIY_DOMAIN

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:32PM -0300, Jason Gunthorpe wrote:
> This brings back the ops->detach_dev() code that commit
> 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it
> into an IDENTITY domain.
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/mtk_iommu.c | 23 +++
>  1 file changed, 23 insertions(+)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 17/24] iommu/ipmmu: Add an IOMMU_IDENTITIY_DOMAIN

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:31PM -0300, Jason Gunthorpe wrote:
> This brings back the ops->detach_dev() code that commit
> 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it
> into an IDENTITY domain.
> 
> Also reverts commit 584d334b1393 ("iommu/ipmmu-vmsa: Remove
> ipmmu_utlb_disable()")
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/ipmmu-vmsa.c | 43 ++
>  1 file changed, 43 insertions(+)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 16/24] iommu/qcom_iommu: Add an IOMMU_IDENTITIY_DOMAIN

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:30PM -0300, Jason Gunthorpe wrote:
> This brings back the ops->detach_dev() code that commit
> 1b932ceddd19 ("iommu: Remove detach_dev callbacks") deleted and turns it
> into an IDENTITY domain.
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/arm/arm-smmu/qcom_iommu.c | 39 +
>  1 file changed, 39 insertions(+)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 15/24] iommu: Remove ops->set_platform_dma_ops()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:29PM -0300, Jason Gunthorpe wrote:
> All drivers are now using IDENTITY or PLATFORM domains for what this did,
> we can remove it now. It is no longer possible to attach to a NULL domain.
> 
> Tested-by: Heiko Stuebner 
> Tested-by: Niklas Schnelle 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/iommu.c | 30 +-
>  include/linux/iommu.h |  4 
>  2 files changed, 5 insertions(+), 29 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 14/24] iommu/msm: Implement an IDENTITY domain

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:28PM -0300, Jason Gunthorpe wrote:
> What msm does during msm_iommu_set_platform_dma() is actually putting the
> iommu into identity mode.
> 
> Move to the new core support for ARM_DMA_USE_IOMMU by defining
> ops->identity_domain.
> 
> This driver does not support IOMMU_DOMAIN_DMA, however it cannot be
> compiled on ARM64 either. Most likely it is fine to support dma-iommu.c
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/msm_iommu.c | 23 +++
>  1 file changed, 19 insertions(+), 4 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 13/24] iommu/omap: Implement an IDENTITY domain

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:27PM -0300, Jason Gunthorpe wrote:
> What omap does during omap_iommu_set_platform_dma() is actually putting
> the iommu into identity mode.
> 
> Move to the new core support for ARM_DMA_USE_IOMMU by defining
> ops->identity_domain.
> 
> This driver does not support IOMMU_DOMAIN_DMA, however it cannot be
> compiled on ARM64 either. Most likely it is fine to support dma-iommu.c
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/omap-iommu.c | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



[PATCH v2] reapply: powerpc/xmon: Relax frame size for clang

2023-08-28 Thread Nick Desaulniers
This is a manual revert of commit
7f3c5d099b6f8452dc4dcfe4179ea48e6a13d0eb, but using
ccflags-$(CONFIG_CC_IS_CLANG) which is shorter.

Turns out that this is reproducible still under specific compiler
versions (mea culpa: I did not test every supported version of clang),
and even a few randconfigs bots found.

We'll have to revisit this again in the future, for now back this out.

Reported-by: Nathan Chancellor 
Closes: 
https://github.com/ClangBuiltLinux/linux/issues/252#issuecomment-1690371256
Reported-by: kernel test robot 
Closes: https://lore.kernel.org/llvm/202308260344.vc4giuk7-...@intel.com/
Suggested-by: Nathan Chancellor 
Reviewed-by: Nathan Chancellor 
Signed-off-by: Nick Desaulniers 
---
Changes in v2:
- Use ccflags-$(CONFIG_CC_IS_CLANG) as per Nathan.
- Move that to be below the initial setting of ccflags-y as per Nathan.
- Add Nathan's Suggested-by and Reviewed-by tags.
- Update commit message slightly, including oneline.
- Link to v1: 
https://lore.kernel.org/r/20230828-ppc_rerevert-v1-1-74f55b818...@google.com
---
 arch/powerpc/xmon/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index 7705aa74a24d..682c7c0a6f77 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -12,6 +12,10 @@ ccflags-remove-$(CONFIG_FUNCTION_TRACER) += 
$(CC_FLAGS_FTRACE)
 
 ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
 
+# Clang stores addresses on the stack causing the frame size to blow
+# out. See https://github.com/ClangBuiltLinux/linux/issues/252
+ccflags-$(CONFIG_CC_IS_CLANG) += -Wframe-larger-than=4096
+
 obj-y  += xmon.o nonstdio.o spr_access.o xmon_bpts.o
 
 ifdef CONFIG_XMON_DISASSEMBLY

---
base-commit: 2ee82481c392eec06a7ef28df61b7f0d8e45be2e
change-id: 20230828-ppc_rerevert-647427f04ce1

Best regards,
-- 
Nick Desaulniers 



Re: [PATCH v7 12/24] iommu/tegra-smmu: Support DMA domains in tegra

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:26PM -0300, Jason Gunthorpe wrote:
> All ARM64 iommu drivers should support IOMMU_DOMAIN_DMA to enable
> dma-iommu.c.
> 
> tegra is blocking dma-iommu usage, and also default_domain's, because it
> wants an identity translation. This is needed for some device quirk. The
> correct way to do this is to support IDENTITY domains and use
> ops->def_domain_type() to return IOMMU_DOMAIN_IDENTITY for only the quirky
> devices.
> 
> Add support for IOMMU_DOMAIN_DMA and force IOMMU_DOMAIN_IDENTITY mode for
> everything so no behavior changes.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/tegra-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 11/24] iommu/tegra-smmu: Implement an IDENTITY domain

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:25PM -0300, Jason Gunthorpe wrote:
> What tegra-smmu does during tegra_smmu_set_platform_dma() is actually
> putting the iommu into identity mode.
> 
> Move to the new core support for ARM_DMA_USE_IOMMU by defining
> ops->identity_domain.
> 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/tegra-smmu.c | 37 -
>  1 file changed, 32 insertions(+), 5 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 10/24] iommu/exynos: Implement an IDENTITY domain

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:24PM -0300, Jason Gunthorpe wrote:
> What exynos calls exynos_iommu_detach_device is actually putting the iommu
> into identity mode.
> 
> Move to the new core support for ARM_DMA_USE_IOMMU by defining
> ops->identity_domain.
> 
> Tested-by: Marek Szyprowski 
> Acked-by: Marek Szyprowski 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/exynos-iommu.c | 66 +---
>  1 file changed, 32 insertions(+), 34 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 09/24] iommu: Allow an IDENTITY domain as the default_domain in ARM32

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:23PM -0300, Jason Gunthorpe wrote:
> Even though dma-iommu.c and CONFIG_ARM_DMA_USE_IOMMU do approximately the
> same stuff, the way they relate to the IOMMU core is quiet different.
> 
> dma-iommu.c expects the core code to setup an UNMANAGED domain (of type
> IOMMU_DOMAIN_DMA) and then configures itself to use that domain. This
> becomes the default_domain for the group.
> 
> ARM_DMA_USE_IOMMU does not use the default_domain, instead it directly
> allocates an UNMANAGED domain and operates it just like an external
> driver. In this case group->default_domain is NULL.
> 
> If the driver provides a global static identity_domain then automatically
> use it as the default_domain when in ARM_DMA_USE_IOMMU mode.
> 
> This allows drivers that implemented default_domain == NULL as an IDENTITY
> translation to trivially get a properly labeled non-NULL default_domain on
> ARM32 configs.
> 
> With this arrangment when ARM_DMA_USE_IOMMU wants to disconnect from the
> device the normal detach_domain flow will restore the IDENTITY domain as
> the default domain. Overall this makes attach_dev() of the IDENTITY domain
> called in the same places as detach_dev().
> 
> This effectively migrates these drivers to default_domain mode. For
> drivers that support ARM64 they will gain support for the IDENTITY
> translation mode for the dma_api and behave in a uniform way.
> 
> Drivers use this by setting ops->identity_domain to a static singleton
> iommu_domain that implements the identity attach. If the core detects
> ARM_DMA_USE_IOMMU mode then it automatically attaches the IDENTITY domain
> during probe.
> 
> Drivers can continue to prevent the use of DMA translation by returning
> IOMMU_DOMAIN_IDENTITY from def_domain_type, this will completely prevent
> IOMMU_DMA from running but will not impact ARM_DMA_USE_IOMMU.
> 
> This allows removing the set_platform_dma_ops() from every remaining
> driver.
> 
> Remove the set_platform_dma_ops from rockchip and mkt_v1 as all it does
> is set an existing global static identity domain. mkt_v1 does not support
> IOMMU_DOMAIN_DMA and it does not compile on ARM64 so this transformation
> is safe.
> 
> Tested-by: Steven Price 
> Tested-by: Marek Szyprowski 
> Tested-by: Nicolin Chen 
> Reviewed-by: Lu Baolu 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/iommu.c  | 21 -
>  drivers/iommu/mtk_iommu_v1.c   | 12 
>  drivers/iommu/rockchip-iommu.c | 10 --
>  3 files changed, 20 insertions(+), 23 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v7 08/24] iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()

2023-08-28 Thread Jerry Snitselaar
On Wed, Aug 23, 2023 at 01:47:22PM -0300, Jason Gunthorpe wrote:
> Except for dart (which forces IOMMU_DOMAIN_DMA) every driver returns 0 or
> IDENTITY from ops->def_domain_type().
> 
> The drivers that return IDENTITY have some kind of good reason, typically
> that quirky hardware really can't support anything other than IDENTITY.
> 
> Arrange things so that if the driver says it needs IDENTITY then
> iommu_get_default_domain_type() either fails or returns IDENTITY.  It will
> not ignore the driver's override to IDENTITY.
> 
> Split the function into two steps, reducing the group device list to the
> driver's def_domain_type() and the untrusted flag.
> 
> Then compute the result based on those two reduced variables. Fully reject
> combining untrusted with IDENTITY.
> 
> Remove the debugging print on the iommu_group_store_type() failure path,
> userspace should not be able to trigger kernel prints.
> 
> This makes the next patch cleaner that wants to force IDENTITY always for
> ARM_IOMMU because there is no support for DMA.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/iommu.c | 117 --
>  1 file changed, 79 insertions(+), 38 deletions(-)
> 

Reviewed-by: Jerry Snitselaar 



Re: [PATCH v3] fsl_ucc_hdlc: process the result of hold_open()

2023-08-28 Thread Jakub Kicinski
On Mon, 28 Aug 2023 15:12:35 +0300 Alexandra Diupina wrote:
> Process the result of hold_open() and return it from
> uhdlc_open() in case of an error
> It is necessary to pass the error code up the control flow,
> similar to a possible error in request_irq()
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Alexandra Diupina 
> ---
> v3: Fix the commits tree
> v2: Remove the 'rc' variable (stores the return value of the 
> hdlc_open()) as Christophe Leroy  suggested
>  drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
> index 47c2ad7a3e42..4164abea7725 100644
> --- a/drivers/net/wan/fsl_ucc_hdlc.c
> +++ b/drivers/net/wan/fsl_ucc_hdlc.c
> @@ -731,7 +731,7 @@ static int uhdlc_open(struct net_device *dev)
>   napi_enable(>napi);
>   netdev_reset_queue(dev);
>   netif_start_queue(dev);
> - hdlc_open(dev);
> + return hdlc_open(dev);

Don't you have to undo all the things done prior to hdlc_open()?

Before you post v4 please make sure that you've read:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#resending-after-review

Zhao, please review the next version.
-- 
pw-bot: cr


Re: [PATCH] Revert "Revert "powerpc/xmon: Relax frame size for clang""

2023-08-28 Thread Nathan Chancellor
On Mon, Aug 28, 2023 at 10:35:26AM -0700, ndesaulni...@google.com wrote:
> This reverts commit 7f3c5d099b6f8452dc4dcfe4179ea48e6a13d0eb.
> 
> Turns out that this is reproducible still under specific compiler
> versions (mea culpa: I did not test every supported version of clang),
> and even a few randconfigs bots found.
> 
> We'll have to revisit this again in the future, for now back this out.
> 
> Reported-by: Nathan Chancellor 
> Closes: 
> https://github.com/ClangBuiltLinux/linux/issues/252#issuecomment-1690371256
> Reported-by: kernel test robot 
> Closes: https://lore.kernel.org/llvm/202308260344.vc4giuk7-...@intel.com/
> Signed-off-by: Nick Desaulniers 

I know this is just a straight reapplication of the original workaround
but could we use ccflags here instead of KBUILD_CFLAGS (with it placed
after the NO_MIMINAL_TOC assignment)?

  # clang stores addresses on the stack causing the frame size to blow
  # out. See https://github.com/ClangBuiltLinux/linux/issues/252
  ccflags-$(CONFIG_CC_IS_CLANG) += -Wframe-larger-than=4096

The addition to KBUILD_CFLAGS makes me think this flags will get applied
to the rest of the kernel but I have not actually verified.

Regardless:

Reviewed-by: Nathan Chancellor 

Side note, seems like b4 is still doing the thing with "From:".

> ---
>  arch/powerpc/xmon/Makefile | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
> index 7705aa74a24d..d334de392e6c 100644
> --- a/arch/powerpc/xmon/Makefile
> +++ b/arch/powerpc/xmon/Makefile
> @@ -10,6 +10,12 @@ KCSAN_SANITIZE := n
>  # Disable ftrace for the entire directory
>  ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
>  
> +ifdef CONFIG_CC_IS_CLANG
> +# clang stores addresses on the stack causing the frame size to blow
> +# out. See https://github.com/ClangBuiltLinux/linux/issues/252
> +KBUILD_CFLAGS += -Wframe-larger-than=4096
> +endif
> +
>  ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
>  
>  obj-y+= xmon.o nonstdio.o spr_access.o xmon_bpts.o
> 
> ---
> base-commit: 2ee82481c392eec06a7ef28df61b7f0d8e45be2e
> change-id: 20230828-ppc_rerevert-647427f04ce1
> 
> Best regards,
> -- 
> Nick Desaulniers 
> 


[PATCH] Revert "Revert "powerpc/xmon: Relax frame size for clang""

2023-08-28 Thread ndesaulniers
This reverts commit 7f3c5d099b6f8452dc4dcfe4179ea48e6a13d0eb.

Turns out that this is reproducible still under specific compiler
versions (mea culpa: I did not test every supported version of clang),
and even a few randconfigs bots found.

We'll have to revisit this again in the future, for now back this out.

Reported-by: Nathan Chancellor 
Closes: 
https://github.com/ClangBuiltLinux/linux/issues/252#issuecomment-1690371256
Reported-by: kernel test robot 
Closes: https://lore.kernel.org/llvm/202308260344.vc4giuk7-...@intel.com/
Signed-off-by: Nick Desaulniers 
---
 arch/powerpc/xmon/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/xmon/Makefile b/arch/powerpc/xmon/Makefile
index 7705aa74a24d..d334de392e6c 100644
--- a/arch/powerpc/xmon/Makefile
+++ b/arch/powerpc/xmon/Makefile
@@ -10,6 +10,12 @@ KCSAN_SANITIZE := n
 # Disable ftrace for the entire directory
 ccflags-remove-$(CONFIG_FUNCTION_TRACER) += $(CC_FLAGS_FTRACE)
 
+ifdef CONFIG_CC_IS_CLANG
+# clang stores addresses on the stack causing the frame size to blow
+# out. See https://github.com/ClangBuiltLinux/linux/issues/252
+KBUILD_CFLAGS += -Wframe-larger-than=4096
+endif
+
 ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
 
 obj-y  += xmon.o nonstdio.o spr_access.o xmon_bpts.o

---
base-commit: 2ee82481c392eec06a7ef28df61b7f0d8e45be2e
change-id: 20230828-ppc_rerevert-647427f04ce1

Best regards,
-- 
Nick Desaulniers 



Re: [PATCH v3] fsl_ucc_hdlc: process the result of hold_open()

2023-08-28 Thread Christophe Leroy


Le 28/08/2023 à 14:12, Alexandra Diupina a écrit :
> [Vous ne recevez pas souvent de courriers de adiup...@astralinux.ru. 
> Découvrez pourquoi ceci est important à 
> https://aka.ms/LearnAboutSenderIdentification ]
> 
> Process the result of hold_open() and return it from
> uhdlc_open() in case of an error
> It is necessary to pass the error code up the control flow,
> similar to a possible error in request_irq()
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Alexandra Diupina 

Reviewed-by: Christophe Leroy 

> ---
> v3: Fix the commits tree
> v2: Remove the 'rc' variable (stores the return value of the
> hdlc_open()) as Christophe Leroy  suggested
>   drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
> index 47c2ad7a3e42..4164abea7725 100644
> --- a/drivers/net/wan/fsl_ucc_hdlc.c
> +++ b/drivers/net/wan/fsl_ucc_hdlc.c
> @@ -731,7 +731,7 @@ static int uhdlc_open(struct net_device *dev)
>  napi_enable(>napi);
>  netdev_reset_queue(dev);
>  netif_start_queue(dev);
> -   hdlc_open(dev);
> +   return hdlc_open(dev);
>  }
> 
>  return 0;
> --
> 2.30.2
> 


[PATCH v3] fsl_ucc_hdlc: process the result of hold_open()

2023-08-28 Thread Alexandra Diupina
Process the result of hold_open() and return it from
uhdlc_open() in case of an error
It is necessary to pass the error code up the control flow,
similar to a possible error in request_irq()

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Alexandra Diupina 
---
v3: Fix the commits tree
v2: Remove the 'rc' variable (stores the return value of the 
hdlc_open()) as Christophe Leroy  suggested
 drivers/net/wan/fsl_ucc_hdlc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index 47c2ad7a3e42..4164abea7725 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -731,7 +731,7 @@ static int uhdlc_open(struct net_device *dev)
napi_enable(>napi);
netdev_reset_queue(dev);
netif_start_queue(dev);
-   hdlc_open(dev);
+   return hdlc_open(dev);
}
 
return 0;
-- 
2.30.2



Re: (subset) [PATCH 00/17] -Wmissing-prototype warning fixes

2023-08-28 Thread Michael Schmitz

Hi Geert,

Am 28.08.2023 um 18:42 schrieb Geert Uytterhoeven:

On Sat, Aug 26, 2023 at 12:44 AM Michael Schmitz  wrote:

(Incidentally - did you ever publish the m68k full history tree anywhere
in git?)


You mean the gitified version of the Linux/m68k CVS tree Ralf created
for me because my machine wasn't powerful enough?


The very same ...


No, and I should look into doing that...


No pressure!

Cheers,

Michael



Gr{oetje,eeting}s,

Geert



Re: (subset) [PATCH 00/17] -Wmissing-prototype warning fixes

2023-08-28 Thread Geert Uytterhoeven
On Sat, Aug 26, 2023 at 12:44 AM Michael Schmitz  wrote:
> (Incidentally - did you ever publish the m68k full history tree anywhere
> in git?)

You mean the gitified version of the Linux/m68k CVS tree Ralf created
for me because my machine wasn't powerful enough?
No, and I should look into doing that...

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


crypto: powerpc/chacha20,poly1305-p10 - Add dependency on VSX

2023-08-28 Thread Herbert Xu
On Fri, Aug 25, 2023 at 07:44:32PM +0800, kernel test robot wrote:
>
> All errors (new ones prefixed by >>):
> 
>In file included from arch/powerpc/crypto/poly1305-p10-glue.c:19:

...

> ae3a197e3d0bfe3 David Howells2012-03-28  75  
> ae3a197e3d0bfe3 David Howells2012-03-28  76  #ifdef CONFIG_VSX
> d1e1cf2e38def30 Anton Blanchard  2015-10-29  77  extern void 
> enable_kernel_vsx(void);
> ae3a197e3d0bfe3 David Howells2012-03-28  78  extern void 
> flush_vsx_to_thread(struct task_struct *);
> 3eb5d5888dc68c9 Anton Blanchard  2015-10-29  79  static inline void 
> disable_kernel_vsx(void)
> 3eb5d5888dc68c9 Anton Blanchard  2015-10-29  80  {
> 3eb5d5888dc68c9 Anton Blanchard  2015-10-29  81   
> msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX);
> 3eb5d5888dc68c9 Anton Blanchard  2015-10-29  82  }
> bd73758803c2eed Christophe Leroy 2021-03-09  83  #else
> bd73758803c2eed Christophe Leroy 2021-03-09  84  static inline void 
> enable_kernel_vsx(void)
> bd73758803c2eed Christophe Leroy 2021-03-09  85  {
> bd73758803c2eed Christophe Leroy 2021-03-09 @86   BUILD_BUG();
> bd73758803c2eed Christophe Leroy 2021-03-09  87  }
> bd73758803c2eed Christophe Leroy 2021-03-09  88  

---8<---
Add dependency on VSX as otherwise the build will fail without
it.

Fixes: 161fca7e3e90 ("crypto: powerpc - Add chacha20/poly1305-p10 to Kconfig 
and Makefile")
Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202308251906.syawej6g-...@intel.com/
Signed-off-by: Herbert Xu 

diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig
index f25024afdda5..7a66d7c0e6a2 100644
--- a/arch/powerpc/crypto/Kconfig
+++ b/arch/powerpc/crypto/Kconfig
@@ -113,7 +113,7 @@ config CRYPTO_AES_GCM_P10
 
 config CRYPTO_CHACHA20_P10
tristate "Ciphers: ChaCha20, XChacha20, XChacha12 (P10 or later)"
-   depends on PPC64 && CPU_LITTLE_ENDIAN
+   depends on PPC64 && CPU_LITTLE_ENDIAN && VSX
select CRYPTO_SKCIPHER
select CRYPTO_LIB_CHACHA_GENERIC
select CRYPTO_ARCH_HAVE_LIB_CHACHA
@@ -127,7 +127,7 @@ config CRYPTO_CHACHA20_P10
 
 config CRYPTO_POLY1305_P10
tristate "Hash functions: Poly1305 (P10 or later)"
-   depends on PPC64 && CPU_LITTLE_ENDIAN
+   depends on PPC64 && CPU_LITTLE_ENDIAN && VSX
select CRYPTO_HASH
select CRYPTO_LIB_POLY1305_GENERIC
help
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH v2] fsl_ucc_hdlc: add a check of the return value from hdlc_open

2023-08-28 Thread Christophe Leroy


Le 28/08/2023 à 10:27, Alexandra Diupina a écrit :
> [Vous ne recevez pas souvent de courriers de adiup...@astralinux.ru. 
> Découvrez pourquoi ceci est important à 
> https://aka.ms/LearnAboutSenderIdentification ]
> 
> Process the result of hold_open() and return it from
> uhdlc_open() in case of an error
> It is necessary to pass the error code up the control flow,
> similar to a possible error in request_irq()
> 
> Found by Linux Verification Center (linuxtesting.org) with SVACE.
> 
> Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
> Signed-off-by: Alexandra Diupina 
> ---
> v2: Remove the 'rc' variable (stores the return value of the
> hdlc_open()) as Christophe Leroy  suggested
>   drivers/net/wan/fsl_ucc_hdlc.c | 7 ++-
>   1 file changed, 2 insertions(+), 5 deletions(-)

I think you did a mistake. A v2 should substitute v1, not come in 
addition to it. So you have to squash this patch into previous one 
before resending.

Christophe

> 
> diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
> index cdd9489c712e..4164abea7725 100644
> --- a/drivers/net/wan/fsl_ucc_hdlc.c
> +++ b/drivers/net/wan/fsl_ucc_hdlc.c
> @@ -708,7 +708,6 @@ static int uhdlc_open(struct net_device *dev)
>  hdlc_device *hdlc = dev_to_hdlc(dev);
>  struct ucc_hdlc_private *priv = hdlc->priv;
>  struct ucc_tdm *utdm = priv->utdm;
> -   int rc = 0;
> 
>  if (priv->hdlc_busy != 1) {
>  if (request_irq(priv->ut_info->uf_info.irq,
> @@ -732,12 +731,10 @@ static int uhdlc_open(struct net_device *dev)
>  napi_enable(>napi);
>  netdev_reset_queue(dev);
>  netif_start_queue(dev);
> -   rc = hdlc_open(dev);
> -   if (rc)
> -   return rc;
> +   return hdlc_open(dev);
>  }
> 
> -   return rc;
> +   return 0;
>   }
> 
>   static void uhdlc_memclean(struct ucc_hdlc_private *priv)
> --
> 2.30.2
> 


[PATCH v2] fsl_ucc_hdlc: add a check of the return value from hdlc_open

2023-08-28 Thread Alexandra Diupina
Process the result of hold_open() and return it from
uhdlc_open() in case of an error
It is necessary to pass the error code up the control flow,
similar to a possible error in request_irq()

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
Signed-off-by: Alexandra Diupina 
---
v2: Remove the 'rc' variable (stores the return value of the 
hdlc_open()) as Christophe Leroy  suggested
 drivers/net/wan/fsl_ucc_hdlc.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/net/wan/fsl_ucc_hdlc.c b/drivers/net/wan/fsl_ucc_hdlc.c
index cdd9489c712e..4164abea7725 100644
--- a/drivers/net/wan/fsl_ucc_hdlc.c
+++ b/drivers/net/wan/fsl_ucc_hdlc.c
@@ -708,7 +708,6 @@ static int uhdlc_open(struct net_device *dev)
hdlc_device *hdlc = dev_to_hdlc(dev);
struct ucc_hdlc_private *priv = hdlc->priv;
struct ucc_tdm *utdm = priv->utdm;
-   int rc = 0;
 
if (priv->hdlc_busy != 1) {
if (request_irq(priv->ut_info->uf_info.irq,
@@ -732,12 +731,10 @@ static int uhdlc_open(struct net_device *dev)
napi_enable(>napi);
netdev_reset_queue(dev);
netif_start_queue(dev);
-   rc = hdlc_open(dev);
-   if (rc)
-   return rc;
+   return hdlc_open(dev);
}
 
-   return rc;
+   return 0;
 }
 
 static void uhdlc_memclean(struct ucc_hdlc_private *priv)
-- 
2.30.2



Re: [PATCH 1/2] powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled

2023-08-28 Thread Aneesh Kumar K V
On 8/28/23 1:16 PM, Aneesh Kumar K.V wrote:
> With CONFIG_SPARSEMEM disabled the below kernel build error is observed.
> 
>  arch/powerpc/mm/init_64.c:477:38: error: use of undeclared identifier 
> 'SECTION_SIZE_BITS'
> 
> CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM and it is more clear
> to describe the code dependency in terms of MEMORY_HOTPLUG. Outside
> memory hotplug the kernel uses memory_block_size for kernel directmap.
> Instead of depending on SECTION_SIZE_BITS to compute the direct map
> page size, add a new #define which defaults to 16M(same as existing
> SECTION_SIZE)
> 

Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202308251532.k9ppwead-...@intel.com/

> Fixes: 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing")
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/mm/init_64.c | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index fcda46c2b8df..e3d7379ef480 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -472,12 +472,23 @@ static int __init dt_scan_mmu_pid_width(unsigned long 
> node,
>   return 1;
>  }
>  
> +/*
> + * Outside hotplug the kernel uses this value to map the kernel direct map
> + * with radix. To be compatible with older kernels, let's keep this value
> + * as 16M which is also SECTION_SIZE with SPARSEMEM. We can ideally map
> + * things with 1GB size in the case where we don't support hotplug.
> + */
> +#ifndef CONFIG_MEMORY_HOTPLUG
> +#define DEFAULT_MEMORY_BLOCK_SIZESZ_16M
> +#else
> +#define DEFAULT_MEMORY_BLOCK_SIZEMIN_MEMORY_BLOCK_SIZE
> +#endif
> +
>  static void update_memory_block_size(unsigned long *block_size, unsigned 
> long mem_size)
>  {
> - unsigned long section_size = 1UL << SECTION_SIZE_BITS;
> -
> - for (; *block_size > section_size; *block_size >>= 2) {
> + unsigned long min_memory_block_size = DEFAULT_MEMORY_BLOCK_SIZE;
>  
> + for (; *block_size > min_memory_block_size; *block_size >>= 2) {
>   if ((mem_size & *block_size) == 0)
>   break;
>   }
> @@ -507,7 +518,7 @@ static int __init probe_memory_block_size(unsigned long 
> node, const char *uname,
>   /*
>* Nothing in the device tree
>*/
> - *block_size = MIN_MEMORY_BLOCK_SIZE;
> + *block_size = DEFAULT_MEMORY_BLOCK_SIZE;
>   else
>   *block_size = of_read_number(prop, dt_root_size_cells);
>   /*



[PATCH 2/2] powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached

2023-08-28 Thread Aneesh Kumar K.V
commit 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing")
used 256MB as the memory block size when we have
ibm,coherent-device-memory device tree node present. Instead of
returning with 256MB memory block size, continue to check the rest of the memory
regions and make sure we can still map them using a 256MB memory block size.

Fixes: 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing")
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/init_64.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index e3d7379ef480..a8557867ece0 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -569,8 +569,12 @@ static int __init probe_memory_block_size(unsigned long 
node, const char *uname,
 */
compatible = of_get_flat_dt_prop(node, "compatible", NULL);
if (compatible && !strcmp(compatible, 
"ibm,coherent-device-memory")) {
-   *block_size = SZ_256M;
-   return 1;
+   if (*block_size > SZ_256M)
+   *block_size = SZ_256M;
+   /*
+* We keep 256M as the upper limit with GPU present.
+*/
+   return 0;
}
}
/* continue looking for other memory device types */
-- 
2.41.0



[PATCH 1/2] powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled

2023-08-28 Thread Aneesh Kumar K.V
With CONFIG_SPARSEMEM disabled the below kernel build error is observed.

 arch/powerpc/mm/init_64.c:477:38: error: use of undeclared identifier 
'SECTION_SIZE_BITS'

CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM and it is more clear
to describe the code dependency in terms of MEMORY_HOTPLUG. Outside
memory hotplug the kernel uses memory_block_size for kernel directmap.
Instead of depending on SECTION_SIZE_BITS to compute the direct map
page size, add a new #define which defaults to 16M(same as existing
SECTION_SIZE)

Fixes: 4d15721177d5 ("powerpc/mm: Cleanup memory block size probing")
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/init_64.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index fcda46c2b8df..e3d7379ef480 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -472,12 +472,23 @@ static int __init dt_scan_mmu_pid_width(unsigned long 
node,
return 1;
 }
 
+/*
+ * Outside hotplug the kernel uses this value to map the kernel direct map
+ * with radix. To be compatible with older kernels, let's keep this value
+ * as 16M which is also SECTION_SIZE with SPARSEMEM. We can ideally map
+ * things with 1GB size in the case where we don't support hotplug.
+ */
+#ifndef CONFIG_MEMORY_HOTPLUG
+#define DEFAULT_MEMORY_BLOCK_SIZE  SZ_16M
+#else
+#define DEFAULT_MEMORY_BLOCK_SIZE  MIN_MEMORY_BLOCK_SIZE
+#endif
+
 static void update_memory_block_size(unsigned long *block_size, unsigned long 
mem_size)
 {
-   unsigned long section_size = 1UL << SECTION_SIZE_BITS;
-
-   for (; *block_size > section_size; *block_size >>= 2) {
+   unsigned long min_memory_block_size = DEFAULT_MEMORY_BLOCK_SIZE;
 
+   for (; *block_size > min_memory_block_size; *block_size >>= 2) {
if ((mem_size & *block_size) == 0)
break;
}
@@ -507,7 +518,7 @@ static int __init probe_memory_block_size(unsigned long 
node, const char *uname,
/*
 * Nothing in the device tree
 */
-   *block_size = MIN_MEMORY_BLOCK_SIZE;
+   *block_size = DEFAULT_MEMORY_BLOCK_SIZE;
else
*block_size = of_read_number(prop, dt_root_size_cells);
/*
-- 
2.41.0



Re: [PATCH v2 0/2] kbuild: Show Kconfig fragments in "help"

2023-08-28 Thread Michael Ellerman
Masahiro Yamada  writes:
> On Sat, Aug 26, 2023 at 4:55 AM Kees Cook  wrote:
>>
>> Hi,
>>
>> This is my series to show *.config targets in the "help" target so these
>> various topics can be more easily discoverd.
>>
>> v2:
>>  - split .fragment from .config to hide "internal" fragments
>
> Please do not do this churn.

That was my idea :}

> Like Randy, I did not get "why" part quiet well,
> but if you are eager about this,
> you can show help message only when the following
> ("# Help:" prefix for example) is found in the first line.
>
> # Help: blah blah
> # other comment

I did think of that, but wasn't sure how to do it in make.

cheers


[RFC v2 2/2] powerpc/selftest: Add support for cpuidle latency measurement

2023-08-28 Thread Aboorva Devarajan
From: Pratik R. Sampat 

The cpuidle latency selftest provides support to systematically extract,
analyse and present IPI and timer based wakeup latencies for each CPU
and each idle state available on the system.

The selftest leverages test_cpuidle_latency module's debugfs interface
to interact and extract latency information from the kernel.

The selftest inserts the module if already not inserted, disables all
the idle states and enables them one by one testing the following:

1. Keeping source CPU constant, iterate through all the cores and pick
   a single CPU for each core measuring IPI latency for baseline
   (CPU is busy with cat /dev/random > /dev/null workload) and then
   when the CPU is idle.
2. Iterating through all the CPU cores and selecting one CPU for each
   core, then, the expected timer durations to be equivalent to the
   residency of the deepest idle state enabled is sent to the selected
   target CPU, then the difference between the expected timer duration
   and the time of wakeup is determined.

To run this test specifically:
$ sudo make -C tools/testing/selftests \
  TARGETS="powerpc/cpuidle_latency" run_tests

There are a few optional arguments too that the script can take
[-h ]
[-i ]
[-m ]
[-s ]
[-o ]
[-v  (run on all cpus)]

Default Output location in:
tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.log

To run the test without re-compiling:
$ cd tools/testing/selftest/powerpc/cpuidle_latency/
$ sudo ./cpuidle_latency.sh

Signed-off-by: Pratik R. Sampat 
Signed-off-by: Aboorva Devarajan 
Reviewed-by: Shrikanth Hegde 
---
 tools/testing/selftests/powerpc/Makefile  |   1 +
 .../powerpc/cpuidle_latency/.gitignore|   2 +
 .../powerpc/cpuidle_latency/Makefile  |   6 +
 .../cpuidle_latency/cpuidle_latency.sh| 443 ++
 .../powerpc/cpuidle_latency/settings  |   1 +
 5 files changed, 453 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/Makefile
 create mode 100755 
tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/settings

diff --git a/tools/testing/selftests/powerpc/Makefile 
b/tools/testing/selftests/powerpc/Makefile
index 49f2ad1793fd..efac7270ce1f 100644
--- a/tools/testing/selftests/powerpc/Makefile
+++ b/tools/testing/selftests/powerpc/Makefile
@@ -17,6 +17,7 @@ SUB_DIRS = alignment  \
   benchmarks   \
   cache_shape  \
   copyloops\
+  cpuidle_latency  \
   dexcr\
   dscr \
   mm   \
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore 
b/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
new file mode 100644
index ..987f8852dc59
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+cpuidle_latency.log
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/Makefile 
b/tools/testing/selftests/powerpc/cpuidle_latency/Makefile
new file mode 100644
index ..04492b6d2582
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+all:
+
+TEST_PROGS := cpuidle_latency.sh
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh 
b/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
new file mode 100755
index ..f7b7a9dc2e08
--- /dev/null
+++ b/tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
@@ -0,0 +1,443 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# CPU-Idle latency selftest enables systematic retrieval and presentation
+# of IPI and timer-triggered wake-up latencies for every CPU and available
+# system idle state by leveraging the test_cpuidle_latency module.
+#
+# Author: Pratik R. Sampat  
+# Author: Aboorva Devarajan 
+
+DISABLE=1
+ENABLE=0
+
+LOG=cpuidle_latency.log
+MODULE=/lib/modules/$(uname 
-r)/kernel/arch/powerpc/kernel/test_cpuidle_latency.ko
+
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+exit_status=0
+
+RUN_TIMER_TEST=1
+TIMEOUT=100
+VERBOSE=0
+
+IPI_SRC_CPU=0
+
+helpme() {
+printf "Usage: %s [-h] [-todg args]
+   [-h ]
+   [-s  (default: 0)]
+   [-m ]
+   [-o ]
+   [-v  (execute test across all CPU threads)]
+   [-i ]
+   \n" "$0"
+exit 2
+}
+
+cpu_is_online() {
+local cpu=$1
+if [ ! -f "/sys/devices/system/cpu/cpu$cpu/online" ]; then
+printf "CPU %s: file not found: /sys/devices/system/cpu/cpu%s/online" 
"$cpu" "$cpu"
+return 0
+fi
+status=$(cat /sys/devices/system/cpu/cpu"$cpu"/online)
+return "$status"
+}
+

[RFC v2 0/2] CPU-Idle latency selftest framework

2023-08-28 Thread Aboorva Devarajan
Changelog: v1 -> v2

* Rebased on v6.5-rc6
* Moved the test directory to powerpc debugfs
* Minimal code refactoring

RFC v1: 
https://lore.kernel.org/all/20210611124154.56427-1-psam...@linux.ibm.com/

Other related RFC:
https://lore.kernel.org/all/20210430082804.38018-1-psam...@linux.ibm.com/

Userspace selftest:
https://lkml.org/lkml/2020/9/2/356



A kernel module + userspace driver to estimate the wakeup latency
caused by going into stop states. The motivation behind this program is
to find significant deviations behind advertised latency and residency
values.

The patchset measures latencies for two kinds of events. IPIs and Timers
As this is a software-only mechanism, there will be additional latencies
of the kernel-firmware-hardware interactions. To account for that, the
program also measures a baseline latency on a 100 percent loaded CPU
and the latencies achieved must be in view relative to that.

To achieve this, we introduce a kernel module and expose its control
knobs through the debugfs interface that the selftests can engage with.

The kernel module provides the following interfaces within
/sys/kernel/debug/powerpc/latency_test/ for,

IPI test:
ipi_cpu_dest = Destination CPU for the IPI
ipi_cpu_src = Origin of the IPI
ipi_latency_ns = Measured latency time in ns
Timeout test:
timeout_cpu_src = CPU on which the timer to be queued
timeout_expected_ns = Timer duration
timeout_diff_ns = Difference of actual duration vs expected timer

Sample output is as follows:

# --IPI Latency Test---
# Baseline Avg IPI latency(ns): 2720
# Observed Avg IPI latency(ns) - State snooze: 2565
# Observed Avg IPI latency(ns) - State stop0_lite: 3856
# Observed Avg IPI latency(ns) - State stop0: 3670
# Observed Avg IPI latency(ns) - State stop1: 3872
# Observed Avg IPI latency(ns) - State stop2: 17421
# Observed Avg IPI latency(ns) - State stop4: 1003922
# Observed Avg IPI latency(ns) - State stop5: 1058870
#
# --Timeout Latency Test--
# Baseline Avg timeout diff(ns): 1435
# Observed Avg timeout diff(ns) - State snooze: 1709
# Observed Avg timeout diff(ns) - State stop0_lite: 2028
# Observed Avg timeout diff(ns) - State stop0: 1954
# Observed Avg timeout diff(ns) - State stop1: 1895
# Observed Avg timeout diff(ns) - State stop2: 14556
# Observed Avg timeout diff(ns) - State stop4: 873988
# Observed Avg timeout diff(ns) - State stop5: 959137

Aboorva Devarajan (2):
  powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events
  powerpc/selftest: Add support for cpuidle latency measurement

 arch/powerpc/Kconfig.debug|  10 +
 arch/powerpc/kernel/Makefile  |   1 +
 arch/powerpc/kernel/test_cpuidle_latency.c| 156 ++
 tools/testing/selftests/powerpc/Makefile  |   1 +
 .../powerpc/cpuidle_latency/.gitignore|   2 +
 .../powerpc/cpuidle_latency/Makefile  |   6 +
 .../cpuidle_latency/cpuidle_latency.sh| 443 ++
 .../powerpc/cpuidle_latency/settings  |   1 +
 8 files changed, 620 insertions(+)
 create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/.gitignore
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/Makefile
 create mode 100755 
tools/testing/selftests/powerpc/cpuidle_latency/cpuidle_latency.sh
 create mode 100644 tools/testing/selftests/powerpc/cpuidle_latency/settings

-- 
2.25.1



[RFC v2 1/2] powerpc/cpuidle: cpuidle wakeup latency based on IPI and timer events

2023-08-28 Thread Aboorva Devarajan
From: Pratik R. Sampat 

Introduce a mechanism to fire directed IPIs from a source CPU to a
specified target CPU and measure the time incurred on waking up the
target CPU in response.

Also, introduce a mechanism to queue a hrtimer on a specified CPU and
subsequently measure the time taken to wakeup the CPU.

Define a simple debugfs interface that allows for adjusting the
settings to trigger IPI and timer events on a designated CPU, and to
observe the resulting cpuidle wakeup latencies.

Signed-off-by: Pratik R. Sampat 
Signed-off-by: Aboorva Devarajan 
Reviewed-by: Shrikanth Hegde 
---
 arch/powerpc/Kconfig.debug |  10 ++
 arch/powerpc/kernel/Makefile   |   1 +
 arch/powerpc/kernel/test_cpuidle_latency.c | 156 +
 3 files changed, 167 insertions(+)
 create mode 100644 arch/powerpc/kernel/test_cpuidle_latency.c

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 2a54fadbeaf5..e175fc3028ac 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -391,3 +391,13 @@ config KASAN_SHADOW_OFFSET
default 0xe000 if PPC32
default 0xa80e if PPC_BOOK3S_64
default 0xa8001c00 if PPC_BOOK3E_64
+
+config CPUIDLE_LATENCY_SELFTEST
+   tristate "Cpuidle latency selftests"
+   depends on CPU_IDLE
+   help
+ Provides a kernel module that run tests using the IPI and
+ timers to measure cpuidle latency.
+
+ Say M if you want these self tests to build as a module.
+ Say N if you are unsure.
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 2919433be355..3205ecbd9d8f 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_PPC_WATCHDOG)+= watchdog.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)   += hw_breakpoint.o
 obj-$(CONFIG_PPC_DAWR) += dawr.o
 obj-$(CONFIG_PPC_BOOK3S_64)+= cpu_setup_ppc970.o cpu_setup_pa6t.o
+obj-$(CONFIG_CPUIDLE_LATENCY_SELFTEST)  += test_cpuidle_latency.o
 obj-$(CONFIG_PPC_BOOK3S_64)+= cpu_setup_power.o
 obj-$(CONFIG_PPC_BOOK3S_64)+= mce.o mce_power.o
 obj-$(CONFIG_PPC_BOOK3E_64)+= exceptions-64e.o idle_64e.o
diff --git a/arch/powerpc/kernel/test_cpuidle_latency.c 
b/arch/powerpc/kernel/test_cpuidle_latency.c
new file mode 100644
index ..3c3c119389c1
--- /dev/null
+++ b/arch/powerpc/kernel/test_cpuidle_latency.c
@@ -0,0 +1,156 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Module-based API test facility for cpuidle latency using IPIs and timers
+ */
+
+#include 
+#include 
+#include 
+
+/*
+ * IPI based wakeup latencies
+ * Measure time taken for a CPU to wakeup on a IPI sent from another CPU
+ * The latency measured also includes the latency of sending the IPI
+ */
+struct latency {
+   unsigned int src_cpu;
+   unsigned int dest_cpu;
+   ktime_t time_start;
+   ktime_t time_end;
+   u64 latency_ns;
+} ipi_wakeup;
+
+static void measure_latency(void *info)
+{
+   struct latency *v;
+   ktime_t time_diff;
+
+   v = (struct latency *)info;
+   v->time_end = ktime_get();
+   time_diff = ktime_sub(v->time_end, v->time_start);
+   v->latency_ns = ktime_to_ns(time_diff);
+}
+
+void run_smp_call_function_test(unsigned int cpu)
+{
+   ipi_wakeup.src_cpu = smp_processor_id();
+   ipi_wakeup.dest_cpu = cpu;
+   ipi_wakeup.time_start = ktime_get();
+   smp_call_function_single(cpu, measure_latency, _wakeup, 1);
+}
+
+/*
+ * Timer based wakeup latencies
+ * Measure time taken for a CPU to wakeup on a timer being armed and fired
+ */
+struct timer_data {
+   unsigned int src_cpu;
+   u64 timeout;
+   ktime_t time_start;
+   ktime_t time_end;
+   struct hrtimer timer;
+   u64 timeout_diff_ns;
+} timer_wakeup;
+
+static enum hrtimer_restart hrtimer_callback(struct hrtimer *hrtimer)
+{
+   struct timer_data *w;
+   ktime_t time_diff;
+
+   w = container_of(hrtimer, struct timer_data, timer);
+   w->time_end = ktime_get();
+
+   time_diff = ktime_sub(w->time_end, w->time_start);
+   time_diff = ktime_sub(time_diff, ns_to_ktime(w->timeout));
+   w->timeout_diff_ns = ktime_to_ns(time_diff);
+   return HRTIMER_NORESTART;
+}
+
+static void run_timer_test(unsigned int ns)
+{
+   hrtimer_init(_wakeup.timer, CLOCK_MONOTONIC,
+HRTIMER_MODE_REL);
+   timer_wakeup.timer.function = hrtimer_callback;
+   timer_wakeup.src_cpu = smp_processor_id();
+   timer_wakeup.timeout = ns;
+   timer_wakeup.time_start = ktime_get();
+
+   hrtimer_start(_wakeup.timer, ns_to_ktime(ns),
+ HRTIMER_MODE_REL_PINNED);
+}
+
+static struct dentry *dir;
+
+static int cpu_read_op(void *data, u64 *dest_cpu)
+{
+   *dest_cpu = ipi_wakeup.dest_cpu;
+   return 0;
+}
+
+/*
+ * Send a directed IPI from the current CPU (source) to the destination CPU and
+ * measure the latency on