Re: [PATCH v2 4/9] powerpc/powernv/npu: use helper pci_dev_id

2019-04-25 Thread Alexey Kardashevskiy



On 25/04/2019 05:14, Heiner Kallweit wrote:
> Use new helper pci_dev_id() to simplify the code.
> 
> Signed-off-by: Heiner Kallweit 



Reviewed-by: Alexey Kardashevskiy 


> ---
>  arch/powerpc/platforms/powernv/npu-dma.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c 
> b/arch/powerpc/platforms/powernv/npu-dma.c
> index dc23d9d2a..495550432 100644
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ b/arch/powerpc/platforms/powernv/npu-dma.c
> @@ -1213,9 +1213,8 @@ int pnv_npu2_map_lpar_dev(struct pci_dev *gpdev, 
> unsigned int lparid,
>* Currently we only support radix and non-zero LPCR only makes sense
>* for hash tables so skiboot expects the LPCR parameter to be a zero.
>*/
> - ret = opal_npu_map_lpar(nphb->opal_id,
> - PCI_DEVID(gpdev->bus->number, gpdev->devfn), lparid,
> - 0 /* LPCR bits */);
> + ret = opal_npu_map_lpar(nphb->opal_id, pci_dev_id(gpdev), lparid,
> + 0 /* LPCR bits */);
>   if (ret) {
>   dev_err(>dev, "Error %d mapping device to LPAR\n", ret);
>   return ret;
> @@ -1224,7 +1223,7 @@ int pnv_npu2_map_lpar_dev(struct pci_dev *gpdev, 
> unsigned int lparid,
>   dev_dbg(>dev, "init context opalid=%llu msr=%lx\n",
>   nphb->opal_id, msr);
>   ret = opal_npu_init_context(nphb->opal_id, 0/*__unused*/, msr,
> - PCI_DEVID(gpdev->bus->number, gpdev->devfn));
> + pci_dev_id(gpdev));
>   if (ret < 0)
>   dev_err(>dev, "Failed to init context: %d\n", ret);
>   else
> @@ -1258,7 +1257,7 @@ int pnv_npu2_unmap_lpar_dev(struct pci_dev *gpdev)
>   dev_dbg(>dev, "destroy context opalid=%llu\n",
>   nphb->opal_id);
>   ret = opal_npu_destroy_context(nphb->opal_id, 0/*__unused*/,
> - PCI_DEVID(gpdev->bus->number, gpdev->devfn));
> +pci_dev_id(gpdev));
>   if (ret < 0) {
>   dev_err(>dev, "Failed to destroy context: %d\n", ret);
>   return ret;
> @@ -1266,9 +1265,8 @@ int pnv_npu2_unmap_lpar_dev(struct pci_dev *gpdev)
>  
>   /* Set LPID to 0 anyway, just to be safe */
>   dev_dbg(>dev, "Map LPAR opalid=%llu lparid=0\n", nphb->opal_id);
> - ret = opal_npu_map_lpar(nphb->opal_id,
> - PCI_DEVID(gpdev->bus->number, gpdev->devfn), 0 /*LPID*/,
> - 0 /* LPCR bits */);
> + ret = opal_npu_map_lpar(nphb->opal_id, pci_dev_id(gpdev), 0 /*LPID*/,
> + 0 /* LPCR bits */);
>   if (ret)
>   dev_err(>dev, "Error %d mapping device to LPAR\n", ret);
>  
> 

-- 
Alexey
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation

2019-04-25 Thread Jacob Pan
On Wed, 24 Apr 2019 19:27:52 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > From: Lu Baolu 
> > 
> > If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> > IOMMU driver should rely on the emulation software to allocate
> > and free PASID IDs.  
> Do we make the decision depending on the CM or depending on the
> VCCAP_REG?
> 
> VCCAP_REG description says:
> 
> If Set, software must use Virtual Command Register interface to
> allocate and free PASIDs.
> 
>  The Intel vt-d spec revision 3.0 defines a
> > register set to support this. This includes a capability register,
> > a virtual command register and a virtual response register. Refer
> > to section 10.4.42, 10.4.43, 10.4.44 for more information.
> > 
> > This patch adds the enlightened PASID allocation/free interfaces  
> For mu curiosity why is it called "enlightened"?
I don't know the origin but "enlightened" means guest is tipped with
information that it is not running on real HW.

> > via the virtual command register.
> > 
> > Cc: Ashok Raj 
> > Cc: Jacob Pan 
> > Cc: Kevin Tian 
> > Signed-off-by: Liu Yi L 
> > Signed-off-by: Lu Baolu 
> > ---
> >  drivers/iommu/intel-pasid.c | 70
> > +
> > drivers/iommu/intel-pasid.h | 13 -
> > include/linux/intel-iommu.h |  2 ++ 3 files changed, 84
> > insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/intel-pasid.c
> > b/drivers/iommu/intel-pasid.c index 03b12d2..5b1d3be 100644
> > --- a/drivers/iommu/intel-pasid.c
> > +++ b/drivers/iommu/intel-pasid.c
> > @@ -63,6 +63,76 @@ void *intel_pasid_lookup_id(int pasid)
> > return p;
> >  }
> >  
> > +int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int
> > *pasid) +{
> > +   u64 res;
> > +   u64 cap;
> > +   u8 err_code;
> > +   unsigned long flags;
> > +   int ret = 0;
> > +
> > +   if (!ecap_vcs(iommu->ecap)) {
> > +   pr_warn("IOMMU: %s: Hardware doesn't support
> > virtual command\n",
> > +   iommu->name);  
> nit: other pr_* messages don't have the "IOMMU: %s:" prefix.
Are you suggesting just use the prefix defined in pr_fmt? I guess i can
remove "IOMMU" if Allen is OK with it :).

> > +   return -ENODEV;
> > +   }
> > +
> > +   cap = dmar_readq(iommu->reg + DMAR_VCCAP_REG);
> > +   if (!(cap & DMA_VCS_PAS)) {
> > +   pr_warn("IOMMU: %s: Emulation software doesn't
> > support PASID allocation\n",
> > +   iommu->name);
> > +   return -ENODEV;
> > +   }
> > +
> > +   raw_spin_lock_irqsave(>register_lock, flags);
> > +   dmar_writeq(iommu->reg + DMAR_VCMD_REG, VCMD_CMD_ALLOC);
> > +   IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> > + !(res & VCMD_VRSP_IP), res);
> > +   raw_spin_unlock_irqrestore(>register_lock, flags);
> > +
> > +   err_code = VCMD_VRSP_EC(res);
> > +   switch (err_code) {
> > +   case VCMD_VRSP_EC_SUCCESS:
> > +   *pasid = VCMD_VRSP_RESULE(res);
> > +   break;
> > +   case VCMD_VRSP_EC_UNAVAIL:
> > +   pr_info("IOMMU: %s: No PASID available\n",
> > iommu->name);
> > +   ret = -ENOMEM;
> > +   break;
> > +   default:
> > +   ret = -ENODEV;
> > +   pr_warn("IOMMU: %s: Unkonwn error code %d\n",  
> unknown
> > +   iommu->name, err_code);
> > +   }
> > +
> > +   return ret;
> > +}
> > +
> > +void vcmd_free_pasid(struct intel_iommu *iommu, unsigned int pasid)
> > +{
> > +   u64 res;
> > +   u8 err_code;
> > +   unsigned long flags;  
> Shall we check as well the cap is set?
yes, good point.

> > +
> > +   raw_spin_lock_irqsave(>register_lock, flags);
> > +   dmar_writeq(iommu->reg + DMAR_VCMD_REG, (pasid << 8) |
> > VCMD_CMD_FREE);
> > +   IOMMU_WAIT_OP(iommu, DMAR_VCRSP_REG, dmar_readq,
> > + !(res & VCMD_VRSP_IP), res);
> > +   raw_spin_unlock_irqrestore(>register_lock, flags);
> > +
> > +   err_code = VCMD_VRSP_EC(res);
> > +   switch (err_code) {
> > +   case VCMD_VRSP_EC_SUCCESS:
> > +   break;
> > +   case VCMD_VRSP_EC_INVAL:
> > +   pr_info("IOMMU: %s: Invalid PASID\n", iommu->name);
> > +   break;
> > +   default:
> > +   pr_warn("IOMMU: %s: Unkonwn error code %d\n",  
> unknown
> > +   iommu->name, err_code);
> > +   }
> > +}
> > +
> >  /*
> >   * Per device pasid table management:
> >   */
> > diff --git a/drivers/iommu/intel-pasid.h
> > b/drivers/iommu/intel-pasid.h index 23537b3..0999dfe 100644
> > --- a/drivers/iommu/intel-pasid.h
> > +++ b/drivers/iommu/intel-pasid.h
> > @@ -19,6 +19,16 @@
> >  #define PASID_PDE_SHIFT6
> >  #define MAX_NR_PASID_BITS  20
> >  
> > +/* Virtual command interface for enlightened pasid management. */
> > +#define VCMD_CMD_ALLOC 0x1
> > +#define VCMD_CMD_FREE  0x2
> > +#define VCMD_VRSP_IP   0x1
> > +#define VCMD_VRSP_EC(e)(((e) >> 1) & 0x3)  
> s/EC/SC? for 

Re: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation

2019-04-25 Thread Jacob Pan
On Thu, 25 Apr 2019 09:40:31 +0200
Auger Eric  wrote:

> Hi Liu,
> 
> On 4/25/19 9:12 AM, Liu, Yi L wrote:
> > Hi Eric,
> >   
> >> From: Auger Eric [mailto:eric.au...@redhat.com]
> >> Sent: Thursday, April 25, 2019 1:28 AM
> >> To: Jacob Pan ;
> >> iommu@lists.linux-foundation.org; Subject: Re: [PATCH v2 09/19]
> >> iommu/vt-d: Enlightened PASID allocation
> >>
> >> Hi Jacob,
> >>
> >> On 4/24/19 1:31 AM, Jacob Pan wrote:  
> >>> From: Lu Baolu 
> >>>
> >>> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the
> >>> IOMMU driver should rely on the emulation software to allocate
> >>> and free PASID IDs.  
> >> Do we make the decision depending on the CM or depending on the
> >> VCCAP_REG?
> >>
> >> VCCAP_REG description says:
> >>
> >> If Set, software must use Virtual Command Register interface to
> >> allocate and free PASIDs.  
> > 
> > The answer is it depends on the ECAP.VCS and then the PASID
> > allocation bit in VCCAP_REG. But VCS bit implies the iommu is a
> > software implementation (vIOMMU) of vt-d architecture. Pls refer to
> > the descriptions of "Virtual Command Support" in vt-d 3.0 spec.
> > 
> > "Hardware implementations of this architecture report a value of 0
> > in this field. Software implementations (emulation) of this
> > architecture may report VCS=1."  
> 
> OK I understand. But strictly speaking a vIOMMU may not implement CM.
> But that's nitpicking ;-)
> 
CAP.CM (caching mode) and ECAP.VCS(virtual command support) are
separate. I think we are mixing the two here since both are sufficient
condition to indicate whether we are running in a guest.
> Thanks
> 
> Eric
> > 
> > Thanks,
> > Yi Liu
> >   

[Jacob Pan]
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 08/19] ioasid: Add custom IOASID allocator

2019-04-25 Thread Jacob Pan
Hi Eric,

Thanks for the review.

On Thu, 25 Apr 2019 12:03:42 +0200
Auger Eric  wrote:

> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > Sometimes, IOASID allocation must be handled by platform specific
> > code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs need
> > to be allocated by the host via enlightened or paravirt interfaces.
> > 
> > This patch adds an extension to the IOASID allocator APIs such that
> > platform drivers can register a custom allocator, possibly at boot
> > time, to take over the allocation. Xarray is still used for tracking
> > and searching purposes internal to the IOASID code. Private data of
> > an IOASID can also be set after the allocation.
> > 
> > There can be multiple custom allocators registered but only one is
> > used at a time. In case of hot removal of devices that provides the
> > allocator, all IOASIDs must be freed prior to unregistering the
> > allocator. Default XArray based allocator cannot be mixed with
> > custom allocators, i.e. custom allocators will not be used if there
> > are outstanding IOASIDs allocated by the default XA allocator.  
> 
> What's the exact use case behind allowing several custom IOASID
> allocators to be registered?
It is mainly for supporting multiple PCI segments thus multiple
vIOMMUs. Even though, all allocators will end up calling the host to
allocate PASIDs. QEMU does not support multiple PCI segments/domains
afaik but others might.
> > 
> > Signed-off-by: Jacob Pan 
> > ---
> >  drivers/base/ioasid.c  | 182
> > ++---
> > include/linux/ioasid.h |  15 +++- 2 files changed, 187
> > insertions(+), 10 deletions(-)
> > 
> > diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> > index c4012aa..5cb36a4 100644
> > --- a/drivers/base/ioasid.c
> > +++ b/drivers/base/ioasid.c
> > @@ -17,6 +17,120 @@ struct ioasid_data {
> >  };
> >  
> >  static DEFINE_XARRAY_ALLOC(ioasid_xa);
> > +static DEFINE_MUTEX(ioasid_allocator_lock);
> > +static struct ioasid_allocator *ioasid_allocator;  
> A more explicit name may be chosen. If I understand correctly that's
> the active_custom_allocator
Yes, more clear this way.

> > +
> > +static LIST_HEAD(custom_allocators);
> > +/*
> > + * A flag to track if ioasid default allocator already been used,
> > this will  
> is already in use?
> > + * prevent custom allocator from being used. The reason is that
> > custom allocator  
> s/The reason is that custom allocator/The reason is that custom
> allocators
> > + * must have unadulterated space to track private data with
> > xarray, there cannot
> > + * be a mix been default and custom allocated IOASIDs.
> > + */
> > +static int default_allocator_used;
> > +
> > +/**
> > + * ioasid_register_allocator - register a custom allocator
> > + * @allocator: the custom allocator to be registered
> > + *
> > + * Custom allocator take precedence over the default xarray based
> > allocator.
> > + * Private data associated with the ASID are managed by ASID
> > common code
> > + * similar to data stored in xa.
> > + *
> > + * There can be multiple allocators registered but only one is
> > active. In case
> > + * of runtime removal of an custom allocator, the next one is
> > activated based
> > + * on the registration ordering.  
> This last sentence may be moved to the unregister() kerneldoc
> > + */
> > +int ioasid_register_allocator(struct ioasid_allocator *allocator)
> > +{
> > +   struct ioasid_allocator *pallocator;
> > +   int ret = 0;
> > +
> > +   if (!allocator)
> > +   return -EINVAL;
> > +
> > +   mutex_lock(_allocator_lock);
> > +   if (list_empty(_allocators))
> > +   ioasid_allocator = allocator;  
> The fact the first registered custom allocator gets automatically
> active was not obvious to me and may deserve a comment.
Will do. I will add:
"No particular preference since all custom allocators end up calling
the host to allocate IOASIDs. We activate the first allocator and keep
the later ones in a list in case the first one gets removed due to
hotplug."

> > +   else {
> > +   /* Check if the allocator is already registered */
> > +   list_for_each_entry(pallocator,
> > _allocators, list) {
> > +   if (pallocator == allocator) {
> > +   pr_err("IOASID allocator already
> > exist\n");  
> s/exist/registered?
make sense.
> > +   ret = -EEXIST;
> > +   goto out_unlock;
> > +   }
> > +   }
> > +   }
> > +   list_add_tail(>list, _allocators);
> > +
> > +out_unlock:
> > +   mutex_unlock(_allocator_lock);
> > +   return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(ioasid_register_allocator);
> > +
> > +/**
> > + * ioasid_unregister_allocator - Remove a custom IOASID allocator
> > + * @allocator: the custom allocator to be removed
> > + *
> > + * Remove an allocator from the list, activate the next allocator
> > in
> > + * the order it was  registration.
> > + */
> > 

Re: [PATCH v2 9/9] platform/chrome: chromeos_laptop: use helper pci_dev_id

2019-04-25 Thread Benson Leung via iommu
Hi Heiner,

On Wed, Apr 24, 2019 at 09:17:19PM +0200, Heiner Kallweit wrote:
> Use new helper pci_dev_id() to simplify the code.
> 
> Signed-off-by: Heiner Kallweit 

Acked-By: Benson Leung 

> ---
>  drivers/platform/chrome/chromeos_laptop.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/platform/chrome/chromeos_laptop.c 
> b/drivers/platform/chrome/chromeos_laptop.c
> index 24326eecd..7abbb6167 100644
> --- a/drivers/platform/chrome/chromeos_laptop.c
> +++ b/drivers/platform/chrome/chromeos_laptop.c
> @@ -125,7 +125,7 @@ static bool chromeos_laptop_match_adapter_devid(struct 
> device *dev, u32 devid)
>   return false;
>  
>   pdev = to_pci_dev(dev);
> - return devid == PCI_DEVID(pdev->bus->number, pdev->devfn);
> + return devid == pci_dev_id(pdev);
>  }
>  
>  static void chromeos_laptop_check_adapter(struct i2c_adapter *adapter)
> -- 
> 2.21.0
> 
> 

-- 
Benson Leung
Staff Software Engineer
Chrome OS Kernel
Google Inc.
ble...@google.com
Chromium OS Project
ble...@chromium.org


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dma-mapping: remove an unnecessary NULL check

2019-04-25 Thread Ian Abbott

On 24/04/2019 15:27, Christoph Hellwig wrote:

On Wed, Apr 24, 2019 at 05:24:37PM +0300, Dan Carpenter wrote:

We already dereferenced "dev" when we called get_dma_ops() so this NULL
check is too late.  We're not supposed to pass NULL "dev" pointers to
dma_alloc_attrs().


Thanks, applied to the dma-mapping for-next tree.


Signed-off-by: Dan Carpenter 
---
There are still at least two drivers which do pass a NULL unfortunately.

drivers/staging/comedi/drivers/comedi_isadma.c:195 comedi_isadma_alloc() error: 
NULL dereference inside function 'dma_alloc_coherent()'
drivers/staging/comedi/drivers/comedi_isadma.c:227 comedi_isadma_free() error: 
NULL dereference inside function 'dma_free_coherent()'


This is staging code.  Per official decree from Linus we can just
ignore it, and I tend to do so to keep my sanity.


So for comedi_isadma, we can just replace the NULL with a pointer to a 
static dummy device with a 24-bit coherent mask?


--
-=( Ian Abbott  || Web: www.mev.co.uk )=-
-=( MEV Ltd. is a company registered in England & Wales. )=-
-=( Registered number: 02862268.  Registered address:)=-
-=( 15 West Park Road, Bramhall, STOCKPORT, SK7 3JZ, UK. )=-
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-mapping: remove an unnecessary NULL check

2019-04-25 Thread Ian Abbott

On 25/04/2019 15:18, Christoph Hellwig wrote:

On Thu, Apr 25, 2019 at 03:13:49PM +0100, Ian Abbott wrote:

So for comedi_isadma, we can just replace the NULL with a pointer to a
static dummy device with a 24-bit coherent mask?


It should be converted to a proper isa_driver, so that it gets a real
struct device.


But it will work as a short term solution?  comedi_isadma isn't a 
driver, but is used by a few comedi drivers for ISA cards.


--
-=( Ian Abbott  || Web: www.mev.co.uk )=-
-=( MEV Ltd. is a company registered in England & Wales. )=-
-=( Registered number: 02862268.  Registered address:)=-
-=( 15 West Park Road, Bramhall, STOCKPORT, SK7 3JZ, UK. )=-
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: a few xen swiotlb cleanups

2019-04-25 Thread Konrad Rzeszutek Wilk
On Thu, Apr 11, 2019 at 09:19:56AM +0200, Christoph Hellwig wrote:
> Hi all,

I will slurp these up.. right after I test them for correctness.

> 
> below are a couple of cleanups for swiotlb-xen.c.  They were done in
> preparation of eventually using the dma-noncoherent.h cache flushing
> hooks, but that final goal will need some major work to the arm32 code
> first.  Until then I think these patches might be better in mainline
> than in my local tree so that they don't bitrot.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/19] drivers core: Add I/O ASID allocator

2019-04-25 Thread Jacob Pan
Hi Christoph,

On Tue, 23 Apr 2019 23:19:03 -0700
Christoph Hellwig  wrote:

> On Tue, Apr 23, 2019 at 04:31:06PM -0700, Jacob Pan wrote:
> > The allocator doesn't really belong in drivers/iommu because some
> > drivers would like to allocate PASIDs for devices that aren't
> > managed by an IOMMU, using the same ID space as IOMMU. It doesn't
> > really belong in drivers/pci either since platform device also
> > support PASID. Add the allocator in drivers/base.  
> 
> I'd still add it to drivers/iommu, just selectable separately from the
> core iommu code..
Perhaps I misunderstood. If a driver wants to use IOASIDs w/o iommu
subsystem even turned on, how could selecting from the core iommu code
help? Could you elaborate on "selectable"?

>From VT-d's perspective, PASIDs are only used with IOMMU on. Jean
knows other use cases.

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/19] iommu: introduce device fault data

2019-04-25 Thread Jacob Pan
On Thu, 25 Apr 2019 15:33:17 +0100
Jean-Philippe Brucker  wrote:

> On 25/04/2019 14:21, Auger Eric wrote:
>   We could add a
> >> IOMMU_FAULT_PAGE_REQUEST_PERM_VALID bit instead, but I still find
> >> it weird to denote the validity of a bitfield using a separate bit.
> >>
> >> Given that three different series now rely on this, how about we
> >> send the fault patches separately for v5.2?  
> 
> Sorry I meant v5.3 - after the merge window
> 
> >> I pushed the recoverable fault
> >> support applied on top of this, with the PERM_READ bit and cleaned
> >> up kernel doc, to git://linux-arm.org/linux-jpb.git sva/api  
> > 
Sounds good to me. We need th READ perm. I will pick the fault reporting
patches from this tree for my next rev. My plan is to add PRQ support
for vSVA after the current series.
> > my only concern is is it likely to be upstreamed without any actual
> > user? In the positive, of course, I don't have any objection.  
> 
> Possibly, I don't think my I/O page fault stuff for SVA is likely to
> get in v5.3, it depends on one or two more patch sets. But your
> nested work and Jacob's one may be in good shape for next version? I
> find it difficult to keep track of the same patches in three
> different series.
Same here, hard to track especially for minor tweaks. I am working
towards the next version for vSVA page fault. Then I will look into
converting VT-d native IO page fault to yours.

Thanks,

Jacob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/19] iommu: introduce device fault data

2019-04-25 Thread Jean-Philippe Brucker
On 25/04/2019 14:21, Auger Eric wrote:
  We could add a
>> IOMMU_FAULT_PAGE_REQUEST_PERM_VALID bit instead, but I still find it
>> weird to denote the validity of a bitfield using a separate bit.
>>
>> Given that three different series now rely on this, how about we send
>> the fault patches separately for v5.2?

Sorry I meant v5.3 - after the merge window

>> I pushed the recoverable fault
>> support applied on top of this, with the PERM_READ bit and cleaned up
>> kernel doc, to git://linux-arm.org/linux-jpb.git sva/api
> 
> my only concern is is it likely to be upstreamed without any actual
> user? In the positive, of course, I don't have any objection.

Possibly, I don't think my I/O page fault stuff for SVA is likely to get
in v5.3, it depends on one or two more patch sets. But your nested work
and Jacob's one may be in good shape for next version? I find it
difficult to keep track of the same patches in three different series.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-mapping: remove an unnecessary NULL check

2019-04-25 Thread Christoph Hellwig
On Thu, Apr 25, 2019 at 03:31:01PM +0100, Ian Abbott wrote:
> On 25/04/2019 15:18, Christoph Hellwig wrote:
>> On Thu, Apr 25, 2019 at 03:13:49PM +0100, Ian Abbott wrote:
>>> So for comedi_isadma, we can just replace the NULL with a pointer to a
>>> static dummy device with a 24-bit coherent mask?
>>
>> It should be converted to a proper isa_driver, so that it gets a real
>> struct device.
>
> But it will work as a short term solution?  comedi_isadma isn't a driver, 
> but is used by a few comedi drivers for ISA cards.

Then convert the drivers properly and pass on the device from them.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dma-mapping: remove an unnecessary NULL check

2019-04-25 Thread Christoph Hellwig
On Thu, Apr 25, 2019 at 03:13:49PM +0100, Ian Abbott wrote:
> So for comedi_isadma, we can just replace the NULL with a pointer to a 
> static dummy device with a 24-bit coherent mask?

It should be converted to a proper isa_driver, so that it gets a real
struct device.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V3 18/29] lockdep: Remove save argument from check_prev_add()

2019-04-25 Thread Peter Zijlstra
On Thu, Apr 25, 2019 at 11:45:11AM +0200, Thomas Gleixner wrote:
> There is only one caller which hands in save_trace as function pointer.
> 
> Signed-off-by: Thomas Gleixner 

Acked-by: Peter Zijlstra (Intel) 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V3 00/29] stacktrace: Consolidate stack trace usage

2019-04-25 Thread Josh Poimboeuf
On Thu, Apr 25, 2019 at 11:44:53AM +0200, Thomas Gleixner wrote:
> This is an update to V2 which can be found here:
> 
>   https://lkml.kernel.org/r/20190418084119.056416...@linutronix.de
> 
> Changes vs. V2:
> 
>   - Fixed the kernel-doc issue pointed out by Mike
> 
>   - Removed the '-1' oddity from the tracer
> 
>   - Restricted the tracer nesting to 4
> 
>   - Restored the lockdep magic to prevent redundant stack traces
> 
>   - Addressed the small nitpicks here and there
> 
>   - Picked up Acked/Reviewed tags

Other than the 2 minor nits:

Reviewed-by: Josh Poimboeuf 

-- 
Josh
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [patch V3 21/29] tracing: Use percpu stack trace buffer more intelligently

2019-04-25 Thread Josh Poimboeuf
On Thu, Apr 25, 2019 at 11:45:14AM +0200, Thomas Gleixner wrote:
> @@ -2788,29 +2798,32 @@ static void __ftrace_trace_stack(struct
>*/
>   preempt_disable_notrace();
>  
> - use_stack = __this_cpu_inc_return(ftrace_stack_reserve);
> + stackidx = __this_cpu_inc_return(ftrace_stack_reserve);
> +
> + /* This should never happen. If it does, yell once and skip */
> + if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING))
> + goto out;
> +
>   /*
> -  * We don't need any atomic variables, just a barrier.
> -  * If an interrupt comes in, we don't care, because it would
> -  * have exited and put the counter back to what we want.
> -  * We just need a barrier to keep gcc from moving things
> -  * around.
> +  * The above __this_cpu_inc_return() is 'atomic' cpu local. An
> +  * interrupt will either see the value pre increment or post
> +  * increment. If the interrupt happens pre increment it will have
> +  * restored the counter when it returns.  We just need a barrier to
> +  * keep gcc from moving things around.
>*/
>   barrier();
> - if (use_stack == 1) {
> - trace.entries   = this_cpu_ptr(ftrace_stack.calls);
> - trace.max_entries   = FTRACE_STACK_MAX_ENTRIES;
> -
> - if (regs)
> - save_stack_trace_regs(regs, );
> - else
> - save_stack_trace();
> -
> - if (trace.nr_entries > size)
> - size = trace.nr_entries;
> - } else
> - /* From now on, use_stack is a boolean */
> - use_stack = 0;
> +
> + fstack = this_cpu_ptr(ftrace_stacks.stacks) + (stackidx - 1);

nit: it would be slightly less surprising if stackidx were 0-based:

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index d3f6ec7eb729..4fc93004feab 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2798,10 +2798,10 @@ static void __ftrace_trace_stack(struct ring_buffer 
*buffer,
 */
preempt_disable_notrace();
 
-   stackidx = __this_cpu_inc_return(ftrace_stack_reserve);
+   stackidx = __this_cpu_inc_return(ftrace_stack_reserve) - 1;
 
/* This should never happen. If it does, yell once and skip */
-   if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING))
+   if (WARN_ON_ONCE(stackidx > FTRACE_KSTACK_NESTING))
goto out;
 
/*
@@ -2813,7 +2813,7 @@ static void __ftrace_trace_stack(struct ring_buffer 
*buffer,
 */
barrier();
 
-   fstack = this_cpu_ptr(ftrace_stacks.stacks) + (stackidx - 1);
+   fstack = this_cpu_ptr(ftrace_stacks.stacks) + stackidx;
trace.entries   = fstack->calls;
trace.max_entries   = FTRACE_KSTACK_ENTRIES;
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/19] iommu: introduce device fault data

2019-04-25 Thread Auger Eric
Hi Jean-Philippe,

On 4/25/19 2:46 PM, Jean-Philippe Brucker wrote:
> On 24/04/2019 00:31, Jacob Pan wrote:
>> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
>> new file mode 100644
>> index 000..edcc0dd
>> --- /dev/null
>> +++ b/include/uapi/linux/iommu.h
>> @@ -0,0 +1,115 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * IOMMU user API definitions
>> + */
>> +
>> +#ifndef _UAPI_IOMMU_H
>> +#define _UAPI_IOMMU_H
>> +
>> +#include 
>> +
>> +#define IOMMU_FAULT_PERM_WRITE  (1 << 0) /* write */
>> +#define IOMMU_FAULT_PERM_EXEC   (1 << 1) /* exec */
>> +#define IOMMU_FAULT_PERM_PRIV   (1 << 2) /* privileged */
> 
> Could we add IOMMU_FAULT_PERM_READ back? The PRI Page Request has both R
> and W fields, and R=W=0 encodes the PASID Stop Markers. Even though the
> IOMMU drivers currently filter out the Stop Markers, we may want to
> inject them into guests at some point in the future, which wouldn't be
> possible with the current API.

OK for me.

 We could add a
> IOMMU_FAULT_PAGE_REQUEST_PERM_VALID bit instead, but I still find it
> weird to denote the validity of a bitfield using a separate bit.
> 
> Given that three different series now rely on this, how about we send
> the fault patches separately for v5.2? I pushed the recoverable fault
> support applied on top of this, with the PERM_READ bit and cleaned up
> kernel doc, to git://linux-arm.org/linux-jpb.git sva/api

my only concern is is it likely to be upstreamed without any actual
user? In the positive, of course, I don't have any objection.

Thanks

Eric
> 
> Thanks,
> Jean
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 02/19] iommu: introduce device fault data

2019-04-25 Thread Jean-Philippe Brucker
On 24/04/2019 00:31, Jacob Pan wrote:
> diff --git a/include/uapi/linux/iommu.h b/include/uapi/linux/iommu.h
> new file mode 100644
> index 000..edcc0dd
> --- /dev/null
> +++ b/include/uapi/linux/iommu.h
> @@ -0,0 +1,115 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * IOMMU user API definitions
> + */
> +
> +#ifndef _UAPI_IOMMU_H
> +#define _UAPI_IOMMU_H
> +
> +#include 
> +
> +#define IOMMU_FAULT_PERM_WRITE   (1 << 0) /* write */
> +#define IOMMU_FAULT_PERM_EXEC(1 << 1) /* exec */
> +#define IOMMU_FAULT_PERM_PRIV(1 << 2) /* privileged */

Could we add IOMMU_FAULT_PERM_READ back? The PRI Page Request has both R
and W fields, and R=W=0 encodes the PASID Stop Markers. Even though the
IOMMU drivers currently filter out the Stop Markers, we may want to
inject them into guests at some point in the future, which wouldn't be
possible with the current API. We could add a
IOMMU_FAULT_PAGE_REQUEST_PERM_VALID bit instead, but I still find it
weird to denote the validity of a bitfield using a separate bit.

Given that three different series now rely on this, how about we send
the fault patches separately for v5.2? I pushed the recoverable fault
support applied on top of this, with the PERM_READ bit and cleaned up
kernel doc, to git://linux-arm.org/linux-jpb.git sva/api

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH] dma-mapping: create iommu mapping for newly allocated dma coherent mem

2019-04-25 Thread Laurentiu Tudor
Hi Christoph,

On 24.04.2019 17:57, Christoph Hellwig wrote:
> I'd be happy to offload all of the mentioned tasks to you if you
> volunteer.

Alright, I think I mostly got it and can start with the two usb drivers 
you mentioned. Just need a few clarifications, please see inline.

> I think the first step is to move the two USB controller that can only
> DMA to their own BAR off the existing DMA coherent infrastructure.  The
> controllers are already identified using the HCD_LOCAL_MEM flag, so we
> just need to key off that in the hcd_buffer_* routines and call into a

So if HCD_LOCAL_MEM is set I should call into the gen_pool api instead 
of existing dma_{alloc,free}_coherent().

> genalloc that has been fed using the bar, replacing the current
> dma_declare_coherent usage.  Take a look at drivers/pci/p2pdma.c
> for another example of allocating bits of a BAR using genalloc.

Where should I place the reference to the gen_pool? How does local_pool 
in 'struct hcd_usb' sound?

---
Thanks & Best Regards, Laurentiu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/19] drivers core: Add I/O ASID allocator

2019-04-25 Thread Jean-Philippe Brucker
On 25/04/2019 11:17, Auger Eric wrote:
>> +/**
>> + * ioasid_alloc - Allocate an IOASID
>> + * @set: the IOASID set
>> + * @min: the minimum ID (inclusive)
>> + * @max: the maximum ID (exclusive)
>> + * @private: data private to the caller
>> + *
>> + * Allocate an ID between @min and @max (or %0 and %INT_MAX). Return the
> I would remove "(or %0 and %INT_MAX)".

Agreed, those where the default values of idr, but the xarray doesn't
define a default max value. By the way, I do think squashing patches 6
and 7 would be better (keeping my SOB but you can change the author).

>> +typedef int (*ioasid_iter_t)(ioasid_t ioasid, void *private, void *data);
> I don't see it used in this series.

There used to be a "ioasid_for_each()", which isn't needed by anyone at
the moment. This can be removed.

Thanks,
Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 23/29] tracing: Simplify stack trace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace.c |   40 +---
 1 file changed, 13 insertions(+), 27 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2774,22 +2774,18 @@ static void __ftrace_trace_stack(struct
 {
struct trace_event_call *call = _kernel_stack;
struct ring_buffer_event *event;
+   unsigned int size, nr_entries;
struct ftrace_stack *fstack;
struct stack_entry *entry;
-   struct stack_trace trace;
-   int size = FTRACE_KSTACK_ENTRIES;
int stackidx;
 
-   trace.nr_entries= 0;
-   trace.skip  = skip;
-
/*
 * Add one, for this function and the call to save_stack_trace()
 * If regs is set, then these functions will not be in the way.
 */
 #ifndef CONFIG_UNWINDER_ORC
if (!regs)
-   trace.skip++;
+   skip++;
 #endif
 
/*
@@ -2816,28 +2812,24 @@ static void __ftrace_trace_stack(struct
barrier();
 
fstack = this_cpu_ptr(ftrace_stacks.stacks) + (stackidx - 1);
-   trace.entries   = fstack->calls;
-   trace.max_entries   = FTRACE_KSTACK_ENTRIES;
-
-   if (regs)
-   save_stack_trace_regs(regs, );
-   else
-   save_stack_trace();
-
-   if (trace.nr_entries > size)
-   size = trace.nr_entries;
+   size = ARRAY_SIZE(fstack->calls);
 
-   size *= sizeof(unsigned long);
+   if (regs) {
+   nr_entries = stack_trace_save_regs(regs, fstack->calls,
+  size, skip);
+   } else {
+   nr_entries = stack_trace_save(fstack->calls, size, skip);
+   }
 
+   size = nr_entries * sizeof(unsigned long);
event = __trace_buffer_lock_reserve(buffer, TRACE_STACK,
sizeof(*entry) + size, flags, pc);
if (!event)
goto out;
entry = ring_buffer_event_data(event);
 
-   memcpy(>caller, trace.entries, size);
-
-   entry->size = trace.nr_entries;
+   memcpy(>caller, fstack->calls, size);
+   entry->size = nr_entries;
 
if (!call_filter_check_discard(call, entry, buffer, event))
__buffer_unlock_commit(buffer, event);
@@ -2916,7 +2908,6 @@ ftrace_trace_userstack(struct ring_buffe
struct trace_event_call *call = _user_stack;
struct ring_buffer_event *event;
struct userstack_entry *entry;
-   struct stack_trace trace;
 
if (!(global_trace.trace_flags & TRACE_ITER_USERSTACKTRACE))
return;
@@ -2947,12 +2938,7 @@ ftrace_trace_userstack(struct ring_buffe
entry->tgid = current->tgid;
memset(>caller, 0, sizeof(entry->caller));
 
-   trace.nr_entries= 0;
-   trace.max_entries   = FTRACE_STACK_ENTRIES;
-   trace.skip  = 0;
-   trace.entries   = entry->caller;
-
-   save_stack_trace_user();
+   stack_trace_save_user(entry->caller, FTRACE_STACK_ENTRIES);
if (!call_filter_check_discard(call, entry, buffer, event))
__buffer_unlock_commit(buffer, event);
 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 04/29] backtrace-test: Simplify stack trace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner 
---
 kernel/backtracetest.c |   11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

--- a/kernel/backtracetest.c
+++ b/kernel/backtracetest.c
@@ -48,19 +48,14 @@ static void backtrace_test_irq(void)
 #ifdef CONFIG_STACKTRACE
 static void backtrace_test_saved(void)
 {
-   struct stack_trace trace;
unsigned long entries[8];
+   unsigned int nr_entries;
 
pr_info("Testing a saved backtrace.\n");
pr_info("The following trace is a kernel self test and not a bug!\n");
 
-   trace.nr_entries = 0;
-   trace.max_entries = ARRAY_SIZE(entries);
-   trace.entries = entries;
-   trace.skip = 0;
-
-   save_stack_trace();
-   print_stack_trace(, 0);
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0);
+   stack_trace_print(entries, nr_entries, 0);
 }
 #else
 static void backtrace_test_saved(void)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 18/29] lockdep: Remove save argument from check_prev_add()

2019-04-25 Thread Thomas Gleixner
There is only one caller which hands in save_trace as function pointer.

Signed-off-by: Thomas Gleixner 
---
 kernel/locking/lockdep.c |   16 
 1 file changed, 8 insertions(+), 8 deletions(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2158,8 +2158,7 @@ check_deadlock(struct task_struct *curr,
  */
 static int
 check_prev_add(struct task_struct *curr, struct held_lock *prev,
-  struct held_lock *next, int distance, struct stack_trace *trace,
-  int (*save)(struct stack_trace *trace))
+  struct held_lock *next, int distance, struct stack_trace *trace)
 {
struct lock_list *uninitialized_var(target_entry);
struct lock_list *entry;
@@ -2199,11 +2198,11 @@ check_prev_add(struct task_struct *curr,
if (unlikely(!ret)) {
if (!trace->entries) {
/*
-* If @save fails here, the printing might trigger
-* a WARN but because of the !nr_entries it should
-* not do bad things.
+* If save_trace fails here, the printing might
+* trigger a WARN but because of the !nr_entries it
+* should not do bad things.
 */
-   save(trace);
+   save_trace(trace);
}
return print_circular_bug(, target_entry, next, prev);
}
@@ -2253,7 +2252,7 @@ check_prev_add(struct task_struct *curr,
return print_bfs_bug(ret);
 
 
-   if (!trace->entries && !save(trace))
+   if (!trace->entries && !save_trace(trace))
return 0;
 
/*
@@ -2318,7 +2317,8 @@ check_prevs_add(struct task_struct *curr
 * added:
 */
if (hlock->read != 2 && hlock->check) {
-   int ret = check_prev_add(curr, hlock, next, distance, 
, save_trace);
+   int ret = check_prev_add(curr, hlock, next, distance,
+);
if (!ret)
return 0;
 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 15/29] dm persistent data: Simplify stack trace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface. This results in less storage space and
indirection.

Signed-off-by: Thomas Gleixner 
Cc: dm-de...@redhat.com
Cc: Mike Snitzer 
Cc: Alasdair Kergon 
---
 drivers/md/persistent-data/dm-block-manager.c |   19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

--- a/drivers/md/persistent-data/dm-block-manager.c
+++ b/drivers/md/persistent-data/dm-block-manager.c
@@ -35,7 +35,10 @@
 #define MAX_HOLDERS 4
 #define MAX_STACK 10
 
-typedef unsigned long stack_entries[MAX_STACK];
+struct stack_store {
+   unsigned intnr_entries;
+   unsigned long   entries[MAX_STACK];
+};
 
 struct block_lock {
spinlock_t lock;
@@ -44,8 +47,7 @@ struct block_lock {
struct task_struct *holders[MAX_HOLDERS];
 
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
-   struct stack_trace traces[MAX_HOLDERS];
-   stack_entries entries[MAX_HOLDERS];
+   struct stack_store traces[MAX_HOLDERS];
 #endif
 };
 
@@ -73,7 +75,7 @@ static void __add_holder(struct block_lo
 {
unsigned h = __find_holder(lock, NULL);
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
-   struct stack_trace *t;
+   struct stack_store *t;
 #endif
 
get_task_struct(task);
@@ -81,11 +83,7 @@ static void __add_holder(struct block_lo
 
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
t = lock->traces + h;
-   t->nr_entries = 0;
-   t->max_entries = MAX_STACK;
-   t->entries = lock->entries[h];
-   t->skip = 2;
-   save_stack_trace(t);
+   t->nr_entries = stack_trace_save(t->entries, MAX_STACK, 2);
 #endif
 }
 
@@ -106,7 +104,8 @@ static int __check_holder(struct block_l
DMERR("recursive lock detected in metadata");
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
DMERR("previously held here:");
-   print_stack_trace(lock->traces + i, 4);
+   stack_trace_print(lock->traces[i].entries,
+ lock->traces[i].nr_entries, 4);
 
DMERR("subsequent acquisition attempted here:");
dump_stack();


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 03/29] lib/stackdepot: Provide functions which operate on plain storage arrays

2019-04-25 Thread Thomas Gleixner
The struct stack_trace indirection in the stack depot functions is a truly
pointless excercise which requires horrible code at the callsites.

Provide interfaces based on plain storage arrays.

Signed-off-by: Thomas Gleixner 
Acked-by: Alexander Potapenko 
---
V3: Fix kernel-doc
---
 include/linux/stackdepot.h |4 ++
 lib/stackdepot.c   |   70 -
 2 files changed, 55 insertions(+), 19 deletions(-)

--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -26,7 +26,11 @@ typedef u32 depot_stack_handle_t;
 struct stack_trace;
 
 depot_stack_handle_t depot_save_stack(struct stack_trace *trace, gfp_t flags);
+depot_stack_handle_t stack_depot_save(unsigned long *entries,
+ unsigned int nr_entries, gfp_t gfp_flags);
 
 void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace);
+unsigned int stack_depot_fetch(depot_stack_handle_t handle,
+  unsigned long **entries);
 
 #endif
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -194,40 +194,60 @@ static inline struct stack_record *find_
return NULL;
 }
 
-void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace)
+/**
+ * stack_depot_fetch - Fetch stack entries from a depot
+ *
+ * @handle:Stack depot handle which was returned from
+ * stack_depot_save().
+ * @entries:   Pointer to store the entries address
+ *
+ * Return: The number of trace entries for this depot.
+ */
+unsigned int stack_depot_fetch(depot_stack_handle_t handle,
+  unsigned long **entries)
 {
union handle_parts parts = { .handle = handle };
void *slab = stack_slabs[parts.slabindex];
size_t offset = parts.offset << STACK_ALLOC_ALIGN;
struct stack_record *stack = slab + offset;
 
-   trace->nr_entries = trace->max_entries = stack->size;
-   trace->entries = stack->entries;
-   trace->skip = 0;
+   *entries = stack->entries;
+   return stack->size;
+}
+EXPORT_SYMBOL_GPL(stack_depot_fetch);
+
+void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace)
+{
+   unsigned int nent = stack_depot_fetch(handle, >entries);
+
+   trace->max_entries = trace->nr_entries = nent;
 }
 EXPORT_SYMBOL_GPL(depot_fetch_stack);
 
 /**
- * depot_save_stack - save stack in a stack depot.
- * @trace - the stacktrace to save.
- * @alloc_flags - flags for allocating additional memory if required.
+ * stack_depot_save - Save a stack trace from an array
+ *
+ * @entries:   Pointer to storage array
+ * @nr_entries:Size of the storage array
+ * @alloc_flags:   Allocation gfp flags
  *
- * Returns the handle of the stack struct stored in depot.
+ * Return: The handle of the stack struct stored in depot
  */
-depot_stack_handle_t depot_save_stack(struct stack_trace *trace,
-   gfp_t alloc_flags)
+depot_stack_handle_t stack_depot_save(unsigned long *entries,
+ unsigned int nr_entries,
+ gfp_t alloc_flags)
 {
-   u32 hash;
-   depot_stack_handle_t retval = 0;
struct stack_record *found = NULL, **bucket;
-   unsigned long flags;
+   depot_stack_handle_t retval = 0;
struct page *page = NULL;
void *prealloc = NULL;
+   unsigned long flags;
+   u32 hash;
 
-   if (unlikely(trace->nr_entries == 0))
+   if (unlikely(nr_entries == 0))
goto fast_exit;
 
-   hash = hash_stack(trace->entries, trace->nr_entries);
+   hash = hash_stack(entries, nr_entries);
bucket = _table[hash & STACK_HASH_MASK];
 
/*
@@ -235,8 +255,8 @@ depot_stack_handle_t depot_save_stack(st
 * The smp_load_acquire() here pairs with smp_store_release() to
 * |bucket| below.
 */
-   found = find_stack(smp_load_acquire(bucket), trace->entries,
-  trace->nr_entries, hash);
+   found = find_stack(smp_load_acquire(bucket), entries,
+  nr_entries, hash);
if (found)
goto exit;
 
@@ -264,10 +284,10 @@ depot_stack_handle_t depot_save_stack(st
 
spin_lock_irqsave(_lock, flags);
 
-   found = find_stack(*bucket, trace->entries, trace->nr_entries, hash);
+   found = find_stack(*bucket, entries, nr_entries, hash);
if (!found) {
struct stack_record *new =
-   depot_alloc_stack(trace->entries, trace->nr_entries,
+   depot_alloc_stack(entries, nr_entries,
  hash, , alloc_flags);
if (new) {
new->next = *bucket;
@@ -297,4 +317,16 @@ depot_stack_handle_t depot_save_stack(st
 fast_exit:
return retval;
 }
+EXPORT_SYMBOL_GPL(stack_depot_save);
+
+/**
+ * depot_save_stack - save stack in 

[patch V3 26/29] stacktrace: Remove obsolete functions

2019-04-25 Thread Thomas Gleixner
No more users of the struct stack_trace based interfaces. Remove them.

Remove the macro stubs for !CONFIG_STACKTRACE as well as they are pointless
because the storage on the call sites is conditional on CONFIG_STACKTRACE
already. No point to be 'smart'.

Signed-off-by: Thomas Gleixner 
---
 include/linux/stacktrace.h |   17 -
 kernel/stacktrace.c|   14 --
 2 files changed, 31 deletions(-)

--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -36,24 +36,7 @@ extern void save_stack_trace_tsk(struct
struct stack_trace *trace);
 extern int save_stack_trace_tsk_reliable(struct task_struct *tsk,
 struct stack_trace *trace);
-
-extern void print_stack_trace(struct stack_trace *trace, int spaces);
-extern int snprint_stack_trace(char *buf, size_t size,
-   struct stack_trace *trace, int spaces);
-
-#ifdef CONFIG_USER_STACKTRACE_SUPPORT
 extern void save_stack_trace_user(struct stack_trace *trace);
-#else
-# define save_stack_trace_user(trace)  do { } while (0)
-#endif
-
-#else /* !CONFIG_STACKTRACE */
-# define save_stack_trace(trace)   do { } while (0)
-# define save_stack_trace_tsk(tsk, trace)  do { } while (0)
-# define save_stack_trace_user(trace)  do { } while (0)
-# define print_stack_trace(trace, spaces)  do { } while (0)
-# define snprint_stack_trace(buf, size, trace, spaces) do { } while (0)
-# define save_stack_trace_tsk_reliable(tsk, trace) ({ -ENOSYS; })
 #endif /* CONFIG_STACKTRACE */
 
 #if defined(CONFIG_STACKTRACE) && defined(CONFIG_HAVE_RELIABLE_STACKTRACE)
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -30,12 +30,6 @@ void stack_trace_print(unsigned long *en
 }
 EXPORT_SYMBOL_GPL(stack_trace_print);
 
-void print_stack_trace(struct stack_trace *trace, int spaces)
-{
-   stack_trace_print(trace->entries, trace->nr_entries, spaces);
-}
-EXPORT_SYMBOL_GPL(print_stack_trace);
-
 /**
  * stack_trace_snprint - Print the entries in the stack trace into a buffer
  * @buf:   Pointer to the print buffer
@@ -72,14 +66,6 @@ int stack_trace_snprint(char *buf, size_
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
-int snprint_stack_trace(char *buf, size_t size,
-   struct stack_trace *trace, int spaces)
-{
-   return stack_trace_snprint(buf, size, trace->entries,
-  trace->nr_entries, spaces);
-}
-EXPORT_SYMBOL_GPL(snprint_stack_trace);
-
 /*
  * Architectures that do not implement save_stack_trace_*()
  * get these weak aliases and once-per-bootup warnings


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 05/29] proc: Simplify task stack retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Alexey Dobriyan 
Cc: Andrew Morton 
---
 fs/proc/base.c |   14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -407,7 +407,6 @@ static void unlock_trace(struct task_str
 static int proc_pid_stack(struct seq_file *m, struct pid_namespace *ns,
  struct pid *pid, struct task_struct *task)
 {
-   struct stack_trace trace;
unsigned long *entries;
int err;
 
@@ -430,20 +429,17 @@ static int proc_pid_stack(struct seq_fil
if (!entries)
return -ENOMEM;
 
-   trace.nr_entries= 0;
-   trace.max_entries   = MAX_STACK_TRACE_DEPTH;
-   trace.entries   = entries;
-   trace.skip  = 0;
-
err = lock_trace(task);
if (!err) {
-   unsigned int i;
+   unsigned int i, nr_entries;
 
-   save_stack_trace_tsk(task, );
+   nr_entries = stack_trace_save_tsk(task, entries,
+ MAX_STACK_TRACE_DEPTH, 0);
 
-   for (i = 0; i < trace.nr_entries; i++) {
+   for (i = 0; i < nr_entries; i++) {
seq_printf(m, "[<0>] %pB\n", (void *)entries[i]);
}
+
unlock_trace(task);
}
kfree(entries);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 25/29] livepatch: Simplify stack trace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner 
Acked-by: Miroslav Benes 
---
 kernel/livepatch/transition.c |   22 +-
 1 file changed, 9 insertions(+), 13 deletions(-)

--- a/kernel/livepatch/transition.c
+++ b/kernel/livepatch/transition.c
@@ -202,15 +202,15 @@ void klp_update_patch_state(struct task_
  * Determine whether the given stack trace includes any references to a
  * to-be-patched or to-be-unpatched function.
  */
-static int klp_check_stack_func(struct klp_func *func,
-   struct stack_trace *trace)
+static int klp_check_stack_func(struct klp_func *func, unsigned long *entries,
+   unsigned int nr_entries)
 {
unsigned long func_addr, func_size, address;
struct klp_ops *ops;
int i;
 
-   for (i = 0; i < trace->nr_entries; i++) {
-   address = trace->entries[i];
+   for (i = 0; i < nr_entries; i++) {
+   address = entries[i];
 
if (klp_target_state == KLP_UNPATCHED) {
 /*
@@ -254,29 +254,25 @@ static int klp_check_stack_func(struct k
 static int klp_check_stack(struct task_struct *task, char *err_buf)
 {
static unsigned long entries[MAX_STACK_ENTRIES];
-   struct stack_trace trace;
struct klp_object *obj;
struct klp_func *func;
-   int ret;
+   int ret, nr_entries;
 
-   trace.skip = 0;
-   trace.nr_entries = 0;
-   trace.max_entries = MAX_STACK_ENTRIES;
-   trace.entries = entries;
-   ret = save_stack_trace_tsk_reliable(task, );
+   ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries));
WARN_ON_ONCE(ret == -ENOSYS);
-   if (ret) {
+   if (ret < 0) {
snprintf(err_buf, STACK_ERR_BUF_SIZE,
 "%s: %s:%d has an unreliable stack\n",
 __func__, task->comm, task->pid);
return ret;
}
+   nr_entries = ret;
 
klp_for_each_object(klp_transition_patch, obj) {
if (!obj->patched)
continue;
klp_for_each_func(obj, func) {
-   ret = klp_check_stack_func(func, );
+   ret = klp_check_stack_func(func, entries, nr_entries);
if (ret) {
snprintf(err_buf, STACK_ERR_BUF_SIZE,
 "%s: %s:%d is sleeping on function 
%s\n",


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 16/29] drm: Simplify stacktrace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

The original code in all printing functions is really wrong. It allocates a
storage array on stack which is unused because depot_fetch_stack() does not
store anything in it. It overwrites the entries pointer in the stack_trace
struct so it points to the depot storage.

Signed-off-by: Thomas Gleixner 
Acked-by: Daniel Vetter 
Cc: intel-...@lists.freedesktop.org
Cc: Joonas Lahtinen 
Cc: Maarten Lankhorst 
Cc: dri-de...@lists.freedesktop.org
Cc: David Airlie 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
---
 drivers/gpu/drm/drm_mm.c|   22 +++---
 drivers/gpu/drm/i915/i915_vma.c |   11 ---
 drivers/gpu/drm/i915/intel_runtime_pm.c |   21 +++--
 3 files changed, 18 insertions(+), 36 deletions(-)

--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -106,22 +106,19 @@
 static noinline void save_stack(struct drm_mm_node *node)
 {
unsigned long entries[STACKDEPTH];
-   struct stack_trace trace = {
-   .entries = entries,
-   .max_entries = STACKDEPTH,
-   .skip = 1
-   };
+   unsigned int n;
 
-   save_stack_trace();
+   n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
 
/* May be called under spinlock, so avoid sleeping */
-   node->stack = depot_save_stack(, GFP_NOWAIT);
+   node->stack = stack_depot_save(entries, n, GFP_NOWAIT);
 }
 
 static void show_leaks(struct drm_mm *mm)
 {
struct drm_mm_node *node;
-   unsigned long entries[STACKDEPTH];
+   unsigned long *entries;
+   unsigned int nr_entries;
char *buf;
 
buf = kmalloc(BUFSZ, GFP_KERNEL);
@@ -129,19 +126,14 @@ static void show_leaks(struct drm_mm *mm
return;
 
list_for_each_entry(node, drm_mm_nodes(mm), node_list) {
-   struct stack_trace trace = {
-   .entries = entries,
-   .max_entries = STACKDEPTH
-   };
-
if (!node->stack) {
DRM_ERROR("node [%08llx + %08llx]: unknown owner\n",
  node->start, node->size);
continue;
}
 
-   depot_fetch_stack(node->stack, );
-   snprint_stack_trace(buf, BUFSZ, , 0);
+   nr_entries = stack_depot_fetch(node->stack, );
+   stack_trace_snprint(buf, BUFSZ, entries, nr_entries, 0);
DRM_ERROR("node [%08llx + %08llx]: inserted at\n%s",
  node->start, node->size, buf);
}
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -36,11 +36,8 @@
 
 static void vma_print_allocator(struct i915_vma *vma, const char *reason)
 {
-   unsigned long entries[12];
-   struct stack_trace trace = {
-   .entries = entries,
-   .max_entries = ARRAY_SIZE(entries),
-   };
+   unsigned long *entries;
+   unsigned int nr_entries;
char buf[512];
 
if (!vma->node.stack) {
@@ -49,8 +46,8 @@ static void vma_print_allocator(struct i
return;
}
 
-   depot_fetch_stack(vma->node.stack, );
-   snprint_stack_trace(buf, sizeof(buf), , 0);
+   nr_entries = stack_depot_fetch(vma->node.stack, );
+   stack_trace_snprint(buf, sizeof(buf), entries, nr_entries, 0);
DRM_DEBUG_DRIVER("vma.node [%08llx + %08llx] %s: inserted at %s\n",
 vma->node.start, vma->node.size, reason, buf);
 }
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -60,27 +60,20 @@
 static noinline depot_stack_handle_t __save_depot_stack(void)
 {
unsigned long entries[STACKDEPTH];
-   struct stack_trace trace = {
-   .entries = entries,
-   .max_entries = ARRAY_SIZE(entries),
-   .skip = 1,
-   };
+   unsigned int n;
 
-   save_stack_trace();
-   return depot_save_stack(, GFP_NOWAIT | __GFP_NOWARN);
+   n = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
+   return stack_depot_save(entries, n, GFP_NOWAIT | __GFP_NOWARN);
 }
 
 static void __print_depot_stack(depot_stack_handle_t stack,
char *buf, int sz, int indent)
 {
-   unsigned long entries[STACKDEPTH];
-   struct stack_trace trace = {
-   .entries = entries,
-   .max_entries = ARRAY_SIZE(entries),
-   };
+   unsigned long *entries;
+   unsigned int nr_entries;
 
-   depot_fetch_stack(stack, );
-   snprint_stack_trace(buf, sz, , indent);
+   nr_entries = stack_depot_fetch(stack, );
+   stack_trace_snprint(buf, sz, entries, nr_entries, indent);
 }
 
 static void init_intel_runtime_pm_wakeref(struct drm_i915_private *i915)


___
iommu mailing list

[patch V3 01/29] tracing: Cleanup stack trace code

2019-04-25 Thread Thomas Gleixner
- Remove the extra array member of stack_dump_trace[] along with the
  ARRAY_SIZE - 1 initialization for struct stack_trace :: max_entries.

  Both are historical leftovers of no value. The stack tracer never exceeds
  the array and there is no extra storage requirement either.

- Make variables which are only used in trace_stack.c static.

- Simplify the enable/disable logic.

- Rename stack_trace_print() as it's using the stack_trace_ namespace. Free
  the name up for stack trace related functions.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt 
---
V3: Remove the -1 init and split the variable declaration as requested by 
Steven.
V2: Add more cleanups and use print_max_stack() as requested by Steven.
---
 include/linux/ftrace.h |   18 --
 kernel/trace/trace_stack.c |   42 +-
 2 files changed, 17 insertions(+), 43 deletions(-)

--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -241,21 +241,11 @@ static inline void ftrace_free_mem(struc
 
 #ifdef CONFIG_STACK_TRACER
 
-#define STACK_TRACE_ENTRIES 500
-
-struct stack_trace;
-
-extern unsigned stack_trace_index[];
-extern struct stack_trace stack_trace_max;
-extern unsigned long stack_trace_max_size;
-extern arch_spinlock_t stack_trace_max_lock;
-
 extern int stack_tracer_enabled;
-void stack_trace_print(void);
-int
-stack_trace_sysctl(struct ctl_table *table, int write,
-  void __user *buffer, size_t *lenp,
-  loff_t *ppos);
+
+int stack_trace_sysctl(struct ctl_table *table, int write,
+  void __user *buffer, size_t *lenp,
+  loff_t *ppos);
 
 /* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
 DECLARE_PER_CPU(int, disable_stack_tracer);
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -18,30 +18,26 @@
 
 #include "trace.h"
 
-static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES + 1];
-unsigned stack_trace_index[STACK_TRACE_ENTRIES];
+#define STACK_TRACE_ENTRIES 500
+
+static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
+static unsigned stack_trace_index[STACK_TRACE_ENTRIES];
 
-/*
- * Reserve one entry for the passed in ip. This will allow
- * us to remove most or all of the stack size overhead
- * added by the stack tracer itself.
- */
 struct stack_trace stack_trace_max = {
-   .max_entries= STACK_TRACE_ENTRIES - 1,
+   .max_entries= STACK_TRACE_ENTRIES,
.entries= _dump_trace[0],
 };
 
-unsigned long stack_trace_max_size;
-arch_spinlock_t stack_trace_max_lock =
+static unsigned long stack_trace_max_size;
+static arch_spinlock_t stack_trace_max_lock =
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
 DEFINE_PER_CPU(int, disable_stack_tracer);
 static DEFINE_MUTEX(stack_sysctl_mutex);
 
 int stack_tracer_enabled;
-static int last_stack_tracer_enabled;
 
-void stack_trace_print(void)
+static void print_max_stack(void)
 {
long i;
int size;
@@ -61,16 +57,7 @@ void stack_trace_print(void)
}
 }
 
-/*
- * When arch-specific code overrides this function, the following
- * data should be filled up, assuming stack_trace_max_lock is held to
- * prevent concurrent updates.
- * stack_trace_index[]
- * stack_trace_max
- * stack_trace_max_size
- */
-void __weak
-check_stack(unsigned long ip, unsigned long *stack)
+static void check_stack(unsigned long ip, unsigned long *stack)
 {
unsigned long this_size, flags; unsigned long *p, *top, *start;
static int tracer_frame;
@@ -179,7 +166,7 @@ check_stack(unsigned long ip, unsigned l
stack_trace_max.nr_entries = x;
 
if (task_stack_end_corrupted(current)) {
-   stack_trace_print();
+   print_max_stack();
BUG();
}
 
@@ -412,23 +399,21 @@ stack_trace_sysctl(struct ctl_table *tab
   void __user *buffer, size_t *lenp,
   loff_t *ppos)
 {
+   int was_enabled;
int ret;
 
mutex_lock(_sysctl_mutex);
+   was_enabled = !!stack_tracer_enabled;
 
ret = proc_dointvec(table, write, buffer, lenp, ppos);
 
-   if (ret || !write ||
-   (last_stack_tracer_enabled == !!stack_tracer_enabled))
+   if (ret || !write || (was_enabled == !!stack_tracer_enabled))
goto out;
 
-   last_stack_tracer_enabled = !!stack_tracer_enabled;
-
if (stack_tracer_enabled)
register_ftrace_function(_ops);
else
unregister_ftrace_function(_ops);
-
  out:
mutex_unlock(_sysctl_mutex);
return ret;
@@ -444,7 +429,6 @@ static __init int enable_stacktrace(char
strncpy(stack_trace_filter_buf, str + len, COMMAND_LINE_SIZE);
 
stack_tracer_enabled = 1;
-   last_stack_tracer_enabled = 1;
return 1;
 }
 __setup("stacktrace", enable_stacktrace);


___
iommu mailing list

[patch V3 02/29] stacktrace: Provide helpers for common stack trace operations

2019-04-25 Thread Thomas Gleixner
All operations with stack traces are based on struct stack_trace. That's a
horrible construct as the struct is a kitchen sink for input and
output. Quite some usage sites embed it into their own data structures
which creates weird indirections.

There is absolutely no point in doing so. For all use cases a storage array
and the number of valid stack trace entries in the array is sufficient.

Provide helper functions which avoid the struct stack_trace indirection so
the usage sites can be cleaned up.

Signed-off-by: Thomas Gleixner 
---
V3: Fix kernel doc.
---
 include/linux/stacktrace.h |   27 +++
 kernel/stacktrace.c|  170 +
 2 files changed, 182 insertions(+), 15 deletions(-)

--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -3,11 +3,26 @@
 #define __LINUX_STACKTRACE_H
 
 #include 
+#include 
 
 struct task_struct;
 struct pt_regs;
 
 #ifdef CONFIG_STACKTRACE
+void stack_trace_print(unsigned long *trace, unsigned int nr_entries,
+  int spaces);
+int stack_trace_snprint(char *buf, size_t size, unsigned long *entries,
+   unsigned int nr_entries, int spaces);
+unsigned int stack_trace_save(unsigned long *store, unsigned int size,
+ unsigned int skipnr);
+unsigned int stack_trace_save_tsk(struct task_struct *task,
+ unsigned long *store, unsigned int size,
+ unsigned int skipnr);
+unsigned int stack_trace_save_regs(struct pt_regs *regs, unsigned long *store,
+  unsigned int size, unsigned int skipnr);
+unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
+
+/* Internal interfaces. Do not use in generic code */
 struct stack_trace {
unsigned int nr_entries, max_entries;
unsigned long *entries;
@@ -41,4 +56,16 @@ extern void save_stack_trace_user(struct
 # define save_stack_trace_tsk_reliable(tsk, trace) ({ -ENOSYS; })
 #endif /* CONFIG_STACKTRACE */
 
+#if defined(CONFIG_STACKTRACE) && defined(CONFIG_HAVE_RELIABLE_STACKTRACE)
+int stack_trace_save_tsk_reliable(struct task_struct *tsk, unsigned long 
*store,
+ unsigned int size);
+#else
+static inline int stack_trace_save_tsk_reliable(struct task_struct *tsk,
+   unsigned long *store,
+   unsigned int size)
+{
+   return -ENOSYS;
+}
+#endif
+
 #endif /* __LINUX_STACKTRACE_H */
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -11,35 +11,54 @@
 #include 
 #include 
 
-void print_stack_trace(struct stack_trace *trace, int spaces)
+/**
+ * stack_trace_print - Print the entries in the stack trace
+ * @entries:   Pointer to storage array
+ * @nr_entries:Number of entries in the storage array
+ * @spaces:Number of leading spaces to print
+ */
+void stack_trace_print(unsigned long *entries, unsigned int nr_entries,
+  int spaces)
 {
-   int i;
+   unsigned int i;
 
-   if (WARN_ON(!trace->entries))
+   if (WARN_ON(!entries))
return;
 
-   for (i = 0; i < trace->nr_entries; i++)
-   printk("%*c%pS\n", 1 + spaces, ' ', (void *)trace->entries[i]);
+   for (i = 0; i < nr_entries; i++)
+   printk("%*c%pS\n", 1 + spaces, ' ', (void *)entries[i]);
+}
+EXPORT_SYMBOL_GPL(stack_trace_print);
+
+void print_stack_trace(struct stack_trace *trace, int spaces)
+{
+   stack_trace_print(trace->entries, trace->nr_entries, spaces);
 }
 EXPORT_SYMBOL_GPL(print_stack_trace);
 
-int snprint_stack_trace(char *buf, size_t size,
-   struct stack_trace *trace, int spaces)
+/**
+ * stack_trace_snprint - Print the entries in the stack trace into a buffer
+ * @buf:   Pointer to the print buffer
+ * @size:  Size of the print buffer
+ * @entries:   Pointer to storage array
+ * @nr_entries:Number of entries in the storage array
+ * @spaces:Number of leading spaces to print
+ *
+ * Return: Number of bytes printed.
+ */
+int stack_trace_snprint(char *buf, size_t size, unsigned long *entries,
+   unsigned int nr_entries, int spaces)
 {
-   int i;
-   int generated;
-   int total = 0;
+   unsigned int generated, i, total = 0;
 
-   if (WARN_ON(!trace->entries))
+   if (WARN_ON(!entries))
return 0;
 
-   for (i = 0; i < trace->nr_entries; i++) {
+   for (i = 0; i < nr_entries && size; i++) {
generated = snprintf(buf, size, "%*c%pS\n", 1 + spaces, ' ',
-(void *)trace->entries[i]);
+(void *)entries[i]);
 
total += generated;
-
-   /* Assume that generated isn't a negative number */
if (generated >= size) {
buf += size;
   

[patch V3 28/29] stacktrace: Provide common infrastructure

2019-04-25 Thread Thomas Gleixner
All architectures which support stacktrace carry duplicated code and
do the stack storage and filtering at the architecture side.

Provide a consolidated interface with a callback function for consuming the
stack entries provided by the architecture specific stack walker. This
removes lots of duplicated code and allows to implement better filtering
than 'skip number of entries' in the future without touching any
architecture specific code.

Signed-off-by: Thomas Gleixner 
Cc: linux-a...@vger.kernel.org
---
V3: Fix kernel doc
---
 include/linux/stacktrace.h |   39 ++
 kernel/stacktrace.c|  173 +
 lib/Kconfig|4 +
 3 files changed, 216 insertions(+)

--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -23,6 +23,44 @@ unsigned int stack_trace_save_regs(struc
 unsigned int stack_trace_save_user(unsigned long *store, unsigned int size);
 
 /* Internal interfaces. Do not use in generic code */
+#ifdef CONFIG_ARCH_STACKWALK
+
+/**
+ * stack_trace_consume_fn - Callback for arch_stack_walk()
+ * @cookie:Caller supplied pointer handed back by arch_stack_walk()
+ * @addr:  The stack entry address to consume
+ * @reliable:  True when the stack entry is reliable. Required by
+ * some printk based consumers.
+ *
+ * Return: True, if the entry was consumed or skipped
+ * False, if there is no space left to store
+ */
+typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr,
+  bool reliable);
+/**
+ * arch_stack_walk - Architecture specific function to walk the stack
+ * @consume_entry: Callback which is invoked by the architecture code for
+ * each entry.
+ * @cookie:Caller supplied pointer which is handed back to
+ * @consume_entry
+ * @task:  Pointer to a task struct, can be NULL
+ * @regs:  Pointer to registers, can be NULL
+ *
+ *  === 
+ * taskregs
+ *  === 
+ * taskNULLStack trace from task (can be current)
+ * current regsStack trace starting on regs->stackpointer
+ *  === 
+ */
+void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
+struct task_struct *task, struct pt_regs *regs);
+int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, void 
*cookie,
+struct task_struct *task);
+void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
+ const struct pt_regs *regs);
+
+#else /* CONFIG_ARCH_STACKWALK */
 struct stack_trace {
unsigned int nr_entries, max_entries;
unsigned long *entries;
@@ -37,6 +75,7 @@ extern void save_stack_trace_tsk(struct
 extern int save_stack_trace_tsk_reliable(struct task_struct *tsk,
 struct stack_trace *trace);
 extern void save_stack_trace_user(struct stack_trace *trace);
+#endif /* !CONFIG_ARCH_STACKWALK */
 #endif /* CONFIG_STACKTRACE */
 
 #if defined(CONFIG_STACKTRACE) && defined(CONFIG_HAVE_RELIABLE_STACKTRACE)
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -5,6 +5,8 @@
  *
  *  Copyright (C) 2006 Red Hat, Inc., Ingo Molnar 
  */
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -66,6 +68,175 @@ int stack_trace_snprint(char *buf, size_
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
+#ifdef CONFIG_ARCH_STACKWALK
+
+struct stacktrace_cookie {
+   unsigned long   *store;
+   unsigned intsize;
+   unsigned intskip;
+   unsigned intlen;
+};
+
+static bool stack_trace_consume_entry(void *cookie, unsigned long addr,
+ bool reliable)
+{
+   struct stacktrace_cookie *c = cookie;
+
+   if (c->len >= c->size)
+   return false;
+
+   if (c->skip > 0) {
+   c->skip--;
+   return true;
+   }
+   c->store[c->len++] = addr;
+   return c->len < c->size;
+}
+
+static bool stack_trace_consume_entry_nosched(void *cookie, unsigned long addr,
+ bool reliable)
+{
+   if (in_sched_functions(addr))
+   return true;
+   return stack_trace_consume_entry(cookie, addr, reliable);
+}
+
+/**
+ * stack_trace_save - Save a stack trace into a storage array
+ * @store: Pointer to storage array
+ * @size:  Size of the storage array
+ * @skipnr:Number of entries to skip at the start of the stack trace
+ *
+ * Return: Number of trace entries stored.
+ */
+unsigned int stack_trace_save(unsigned long *store, unsigned int size,
+ unsigned int skipnr)
+{
+   stack_trace_consume_fn consume_entry = stack_trace_consume_entry;
+  

[patch V3 06/29] latency_top: Simplify stack trace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
---
 kernel/latencytop.c |   17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

--- a/kernel/latencytop.c
+++ b/kernel/latencytop.c
@@ -141,20 +141,6 @@ account_global_scheduler_latency(struct
memcpy(_record[i], lat, sizeof(struct latency_record));
 }
 
-/*
- * Iterator to store a backtrace into a latency record entry
- */
-static inline void store_stacktrace(struct task_struct *tsk,
-   struct latency_record *lat)
-{
-   struct stack_trace trace;
-
-   memset(, 0, sizeof(trace));
-   trace.max_entries = LT_BACKTRACEDEPTH;
-   trace.entries = >backtrace[0];
-   save_stack_trace_tsk(tsk, );
-}
-
 /**
  * __account_scheduler_latency - record an occurred latency
  * @tsk - the task struct of the task hitting the latency
@@ -191,7 +177,8 @@ void __sched
lat.count = 1;
lat.time = usecs;
lat.max = usecs;
-   store_stacktrace(tsk, );
+
+   stack_trace_save_tsk(tsk, lat.backtrace, LT_BACKTRACEDEPTH, 0);
 
raw_spin_lock_irqsave(_lock, flags);
 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 21/29] tracing: Use percpu stack trace buffer more intelligently

2019-04-25 Thread Thomas Gleixner
The per cpu stack trace buffer usage pattern is odd at best. The buffer has
place for 512 stack trace entries on 64-bit and 1024 on 32-bit. When
interrupts or exceptions nest after the per cpu buffer was acquired the
stacktrace length is hardcoded to 8 entries. 512/1024 stack trace entries
in kernel stacks are unrealistic so the buffer is a complete waste.

Split the buffer into 4 nest levels, which are 128/256 entries per
level. This allows nesting contexts (interrupts, exceptions) to utilize the
cpu buffer for stack retrieval and avoids the fixed length allocation along
with the conditional execution pathes.

Signed-off-by: Thomas Gleixner 
Cc: Steven Rostedt 
---
V3: Limit to 4 nest levels and increase size per level.
---
 kernel/trace/trace.c |   77 +--
 1 file changed, 39 insertions(+), 38 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2749,12 +2749,21 @@ trace_function(struct trace_array *tr,
 
 #ifdef CONFIG_STACKTRACE
 
-#define FTRACE_STACK_MAX_ENTRIES (PAGE_SIZE / sizeof(unsigned long))
+/* Allow 4 levels of nesting: normal, softirq, irq, NMI */
+#define FTRACE_KSTACK_NESTING  4
+
+#define FTRACE_KSTACK_ENTRIES  (PAGE_SIZE / FTRACE_KSTACK_NESTING)
+
 struct ftrace_stack {
-   unsigned long   calls[FTRACE_STACK_MAX_ENTRIES];
+   unsigned long   calls[FTRACE_KSTACK_ENTRIES];
+};
+
+
+struct ftrace_stacks {
+   struct ftrace_stack stacks[FTRACE_KSTACK_NESTING];
 };
 
-static DEFINE_PER_CPU(struct ftrace_stack, ftrace_stack);
+static DEFINE_PER_CPU(struct ftrace_stacks, ftrace_stacks);
 static DEFINE_PER_CPU(int, ftrace_stack_reserve);
 
 static void __ftrace_trace_stack(struct ring_buffer *buffer,
@@ -2763,10 +2772,11 @@ static void __ftrace_trace_stack(struct
 {
struct trace_event_call *call = _kernel_stack;
struct ring_buffer_event *event;
+   struct ftrace_stack *fstack;
struct stack_entry *entry;
struct stack_trace trace;
-   int use_stack;
-   int size = FTRACE_STACK_ENTRIES;
+   int size = FTRACE_KSTACK_ENTRIES;
+   int stackidx;
 
trace.nr_entries= 0;
trace.skip  = skip;
@@ -2788,29 +2798,32 @@ static void __ftrace_trace_stack(struct
 */
preempt_disable_notrace();
 
-   use_stack = __this_cpu_inc_return(ftrace_stack_reserve);
+   stackidx = __this_cpu_inc_return(ftrace_stack_reserve);
+
+   /* This should never happen. If it does, yell once and skip */
+   if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING))
+   goto out;
+
/*
-* We don't need any atomic variables, just a barrier.
-* If an interrupt comes in, we don't care, because it would
-* have exited and put the counter back to what we want.
-* We just need a barrier to keep gcc from moving things
-* around.
+* The above __this_cpu_inc_return() is 'atomic' cpu local. An
+* interrupt will either see the value pre increment or post
+* increment. If the interrupt happens pre increment it will have
+* restored the counter when it returns.  We just need a barrier to
+* keep gcc from moving things around.
 */
barrier();
-   if (use_stack == 1) {
-   trace.entries   = this_cpu_ptr(ftrace_stack.calls);
-   trace.max_entries   = FTRACE_STACK_MAX_ENTRIES;
-
-   if (regs)
-   save_stack_trace_regs(regs, );
-   else
-   save_stack_trace();
-
-   if (trace.nr_entries > size)
-   size = trace.nr_entries;
-   } else
-   /* From now on, use_stack is a boolean */
-   use_stack = 0;
+
+   fstack = this_cpu_ptr(ftrace_stacks.stacks) + (stackidx - 1);
+   trace.entries   = fstack->calls;
+   trace.max_entries   = FTRACE_KSTACK_ENTRIES;
+
+   if (regs)
+   save_stack_trace_regs(regs, );
+   else
+   save_stack_trace();
+
+   if (trace.nr_entries > size)
+   size = trace.nr_entries;
 
size *= sizeof(unsigned long);
 
@@ -2820,19 +2833,7 @@ static void __ftrace_trace_stack(struct
goto out;
entry = ring_buffer_event_data(event);
 
-   memset(>caller, 0, size);
-
-   if (use_stack)
-   memcpy(>caller, trace.entries,
-  trace.nr_entries * sizeof(unsigned long));
-   else {
-   trace.max_entries   = FTRACE_STACK_ENTRIES;
-   trace.entries   = entry->caller;
-   if (regs)
-   save_stack_trace_regs(regs, );
-   else
-   save_stack_trace();
-   }
+   memcpy(>caller, trace.entries, size);
 
entry->size = trace.nr_entries;
 


___
iommu mailing list

[patch V3 14/29] dm bufio: Simplify stack trace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Cc: dm-de...@redhat.com
Cc: Mike Snitzer 
Cc: Alasdair Kergon 
---
 drivers/md/dm-bufio.c |   15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -150,7 +150,7 @@ struct dm_buffer {
void (*end_io)(struct dm_buffer *, blk_status_t);
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
 #define MAX_STACK 10
-   struct stack_trace stack_trace;
+   unsigned int stack_len;
unsigned long stack_entries[MAX_STACK];
 #endif
 };
@@ -232,11 +232,7 @@ static DEFINE_MUTEX(dm_bufio_clients_loc
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
 static void buffer_record_stack(struct dm_buffer *b)
 {
-   b->stack_trace.nr_entries = 0;
-   b->stack_trace.max_entries = MAX_STACK;
-   b->stack_trace.entries = b->stack_entries;
-   b->stack_trace.skip = 2;
-   save_stack_trace(>stack_trace);
+   b->stack_len = stack_trace_save(b->stack_entries, MAX_STACK, 2);
 }
 #endif
 
@@ -438,7 +434,7 @@ static struct dm_buffer *alloc_buffer(st
adjust_total_allocated(b->data_mode, (long)c->block_size);
 
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
-   memset(>stack_trace, 0, sizeof(b->stack_trace));
+   b->stack_len = 0;
 #endif
return b;
 }
@@ -1520,8 +1516,9 @@ static void drop_buffers(struct dm_bufio
DMERR("leaked buffer %llx, hold count %u, list %d",
  (unsigned long long)b->block, b->hold_count, i);
 #ifdef CONFIG_DM_DEBUG_BLOCK_STACK_TRACING
-   print_stack_trace(>stack_trace, 1);
-   b->hold_count = 0; /* mark unclaimed to avoid BUG_ON 
below */
+   stack_trace_print(b->stack_entries, b->stack_len, 1);
+   /* mark unclaimed to avoid BUG_ON below */
+   b->hold_count = 0;
 #endif
}
 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 24/29] tracing: Remove the last struct stack_trace usage

2019-04-25 Thread Thomas Gleixner
Simplify the stack retrieval code by using the storage array based
interface.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_stack.c |   37 -
 1 file changed, 16 insertions(+), 21 deletions(-)

--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -23,11 +23,7 @@
 static unsigned long stack_dump_trace[STACK_TRACE_ENTRIES];
 static unsigned stack_trace_index[STACK_TRACE_ENTRIES];
 
-struct stack_trace stack_trace_max = {
-   .max_entries= STACK_TRACE_ENTRIES,
-   .entries= _dump_trace[0],
-};
-
+static unsigned int stack_trace_entries;
 static unsigned long stack_trace_max_size;
 static arch_spinlock_t stack_trace_max_lock =
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
@@ -44,10 +40,10 @@ static void print_max_stack(void)
 
pr_emerg("DepthSize   Location(%d entries)\n"
   "-   \n",
-  stack_trace_max.nr_entries);
+  stack_trace_entries);
 
-   for (i = 0; i < stack_trace_max.nr_entries; i++) {
-   if (i + 1 == stack_trace_max.nr_entries)
+   for (i = 0; i < stack_trace_entries; i++) {
+   if (i + 1 == stack_trace_entries)
size = stack_trace_index[i];
else
size = stack_trace_index[i] - stack_trace_index[i+1];
@@ -93,13 +89,12 @@ static void check_stack(unsigned long ip
 
stack_trace_max_size = this_size;
 
-   stack_trace_max.nr_entries = 0;
-   stack_trace_max.skip = 0;
-
-   save_stack_trace(_trace_max);
+   stack_trace_entries = stack_trace_save(stack_dump_trace,
+  ARRAY_SIZE(stack_dump_trace) - 1,
+  0);
 
/* Skip over the overhead of the stack tracer itself */
-   for (i = 0; i < stack_trace_max.nr_entries; i++) {
+   for (i = 0; i < stack_trace_entries; i++) {
if (stack_dump_trace[i] == ip)
break;
}
@@ -108,7 +103,7 @@ static void check_stack(unsigned long ip
 * Some archs may not have the passed in ip in the dump.
 * If that happens, we need to show everything.
 */
-   if (i == stack_trace_max.nr_entries)
+   if (i == stack_trace_entries)
i = 0;
 
/*
@@ -126,13 +121,13 @@ static void check_stack(unsigned long ip
 * loop will only happen once. This code only takes place
 * on a new max, so it is far from a fast path.
 */
-   while (i < stack_trace_max.nr_entries) {
+   while (i < stack_trace_entries) {
int found = 0;
 
stack_trace_index[x] = this_size;
p = start;
 
-   for (; p < top && i < stack_trace_max.nr_entries; p++) {
+   for (; p < top && i < stack_trace_entries; p++) {
/*
 * The READ_ONCE_NOCHECK is used to let KASAN know that
 * this is not a stack-out-of-bounds error.
@@ -163,7 +158,7 @@ static void check_stack(unsigned long ip
i++;
}
 
-   stack_trace_max.nr_entries = x;
+   stack_trace_entries = x;
 
if (task_stack_end_corrupted(current)) {
print_max_stack();
@@ -265,7 +260,7 @@ static void *
 {
long n = *pos - 1;
 
-   if (n >= stack_trace_max.nr_entries)
+   if (n >= stack_trace_entries)
return NULL;
 
m->private = (void *)n;
@@ -329,7 +324,7 @@ static int t_show(struct seq_file *m, vo
seq_printf(m, "DepthSize   Location"
   "(%d entries)\n"
   "-   \n",
-  stack_trace_max.nr_entries);
+  stack_trace_entries);
 
if (!stack_tracer_enabled && !stack_trace_max_size)
print_disabled(m);
@@ -339,10 +334,10 @@ static int t_show(struct seq_file *m, vo
 
i = *(long *)v;
 
-   if (i >= stack_trace_max.nr_entries)
+   if (i >= stack_trace_entries)
return 0;
 
-   if (i + 1 == stack_trace_max.nr_entries)
+   if (i + 1 == stack_trace_entries)
size = stack_trace_index[i];
else
size = stack_trace_index[i] - stack_trace_index[i+1];


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 27/29] lib/stackdepot: Remove obsolete functions

2019-04-25 Thread Thomas Gleixner
No more users of the struct stack_trace based interfaces.

Signed-off-by: Thomas Gleixner 
Acked-by: Alexander Potapenko 
---
 include/linux/stackdepot.h |4 
 lib/stackdepot.c   |   20 
 2 files changed, 24 deletions(-)

--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -23,13 +23,9 @@
 
 typedef u32 depot_stack_handle_t;
 
-struct stack_trace;
-
-depot_stack_handle_t depot_save_stack(struct stack_trace *trace, gfp_t flags);
 depot_stack_handle_t stack_depot_save(unsigned long *entries,
  unsigned int nr_entries, gfp_t gfp_flags);
 
-void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace);
 unsigned int stack_depot_fetch(depot_stack_handle_t handle,
   unsigned long **entries);
 
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -216,14 +216,6 @@ unsigned int stack_depot_fetch(depot_sta
 }
 EXPORT_SYMBOL_GPL(stack_depot_fetch);
 
-void depot_fetch_stack(depot_stack_handle_t handle, struct stack_trace *trace)
-{
-   unsigned int nent = stack_depot_fetch(handle, >entries);
-
-   trace->max_entries = trace->nr_entries = nent;
-}
-EXPORT_SYMBOL_GPL(depot_fetch_stack);
-
 /**
  * stack_depot_save - Save a stack trace from an array
  *
@@ -318,15 +310,3 @@ depot_stack_handle_t stack_depot_save(un
return retval;
 }
 EXPORT_SYMBOL_GPL(stack_depot_save);
-
-/**
- * depot_save_stack - save stack in a stack depot.
- * @trace - the stacktrace to save.
- * @alloc_flags - flags for allocating additional memory if required.
- */
-depot_stack_handle_t depot_save_stack(struct stack_trace *trace,
- gfp_t alloc_flags)
-{
-   return stack_depot_save(trace->entries, trace->nr_entries, alloc_flags);
-}
-EXPORT_SYMBOL_GPL(depot_save_stack);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 29/29] x86/stacktrace: Use common infrastructure

2019-04-25 Thread Thomas Gleixner
Replace the stack_trace_save*() functions with the new arch_stack_walk()
interfaces.

Signed-off-by: Thomas Gleixner 
Cc: linux-a...@vger.kernel.org
---
 arch/x86/Kconfig |1 
 arch/x86/kernel/stacktrace.c |  116 +++
 2 files changed, 20 insertions(+), 97 deletions(-)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -74,6 +74,7 @@ config X86
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+   select ARCH_STACKWALK
select ARCH_SUPPORTS_ACPI
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -12,75 +12,31 @@
 #include 
 #include 
 
-static int save_stack_address(struct stack_trace *trace, unsigned long addr,
- bool nosched)
-{
-   if (nosched && in_sched_functions(addr))
-   return 0;
-
-   if (trace->skip > 0) {
-   trace->skip--;
-   return 0;
-   }
-
-   if (trace->nr_entries >= trace->max_entries)
-   return -1;
-
-   trace->entries[trace->nr_entries++] = addr;
-   return 0;
-}
-
-static void noinline __save_stack_trace(struct stack_trace *trace,
-  struct task_struct *task, struct pt_regs *regs,
-  bool nosched)
+void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
+struct task_struct *task, struct pt_regs *regs)
 {
struct unwind_state state;
unsigned long addr;
 
-   if (regs)
-   save_stack_address(trace, regs->ip, nosched);
+   if (regs && !consume_entry(cookie, regs->ip, false))
+   return;
 
for (unwind_start(, task, regs, NULL); !unwind_done();
 unwind_next_frame()) {
addr = unwind_get_return_address();
-   if (!addr || save_stack_address(trace, addr, nosched))
+   if (!addr || !consume_entry(cookie, addr, false))
break;
}
 }
 
 /*
- * Save stack-backtrace addresses into a stack_trace buffer.
+ * This function returns an error if it detects any unreliable features of the
+ * stack.  Otherwise it guarantees that the stack trace is reliable.
+ *
+ * If the task is not 'current', the caller *must* ensure the task is inactive.
  */
-void save_stack_trace(struct stack_trace *trace)
-{
-   trace->skip++;
-   __save_stack_trace(trace, current, NULL, false);
-}
-EXPORT_SYMBOL_GPL(save_stack_trace);
-
-void save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
-{
-   __save_stack_trace(trace, current, regs, false);
-}
-
-void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
-{
-   if (!try_get_task_stack(tsk))
-   return;
-
-   if (tsk == current)
-   trace->skip++;
-   __save_stack_trace(trace, tsk, NULL, true);
-
-   put_task_stack(tsk);
-}
-EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
-
-#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
-
-static int __always_inline
-__save_stack_trace_reliable(struct stack_trace *trace,
-   struct task_struct *task)
+int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
+void *cookie, struct task_struct *task)
 {
struct unwind_state state;
struct pt_regs *regs;
@@ -117,7 +73,7 @@ static int __always_inline
if (!addr)
return -EINVAL;
 
-   if (save_stack_address(trace, addr, false))
+   if (!consume_entry(cookie, addr, false))
return -EINVAL;
}
 
@@ -132,32 +88,6 @@ static int __always_inline
return 0;
 }
 
-/*
- * This function returns an error if it detects any unreliable features of the
- * stack.  Otherwise it guarantees that the stack trace is reliable.
- *
- * If the task is not 'current', the caller *must* ensure the task is inactive.
- */
-int save_stack_trace_tsk_reliable(struct task_struct *tsk,
- struct stack_trace *trace)
-{
-   int ret;
-
-   /*
-* If the task doesn't have a stack (e.g., a zombie), the stack is
-* "reliably" empty.
-*/
-   if (!try_get_task_stack(tsk))
-   return 0;
-
-   ret = __save_stack_trace_reliable(trace, tsk);
-
-   put_task_stack(tsk);
-
-   return ret;
-}
-#endif /* CONFIG_HAVE_RELIABLE_STACKTRACE */
-
 /* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */
 
 struct stack_frame_user {
@@ -182,15 +112,15 @@ copy_stack_frame(const void __user *fp,
return ret;
 }
 
-static inline void __save_stack_trace_user(struct stack_trace *trace)
+void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
+ const struct pt_regs 

[patch V3 11/29] fault-inject: Simplify stacktrace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Cc: Akinobu Mita 
---
 lib/fault-inject.c |   12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -65,22 +65,16 @@ static bool fail_task(struct fault_attr
 
 static bool fail_stacktrace(struct fault_attr *attr)
 {
-   struct stack_trace trace;
int depth = attr->stacktrace_depth;
unsigned long entries[MAX_STACK_TRACE_DEPTH];
-   int n;
+   int n, nr_entries;
bool found = (attr->require_start == 0 && attr->require_end == 
ULONG_MAX);
 
if (depth == 0)
return found;
 
-   trace.nr_entries = 0;
-   trace.entries = entries;
-   trace.max_entries = depth;
-   trace.skip = 1;
-
-   save_stack_trace();
-   for (n = 0; n < trace.nr_entries; n++) {
+   nr_entries = stack_trace_save(entries, depth, 1);
+   for (n = 0; n < nr_entries; n++) {
if (attr->reject_start <= entries[n] &&
   entries[n] < attr->reject_end)
return false;


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 13/29] btrfs: ref-verify: Simplify stack trace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Johannes Thumshirn 
Acked-by: David Sterba 
Cc: Chris Mason 
Cc: Josef Bacik 
Cc: linux-bt...@vger.kernel.org
---
 fs/btrfs/ref-verify.c |   15 ++-
 1 file changed, 2 insertions(+), 13 deletions(-)

--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -205,28 +205,17 @@ static struct root_entry *lookup_root_en
 #ifdef CONFIG_STACKTRACE
 static void __save_stack_trace(struct ref_action *ra)
 {
-   struct stack_trace stack_trace;
-
-   stack_trace.max_entries = MAX_TRACE;
-   stack_trace.nr_entries = 0;
-   stack_trace.entries = ra->trace;
-   stack_trace.skip = 2;
-   save_stack_trace(_trace);
-   ra->trace_len = stack_trace.nr_entries;
+   ra->trace_len = stack_trace_save(ra->trace, MAX_TRACE, 2);
 }
 
 static void __print_stack_trace(struct btrfs_fs_info *fs_info,
struct ref_action *ra)
 {
-   struct stack_trace trace;
-
if (ra->trace_len == 0) {
btrfs_err(fs_info, "  ref-verify: no stacktrace");
return;
}
-   trace.nr_entries = ra->trace_len;
-   trace.entries = ra->trace;
-   print_stack_trace(, 2);
+   stack_trace_print(ra->trace, ra->trace_len, 2);
 }
 #else
 static void inline __save_stack_trace(struct ref_action *ra)


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 22/29] tracing: Make ftrace_trace_userstack() static and conditional

2019-04-25 Thread Thomas Gleixner
It's only used in trace.c and there is absolutely no point in compiling it
in when user space stack traces are not supported.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Steven Rostedt 
---
 kernel/trace/trace.c |   14 --
 kernel/trace/trace.h |8 
 2 files changed, 8 insertions(+), 14 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -159,6 +159,8 @@ static union trace_eval_map_item *trace_
 #endif /* CONFIG_TRACE_EVAL_MAP_FILE */
 
 static int tracing_set_tracer(struct trace_array *tr, const char *buf);
+static void ftrace_trace_userstack(struct ring_buffer *buffer,
+  unsigned long flags, int pc);
 
 #define MAX_TRACER_SIZE100
 static char bootup_tracer_buf[MAX_TRACER_SIZE] __initdata;
@@ -2905,9 +2907,10 @@ void trace_dump_stack(int skip)
 }
 EXPORT_SYMBOL_GPL(trace_dump_stack);
 
+#ifdef CONFIG_USER_STACKTRACE_SUPPORT
 static DEFINE_PER_CPU(int, user_stack_count);
 
-void
+static void
 ftrace_trace_userstack(struct ring_buffer *buffer, unsigned long flags, int pc)
 {
struct trace_event_call *call = _user_stack;
@@ -2958,13 +2961,12 @@ ftrace_trace_userstack(struct ring_buffe
  out:
preempt_enable();
 }
-
-#ifdef UNUSED
-static void __trace_userstack(struct trace_array *tr, unsigned long flags)
+#else /* CONFIG_USER_STACKTRACE_SUPPORT */
+static void ftrace_trace_userstack(struct ring_buffer *buffer,
+  unsigned long flags, int pc)
 {
-   ftrace_trace_userstack(tr, flags, preempt_count());
 }
-#endif /* UNUSED */
+#endif /* !CONFIG_USER_STACKTRACE_SUPPORT */
 
 #endif /* CONFIG_STACKTRACE */
 
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -782,17 +782,9 @@ void update_max_tr_single(struct trace_a
 #endif /* CONFIG_TRACER_MAX_TRACE */
 
 #ifdef CONFIG_STACKTRACE
-void ftrace_trace_userstack(struct ring_buffer *buffer, unsigned long flags,
-   int pc);
-
 void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
   int pc);
 #else
-static inline void ftrace_trace_userstack(struct ring_buffer *buffer,
- unsigned long flags, int pc)
-{
-}
-
 static inline void __trace_stack(struct trace_array *tr, unsigned long flags,
 int skip, int pc)
 {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 06/19] drivers core: Add I/O ASID allocator

2019-04-25 Thread Auger Eric
Hi Jean-Philippe, Jacob,
On 4/24/19 1:31 AM, Jacob Pan wrote:
> From: Jean-Philippe Brucker 
> 
> Some devices might support multiple DMA address spaces, in particular
> those that have the PCI PASID feature. PASID (Process Address Space ID)
> allows to share process address spaces with devices (SVA), partition a
> device into VM-assignable entities (VFIO mdev) or simply provide
> multiple DMA address space to kernel drivers. Add a global PASID
> allocator usable by different drivers at the same time. Name it I/O ASID
> to avoid confusion with ASIDs allocated by arch code, which are usually
> a separate ID space.
> 
> The IOASID space is global. Each device can have its own PASID space,
> but by convention the IOMMU ended up having a global PASID space, so
> that with SVA, each mm_struct is associated to a single PASID.
> 
> The allocator doesn't really belong in drivers/iommu because some
> drivers would like to allocate PASIDs for devices that aren't managed by
> an IOMMU, using the same ID space as IOMMU. It doesn't really belong in
> drivers/pci either since platform device also support PASID. Add the
> allocator in drivers/base.
> 
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/base/Kconfig   |   6 +++
>  drivers/base/Makefile  |   1 +
>  drivers/base/ioasid.c  | 106 
> +
>  include/linux/ioasid.h |  40 +++
>  4 files changed, 153 insertions(+)
>  create mode 100644 drivers/base/ioasid.c
>  create mode 100644 include/linux/ioasid.h
> 
> diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
> index 059700e..47c1348 100644
> --- a/drivers/base/Kconfig
> +++ b/drivers/base/Kconfig
> @@ -182,6 +182,12 @@ config DMA_SHARED_BUFFER
> APIs extension; the file's descriptor can then be passed on to other
> driver.
>  
> +config IOASID
> + bool
> + help
> +   Enable the I/O Address Space ID allocator. A single ID space shared
> +   between different users.
> +
>  config DMA_FENCE_TRACE
>   bool "Enable verbose DMA_FENCE_TRACE messages"
>   depends on DMA_SHARED_BUFFER
> diff --git a/drivers/base/Makefile b/drivers/base/Makefile
> index 1574520..aafa2ac 100644
> --- a/drivers/base/Makefile
> +++ b/drivers/base/Makefile
> @@ -23,6 +23,7 @@ obj-$(CONFIG_PINCTRL) += pinctrl.o
>  obj-$(CONFIG_DEV_COREDUMP) += devcoredump.o
>  obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o
>  obj-$(CONFIG_GENERIC_ARCH_TOPOLOGY) += arch_topology.o
> +obj-$(CONFIG_IOASID) += ioasid.o
>  
>  obj-y+= test/
>  
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> new file mode 100644
> index 000..cf122b2
> --- /dev/null
> +++ b/drivers/base/ioasid.c
> @@ -0,0 +1,106 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * I/O Address Space ID allocator. There is one global IOASID space, split 
> into
> + * subsets. Users create a subset with DECLARE_IOASID_SET, then allocate and
> + * free IOASIDs with ioasid_alloc and ioasid_free.
> + */
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct ioasid_data {
> + ioasid_t id;
> + struct ioasid_set *set;
> + void *private;
> + struct rcu_head rcu;
> +};
> +
> +static DEFINE_IDR(ioasid_idr);
> +
> +/**
> + * ioasid_alloc - Allocate an IOASID
> + * @set: the IOASID set
> + * @min: the minimum ID (inclusive)
> + * @max: the maximum ID (exclusive)
> + * @private: data private to the caller
> + *
> + * Allocate an ID between @min and @max (or %0 and %INT_MAX). Return the
I would remove "(or %0 and %INT_MAX)".
> + * allocated ID on success, or INVALID_IOASID on failure. The @private 
> pointer
> + * is stored internally and can be retrieved with ioasid_find().
> + */
> +ioasid_t ioasid_alloc(struct ioasid_set *set, ioasid_t min, ioasid_t max,
> +   void *private)
> +{
> + int id = -1;
> + struct ioasid_data *data;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> + if (!data)
> + return INVALID_IOASID;
> +
> + data->set = set;
> + data->private = private;
> +
> + idr_preload(GFP_KERNEL);
> + idr_lock(_idr);
> + data->id = id = idr_alloc(_idr, data, min, max, GFP_ATOMIC);
> + idr_unlock(_idr);
> + idr_preload_end();
> +
> + if (id < 0) {
> + kfree(data);
> + return INVALID_IOASID;
> + }
> + return id;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_alloc);
> +
> +/**
> + * ioasid_free - Free an IOASID
> + * @ioasid: the ID to remove
> + */
> +void ioasid_free(ioasid_t ioasid)
> +{
> + struct ioasid_data *ioasid_data;
> +
> + idr_lock(_idr);
> + ioasid_data = idr_remove(_idr, ioasid);
> + idr_unlock(_idr);
> +
> + if (ioasid_data)
> + kfree_rcu(ioasid_data, rcu);
> +}
> +EXPORT_SYMBOL_GPL(ioasid_free);
> +
> +/**
> + * ioasid_find - Find IOASID data
> + * @set: the IOASID set
> + * @ioasid: the IOASID to find
> + * @getter: function to call on the found object
> + *
> + * The optional getter 

Re: [patch V3 00/29] stacktrace: Consolidate stack trace usage

2019-04-25 Thread Ingo Molnar


* Thomas Gleixner  wrote:

> - if (unlikely(!ret))
> + if (unlikely(!ret)) {
> + if (!trace->nr_entries) {
> + /*
> +  * If save_trace fails here, the printing might
> +  * trigger a WARN but because of the !nr_entries it
> +  * should not do bad things.
> +  */
> + save_trace(trace);
> + }
>   return print_circular_bug(, target_entry, next, prev);
> + }
>   else if (unlikely(ret < 0))
>   return print_bfs_bug(ret);

Just a minor style nit: the 'else' should probably on the same line as 
the '}' it belongs to, to make it really obvious that the 'if' has an 
'else' branch?

At that point the condition should probably also use balanced curly 
braces.

Interdiff looks good otherwise.

Thanks,

Ingo
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 11/19] iommu/vt-d: Replace Intel specific PASID allocator with IOASID

2019-04-25 Thread Auger Eric
Hi Jacob,

On 4/24/19 1:31 AM, Jacob Pan wrote:
> Make use of generic IOASID code to manage PASID allocation,
> free, and lookup.
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/iommu/Kconfig   |  1 +
>  drivers/iommu/intel-iommu.c |  9 -
>  drivers/iommu/intel-pasid.c | 36 
>  drivers/iommu/intel-svm.c   | 41 -
>  4 files changed, 29 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 6f07f3b..7f92009 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -204,6 +204,7 @@ config INTEL_IOMMU_SVM
>   bool "Support for Shared Virtual Memory with Intel IOMMU"
>   depends on INTEL_IOMMU && X86
>   select PCI_PASID
> + select IOASID
>   select MMU_NOTIFIER
>   help
> Shared Virtual Memory (SVM) provides a facility for devices
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index ec6f22d..785330a 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5153,7 +5153,7 @@ static void auxiliary_unlink_device(struct dmar_domain 
> *domain,
>   domain->auxd_refcnt--;
>  
>   if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>  }
>  
>  static int aux_domain_add_dev(struct dmar_domain *domain,
> @@ -5171,9 +5171,8 @@ static int aux_domain_add_dev(struct dmar_domain 
> *domain,
>   if (domain->default_pasid <= 0) {
>   int pasid;
>  
> - pasid = intel_pasid_alloc_id(domain, PASID_MIN,
> -  pci_max_pasids(to_pci_dev(dev)),
> -  GFP_KERNEL);
> + pasid = ioasid_alloc(NULL, PASID_MIN, 
> pci_max_pasids(to_pci_dev(dev)) - 1,
> + domain);
>   if (pasid <= 0) {
ioasid_t is a uint and returns INVALID_IOASID on error. Wouldn't it be
simpler to make ioasid_alloc return an int?
>   pr_err("Can't allocate default pasid\n");
>   return -ENODEV;
> @@ -5210,7 +5209,7 @@ static int aux_domain_add_dev(struct dmar_domain 
> *domain,
>   spin_unlock(>lock);
>   spin_unlock_irqrestore(_domain_lock, flags);
>   if (!domain->auxd_refcnt && domain->default_pasid > 0)
> - intel_pasid_free_id(domain->default_pasid);
> + ioasid_free(domain->default_pasid);
>  
>   return ret;
>  }
> diff --git a/drivers/iommu/intel-pasid.c b/drivers/iommu/intel-pasid.c
> index 5b1d3be..d339e8f 100644
> --- a/drivers/iommu/intel-pasid.c
> +++ b/drivers/iommu/intel-pasid.c
> @@ -26,42 +26,6 @@
>   */
>  static DEFINE_SPINLOCK(pasid_lock);
>  u32 intel_pasid_max_id = PASID_MAX;
> -static DEFINE_IDR(pasid_idr);
> -
> -int intel_pasid_alloc_id(void *ptr, int start, int end, gfp_t gfp)
> -{
> - int ret, min, max;
> -
> - min = max_t(int, start, PASID_MIN);
> - max = min_t(int, end, intel_pasid_max_id);
> -
> - WARN_ON(in_interrupt());
> - idr_preload(gfp);
> - spin_lock(_lock);
> - ret = idr_alloc(_idr, ptr, min, max, GFP_ATOMIC);
> - spin_unlock(_lock);
> - idr_preload_end();
> -
> - return ret;
> -}
> -
> -void intel_pasid_free_id(int pasid)
> -{
> - spin_lock(_lock);
> - idr_remove(_idr, pasid);
> - spin_unlock(_lock);
> -}
> -
> -void *intel_pasid_lookup_id(int pasid)
> -{
> - void *p;
> -
> - spin_lock(_lock);
> - p = idr_find(_idr, pasid);
> - spin_unlock(_lock);
> -
> - return p;
> -}
>  
>  int vcmd_alloc_pasid(struct intel_iommu *iommu, unsigned int *pasid)
>  {
> diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
> index 8f87304..8fff212 100644
> --- a/drivers/iommu/intel-svm.c
> +++ b/drivers/iommu/intel-svm.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include "intel-pasid.h"
> @@ -211,7 +212,9 @@ static void intel_mm_release(struct mmu_notifier *mn, 
> struct mm_struct *mm)
>   rcu_read_lock();
>   list_for_each_entry_rcu(sdev, >devs, list) {
>   intel_pasid_tear_down_entry(svm->iommu, sdev->dev, svm->pasid);
> - intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, !svm->mm)> + 
> /* for emulated iommu, PASID cache invalidation implies IOTLB/DTLB */
> + if (!cap_caching_mode(svm->iommu->cap))
> + intel_flush_svm_range_dev(svm, sdev, 0, -1, 0, 
> !svm->mm);
This change is not documented in the commit message. Isn't it a separate
fix?
>   }
>   rcu_read_unlock();
>  
> @@ -332,16 +335,15 @@ int intel_svm_bind_mm(struct device *dev, int *pasid, 
> int flags, struct svm_dev_
>   if (pasid_max > intel_pasid_max_id)
>   pasid_max = intel_pasid_max_id;
>  
> - /* Do not use PASID 0 in caching mode (virtualised 

Re: [PATCH v2 08/19] ioasid: Add custom IOASID allocator

2019-04-25 Thread Auger Eric
Hi Jacob,

On 4/24/19 1:31 AM, Jacob Pan wrote:
> Sometimes, IOASID allocation must be handled by platform specific
> code. The use cases are guest vIOMMU and pvIOMMU where IOASIDs need
> to be allocated by the host via enlightened or paravirt interfaces.
> 
> This patch adds an extension to the IOASID allocator APIs such that
> platform drivers can register a custom allocator, possibly at boot
> time, to take over the allocation. Xarray is still used for tracking
> and searching purposes internal to the IOASID code. Private data of
> an IOASID can also be set after the allocation.
> 
> There can be multiple custom allocators registered but only one is
> used at a time. In case of hot removal of devices that provides the
> allocator, all IOASIDs must be freed prior to unregistering the
> allocator. Default XArray based allocator cannot be mixed with
> custom allocators, i.e. custom allocators will not be used if there
> are outstanding IOASIDs allocated by the default XA allocator.

What's the exact use case behind allowing several custom IOASID
allocators to be registered?
> 
> Signed-off-by: Jacob Pan 
> ---
>  drivers/base/ioasid.c  | 182 
> ++---
>  include/linux/ioasid.h |  15 +++-
>  2 files changed, 187 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/base/ioasid.c b/drivers/base/ioasid.c
> index c4012aa..5cb36a4 100644
> --- a/drivers/base/ioasid.c
> +++ b/drivers/base/ioasid.c
> @@ -17,6 +17,120 @@ struct ioasid_data {
>  };
>  
>  static DEFINE_XARRAY_ALLOC(ioasid_xa);
> +static DEFINE_MUTEX(ioasid_allocator_lock);
> +static struct ioasid_allocator *ioasid_allocator;
A more explicit name may be chosen. If I understand correctly that's the
active_custom_allocator
> +
> +static LIST_HEAD(custom_allocators);
> +/*
> + * A flag to track if ioasid default allocator already been used, this will
is already in use?
> + * prevent custom allocator from being used. The reason is that custom 
> allocator
s/The reason is that custom allocator/The reason is that custom allocators
> + * must have unadulterated space to track private data with xarray, there 
> cannot
> + * be a mix been default and custom allocated IOASIDs.
> + */
> +static int default_allocator_used;
> +
> +/**
> + * ioasid_register_allocator - register a custom allocator
> + * @allocator: the custom allocator to be registered
> + *
> + * Custom allocator take precedence over the default xarray based allocator.
> + * Private data associated with the ASID are managed by ASID common code
> + * similar to data stored in xa.
> + *
> + * There can be multiple allocators registered but only one is active. In 
> case
> + * of runtime removal of an custom allocator, the next one is activated based
> + * on the registration ordering.
This last sentence may be moved to the unregister() kerneldoc
> + */
> +int ioasid_register_allocator(struct ioasid_allocator *allocator)
> +{
> + struct ioasid_allocator *pallocator;
> + int ret = 0;
> +
> + if (!allocator)
> + return -EINVAL;
> +
> + mutex_lock(_allocator_lock);
> + if (list_empty(_allocators))
> + ioasid_allocator = allocator;
The fact the first registered custom allocator gets automatically active
was not obvious to me and may deserve a comment.
> + else {
> + /* Check if the allocator is already registered */
> + list_for_each_entry(pallocator, _allocators, list) {
> + if (pallocator == allocator) {
> + pr_err("IOASID allocator already exist\n");
s/exist/registered?
> + ret = -EEXIST;
> + goto out_unlock;
> + }
> + }
> + }
> + list_add_tail(>list, _allocators);
> +
> +out_unlock:
> + mutex_unlock(_allocator_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(ioasid_register_allocator);
> +
> +/**
> + * ioasid_unregister_allocator - Remove a custom IOASID allocator
> + * @allocator: the custom allocator to be removed
> + *
> + * Remove an allocator from the list, activate the next allocator in
> + * the order it was  registration.
> + */
> +void ioasid_unregister_allocator(struct ioasid_allocator *allocator)
> +{
> + if (!allocator)
> + return;
> +
> + if (list_empty(_allocators)) {
> + pr_warn("No custom IOASID allocators active!\n");
s/active/registered?
> + return;
> + }
> +
> + mutex_lock(_allocator_lock);
> + list_del(>list);
> + if (list_empty(_allocators)) {
> + pr_info("No custom IOASID allocators\n");
> + /*
> +  * All IOASIDs should have been freed before the last allocator
> +  * is unregistered.
> +  */
> + BUG_ON(!xa_empty(_xa));
At this stage it is difficult to assess whether using a BUG_ON() is safe
here. Who is responsible for freeing the IOASIDs?
> + ioasid_allocator = NULL;
> 

[patch V3 10/29] mm/page_owner: Simplify stack trace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

The original code in all printing functions is really wrong. It allocates a
storage array on stack which is unused because depot_fetch_stack() does not
store anything in it. It overwrites the entries pointer in the stack_trace
struct so it points to the depot storage.

Signed-off-by: Thomas Gleixner 
Cc: linux...@kvack.org
Cc: Mike Rapoport 
Cc: David Rientjes 
Cc: Andrew Morton 
---
 mm/page_owner.c |   79 +++-
 1 file changed, 28 insertions(+), 51 deletions(-)

--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -58,15 +58,10 @@ static bool need_page_owner(void)
 static __always_inline depot_stack_handle_t create_dummy_stack(void)
 {
unsigned long entries[4];
-   struct stack_trace dummy;
+   unsigned int nr_entries;
 
-   dummy.nr_entries = 0;
-   dummy.max_entries = ARRAY_SIZE(entries);
-   dummy.entries = [0];
-   dummy.skip = 0;
-
-   save_stack_trace();
-   return depot_save_stack(, GFP_KERNEL);
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0);
+   return stack_depot_save(entries, nr_entries, GFP_KERNEL);
 }
 
 static noinline void register_dummy_stack(void)
@@ -120,46 +115,39 @@ void __reset_page_owner(struct page *pag
}
 }
 
-static inline bool check_recursive_alloc(struct stack_trace *trace,
-   unsigned long ip)
+static inline bool check_recursive_alloc(unsigned long *entries,
+unsigned int nr_entries,
+unsigned long ip)
 {
-   int i;
+   unsigned int i;
 
-   if (!trace->nr_entries)
-   return false;
-
-   for (i = 0; i < trace->nr_entries; i++) {
-   if (trace->entries[i] == ip)
+   for (i = 0; i < nr_entries; i++) {
+   if (entries[i] == ip)
return true;
}
-
return false;
 }
 
 static noinline depot_stack_handle_t save_stack(gfp_t flags)
 {
unsigned long entries[PAGE_OWNER_STACK_DEPTH];
-   struct stack_trace trace = {
-   .nr_entries = 0,
-   .entries = entries,
-   .max_entries = PAGE_OWNER_STACK_DEPTH,
-   .skip = 2
-   };
depot_stack_handle_t handle;
+   unsigned int nr_entries;
 
-   save_stack_trace();
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 2);
 
/*
-* We need to check recursion here because our request to stackdepot
-* could trigger memory allocation to save new entry. New memory
-* allocation would reach here and call depot_save_stack() again
-* if we don't catch it. There is still not enough memory in stackdepot
-* so it would try to allocate memory again and loop forever.
+* We need to check recursion here because our request to
+* stackdepot could trigger memory allocation to save new
+* entry. New memory allocation would reach here and call
+* stack_depot_save_entries() again if we don't catch it. There is
+* still not enough memory in stackdepot so it would try to
+* allocate memory again and loop forever.
 */
-   if (check_recursive_alloc(, _RET_IP_))
+   if (check_recursive_alloc(entries, nr_entries, _RET_IP_))
return dummy_handle;
 
-   handle = depot_save_stack(, flags);
+   handle = stack_depot_save(entries, nr_entries, flags);
if (!handle)
handle = failure_handle;
 
@@ -337,16 +325,10 @@ print_page_owner(char __user *buf, size_
struct page *page, struct page_owner *page_owner,
depot_stack_handle_t handle)
 {
-   int ret;
-   int pageblock_mt, page_mt;
+   int ret, pageblock_mt, page_mt;
+   unsigned long *entries;
+   unsigned int nr_entries;
char *kbuf;
-   unsigned long entries[PAGE_OWNER_STACK_DEPTH];
-   struct stack_trace trace = {
-   .nr_entries = 0,
-   .entries = entries,
-   .max_entries = PAGE_OWNER_STACK_DEPTH,
-   .skip = 0
-   };
 
count = min_t(size_t, count, PAGE_SIZE);
kbuf = kmalloc(count, GFP_KERNEL);
@@ -375,8 +357,8 @@ print_page_owner(char __user *buf, size_
if (ret >= count)
goto err;
 
-   depot_fetch_stack(handle, );
-   ret += snprint_stack_trace(kbuf + ret, count - ret, , 0);
+   nr_entries = stack_depot_fetch(handle, );
+   ret += stack_trace_snprint(kbuf + ret, count - ret, entries, 
nr_entries, 0);
if (ret >= count)
goto err;
 
@@ -407,14 +389,9 @@ void __dump_page_owner(struct page *page
 {
struct page_ext *page_ext = lookup_page_ext(page);
struct page_owner *page_owner;
-   unsigned long entries[PAGE_OWNER_STACK_DEPTH];
-   struct 

[patch V3 19/29] lockdep: Simplify stack trace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces and storing the information is a small lockdep
specific data structure.

Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
---
 include/linux/lockdep.h  |9 +--
 kernel/locking/lockdep.c |   55 +++
 2 files changed, 35 insertions(+), 29 deletions(-)

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -66,6 +66,11 @@ struct lock_class_key {
 
 extern struct lock_class_key __lockdep_no_validate__;
 
+struct lock_trace {
+   unsigned intnr_entries;
+   unsigned intoffset;
+};
+
 #define LOCKSTAT_POINTS4
 
 /*
@@ -100,7 +105,7 @@ struct lock_class {
 * IRQ/softirq usage tracking bits:
 */
unsigned long   usage_mask;
-   struct stack_trace  usage_traces[XXX_LOCK_USAGE_STATES];
+   struct lock_trace   usage_traces[XXX_LOCK_USAGE_STATES];
 
/*
 * Generation counter, when doing certain classes of graph walking,
@@ -188,7 +193,7 @@ struct lock_list {
struct list_headentry;
struct lock_class   *class;
struct lock_class   *links_to;
-   struct stack_trace  trace;
+   struct lock_trace   trace;
int distance;
 
/*
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -434,18 +434,14 @@ static void print_lockdep_off(const char
 #endif
 }
 
-static int save_trace(struct stack_trace *trace)
+static int save_trace(struct lock_trace *trace)
 {
-   trace->nr_entries = 0;
-   trace->max_entries = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries;
-   trace->entries = stack_trace + nr_stack_trace_entries;
-
-   trace->skip = 3;
-
-   save_stack_trace(trace);
-
-   trace->max_entries = trace->nr_entries;
+   unsigned long *entries = stack_trace + nr_stack_trace_entries;
+   unsigned int max_entries;
 
+   trace->offset = nr_stack_trace_entries;
+   max_entries = MAX_STACK_TRACE_ENTRIES - nr_stack_trace_entries;
+   trace->nr_entries = stack_trace_save(entries, max_entries, 3);
nr_stack_trace_entries += trace->nr_entries;
 
if (nr_stack_trace_entries >= MAX_STACK_TRACE_ENTRIES-1) {
@@ -1196,7 +1192,7 @@ static struct lock_list *alloc_list_entr
 static int add_lock_to_list(struct lock_class *this,
struct lock_class *links_to, struct list_head *head,
unsigned long ip, int distance,
-   struct stack_trace *trace)
+   struct lock_trace *trace)
 {
struct lock_list *entry;
/*
@@ -1415,6 +1411,13 @@ static inline int __bfs_backwards(struct
  * checking.
  */
 
+static void print_lock_trace(struct lock_trace *trace, unsigned int spaces)
+{
+   unsigned long *entries = stack_trace + trace->offset;
+
+   stack_trace_print(entries, trace->nr_entries, spaces);
+}
+
 /*
  * Print a dependency chain entry (this is only done when a deadlock
  * has been detected):
@@ -1427,8 +1430,7 @@ print_circular_bug_entry(struct lock_lis
printk("\n-> #%u", depth);
print_lock_name(target->class);
printk(KERN_CONT ":\n");
-   print_stack_trace(>trace, 6);
-
+   print_lock_trace(>trace, 6);
return 0;
 }
 
@@ -1740,7 +1742,7 @@ static void print_lock_class_header(stru
 
len += printk("%*s   %s", depth, "", usage_str[bit]);
len += printk(KERN_CONT " at:\n");
-   print_stack_trace(class->usage_traces + bit, len);
+   print_lock_trace(class->usage_traces + bit, len);
}
}
printk("%*s }\n", depth, "");
@@ -1765,7 +1767,7 @@ print_shortest_lock_dependencies(struct
do {
print_lock_class_header(entry->class, depth);
printk("%*s ... acquired at:\n", depth, "");
-   print_stack_trace(>trace, 2);
+   print_lock_trace(>trace, 2);
printk("\n");
 
if (depth == 0 && (entry != root)) {
@@ -1878,14 +1880,14 @@ print_bad_irq_dependency(struct task_str
print_lock_name(backwards_entry->class);
pr_warn("\n... which became %s-irq-safe at:\n", irqclass);
 
-   print_stack_trace(backwards_entry->class->usage_traces + bit1, 1);
+   print_lock_trace(backwards_entry->class->usage_traces + bit1, 1);
 
pr_warn("\nto a %s-irq-unsafe lock:\n", irqclass);
print_lock_name(forwards_entry->class);
pr_warn("\n... which became %s-irq-unsafe at:\n", irqclass);
pr_warn("...");
 
-   print_stack_trace(forwards_entry->class->usage_traces + bit2, 1);
+   print_lock_trace(forwards_entry->class->usage_traces + bit2, 1);
 
pr_warn("\nother info 

[patch V3 20/29] tracing: Simplify stacktrace retrieval in histograms

2019-04-25 Thread Thomas Gleixner
The indirection through struct stack_trace is not necessary at all. Use the
storage array based interface.

Signed-off-by: Thomas Gleixner 
Tested-by: Tom Zanussi 
Reviewed-by: Tom Zanussi 
Acked-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_events_hist.c |   12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -5186,7 +5186,6 @@ static void event_hist_trigger(struct ev
u64 var_ref_vals[TRACING_MAP_VARS_MAX];
char compound_key[HIST_KEY_SIZE_MAX];
struct tracing_map_elt *elt = NULL;
-   struct stack_trace stacktrace;
struct hist_field *key_field;
u64 field_contents;
void *key = NULL;
@@ -5198,14 +5197,9 @@ static void event_hist_trigger(struct ev
key_field = hist_data->fields[i];
 
if (key_field->flags & HIST_FIELD_FL_STACKTRACE) {
-   stacktrace.max_entries = HIST_STACKTRACE_DEPTH;
-   stacktrace.entries = entries;
-   stacktrace.nr_entries = 0;
-   stacktrace.skip = HIST_STACKTRACE_SKIP;
-
-   memset(stacktrace.entries, 0, HIST_STACKTRACE_SIZE);
-   save_stack_trace();
-
+   memset(entries, 0, HIST_STACKTRACE_SIZE);
+   stack_trace_save(entries, HIST_STACKTRACE_DEPTH,
+HIST_STACKTRACE_SKIP);
key = entries;
} else {
field_contents = key_field->fn(key_field, elt, rbe, 
rec);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 17/29] lockdep: Remove unused trace argument from print_circular_bug()

2019-04-25 Thread Thomas Gleixner
Signed-off-by: Thomas Gleixner 
---
 kernel/locking/lockdep.c |9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1522,10 +1522,9 @@ static inline int class_equal(struct loc
 }
 
 static noinline int print_circular_bug(struct lock_list *this,
-   struct lock_list *target,
-   struct held_lock *check_src,
-   struct held_lock *check_tgt,
-   struct stack_trace *trace)
+  struct lock_list *target,
+  struct held_lock *check_src,
+  struct held_lock *check_tgt)
 {
struct task_struct *curr = current;
struct lock_list *parent;
@@ -2206,7 +2205,7 @@ check_prev_add(struct task_struct *curr,
 */
save(trace);
}
-   return print_circular_bug(, target_entry, next, prev, 
trace);
+   return print_circular_bug(, target_entry, next, prev);
}
else if (unlikely(ret < 0))
return print_bfs_bug(ret);


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 12/29] dma/debug: Simplify stracktrace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Reviewed-by: Christoph Hellwig 
Cc: iommu@lists.linux-foundation.org
Cc: Robin Murphy 
Cc: Marek Szyprowski 
---
 kernel/dma/debug.c |   14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -89,8 +89,8 @@ struct dma_debug_entry {
int  sg_mapped_ents;
enum map_err_types  map_err_type;
 #ifdef CONFIG_STACKTRACE
-   struct   stack_trace stacktrace;
-   unsigned longst_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
+   unsigned intstack_len;
+   unsigned long   stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES];
 #endif
 };
 
@@ -174,7 +174,7 @@ static inline void dump_entry_trace(stru
 #ifdef CONFIG_STACKTRACE
if (entry) {
pr_warning("Mapped at:\n");
-   print_stack_trace(>stacktrace, 0);
+   stack_trace_print(entry->stack_entries, entry->stack_len, 0);
}
 #endif
 }
@@ -704,12 +704,10 @@ static struct dma_debug_entry *dma_entry
spin_unlock_irqrestore(_entries_lock, flags);
 
 #ifdef CONFIG_STACKTRACE
-   entry->stacktrace.max_entries = DMA_DEBUG_STACKTRACE_ENTRIES;
-   entry->stacktrace.entries = entry->st_entries;
-   entry->stacktrace.skip = 1;
-   save_stack_trace(>stacktrace);
+   entry->stack_len = stack_trace_save(entry->stack_entries,
+   ARRAY_SIZE(entry->stack_entries),
+   1);
 #endif
-
return entry;
 }
 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 07/29] mm/slub: Simplify stack trace retrieval

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace with an invocation of
the storage array based interface.

Signed-off-by: Thomas Gleixner 
Acked-by: Christoph Lameter 
Cc: Andrew Morton 
Cc: Pekka Enberg 
Cc: linux...@kvack.org
Cc: David Rientjes 
---
 mm/slub.c |   12 
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/mm/slub.c
+++ b/mm/slub.c
@@ -552,18 +552,14 @@ static void set_track(struct kmem_cache
 
if (addr) {
 #ifdef CONFIG_STACKTRACE
-   struct stack_trace trace;
+   unsigned int nr_entries;
 
-   trace.nr_entries = 0;
-   trace.max_entries = TRACK_ADDRS_COUNT;
-   trace.entries = p->addrs;
-   trace.skip = 3;
metadata_access_enable();
-   save_stack_trace();
+   nr_entries = stack_trace_save(p->addrs, TRACK_ADDRS_COUNT, 3);
metadata_access_disable();
 
-   if (trace.nr_entries < TRACK_ADDRS_COUNT)
-   p->addrs[trace.nr_entries] = 0;
+   if (nr_entries < TRACK_ADDRS_COUNT)
+   p->addrs[nr_entries] = 0;
 #endif
p->addr = addr;
p->cpu = smp_processor_id();


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 09/29] mm/kasan: Simplify stacktrace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner 
Acked-by: Dmitry Vyukov 
Acked-by: Andrey Ryabinin 
Cc: Alexander Potapenko 
Cc: kasan-...@googlegroups.com
Cc: linux...@kvack.org
---
 mm/kasan/common.c |   30 --
 mm/kasan/report.c |7 ---
 2 files changed, 16 insertions(+), 21 deletions(-)

--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -48,34 +48,28 @@ static inline int in_irqentry_text(unsig
 ptr < (unsigned long)&__softirqentry_text_end);
 }
 
-static inline void filter_irq_stacks(struct stack_trace *trace)
+static inline unsigned int filter_irq_stacks(unsigned long *entries,
+unsigned int nr_entries)
 {
-   int i;
+   unsigned int i;
 
-   if (!trace->nr_entries)
-   return;
-   for (i = 0; i < trace->nr_entries; i++)
-   if (in_irqentry_text(trace->entries[i])) {
+   for (i = 0; i < nr_entries; i++) {
+   if (in_irqentry_text(entries[i])) {
/* Include the irqentry function into the stack. */
-   trace->nr_entries = i + 1;
-   break;
+   return i + 1;
}
+   }
+   return nr_entries;
 }
 
 static inline depot_stack_handle_t save_stack(gfp_t flags)
 {
unsigned long entries[KASAN_STACK_DEPTH];
-   struct stack_trace trace = {
-   .nr_entries = 0,
-   .entries = entries,
-   .max_entries = KASAN_STACK_DEPTH,
-   .skip = 0
-   };
+   unsigned int nr_entries;
 
-   save_stack_trace();
-   filter_irq_stacks();
-
-   return depot_save_stack(, flags);
+   nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 0);
+   nr_entries = filter_irq_stacks(entries, nr_entries);
+   return stack_depot_save(entries, nr_entries, flags);
 }
 
 static inline void set_track(struct kasan_track *track, gfp_t flags)
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -100,10 +100,11 @@ static void print_track(struct kasan_tra
 {
pr_err("%s by task %u:\n", prefix, track->pid);
if (track->stack) {
-   struct stack_trace trace;
+   unsigned long *entries;
+   unsigned int nr_entries;
 
-   depot_fetch_stack(track->stack, );
-   print_stack_trace(, 0);
+   nr_entries = stack_depot_fetch(track->stack, );
+   stack_trace_print(entries, nr_entries, 0);
} else {
pr_err("(stack is not available)\n");
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 08/29] mm/kmemleak: Simplify stacktrace handling

2019-04-25 Thread Thomas Gleixner
Replace the indirection through struct stack_trace by using the storage
array based interfaces.

Signed-off-by: Thomas Gleixner 
Acked-by: Catalin Marinas 
Cc: linux...@kvack.org
---
 mm/kmemleak.c |   24 +++-
 1 file changed, 3 insertions(+), 21 deletions(-)

--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -410,11 +410,6 @@ static void print_unreferenced(struct se
  */
 static void dump_object_info(struct kmemleak_object *object)
 {
-   struct stack_trace trace;
-
-   trace.nr_entries = object->trace_len;
-   trace.entries = object->trace;
-
pr_notice("Object 0x%08lx (size %zu):\n",
  object->pointer, object->size);
pr_notice("  comm \"%s\", pid %d, jiffies %lu\n",
@@ -424,7 +419,7 @@ static void dump_object_info(struct kmem
pr_notice("  flags = 0x%x\n", object->flags);
pr_notice("  checksum = %u\n", object->checksum);
pr_notice("  backtrace:\n");
-   print_stack_trace(, 4);
+   stack_trace_print(object->trace, object->trace_len, 4);
 }
 
 /*
@@ -553,15 +548,7 @@ static struct kmemleak_object *find_and_
  */
 static int __save_stack_trace(unsigned long *trace)
 {
-   struct stack_trace stack_trace;
-
-   stack_trace.max_entries = MAX_TRACE;
-   stack_trace.nr_entries = 0;
-   stack_trace.entries = trace;
-   stack_trace.skip = 2;
-   save_stack_trace(_trace);
-
-   return stack_trace.nr_entries;
+   return stack_trace_save(trace, MAX_TRACE, 2);
 }
 
 /*
@@ -2019,13 +2006,8 @@ early_param("kmemleak", kmemleak_boot_co
 
 static void __init print_log_trace(struct early_log *log)
 {
-   struct stack_trace trace;
-
-   trace.nr_entries = log->trace_len;
-   trace.entries = log->trace;
-
pr_notice("Early log backtrace:\n");
-   print_stack_trace(, 2);
+   stack_trace_print(log->trace, log->trace_len, 2);
 }
 
 /*


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[patch V3 00/29] stacktrace: Consolidate stack trace usage

2019-04-25 Thread Thomas Gleixner
This is an update to V2 which can be found here:

  https://lkml.kernel.org/r/20190418084119.056416...@linutronix.de

Changes vs. V2:

  - Fixed the kernel-doc issue pointed out by Mike

  - Removed the '-1' oddity from the tracer

  - Restricted the tracer nesting to 4

  - Restored the lockdep magic to prevent redundant stack traces

  - Addressed the small nitpicks here and there

  - Picked up Acked/Reviewed tags

The series is based on:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core/stacktrace

which contains the removal of the inconsistent and pointless ULONG_MAX
termination of stacktraces.

It's also available from git:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.core/stacktrace

   up to 5160eb663575 ("x86/stacktrace: Use common infrastructure")

Delta patch vs. v2 below.

Thanks,

tglx

8<-
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 654949dc1c16..f0cfd12cb45e 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -32,14 +32,13 @@ unsigned int stack_trace_save_user(unsigned long *store, 
unsigned int size);
  * @reliable:  True when the stack entry is reliable. Required by
  * some printk based consumers.
  *
- * Returns:True, if the entry was consumed or skipped
+ * Return: True, if the entry was consumed or skipped
  * False, if there is no space left to store
  */
 typedef bool (*stack_trace_consume_fn)(void *cookie, unsigned long addr,
   bool reliable);
 /**
  * arch_stack_walk - Architecture specific function to walk the stack
-
  * @consume_entry: Callback which is invoked by the architecture code for
  * each entry.
  * @cookie:Caller supplied pointer which is handed back to
@@ -47,10 +46,12 @@ typedef bool (*stack_trace_consume_fn)(void *cookie, 
unsigned long addr,
  * @task:  Pointer to a task struct, can be NULL
  * @regs:  Pointer to registers, can be NULL
  *
- * @task   @regs:
- * NULLNULLStack trace from current
+ *  === 
+ * taskregs
+ *  === 
  * taskNULLStack trace from task (can be current)
- * NULLregsStack trace starting on regs->stackpointer
+ * current regsStack trace starting on regs->stackpointer
+ *  === 
  */
 void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
 struct task_struct *task, struct pt_regs *regs);
diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 1e8e246edaad..badd77670d00 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -705,7 +705,8 @@ static struct dma_debug_entry *dma_entry_alloc(void)
 
 #ifdef CONFIG_STACKTRACE
entry->stack_len = stack_trace_save(entry->stack_entries,
-   ARRAY_SIZE(entry->stack_entries), 
1);
+   ARRAY_SIZE(entry->stack_entries),
+   1);
 #endif
return entry;
 }
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 84423f0bb0b0..45bcaf2e4cb6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2160,12 +2160,11 @@ check_deadlock(struct task_struct *curr, struct 
held_lock *next,
  */
 static int
 check_prev_add(struct task_struct *curr, struct held_lock *prev,
-  struct held_lock *next, int distance)
+  struct held_lock *next, int distance, struct lock_trace *trace)
 {
struct lock_list *uninitialized_var(target_entry);
struct lock_list *entry;
struct lock_list this;
-   struct lock_trace trace;
int ret;
 
if (!hlock_class(prev)->key || !hlock_class(next)->key) {
@@ -2198,8 +2197,17 @@ check_prev_add(struct task_struct *curr, struct 
held_lock *prev,
this.class = hlock_class(next);
this.parent = NULL;
ret = check_noncircular(, hlock_class(prev), _entry);
-   if (unlikely(!ret))
+   if (unlikely(!ret)) {
+   if (!trace->nr_entries) {
+   /*
+* If save_trace fails here, the printing might
+* trigger a WARN but because of the !nr_entries it
+* should not do bad things.
+*/
+   save_trace(trace);
+   }
return print_circular_bug(, target_entry, next, prev);
+   }
else if (unlikely(ret < 0))
return print_bfs_bug(ret);
 
@@ -2246,7 +2254,7 @@ check_prev_add(struct task_struct *curr, struct held_lock 
*prev,
return print_bfs_bug(ret);
 
 
-   if (!save_trace())
+   if (!trace->nr_entries && 

Re: [Xen-devel] [PATCH 1/3] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()

2019-04-25 Thread Jan Beulich
>>> On 23.04.19 at 12:54,  wrote:
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -360,8 +360,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>  
> - if (((dev_addr + size - 1 <= dma_mask)) ||
> - range_straddles_page_boundary(phys, size))
> + if ((dev_addr + size - 1 <= dma_mask) &&
> + !WARN_ON(range_straddles_page_boundary(phys, size)))
>   xen_destroy_contiguous_region(phys, order);

On the allocation side we have

if (((dev_addr + size - 1 <= dma_mask)) &&
!range_straddles_page_boundary(phys, size))
*dma_handle = dev_addr;
else {
if (xen_create_contiguous_region(phys, order,
 fls64(dma_mask), dma_handle) 
!= 0) {
xen_free_coherent_pages(hwdev, size, ret, 
(dma_addr_t)phys, attrs);
return NULL;
}
}

which is (as far as the function call is concerned)

if ((dev_addr + size - 1 > dma_mask) ||
range_straddles_page_boundary(phys, size))
xen_create_contiguous_region(...);

So I don't think your transformation is correct. Even worse, both
parts of the condition in xen_swiotlb_free_coherent() act on an
address that is the _result_ of the prior
xen_create_contiguous_region(), i.e. the address should always
match _both_ criteria anyway. Whereas what you really want is
undo the xen_create_contiguous_region() only when it actually
was called. Otherwise you also shatter contiguous allocations
that were contiguous already for other reasons (perhaps just
luck).

Jan


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()

2019-04-25 Thread Juergen Gross
On 25/04/2019 11:01, Jan Beulich wrote:
 On 23.04.19 at 20:36,  wrote:
>> On 23/04/2019 19:05, Stefano Stabellini wrote:
>>> On Tue, 23 Apr 2019, Juergen Gross wrote:
 Instead of always calling xen_destroy_contiguous_region() in case the
 memory is DMA-able for the used device, do so only in case it has been
 made DMA-able via xen_create_contiguous_region() before.

 This will avoid a lot of xen_destroy_contiguous_region() calls for
 64-bit capable devices.

 As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
 flag of the first allocated page can be used for remembering.
>>>
>>> Although the patch looks OK, this sentence puzzles me. Why do you say
>>> that the memory in question is owned by swiotlb-xen? Because it was
>>> returned by xen_alloc_coherent_pages? Both the x86 and the Arm
>>> implementation return fresh new memory, hence, it should be safe to set
>>> the PageOwnerPriv1 flag?
>>>
>>> My concern with this approach is with the semantics of PG_owner_priv_1.
>>> Is a page marked with PG_owner_priv_1 only supposed to be used by the
>>> owner?
>>
>> The owner of the page is free to use the flag.
>>
>> Like Grant pages are marked by the grant driver using this flag. And
>> Xen page tables are using it in PV-guests for indicating a "Pinned"
>> page table.
> 
> Considering the background of the series, isn't such multi-purpose use
> of the flag a possible problem? You're already suspecting a wrong call
> into here. The function finding the flag set (but for another reason)
> might add to the confusion. But I realize there are only so many page
> flags available.

Right.

> Perhaps the freeing function should, first thing, check the handed
> space actually matches the criteria (within dma_mask and contiguous)?

Yes, I think this is a good idea.


Juergen
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 3/3] xen/swiotlb: remember having called xen_create_contiguous_region()

2019-04-25 Thread Jan Beulich
>>> On 23.04.19 at 20:36,  wrote:
> On 23/04/2019 19:05, Stefano Stabellini wrote:
>> On Tue, 23 Apr 2019, Juergen Gross wrote:
>>> Instead of always calling xen_destroy_contiguous_region() in case the
>>> memory is DMA-able for the used device, do so only in case it has been
>>> made DMA-able via xen_create_contiguous_region() before.
>>>
>>> This will avoid a lot of xen_destroy_contiguous_region() calls for
>>> 64-bit capable devices.
>>>
>>> As the memory in question is owned by swiotlb-xen the PG_owner_priv_1
>>> flag of the first allocated page can be used for remembering.
>> 
>> Although the patch looks OK, this sentence puzzles me. Why do you say
>> that the memory in question is owned by swiotlb-xen? Because it was
>> returned by xen_alloc_coherent_pages? Both the x86 and the Arm
>> implementation return fresh new memory, hence, it should be safe to set
>> the PageOwnerPriv1 flag?
>> 
>> My concern with this approach is with the semantics of PG_owner_priv_1.
>> Is a page marked with PG_owner_priv_1 only supposed to be used by the
>> owner?
> 
> The owner of the page is free to use the flag.
> 
> Like Grant pages are marked by the grant driver using this flag. And
> Xen page tables are using it in PV-guests for indicating a "Pinned"
> page table.

Considering the background of the series, isn't such multi-purpose use
of the flag a possible problem? You're already suspecting a wrong call
into here. The function finding the flag set (but for another reason)
might add to the confusion. But I realize there are only so many page
flags available.

Perhaps the freeing function should, first thing, check the handed
space actually matches the criteria (within dma_mask and contiguous)?

Jan


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [Xen-devel] [PATCH 1/3] xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()

2019-04-25 Thread Juergen Gross
On 25/04/2019 10:53, Jan Beulich wrote:
 On 23.04.19 at 12:54,  wrote:
>> --- a/drivers/xen/swiotlb-xen.c
>> +++ b/drivers/xen/swiotlb-xen.c
>> @@ -360,8 +360,8 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
>> size, void *vaddr,
>>  /* Convert the size to actually allocated. */
>>  size = 1UL << (order + XEN_PAGE_SHIFT);
>>  
>> -if (((dev_addr + size - 1 <= dma_mask)) ||
>> -range_straddles_page_boundary(phys, size))
>> +if ((dev_addr + size - 1 <= dma_mask) &&
>> +!WARN_ON(range_straddles_page_boundary(phys, size)))
>>  xen_destroy_contiguous_region(phys, order);
> 
> On the allocation side we have
> 
>   if (((dev_addr + size - 1 <= dma_mask)) &&
>   !range_straddles_page_boundary(phys, size))
>   *dma_handle = dev_addr;
>   else {
>   if (xen_create_contiguous_region(phys, order,
>fls64(dma_mask), dma_handle) 
> != 0) {
>   xen_free_coherent_pages(hwdev, size, ret, 
> (dma_addr_t)phys, attrs);
>   return NULL;
>   }
>   }
> 
> which is (as far as the function call is concerned)
> 
>   if ((dev_addr + size - 1 > dma_mask) ||
>   range_straddles_page_boundary(phys, size))
>   xen_create_contiguous_region(...);
> 
> So I don't think your transformation is correct. Even worse, both
> parts of the condition in xen_swiotlb_free_coherent() act on an
> address that is the _result_ of the prior
> xen_create_contiguous_region(), i.e. the address should always
> match _both_ criteria anyway. Whereas what you really want is
> undo the xen_create_contiguous_region() only when it actually
> was called. Otherwise you also shatter contiguous allocations
> that were contiguous already for other reasons (perhaps just
> luck).

Yes, that is what patch 3 does.


Juergen

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation

2019-04-25 Thread Auger Eric
Hi Liu,

On 4/25/19 9:12 AM, Liu, Yi L wrote:
> Hi Eric,
> 
>> From: Auger Eric [mailto:eric.au...@redhat.com]
>> Sent: Thursday, April 25, 2019 1:28 AM
>> To: Jacob Pan ; 
>> iommu@lists.linux-foundation.org;
>> Subject: Re: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation
>>
>> Hi Jacob,
>>
>> On 4/24/19 1:31 AM, Jacob Pan wrote:
>>> From: Lu Baolu 
>>>
>>> If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the IOMMU
>>> driver should rely on the emulation software to allocate and free
>>> PASID IDs.
>> Do we make the decision depending on the CM or depending on the VCCAP_REG?
>>
>> VCCAP_REG description says:
>>
>> If Set, software must use Virtual Command Register interface to allocate and 
>> free
>> PASIDs.
> 
> The answer is it depends on the ECAP.VCS and then the PASID allocation bit in
> VCCAP_REG. But VCS bit implies the iommu is a software implementation
> (vIOMMU) of vt-d architecture. Pls refer to the descriptions of "Virtual
> Command Support" in vt-d 3.0 spec.
> 
> "Hardware implementations of this architecture report a value of 0
> in this field. Software implementations (emulation) of this
> architecture may report VCS=1."

OK I understand. But strictly speaking a vIOMMU may not implement CM.
But that's nitpicking ;-)

Thanks

Eric
> 
> Thanks,
> Yi Liu
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation

2019-04-25 Thread Liu, Yi L
Hi Eric,

> From: Auger Eric [mailto:eric.au...@redhat.com]
> Sent: Thursday, April 25, 2019 1:28 AM
> To: Jacob Pan ; 
> iommu@lists.linux-foundation.org;
> Subject: Re: [PATCH v2 09/19] iommu/vt-d: Enlightened PASID allocation
> 
> Hi Jacob,
> 
> On 4/24/19 1:31 AM, Jacob Pan wrote:
> > From: Lu Baolu 
> >
> > If Intel IOMMU runs in caching mode, a.k.a. virtual IOMMU, the IOMMU
> > driver should rely on the emulation software to allocate and free
> > PASID IDs.
> Do we make the decision depending on the CM or depending on the VCCAP_REG?
> 
> VCCAP_REG description says:
> 
> If Set, software must use Virtual Command Register interface to allocate and 
> free
> PASIDs.

The answer is it depends on the ECAP.VCS and then the PASID allocation bit in
VCCAP_REG. But VCS bit implies the iommu is a software implementation
(vIOMMU) of vt-d architecture. Pls refer to the descriptions of "Virtual
Command Support" in vt-d 3.0 spec.

"Hardware implementations of this architecture report a value of 0
in this field. Software implementations (emulation) of this
architecture may report VCS=1."

Thanks,
Yi Liu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 5/9] drm/amdkfd: use helper pci_dev_id

2019-04-25 Thread Koenig, Christian
Am 24.04.19 um 21:14 schrieb Heiner Kallweit:
> Use new helper pci_dev_id() to simplify the code.
>
> Signed-off-by: Heiner Kallweit 

Acked-by: Christian König 

> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> index 2cb09e088..769dbc7be 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
> @@ -1272,8 +1272,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
>   
>   dev->node_props.vendor_id = gpu->pdev->vendor;
>   dev->node_props.device_id = gpu->pdev->device;
> - dev->node_props.location_id = PCI_DEVID(gpu->pdev->bus->number,
> - gpu->pdev->devfn);
> + dev->node_props.location_id = pci_dev_id(gpu->pdev);
>   dev->node_props.max_engine_clk_fcompute =
>   amdgpu_amdkfd_get_max_engine_clock_in_mhz(dev->gpu->kgd);
>   dev->node_props.max_engine_clk_ccompute =

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu