from:"John Stultz"

Re: Requirements to merge new heaps in the kernel

2024-10-22 Thread John Stultz

On Tue, Oct 22, 2024 at 1:38 AM Maxime Ripard  wrote:
>
> I wanted to follow-up on the discussion we had at Plumbers with John and
> T.J. about (among other things) adding new heaps to the kernel.
>
> I'm still interested in merging a carve-out driver[1], since it seems to be
> in every vendor BSP and got asked again last week.
>
> I remember from our discussion that for new heap types to be merged, we
> needed a kernel use-case. Looking back, I'm not entirely sure how one
> can provide that given that heaps are essentially facilities for
> user-space.
>
> Am I misremembering or missing something? What are the requirements for
> you to consider adding a new heap driver?

It's basically the same as the DRM subsystem rules.
https://docs.kernel.org/gpu/drm-uapi.html#open-source-userspace-requirements
ie: There has to be opensource user for it, and the user has to be
more significant than a "toy" implementation (which can be a bit
subjective and contentious when trying to get out of a chicken and egg
loop).

thanks
-john

Re: [PATCH v2] dma-buf: fix S_IRUGO to 0444, block comments, func declaration

2024-10-11 Thread John Stultz

On Sat, Oct 5, 2024 at 11:10 AM Pintu Kumar  wrote:
>
> These warnings/errors are reported by checkpatch.
> Fix them with minor changes to make it clean.
> No other functional changes.
>
> WARNING: Block comments use * on subsequent lines
> +   /* only support discovering the end of the buffer,
> +  but also allow SEEK_SET to maintain the idiomatic
>
> WARNING: Block comments use a trailing */ on a separate line
> +  SEEK_END(0), SEEK_CUR(0) pattern */
>
> WARNING: Block comments use a trailing */ on a separate line
> +* before passing the sgt back to the exporter. */
>
> ERROR: "foo * bar" should be "foo *bar"
> +static struct sg_table * __map_dma_buf(struct dma_buf_attachment *attach,
>
> WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using 
> octal permissions '0444'.
> +   d = debugfs_create_file("bufinfo", S_IRUGO, dma_buf_debugfs_dir,
>
> total: 1 errors, 4 warnings, 1746 lines checked
>
> Signed-off-by: Pintu Kumar 

Looks ok to me. Thanks for sending these cleanups!
Acked-by: John Stultz

Re: [PATCH] dma-buf: heaps: Add __init to CMA and system heap module_init functions

2024-09-05 Thread John Stultz

On Thu, Sep 5, 2024 at 5:03 PM T.J. Mercier  wrote:
>
> Shrink the kernel .text a bit after successful initialization of the
> heaps.
>
> Signed-off-by: T.J. Mercier 

Acked-by: John Stultz

Re: [PATCH] dma-buf: heaps: Fix off-by-one in CMA heap fault handler

2024-08-30 Thread John Stultz

On Fri, Aug 30, 2024 at 12:26 PM T.J. Mercier  wrote:
>
> Until VM_DONTEXPAND was added in commit 1c1914d6e8c6 ("dma-buf: heaps:
> Don't track CMA dma-buf pages under RssFile") it was possible to obtain
> a mapping larger than the buffer size via mremap and bypass the overflow
> check in dma_buf_mmap_internal. When using such a mapping to attempt to
> fault past the end of the buffer, the CMA heap fault handler also checks
> the fault offset against the buffer size, but gets the boundary wrong by
> 1. Fix the boundary check so that we don't read off the end of the pages
> array and insert an arbitrary page in the mapping.
>
> Reported-by: Xingyu Jin 
> Fixes: a5d2d29e24be ("dma-buf: heaps: Move heap-helper logic into the 
> cma_heap implementation")
> Cc: sta...@vger.kernel.org # Applicable >= 5.10. Needs adjustments only for 
> 5.10.
> Signed-off-by: T.J. Mercier 
> ---
>  drivers/dma-buf/heaps/cma_heap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/heaps/cma_heap.c 
> b/drivers/dma-buf/heaps/cma_heap.c
> index c384004b918e..93be88b805fe 100644
> --- a/drivers/dma-buf/heaps/cma_heap.c
> +++ b/drivers/dma-buf/heaps/cma_heap.c
> @@ -165,7 +165,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf)
> struct vm_area_struct *vma = vmf->vma;
> struct cma_heap_buffer *buffer = vma->vm_private_data;
>
> -   if (vmf->pgoff > buffer->pagecount)
> +   if (vmf->pgoff >= buffer->pagecount)
> return VM_FAULT_SIGBUS;


Thanks for fixing this! (And thanks to Xingyu Jin for catching it!)

Acked-by: John Stultz 

thanks
-john

Re: [PATCH v2 13/86] drm/hisilicon/kirin: Run DRM default client setup

2024-08-21 Thread John Stultz

On Wed, Aug 21, 2024 at 6:04 AM Thomas Zimmermann  wrote:
>
> Call drm_client_setup() to run the kernel's default client setup
> for DRM. Set fbdev_probe in struct drm_driver, so that the client
> setup can start the common fbdev client.
>
> The kirin driver specifies a preferred color mode of 32. As this
> is the default if no format has been given, leave it out entirely.
>
> Signed-off-by: Thomas Zimmermann 
> Cc: Xinliang Liu 
> Cc: Tian Tao 
> Cc: Xinwei Kong 
> Cc: Sumit Semwal 
> Cc: Yongqin Liu 
> Cc: John Stultz 
> ---
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 2 ++
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c | 4 ++--

I don't have hardware to test anymore, but it looks reasonable.
Acked-by: John Stultz 

thanks
-john

Re: [PATCH v7 05/28] dma-heap: Add proper kref handling on dma-buf heaps

2024-07-22 Thread John Stultz

On Sat, Jul 20, 2024 at 8:13 AM Markus Elfring  wrote:
>
> …
> > +++ b/drivers/dma-buf/dma-heap.c
> …
> > +static void dma_heap_release(struct kref *ref)
> > +{
> …
> > + mutex_lock(&heap_list_lock);
> > + list_del(&heap->list);
> > + mutex_unlock(&heap_list_lock);
> …
>
> Under which circumstances would you become interested to apply a statement
> like “guard(mutex)(&heap_list_lock);”?
> https://elixir.bootlin.com/linux/v6.10/source/include/linux/mutex.h#L196

This strikes me as a strange place to apply it, as it seems like it
would grow the lock hold time to the entire scope of the function
unless one created a subscope for just the list_del, at which point
you're not saving much or really improving readability.  I definitely
think guard usage is very interesting in places where locks are
released in multiple exit paths, etc. but this is a very trivial and
straightforward lock/unlock usage, so I fret I don't quite understand
the suggestion.

thanks
-john

Re: [PATCH] dma-buf/heaps: Correct the types of fd_flags and heap_flags

2024-06-05 Thread John Stultz

On Wed, Jun 5, 2024 at 7:02 PM Barry Song <21cn...@gmail.com> wrote:
>
> From: Barry Song 
>
> dma_heap_allocation_data defines the UAPI as follows:
>
>  struct dma_heap_allocation_data {
> __u64 len;
> __u32 fd;
> __u32 fd_flags;
> __u64 heap_flags;
>  };
>
> But dma heaps are casting both fd_flags and heap_flags into
> unsigned long. This patch makes dma heaps - cma heap and
> system heap have consistent types with UAPI.
>
> Signed-off-by: Barry Song 

Thanks for submitting this additional cleanup!

Acked-by: John Stultz

Re: [RFC PATCH v1] dma-buf: heaps: move the verification of heap_flags to the corresponding heap

2024-06-04 Thread John Stultz

On Mon, Jun 3, 2024 at 11:30 PM Hailong Liu  wrote:
> On 6/4/2024 2:06 AM, John Stultz wrote:
> > On Mon, Jun 3, 2024 at 10:21 AM Hailong Liu  wrote:
> >> We now aim to improve priority dma-buf allocation. Consider android
> >> animations scene:
> >>
> >> when device is in low memory, Allocating dma-buf as animation
> >> buffers enter direct_reclaimation, longer allocation time result in a
> >> laggy UI. But if we know the usage of the dma-buf, we can use some
> >> mechanisms to boost, e.g. animation-memory-pool.
> >
> > Can you generalize this a bit further? When would userland know to use
> > this new flag?
> > If it is aware, would it make sense to just use a separate heap name 
> > instead?
> >
> > (Also: These other mechanisms you mention should probably also be
> > submitted upstream, however for upstream there's also the requirement
> > that we have open users and are not just enabling proprietary blob
> > userspace, which makes any changes to dma-buf heaps for out of tree
> > code quite difficult)
> >
> >> However, dma-buf usage identification becomes a challenge. A potential
> >> solution could be heap_flags. the use of heap_flags seems ugly and
> >> contrary to the intended design as you said, How aboult extending
> >> dma_heap_allocation_data as follows?
> >>
> >> struct dma_heap_allocation_data {
> >> __u64 len;
> >> __u32 fd;
> >> __u32 fd_flags;
> >> __u64 heap_flags;
> >> __u64 buf_flags: // buf usage
> >> };
> >
> > This would affect the ABI (forcing a new ioctl number).  And it's
> > unclear what flags you envision as buffer specific (rather than heap
> > specific as this patch suggested).
> >
> > I think we need more details about the specific problem you're seeing
> > and trying to resolve.
> This patch mainly focuses on optimization for Android scenarios. Let’s
> discuss it on the issue website.
> Bug: 344501512

Ok, we can do that if you need.

But if this is ever going to go upstream (and it's more and more
important that we minimize out of tree technical debt), conversations
about how to generalize this will need to happen on the list.

thanks
-john

Re: [RFC PATCH v1] dma-buf: heaps: move the verification of heap_flags to the corresponding heap

2024-06-03 Thread John Stultz

On Mon, Jun 3, 2024 at 10:21 AM Hailong Liu  wrote:
> On Mon, 03. Jun 09:01, John Stultz wrote:
> > On Mon, Jun 3, 2024 at 4:40 AM  wrote:
> > >
> > > From: "Hailong.Liu" 
> > >
> > > This help module use heap_flags to determine the type of dma-buf,
> > > so that some mechanisms can be used to speed up allocation, such as
> > > memory_pool, to optimize the allocation time of dma-buf.
> >
> > This feels like it's trying to introduce heap specific flags, but
> > doesn't introduce any details about what those flags might be?
> >
> > This seems like it would re-allow the old opaque vendor specific heap
> > flags that we saw in the ION days, which was problematic as different
> > userspaces would use the same interface with potentially colliding
> > heap flags with different meanings. Resulting in no way to properly
> > move to an upstream solution.
> >
> > With the dma-heaps interface, we're trying to make sure it is well
> > defined. One can register a number of heaps with different behaviors,
> > and the heap name is used to differentiate the behavior. Any flags
> > introduced will need to be well defined and behaviorally consistent
> > between heaps. That way when an upstream solution lands, if necessary
> > we can provide backwards compatibility via symlinks.
> >
> > So I don't think this is a good direction to go for dma-heaps.
> >
> > It would be better if you were able to clarify what flag requirements
> > you need, so we can better understand how they might apply to other
> > heaps, and see if it was something we would want to define as a flag
> > (see the discussion here for similar thoughts:
> > https://lore.kernel.org/lkml/candhncookwtpstfe2vdcuvzdxuwkz-zx+fz6xrdpwtycivx...@mail.gmail.com/
> > )
> >
> > But if your vendor heap really needs some sort of flags argument that
> > you can't generalize, you can always implement your own dmabuf
> > exporter driver with whatever ioctl interface you'd prefer.
>
> Thanks for your reply. Let’s continue our discussion here instead
> of on android-review. We aim to enhance memory allocation on each
> all heaps. Your pointer towards heap_flags used in /dev/ion for heap
> identification was helpful.
>
> We now aim to improve priority dma-buf allocation. Consider android
> animations scene:
>
> when device is in low memory, Allocating dma-buf as animation
> buffers enter direct_reclaimation, longer allocation time result in a
> laggy UI. But if we know the usage of the dma-buf, we can use some
> mechanisms to boost, e.g. animation-memory-pool.

Can you generalize this a bit further? When would userland know to use
this new flag?
If it is aware, would it make sense to just use a separate heap name instead?

(Also: These other mechanisms you mention should probably also be
submitted upstream, however for upstream there's also the requirement
that we have open users and are not just enabling proprietary blob
userspace, which makes any changes to dma-buf heaps for out of tree
code quite difficult)

> However, dma-buf usage identification becomes a challenge. A potential
> solution could be heap_flags. the use of heap_flags seems ugly and
> contrary to the intended design as you said, How aboult extending
> dma_heap_allocation_data as follows?
>
> struct dma_heap_allocation_data {
> __u64 len;
> __u32 fd;
> __u32 fd_flags;
> __u64 heap_flags;
> __u64 buf_flags: // buf usage
> };

This would affect the ABI (forcing a new ioctl number).  And it's
unclear what flags you envision as buffer specific (rather than heap
specific as this patch suggested).

I think we need more details about the specific problem you're seeing
and trying to resolve.

thanks
-john

Re: [RFC PATCH v1] dma-buf: heaps: move the verification of heap_flags to the corresponding heap

2024-06-03 Thread John Stultz

On Mon, Jun 3, 2024 at 4:40 AM  wrote:
>
> From: "Hailong.Liu" 
>
> This help module use heap_flags to determine the type of dma-buf,
> so that some mechanisms can be used to speed up allocation, such as
> memory_pool, to optimize the allocation time of dma-buf.

This feels like it's trying to introduce heap specific flags, but
doesn't introduce any details about what those flags might be?

This seems like it would re-allow the old opaque vendor specific heap
flags that we saw in the ION days, which was problematic as different
userspaces would use the same interface with potentially colliding
heap flags with different meanings. Resulting in no way to properly
move to an upstream solution.

With the dma-heaps interface, we're trying to make sure it is well
defined. One can register a number of heaps with different behaviors,
and the heap name is used to differentiate the behavior. Any flags
introduced will need to be well defined and behaviorally consistent
between heaps. That way when an upstream solution lands, if necessary
we can provide backwards compatibility via symlinks.

So I don't think this is a good direction to go for dma-heaps.

It would be better if you were able to clarify what flag requirements
you need, so we can better understand how they might apply to other
heaps, and see if it was something we would want to define as a flag
(see the discussion here for similar thoughts:
https://lore.kernel.org/lkml/candhncookwtpstfe2vdcuvzdxuwkz-zx+fz6xrdpwtycivx...@mail.gmail.com/
)

But if your vendor heap really needs some sort of flags argument that
you can't generalize, you can always implement your own dmabuf
exporter driver with whatever ioctl interface you'd prefer.

thanks
-john

Re: [RFC PATCH] dma-buf: align fd_flags and heap_flags with dma_heap_allocation_data

2024-05-29 Thread John Stultz

On Wed, May 22, 2024 at 2:02 AM Barry Song <21cn...@gmail.com> wrote:
>
> From: Barry Song 
>
> dma_heap_allocation_data defines the UAPI as follows:
>
>  struct dma_heap_allocation_data {
>  __u64 len;
>  __u32 fd;
>  __u32 fd_flags;
>  __u64 heap_flags;
>  };
>
> However, dma_heap_buffer_alloc() casts them into unsigned int. It's unclear
> whether this is intentional or what the purpose is, but it can be quite
> confusing for users.
>
> Adding to the confusion, dma_heap_ops.allocate defines both of these as
> unsigned long. Fortunately, since dma_heap_ops is not part of the UAPI,
> it is less of a concern.
>
> struct dma_heap_ops {
> struct dma_buf *(*allocate)(struct dma_heap *heap,
> unsigned long len,
> unsigned long fd_flags,
> unsigned long heap_flags);
> };
>
> I am sending this RFC in hopes of clarifying these confusions.
>
> If the goal is to constrain both flags to 32 bits while ensuring the struct
> is aligned to 64 bits, it would have been more suitable to define
> dma_heap_allocation_data accordingly from the beginning, like so:
>
>  struct dma_heap_allocation_data {
>  __u64 len;
>  __u32 fd;
>  __u32 fd_flags;
>  __u32 heap_flags;
>  __u32 padding;
>  };

So here, if I recall, the intent was to keep 64bits for potential
future heap_flags.

But your point above that we're inconsistent with types in the non
UAPI arguments is valid.
So I think your patch makes sense.

Thanks for raising this issue!
Acked-by: John Stultz

Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-16 Thread John Stultz

On Thu, May 16, 2024 at 5:22 AM Maxime Ripard  wrote:
> On Wed, May 15, 2024 at 11:42:58AM -0700, John Stultz wrote:
> > I apologize as my worry is mostly born out of seeing vendors really
> > push opaque feature flags in their old ion heaps, so in providing a
> > flags argument, it was mostly intended as an escape hatch for
> > obviously common attributes. So having the first be something that
> > seems reasonable, but isn't actually that common makes me fret some.
> >
> > So again, not an objection, just something for folks to stew on to
> > make sure this is really the right approach.
>
> I understand your hesitation and concern :) Is there anything we could
> provide that would help moving the discussion forward?
>

Mostly I just want to make sure it's discussed, which is why I raise
it as a concern.

Getting APIs right is hard, and I know I've made my share of mistakes
(for instance, a situation that very much echoes this current
question: the *_ALARM clockids probably should have used a flag). So
I've found highlighting the trade-offs and getting other folk's
perspectives useful to avoid such issues.  But I also don't intend to
needlessly delay things.

> > Another thing to discuss, that I didn't see in your mail: Do we have
> > an open-source user of this new flag?
>
> Not at the moment. I guess it would be one of the things that would
> reduce your concerns, but is it a requirement?

So I'd defer to Sima on this. There have been a number of heap
releated changes that have had to be held out of tree on this
requirement, but I'm hoping recent efforts on upstream device support
will eventually unblock those.

thanks
-john

Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-16 Thread John Stultz

On Thu, May 16, 2024 at 3:56 AM Daniel Vetter  wrote:
> On Wed, May 15, 2024 at 11:42:58AM -0700, John Stultz wrote:
> > But it makes me a little nervous to add a new generic allocation flag
> > for a feature most hardware doesn't support (yet, at least). So it's
> > hard to weigh how common the actual usage will be across all the
> > heaps.
> >
> > I apologize as my worry is mostly born out of seeing vendors really
> > push opaque feature flags in their old ion heaps, so in providing a
> > flags argument, it was mostly intended as an escape hatch for
> > obviously common attributes. So having the first be something that
> > seems reasonable, but isn't actually that common makes me fret some.
> >
> > So again, not an objection, just something for folks to stew on to
> > make sure this is really the right approach.
>
> Another good reason to go with full heap names instead of opaque flags on
> existing heaps is that with the former we can use symlinks in sysfs to
> specify heaps, with the latter we need a new idea. We haven't yet gotten
> around to implement this anywhere, but it's been in the dma-buf/heap todo
> since forever, and I like it as a design approach. So would be a good idea
> to not toss it. With that display would have symlinks to cma-ecc and cma,
> and rendering maybe cma-ecc, shmem, cma heaps (in priority order) for a
> SoC where the display needs contig memory for scanout.

So indeed that is a good point to keep in mind, but I also think it
might re-inforce the choice of having ECC as a flag here.

Since my understanding of the sysfs symlinks to heaps idea is about
being able to figure out a common heap from a collection of devices,
it's really about the ability for the driver to access the type of
memory. If ECC is just an attribute of the type of memory (as in this
patch series), it being on or off won't necessarily affect
compatibility of the buffer with the device.  Similarly "uncached"
seems more of an attribute of memory type and not a type itself.
Hardware that can access non-contiguous "system" buffers can access
uncached system buffers.

thanks
-john

Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-15 Thread John Stultz

On Wed, May 15, 2024 at 6:57 AM Maxime Ripard  wrote:
> This series is the follow-up of the discussion that John and I had a few
> months ago here:
>
> https://lore.kernel.org/all/candhncqujn6bh3kxkf65bwitylvqsd9892-xtfdhhqqyrro...@mail.gmail.com/
>
> The initial problem we were discussing was that I'm currently working on
> a platform which has a memory layout with ECC enabled. However, enabling
> the ECC has a number of drawbacks on that platform: lower performance,
> increased memory usage, etc. So for things like framebuffers, the
> trade-off isn't great and thus there's a memory region with ECC disabled
> to allocate from for such use cases.
>
> After a suggestion from John, I chose to start using heap allocations
> flags to allow for userspace to ask for a particular ECC setup. This is
> then backed by a new heap type that runs from reserved memory chunks
> flagged as such, and the existing DT properties to specify the ECC
> properties.
>
> We could also easily extend this mechanism to support more flags, or
> through a new ioctl to discover which flags a given heap supports.

Hey! Thanks for sending this along! I'm eager to see more heap related
work being done upstream.

The only thing that makes me a bit hesitant, is the introduction of
allocation flags (as opposed to a uniquely specified/named "ecc"
heap).

We did talk about this earlier, and my earlier press that only if the
ECC flag was general enough to apply to the majority of heaps then it
makes sense as a flag, and your patch here does apply it to all the
heaps. So I don't have an objection.

But it makes me a little nervous to add a new generic allocation flag
for a feature most hardware doesn't support (yet, at least). So it's
hard to weigh how common the actual usage will be across all the
heaps.

I apologize as my worry is mostly born out of seeing vendors really
push opaque feature flags in their old ion heaps, so in providing a
flags argument, it was mostly intended as an escape hatch for
obviously common attributes. So having the first be something that
seems reasonable, but isn't actually that common makes me fret some.

So again, not an objection, just something for folks to stew on to
make sure this is really the right approach.

Another thing to discuss, that I didn't see in your mail: Do we have
an open-source user of this new flag?

thanks
-john

Re: [PATCH 11/21] drm/hisilicon/kirin: Allow build with COMPILE_TEST=y

2024-04-08 Thread John Stultz

On Mon, Apr 8, 2024 at 10:05 AM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Allow kirin to be built with COMPILE_TEST=y for greater
> coverage. Builds fine on x86/x86_64 at least.
>
> Cc: Xinliang Liu 
> Cc: Tian Tao 
> Cc: Xinwei Kong 
> Cc: Sumit Semwal 
> Cc: Yongqin Liu 
> Cc: John Stultz 
> Signed-off-by: Ville Syrjälä 

Acked-by: John Stultz

Re: [PATCH 10/21] drm/hisilicon/kirin: Fix MASK(32) on 32bit architectures

2024-04-08 Thread John Stultz

On Mon, Apr 8, 2024 at 10:05 AM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> BIT(32) is illegal when sizeof(long)==4. Use BIT_ULL(32)
> instead.
>
> Cc: Xinliang Liu 
> Cc: Tian Tao 
> Cc: Xinwei Kong 
> Cc: Sumit Semwal 
> Cc: Yongqin Liu 
> Cc: John Stultz 
> Signed-off-by: Ville Syrjälä 

Acked-by: John Stultz

Re: [PATCH 09/21] drm/hisilicon/kirin: Fix 64bit divisions

2024-04-08 Thread John Stultz

On Mon, Apr 8, 2024 at 10:05 AM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Use the appropriate 64bit division helpers to make the code
> build on 32bit architectures.
>
> Cc: Xinliang Liu 
> Cc: Tian Tao 
> Cc: Xinwei Kong 
> Cc: Sumit Semwal 
> Cc: Yongqin Liu 
> Cc: John Stultz 
> Signed-off-by: Ville Syrjälä 

Acked-by: John Stultz

Re: [PATCH 08/21] drm/hisilicon/kirin: Include linux/io.h for readl()/writel()

2024-04-08 Thread John Stultz

On Mon, Apr 8, 2024 at 10:04 AM Ville Syrjala
 wrote:
>
> From: Ville Syrjälä 
>
> Include linux/io.h for readl()/writel().
>
> When built on x86_64 w/ COMPILE_TEST=y:
> ../drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h:93:16: error: implicit 
> declaration of function ‘readl’ [-Werror=implicit-function-declaration]
>93 | orig = readl(addr);
>   |^
> ../drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h:96:9: error: implicit 
> declaration of function ‘writel’ [-Werror=implicit-function-declaration]
>96 | writel(tmp, addr);
>   | ^~
>
> Cc: Xinliang Liu 
> Cc: Tian Tao 
> Cc: Xinwei Kong 
> Cc: Sumit Semwal 
> Cc: Yongqin Liu 
> Cc: John Stultz 
> Signed-off-by: Ville Syrjälä 

Acked-by: John Stultz

Re: ECC memory semantics for heaps

2024-03-04 Thread John Stultz

On Mon, Mar 4, 2024 at 5:46 AM Maxime Ripard  wrote:
> On Wed, Feb 28, 2024 at 08:17:55PM -0800, John Stultz wrote:
> > On Wed, Feb 28, 2024 at 7:24 AM Maxime Ripard  wrote:
> > >
> > > I'm currently working on a platform that seems to have togglable RAM ECC
> > > support. Enabling ECC reduces the memory capacity and memory bandwidth,
> > > so while it's a good idea to protect most of the system, it's not worth
> > > it for things like framebuffers that won't really be affected by a
> > > bitflip.
> > >
> > > It's currently setup by enabling ECC on the entire memory, and then
> > > having a region of memory where ECC is disabled and where we're supposed
> > > to allocate from for allocations that don't need it.
> > >
> > > My first thought to support this was to create a reserved memory region
> > > for the !ECC memory, and to create a heap to allocate buffers from that
> > > region. That would leave the system protected by ECC, while enabling
> > > userspace to be nicer to the system by allocating buffers from the !ECC
> > > region if it doesn't need it.
> > >
> > > However, this creates basically a new combination compared to the one we
> > > already have (ie, physically contiguous vs virtually contiguous), and we
> > > probably would want to throw in cacheable vs non-cacheable too.
> > >
> > > If we had to provide new heaps for each variation, we would have 8 heaps
> > > (and 6 new ones), which could be fine I guess but would still increase
> > > quite a lot the number of heaps we have so far.
> > >
> > > Is it something that would be a problem? If it is, do you see another
> > > way to support those kind of allocations (like providing hints through
> > > the ioctl maybe?)?
> >
> > So, the dma-buf heaps interface uses chardevs so that we can have a
> > lot of flexibility in the types of heaps (and don't have the risk of
> > bitmask exhaustion like ION had). So I don't see adding many
> > differently named heaps as particularly problematic.
>
> Ok
>
> > That said, if there are truly generic properties (cacheable vs
> > non-cachable is maybe one of those) which apply to most heaps, I'm
> > open to making use of the flags. But I want to avoid having per-heap
> > flags, it really needs to be a generic attribute.
> >
> > And I personally don't mind the idea of having things added as heaps
> > initially, and potentially upgrading them to flags if needed (allowing
> > heap drivers to optionally enumerate the old chardevs behind a config
> > option for backwards compatibility).
> >
> > How common is the hardware that is going to provide this configurable
> > ECC option
>
> In terms of number of SoCs with the feature, it's probably a handful. In
> terms of number of units shipped, we're in the fairly common range :)
>

Sure, I guess I was trying to get a sense of is this a feature we'll
likely be seeing commonly across hardware (such that internal kernel
allocators would be considering it as a flag), or is it more tied to a
single vendor such that enabling/isolating it in a driver is the right
place in the abstraction to put it.


> > and will you really want the option on all of the heap types?
>
> Aside from the cacheable/uncacheable discussion, yes. We could probably
> get away with only physically contiguous allocations at the moment
> though, I'll double check.

Ok, that will be useful to know.

> > Will there be any hardware constraint limitations caused by the
> > ECC/!ECC flags? (ie: Devices that can't use !ECC allocated buffers?)
>
> My understanding is that there's no device restriction. It will be a
> carved out memory so we will need to maintain a separate pool and it
> will be limited in size, but that's pretty much the only one afaik.

Ok.

> > If not, I wonder if it would make sense to have something more along
> > the lines using a fcntl()  like how F_SEAL_* is used with memfds?
> > With some of the discussion around "restricted"/"secure" heaps that
> > can change state, I've liked this idea of just allocating dmabufs from
> > normal heaps and then using fcntl or something similar to modify
> > properties of the buffer that are separate from the type of memory
> > that is needed to be allocated to satisfy device constraints.
>
> Sorry, I'm not super familiar with F_SEAL so I don't really follow what
> you have in mind here. Do you have any additional resources I could read
> to understand bette

Re: ECC memory semantics for heaps

2024-02-28 Thread John Stultz

On Wed, Feb 28, 2024 at 7:24 AM Maxime Ripard  wrote:
>
> I'm currently working on a platform that seems to have togglable RAM ECC
> support. Enabling ECC reduces the memory capacity and memory bandwidth,
> so while it's a good idea to protect most of the system, it's not worth
> it for things like framebuffers that won't really be affected by a
> bitflip.
>
> It's currently setup by enabling ECC on the entire memory, and then
> having a region of memory where ECC is disabled and where we're supposed
> to allocate from for allocations that don't need it.
>
> My first thought to support this was to create a reserved memory region
> for the !ECC memory, and to create a heap to allocate buffers from that
> region. That would leave the system protected by ECC, while enabling
> userspace to be nicer to the system by allocating buffers from the !ECC
> region if it doesn't need it.
>
> However, this creates basically a new combination compared to the one we
> already have (ie, physically contiguous vs virtually contiguous), and we
> probably would want to throw in cacheable vs non-cacheable too.
>
> If we had to provide new heaps for each variation, we would have 8 heaps
> (and 6 new ones), which could be fine I guess but would still increase
> quite a lot the number of heaps we have so far.
>
> Is it something that would be a problem? If it is, do you see another
> way to support those kind of allocations (like providing hints through
> the ioctl maybe?)?

So, the dma-buf heaps interface uses chardevs so that we can have a
lot of flexibility in the types of heaps (and don't have the risk of
bitmask exhaustion like ION had). So I don't see adding many
differently named heaps as particularly problematic.

That said, if there are truly generic properties (cacheable vs
non-cachable is maybe one of those) which apply to most heaps, I'm
open to making use of the flags. But I want to avoid having per-heap
flags, it really needs to be a generic attribute.

And I personally don't mind the idea of having things added as heaps
initially, and potentially upgrading them to flags if needed (allowing
heap drivers to optionally enumerate the old chardevs behind a config
option for backwards compatibility).

How common is the hardware that is going to provide this configurable
ECC option, and will you really want the option on all of the heap
types? Will there be any hardware constraint limitations caused by the
ECC/!ECC flags? (ie: Devices that can't use !ECC allocated buffers?)

If not, I wonder if it would make sense to have something more along
the lines using a fcntl()  like how F_SEAL_* is used with memfds?
With some of the discussion around "restricted"/"secure" heaps that
can change state, I've liked this idea of just allocating dmabufs from
normal heaps and then using fcntl or something similar to modify
properties of the buffer that are separate from the type of memory
that is needed to be allocated to satisfy device constraints.

thanks
-john

Re: [PATCH] drm/hisilicon: Use devm_platform_get_and_ioremap_resource() in dsi_parse_dt()

2024-02-05 Thread John Stultz

On Mon, Feb 5, 2024 at 12:13 AM Markus Elfring  wrote:
>
> From: Markus Elfring 
> Date: Mon, 5 Feb 2024 08:58:21 +0100
>
> A wrapper function is available since the commit 
> 890cc39a879906b63912482dfc41944579df2dc6
> ("drivers: provide devm_platform_get_and_ioremap_resource()").
> Thus reuse existing functionality instead of keeping duplicate source code.
>
> This issue was detected by using the Coccinelle software.
>
> Signed-off-by: Markus Elfring 

I no longer have hardware to test with this, but this looks reasonable to me.

Acked-by: John Stultz 

thanks
-john

Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops

2024-01-31 Thread John Stultz

On Wed, Jan 31, 2024 at 6:15 AM Joakim Bech  wrote:
> On Fri, Jan 12, 2024 at 05:23:07PM -0800, John Stultz wrote:
> > So we need some clarity in what "restricted" really means.  For
> > instance, it being not cpu mappable vs other potential variations like
> > being cpu mappable, but not cpu accessible.  Or not cpu mappable, but
> > only mappable between a set of 3 devices (Which 3 devices?! How can we
> > tell?).
> >
> Can we flip things around? I.e., instead of saying which devices are
> allowed to use the restricted buffer, can we instead say where it's not
> allowed to be used? My view has been that by default the contents of the
> types of buffers where talking about here is only accessible to things
> running on the secure side, i.e, typically S-EL3, S-EL1 and a specific
> Trusted Application running in S-EL0. I guess that serves as some kind
> of baseline.

? This seems like you're suggesting enumerating badness? I'm not sure
I understand the benefit of that.

> From there, things turns to a more dynamic nature, where firewalls etc,
> can be configured to give access to various IPs, blocks and runtimes.
>
> I understand that it's nice to be able to know all this from the Linux
> kernel point of view, but does it have to be aware of this? What's the
> major drawback if Linux doesn't know about it?

Indeed, it doesn't necessarily. The idea with DMABUF heaps is it
provides a name to abstract/wrap a type of constraint. So you can then
allocate buffers that satisfy that constraint.

Admittedly the downside with DMABUF heaps is that it has a bit of a
gap in the abstraction in that we don't have a mapping of device
constraints, so in Android gralloc provides a device specific
usage/pipeline -> heap mapping.
(Note: This I don't think is very problematic - I often use the
example of fstab as device-specific config everyone is comfortable
with - but I know folks would like to have something more generic)

I believe Christian has previously proposed to have the devices
provide something like symlinks from their sysfs  nodes to the heaps
the device supports, which is an interesting idea to mostly close that
issue. Applications could then scan the devices in a pipeline and find
the type they all support, and the specific names wouldn't matter.

However, I'd expect the same hardware pipeline might support both
restricted and unrestricted playback, so there would need to be some
way to differentiate for the use case, so I'm not sure you can get
away from some heap name to functionality mapping.

My main concern with this patch series is that it seems to want to
bundle all the different types of "restricted" buffers that might be
possible under a single "restricted" heap name.

Since we likely have devices with different security domains, thus
different types of restrictions. So we may need to be able to
differentiate between "secure video playback" uses and "protected
device firmware" uses on the same machine. Thus, I'm not sure it's a
good idea to bundle all of these under the same heap name.

thanks
-john

Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops

2024-01-12 Thread John Stultz

On Fri, Jan 12, 2024 at 4:13 PM Jeffrey Kardatzke  wrote:
> On Fri, Jan 12, 2024 at 3:51 PM John Stultz  wrote:
> >
> > On Fri, Jan 12, 2024 at 3:27 PM Jeffrey Kardatzke  
> > wrote:
> > > On Fri, Jan 12, 2024 at 2:52 PM John Stultz  wrote:
> > > > I know part of this effort here is to start to centralize all these
> > > > vendor specific restricted buffer implementations, which I think is
> > > > great, but I just worry that in doing it in the dmabuf heap interface,
> > > > it is a bit too user-facing. The likelihood of encoding a particular
> > > > semantic to the singular "restricted_heap" name seems high.
> > >
> > > In patch #5 it has the actual driver implementation for the MTK heap
> > > that includes the heap name of "restricted_mtk_cm", so there shouldn't
> > > be a heap named "restricted_heap" that's actually getting exposed. The
> >
> > Ah, I appreciate that clarification! Indeed, you're right the name is
> > passed through. Apologies for missing that detail.
> >
> > > restricted_heap code is a framework, and not a driver itself.  Unless
> > > I'm missing something in this patchset...but that's the way it's
> > > *supposed* to be.
> >
> > So I guess I'm not sure I understand the benefit of the extra
> > indirection. What then does the restricted_heap.c logic itself
> > provide?
> > The dmabuf heaps framework already provides a way to add heap 
> > implementations.
>
> So in the v1 patchset, it was done with just a Mediatek specific heap
> with no framework or abstractions for another vendor to build on top
> of. The feedback was to make this more generic since Mediatek won't be
> the only vendor who wants a restricted heap..which is how it ended up
> here. There was more code in the framework before relating to TEE
> calls, but then that was moved to the vendor specific code since not
> all restricted heaps are allocated through a TEE.

Yeah. I apologize, as I know how frustrating the contradictory
feedback can be. I don't mean to demotivate. :(

I think folks would very much like to see consolidation around the
various implementations, and I agree!
I just worry that creating the common framework for this concept in a
dmabuf heaps driver is maybe too peripheral/close to userland.

> This was also desirable for the V4L2 pieces since there's going to be
> a V4L2 flag set when using restricted dma_bufs (and it wants to
> validate that)so in order to keep that more generic, there should
> be a higher level concept of restricted dma_bufs that isn't specific
> to a single vendor.  One other thing that would ideally come out of
> this is a cleaner way to check that a dma_buf is restricted or not.

Yeah. If there is a clear meaning to "restricted" here, I think having
a query method on the dmabuf is reasonable.
My only fret is if the meaning is too vague and userland starts
depending on it meaning what it meant for vendor1, but doesn't mean
for vendor2.

So we need some clarity in what "restricted" really means.  For
instance, it being not cpu mappable vs other potential variations like
being cpu mappable, but not cpu accessible.  Or not cpu mappable, but
only mappable between a set of 3 devices (Which 3 devices?! How can we
tell?).

And if there is variation, maybe we need to enumerate the types of
"restricted" buffers so we can be specific when it's queried.

That's where maybe having the framework for this be more central or
closer to the kernel mm code and not just a sub-type of a dmabuf heap
driver might be better?

> The current V4L2 patchset just attaches the dma_buf and then checks if
> the page table is emptyand if so, it's restricted. But now I see
> there's other feedback indicating attaching a restricted dma_buf
> shouldn't even be allowed, so we'll need another strategy for
> detecting them. Ideally there is some function/macro like
> is_dma_buf_restricted(struct dma_buf*) that can indicate that...but we
> haven't come up with a good way to do that yet which doesn't involve
> adding another field to dma_buf or to dma_buf_ops (and if such a thing
> would be fine, then OK...but I had assumed we would get pushback on
> modifying either of those structs).

If there's a need and the best place to put something is in the
dma_buf or dma_buf_ops, that's where it should go.  Folks may
reasonably disagree if it's the best place (there may be yet better
spots for the state to sit in the abstractions), but for stuff going
upstream, there's no reason to try to hack around things or smuggle
state just to avoid changing core structure

Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops

2024-01-12 Thread John Stultz

On Fri, Jan 12, 2024 at 3:27 PM Jeffrey Kardatzke  wrote:
> On Fri, Jan 12, 2024 at 2:52 PM John Stultz  wrote:
> > On Fri, Jan 12, 2024 at 1:21 AM Yong Wu  wrote:
> > > diff --git a/drivers/dma-buf/heaps/restricted_heap.h 
> > > b/drivers/dma-buf/heaps/restricted_heap.h
> > > index 443028f6ba3b..ddeaf9805708 100644
> > > --- a/drivers/dma-buf/heaps/restricted_heap.h
> > > +++ b/drivers/dma-buf/heaps/restricted_heap.h
> > > @@ -15,6 +15,18 @@ struct restricted_buffer {
> > >
> > >  struct restricted_heap {
> > > const char  *name;
> > > +
> > > +   const struct restricted_heap_ops *ops;
> > > +};
> > > +
> > > +struct restricted_heap_ops {
> > > +   int (*heap_init)(struct restricted_heap *heap);
> > > +
> > > +   int (*memory_alloc)(struct restricted_heap *heap, struct 
> > > restricted_buffer *buf);
> > > +   void(*memory_free)(struct restricted_heap *heap, struct 
> > > restricted_buffer *buf);
> > > +
> > > +   int (*memory_restrict)(struct restricted_heap *heap, struct 
> > > restricted_buffer *buf);
> > > +   void(*memory_unrestrict)(struct restricted_heap *heap, struct 
> > > restricted_buffer *buf);
> > >  };
> > >
> > >  int restricted_heap_add(struct restricted_heap *rstrd_heap);
> >
> > So, I'm a little worried here, because you're basically turning the
> > restricted_heap dma-buf heap driver into a framework itself.
> > Where this patch is creating a subdriver framework.
> >
> > Part of my hesitancy, is you're introducing this under the dma-buf
> > heaps. For things like CMA, that's more of a core subsystem that has
> > multiple users, and exporting cma buffers via dmabuf heaps is just an
> > additional interface.  What I like about that is the core kernel has
> > to define the semantics for the memory type and then the dmabuf heap
> > is just exporting that well understood type of buffer.
> >
> > But with these restricted buffers, I'm not sure there's yet a well
> > understood set of semantics nor a central abstraction for that which
> > other drivers use directly.
> >
> > I know part of this effort here is to start to centralize all these
> > vendor specific restricted buffer implementations, which I think is
> > great, but I just worry that in doing it in the dmabuf heap interface,
> > it is a bit too user-facing. The likelihood of encoding a particular
> > semantic to the singular "restricted_heap" name seems high.
>
> In patch #5 it has the actual driver implementation for the MTK heap
> that includes the heap name of "restricted_mtk_cm", so there shouldn't
> be a heap named "restricted_heap" that's actually getting exposed. The

Ah, I appreciate that clarification! Indeed, you're right the name is
passed through. Apologies for missing that detail.

> restricted_heap code is a framework, and not a driver itself.  Unless
> I'm missing something in this patchset...but that's the way it's
> *supposed* to be.

So I guess I'm not sure I understand the benefit of the extra
indirection. What then does the restricted_heap.c logic itself
provide?
The dmabuf heaps framework already provides a way to add heap implementations.

thanks
-john

Re: [PATCH v4 3/7] dma-buf: heaps: restricted_heap: Add private heap ops

2024-01-12 Thread John Stultz

On Fri, Jan 12, 2024 at 1:21 AM Yong Wu  wrote:
>
> Add "struct restricted_heap_ops". For the restricted memory, totally there
> are two steps:
> a) memory_alloc: Allocate the buffer in kernel;
> b) memory_restrict: Restrict/Protect/Secure that buffer.
> The memory_alloc is mandatory while memory_restrict is optinal since it may
> be part of memory_alloc.
>
> Signed-off-by: Yong Wu 
> ---
>  drivers/dma-buf/heaps/restricted_heap.c | 41 -
>  drivers/dma-buf/heaps/restricted_heap.h | 12 
>  2 files changed, 52 insertions(+), 1 deletion(-)
>

Thanks for sending this out! A thought below.

> diff --git a/drivers/dma-buf/heaps/restricted_heap.h 
> b/drivers/dma-buf/heaps/restricted_heap.h
> index 443028f6ba3b..ddeaf9805708 100644
> --- a/drivers/dma-buf/heaps/restricted_heap.h
> +++ b/drivers/dma-buf/heaps/restricted_heap.h
> @@ -15,6 +15,18 @@ struct restricted_buffer {
>
>  struct restricted_heap {
> const char  *name;
> +
> +   const struct restricted_heap_ops *ops;
> +};
> +
> +struct restricted_heap_ops {
> +   int (*heap_init)(struct restricted_heap *heap);
> +
> +   int (*memory_alloc)(struct restricted_heap *heap, struct 
> restricted_buffer *buf);
> +   void(*memory_free)(struct restricted_heap *heap, struct 
> restricted_buffer *buf);
> +
> +   int (*memory_restrict)(struct restricted_heap *heap, struct 
> restricted_buffer *buf);
> +   void(*memory_unrestrict)(struct restricted_heap *heap, struct 
> restricted_buffer *buf);
>  };
>
>  int restricted_heap_add(struct restricted_heap *rstrd_heap);

So, I'm a little worried here, because you're basically turning the
restricted_heap dma-buf heap driver into a framework itself.
Where this patch is creating a subdriver framework.

Part of my hesitancy, is you're introducing this under the dma-buf
heaps. For things like CMA, that's more of a core subsystem that has
multiple users, and exporting cma buffers via dmabuf heaps is just an
additional interface.  What I like about that is the core kernel has
to define the semantics for the memory type and then the dmabuf heap
is just exporting that well understood type of buffer.

But with these restricted buffers, I'm not sure there's yet a well
understood set of semantics nor a central abstraction for that which
other drivers use directly.

I know part of this effort here is to start to centralize all these
vendor specific restricted buffer implementations, which I think is
great, but I just worry that in doing it in the dmabuf heap interface,
it is a bit too user-facing. The likelihood of encoding a particular
semantic to the singular "restricted_heap" name seems high.

Similarly we might run into systems with multiple types of restricted
buffers (imagine a discrete gpu having one type along with TEE
protected buffers also being used on the same system).

So the one question I have: Why not just have a mediatek specific
restricted_heap dmabuf heap driver?  Since there's already been some
talk of slight semantic differences in various restricted buffer
implementations, should we just start with separately named dmabuf
heaps for each? Maybe consolidating to a common name as more drivers
arrive and we gain a better understanding of the variations of
semantics folks are using?

thanks
-john

Re: [PATCH 3/9] dma-heap: Provide accessors so that in-kernel drivers can allocate dmabufs from specific heaps

2023-09-11 Thread John Stultz

On Mon, Sep 11, 2023 at 3:14 AM Christian König
 wrote:
> Am 11.09.23 um 04:30 schrieb Yong Wu:
> > From: John Stultz 
> >
> > This allows drivers who don't want to create their own
> > DMA-BUF exporter to be able to allocate DMA-BUFs directly
> > from existing DMA-BUF Heaps.
> >
> > There is some concern that the premise of DMA-BUF heaps is
> > that userland knows better about what type of heap memory
> > is needed for a pipeline, so it would likely be best for
> > drivers to import and fill DMA-BUFs allocated by userland
> > instead of allocating one themselves, but this is still
> > up for debate.
>
> The main design goal of having DMA-heaps in the first place is to avoid
> per driver allocation and this is not necessary because userland know
> better what type of memory it wants.
>
> The background is rather that we generally want to decouple allocation
> from having a device driver connection so that we have better chance
> that multiple devices can work with the same memory.

Yep, very much agreed, and this is what the comment above is trying to describe.

Ideally user-allocated buffers would be used to ensure driver's don't
create buffers with constraints that limit which devices the buffers
might later be shared with.

However, this patch was created as a hold-over from the old ION logic
to help vendors transition to dmabuf heaps, as vendors had situations
where they still wanted to export dmabufs that were not to be
generally shared and folks wanted to avoid duplication of logic
already in existing heaps.  At the time, I never pushed it upstream as
there were no upstream users.  But I think if there is now a potential
upstream user, it's worth having the discussion to better understand
the need.

So I think this patch is a little confusing in this series, as I don't
see much of it actually being used here (though forgive me if I'm
missing it).

Instead, It seems it get used in a separate patch series here:
  https://lore.kernel.org/all/20230911125936.10648-1-yunfei.d...@mediatek.com/

Yong, I appreciate you sending this out! But maybe if the secure heap
submission doesn't depend on this functionality, I might suggest
moving this patch (or at least the majority of it) to be part of the
vcodec series instead?  That way reviewers will have more context for
how the code being added is used?

thanks
-john

Re: [PATCH v2] dma-contiguous: define proper name for global cma region

2023-08-09 Thread John Stultz

On Wed, Aug 9, 2023 at 8:04 AM Pintu Agarwal  wrote:
>
> Hi,
>
> On Thu, 3 Aug 2023 at 23:04, Pintu Agarwal  wrote:
> >
> > Hi,
> >
> > On Wed, 2 Aug 2023 at 15:17, Christoph Hellwig  wrote:
> > >
> > > On Tue, Aug 01, 2023 at 10:39:04PM -0700, John Stultz wrote:
> > > > So, forgive me, I've not had a chance to look into this, but my
> > > > recollection was "reserved" is the name we see on x86, but other names
> > > > are possibly provided via the dts node?
> > >
> > No, I think "reserved" is the name hard-coded (for all arch) in Kernel
> > for global-cma.
> > So, I don't think this is x86 specific. I am checking on arm32 itself.
> > When we can dma_alloc_coherent we see these in the logs (if dts region
> > is not present).
> > cma: cma_alloc(cma (ptrval), name: reserved, count 64, align 6)
> > Now, with this change we will see this:
> > cma: cma_alloc(cma (ptrval), name: global-cma-region, count 64, align 6)
> >
> > > Indeed, dma_contiguous_default_area can also be set through
> > > rmem_cma_setup, which then takes the name from DT.
> > >
> > I think this is a different case. If DT entry is present we get this:
> > Reserved memory: created CMA memory pool at 0x9800, name: name:
> > linux,cma, size 128 MiB
> > cma: cma_alloc(cma (ptrval), name: linux,cma, count 64, align 6)
> >
> > Here we are talking about the default hard-coded name in Kernel code
> > if DT is not defined.
> > So, in one of the boards, this DT entry was not present and it shows
> > as "reserved".
> >
> > > > I believe on the hikey board its "linux,cma" is the name, so forcing
> > > > it to reserved would break that.
> > > >
> > Yes, everywhere in the DT it's defined as "linux,cma".
> > You mean this also should be changed to "linux,cma-global-region"
> > everywhere with this change ?
> >
> > > > Maybe instead add a compat config option to force the cma name (so x86
> > > > can set it to "default" if needed)?
> > >
> > Yes, having it in config is also a good option instead of hard-coding in 
> > Kernel.
> > >
> > > I think we'll just need to leave it as-is.  I with dma-heaps had never
> > > exposed the name to userspace, but we'll have to lіve with it now.
> >
> > Can you point me to the userspace utility we are talking about here ?
> > I think we should not worry much about userspace name exposure.
> > I guess it should fetch whatever is declared in Kernel or DTS, right ?
>
> Just to follow-up on this.
> For now, can we change the Kernel hard-coded value from "reserved" to
> "global-cma-region" ?
> Later, for the DTS defined name let it be "linux,cma" or change that
> also to "linux,global-cma-region" ?
>
> Will this make sense ?

Apologies, sorry for not responding to your earlier message, it slipped by.

So, the concern is there may be allocators (like gralloc in Android)
that allocate from the CMA region via the dma-buf heaps interface.

So by changing the name (either hardcoded or DTS names), you'll change
the user-visible heap name, potentially breaking those userland
allocators.

Now, the dmabuf heaps are designed to be like other dynamic devices
(like disks or partitions), which may be different from device to
device. However, changing the name would still be an inconvenience for
folks who have hard-coded that name in their userland allocator which
was designed for a single device.  This would be similar to the old
issue of an existing fstab breaking from the ide (hda) to sata (sda)
driver transition.  Or similar to what folks went through a while back
with network device names changing from eth0 -> ens0 or whatever.

That said, most android devices historically haven't upreved to new
kernel versions wihtout major userspace changes, so the impact might
be minimal, but that is likely to change in the future so we should be
careful here.

What I'd propose instead is to either leave it alone as Christoph
suggested, or have a build option/boot argument so folks can preserve
the legacy name if they need.

thanks
-john

Re: [PATCH v2] dma-contiguous: define proper name for global cma region

2023-08-01 Thread John Stultz

On Tue, Aug 1, 2023 at 10:18 AM Christoph Hellwig  wrote:
>
> On Tue, Aug 01, 2023 at 10:42:42PM +0530, Pintu Agarwal wrote:
> > > I agree that reserved is not a very useful name.  Unfortuately the
> > > name of the region leaks to userspace through cma_heap.
> > >
> > > So I think we need prep patches to hardcode "reserved" in
> > > add_default_cma_heap first, and then remove the cma_get_name
> > > first.
> >
> > Sorry, but I could not fully understand your comments.
> > Can you please elaborate a little more what changes are required in
> > cma_heap if we change "reserved" to "global-cma-region" ?
>
> Step 1:
>
> Instead of setting exp_info.name to cma_get_name(cma);
> in __add_cma_heap just set it to "reserved", probably by passing a name
> argument.  You can also remove the unused data argument to __add_cma_heap
> and/or just fold that function into the only caller while you're at it.

So, forgive me, I've not had a chance to look into this, but my
recollection was "reserved" is the name we see on x86, but other names
are possibly provided via the dts node?

I believe on the hikey board its "linux,cma" is the name, so forcing
it to reserved would break that.

Maybe instead add a compat config option to force the cma name (so x86
can set it to "default" if needed)?

thanks
-john

[PATCH] MAINTAINERS: Remove Laura Abbott from DMA-BUF HEAPS FRAMEWORK

2023-06-30 Thread John Stultz

Laura's email address has not been valid for quite awhile now,
so wanted to clean up the reviewer list here.

I reached out to Laura who said it made sense to drop her from
the list, so this patch does that.

I do want to recognize Laura's long time contribution to this
area and her previous ION maintainership, as this couldn't
have gone upstream without her prior efforts. Many thanks!

Cc: Laura Abbott 
Cc: T.J. Mercier 
Cc: Sumit Semwal 
Cc: Benjamin Gaignard 
Cc: Brian Starkey 
Cc: John Stultz 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: kernel-t...@android.com
Acked-by: Laura Abbott 
Signed-off-by: John Stultz 
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f4e92b968ed7..6b28b59cbdb9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6181,7 +6181,6 @@ F:kernel/dma/
 DMA-BUF HEAPS FRAMEWORK
 M: Sumit Semwal 
 R: Benjamin Gaignard 
-R: Laura Abbott 
 R: Brian Starkey 
 R: John Stultz 
 R: T.J. Mercier 
-- 
2.41.0.255.g8b1d071c50-goog

[PATCH] MAINTAINERS: Add T.J. Mercier as reviewer for DMA-BUF HEAPS FRAMEWORK

2023-06-29 Thread John Stultz

T.J. has been responsible for dmab-buf items on the Android team
for awhile now, so it would be great to have him on as a reviewer.

Cc: T.J. Mercier 
Cc: Sumit Semwal 
Cc: Benjamin Gaignard 
Cc: Brian Starkey 
Cc: John Stultz 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-...@lists.linaro.org
Cc: kernel-t...@android.com
Signed-off-by: John Stultz 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index cb075f52d97b..f4e92b968ed7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6184,6 +6184,7 @@ R:Benjamin Gaignard 

 R: Laura Abbott 
 R: Brian Starkey 
 R: John Stultz 
+R: T.J. Mercier 
 L: linux-me...@vger.kernel.org
 L: dri-devel@lists.freedesktop.org
 L: linaro-mm-...@lists.linaro.org (moderated for non-subscribers)
-- 
2.41.0.255.g8b1d071c50-goog

Re: [PATCH] MAINTAINERS: Remove Liam Mark from DMA-BUF HEAPS FRAMEWORK

2023-06-28 Thread John Stultz

On Wed, Jun 28, 2023 at 11:05 AM Jeffrey Hugo  wrote:
>
> @codeaurora.org email addresses are no longer valid and will bounce.
>
> I reached out to Liam about updating his entry under DMA-BUF HEAPS
> FRAMEWORK with an @codeaurora.org address.  His response:
>
> "I am not a maintainer anymore, that should be removed."
>
> Liam currently does not have an email address that can be used to remove
> this entry, so I offered to submit a cleanup on his behalf with Liam's
> consent.
>
> Signed-off-by: Jeffrey Hugo 
> ---
>  MAINTAINERS | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 76b53bafc03c..1781eb0a8dda 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6168,7 +6168,6 @@ F:kernel/dma/
>  DMA-BUF HEAPS FRAMEWORK
>  M: Sumit Semwal 
>  R: Benjamin Gaignard 
> -R: Liam Mark 
>  R: Laura Abbott 
>  R: Brian Starkey 
>  R: John Stultz 
> --

Acked-by: John Stultz 

Though probably worth adding TJ as a reviewer?
-john

Re: [PATCH v2 06/24] dma-buf: system_heap: use vmalloc_array and vcalloc

2023-06-27 Thread John Stultz

On Tue, Jun 27, 2023 at 7:44 AM Julia Lawall  wrote:
>
> Use vmalloc_array and vcalloc to protect against
> multiplication overflows.
>
> The changes were done using the following Coccinelle
> semantic patch:
>
> // 
> @initialize:ocaml@
> @@
>
> let rename alloc =
>   match alloc with
> "vmalloc" -> "vmalloc_array"
>   | "vzalloc" -> "vcalloc"
>   | _ -> failwith "unknown"
>
> @@
> size_t e1,e2;
> constant C1, C2;
> expression E1, E2, COUNT, x1, x2, x3;
> typedef u8;
> typedef __u8;
> type t = {u8,__u8,char,unsigned char};
> identifier alloc = {vmalloc,vzalloc};
> fresh identifier realloc = script:ocaml(alloc) { rename alloc };
> @@
>
> (
>   alloc(x1*x2*x3)
> |
>   alloc(C1 * C2)
> |
>   alloc((sizeof(t)) * (COUNT), ...)
> |
> - alloc((e1) * (e2))
> + realloc(e1, e2)
> |
> - alloc((e1) * (COUNT))
> + realloc(COUNT, e1)
> |
> - alloc((E1) * (E2))
> + realloc(E1, E2)
> )
> // 
>
> Signed-off-by: Julia Lawall 
>
> ---
> v2: Use vmalloc_array and vcalloc instead of array_size.
> This also leaves a multiplication of a constant by a sizeof
> as is.  Two patches are thus dropped from the series.
>
>  drivers/dma-buf/heaps/system_heap.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff -u -p a/drivers/dma-buf/heaps/system_heap.c 
> b/drivers/dma-buf/heaps/system_heap.c
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -221,7 +221,7 @@ static void *system_heap_do_vmap(struct
>  {
> struct sg_table *table = &buffer->sg_table;
> int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE;
> -   struct page **pages = vmalloc(sizeof(struct page *) * npages);
> +   struct page **pages = vmalloc_array(npages, sizeof(struct page *));
> struct page **tmp = pages;
> struct sg_page_iter piter;
> void *vaddr;

Seems reasonable. Thanks for sending this out!

Acked-by: John Stultz 

thanks
-john

Re: [PATCH 06/26] dma-buf: system_heap: use array_size

2023-06-23 Thread John Stultz

On Fri, Jun 23, 2023 at 2:15 PM Julia Lawall  wrote:
>
> Use array_size to protect against multiplication overflows.
>
> The changes were done using the following Coccinelle semantic patch:
>
> // 
> @@
> size_t e1,e2;
> expression COUNT;
> identifier alloc = {vmalloc,vzalloc,kvmalloc,kvzalloc};
> @@
>
> (
>   alloc(
> -   (e1) * (e2)
> +   array_size(e1, e2)
>   ,...)
> |
>   alloc(
> -   (e1) * (COUNT)
> +   array_size(COUNT, e1)
>   ,...)
> )
> // 
>
> Signed-off-by: Julia Lawall 

Thanks for sending this out!

Acked-by: John Stultz

Re: [PATCH v3] dma-buf: cma_heap: Check for device max segment size when attaching

2023-03-06 Thread John Stultz

On Mon, Mar 6, 2023 at 8:52 AM Andrew Davis  wrote:
>
> Although there is usually not such a limitation (and when there is it is
> often only because the driver forgot to change the super small default),
> it is still correct here to break scatterlist element into chunks of
> dma_max_mapping_size().

Hey Andrew!
  Thanks for sending this out!

So *why* is it "correct here to break scatterlist element into chunks
of  dma_max_mapping_size()." ?

> This might cause some issues for users with misbehaving drivers. If
> bisecting has landed you on this commit, make sure your drivers both set
> dma_set_max_seg_size() and are checking for contiguousness correctly.

Why is this change worth the risk? (If this is really likely to break
folks, should we maybe provide warnings initially instead? Maybe
falling back to the old code if we can catch the failure?)

I don't really object to the change, just want to make sure the commit
message is more clear on why we should make this change, what the
benefit will be along with the potential downsides.

thanks
-john

Re: DMA-heap driver hints

2023-01-25 Thread John Stultz

Sorry for the delay, this was almost ready to send, but then got
forgotten in my drafts folder.

On Mon, Jan 23, 2023 at 11:15 PM Christian König
 wrote:
> Am 24.01.23 um 06:19 schrieb John Stultz:
> > On Mon, Jan 23, 2023 at 8:29 AM Christian König
> >  wrote:
> >> Am 23.01.23 um 14:55 schrieb Laurent Pinchart:
> >>> - I assume some drivers will be able to support multiple heaps. How do
> >>> you envision this being implemented ?
> >> I don't really see an use case for this.
> >>
> >> We do have some drivers which say: for this use case you can use
> >> whatever you want, but for that use case you need to use specific memory
> >> (scan out on GPUs for example works like this).
> >>
> > [snipping the constraints argument, which I agree with]
> >> What we do have is compatibility between heaps. E.g. a CMA heap is
> >> usually compatible with the system heap or might even be a subset of
> >> another CMA heap. But I wanted to add that as next step to the heaps
> >> framework itself.
> > So the difficult question is how is userland supposed to know which
> > heap is compatible with which?
>
> The heaps should know which other heap they are compatible with.
>
> E.g. the CMA heap should have a link to the system heap because it can
> handle all system memory allocations as well.
>
> If we have a specialized CMA heap (for example for 32bit DMA) it should
> have a link to the general CMA heap.

This is an interesting idea, but it seems to assume a linear or at
least converging "compatibility" order, which I don't think is always
the case.
(For instance, there may be secure heaps which a small set of devices
have access to, but supporting secure memory doesn't imply system
memory for all devices or vice versa).

So I really think being able to express support for multiple heaps
would be important to resolve the majority of these edge cases.

Also to have a single link ordering, it means the constraints have to
go from the heap that satisfies more constraints to the heap that
satisfies less (which is sort of reverse of how I'd think of
compatibility). Which makes the solving logic for userland doable, but
somewhat complex/non-intuitive (as you're searching for the most
"satisfying" heap from the set which will be one of the starting
points).

Whereas finding the intersection of lists seems a bit more straightforward.

> > If you have two devices, one that points to heap "foo" and the other
> > points to heap "bar", how does userland know that "foo" satisfies the
> > constraints of "bar" but "bar" doesn't satisfy the constraints of
> > "foo".
> > (foo ="cma",  bar="system")
> >
> > I think it would be much better for device 1 to list "foo" and device
> > 2 to list "foo" and "bar", so you can find that "foo" is the common
> > heap which will solve both devices' needs.
>
> I think that this would be a rather bad idea because then all devices
> need to know about all the possible different heaps they are compatible
> with.

I agree it is somewhat burdensome, but I suspect we'd eventually want
registration helpers to abstract out some of the relationships you
mention above (ie: system supporting devices will accept CMA buffers,
dma32 buffers, etc). But at least that logic would be in-kernel and
not exposed to userland.

> >>> - Devices could have different constraints based on particular
> >>> configurations. For instance, a device may require specific memory
> >>> layout for multi-planar YUV formats only (as in allocating the Y and C
> >>> planes of NV12 from different memory banks). A dynamic API may thus be
> >>> needed (but may also be very painful to use from userspace).
> >> Uff, good to know. But I'm not sure how to expose stuff like that.
> > Yeah. These edge cases are really hard to solve generically.  And
> > single devices that have separate constraints for different uses are
> > also not going to be solvable with a simple linking approach.
> >
> > But I do wonder if a generic solution to all cases is needed
> > (especially if it really isn't possible)? If we leave the option for
> > gralloc like omniscient device-specific userland policy, those edge
> > cases can be handled by those devices that can't run generic logic.
> > And those devices just won't be able to be supported by generic
> > distros, hopefully motivating future designs to have less odd
> > constraints?
>
> Potentially yes, but I think that

Re: DMA-heap driver hints

2023-01-23 Thread John Stultz

On Mon, Jan 23, 2023 at 8:29 AM Christian König
 wrote:
> Am 23.01.23 um 14:55 schrieb Laurent Pinchart:
> > - I assume some drivers will be able to support multiple heaps. How do
> >you envision this being implemented ?
>
> I don't really see an use case for this.
>
> We do have some drivers which say: for this use case you can use
> whatever you want, but for that use case you need to use specific memory
> (scan out on GPUs for example works like this).
>
[snipping the constraints argument, which I agree with]
>
> What we do have is compatibility between heaps. E.g. a CMA heap is
> usually compatible with the system heap or might even be a subset of
> another CMA heap. But I wanted to add that as next step to the heaps
> framework itself.

So the difficult question is how is userland supposed to know which
heap is compatible with which?

If you have two devices, one that points to heap "foo" and the other
points to heap "bar", how does userland know that "foo" satisfies the
constraints of "bar" but "bar" doesn't satisfy the constraints of
"foo".
(foo ="cma",  bar="system")

I think it would be much better for device 1 to list "foo" and device
2 to list "foo" and "bar", so you can find that "foo" is the common
heap which will solve both devices' needs.

> > - Devices could have different constraints based on particular
> >configurations. For instance, a device may require specific memory
> >layout for multi-planar YUV formats only (as in allocating the Y and C
> >planes of NV12 from different memory banks). A dynamic API may thus be
> >needed (but may also be very painful to use from userspace).
>
> Uff, good to know. But I'm not sure how to expose stuff like that.

Yeah. These edge cases are really hard to solve generically.  And
single devices that have separate constraints for different uses are
also not going to be solvable with a simple linking approach.

But I do wonder if a generic solution to all cases is needed
(especially if it really isn't possible)? If we leave the option for
gralloc like omniscient device-specific userland policy, those edge
cases can be handled by those devices that can't run generic logic.
And those devices just won't be able to be supported by generic
distros, hopefully motivating future designs to have less odd
constraints?

thanks
-john

Re: DMA-heap driver hints

2023-01-23 Thread John Stultz

On Mon, Jan 23, 2023 at 5:55 AM Laurent Pinchart
 wrote:
>
> Hi Christian,
>
> CC'ing James as I think this is related to his work on the unix device
> memory allocator ([1]).
>
> [1] 
> https://lore.kernel.org/dri-devel/8b555674-1c5b-c791-4547-2ea7c16ae...@nvidia.com/
>
> On Mon, Jan 23, 2023 at 01:37:54PM +0100, Christian König wrote:
> > Hi guys,
> >
> > this is just an RFC! The last time we discussed the DMA-buf coherency
> > problem [1] we concluded that DMA-heap first needs a better way to
> > communicate to userspace which heap to use for a certain device.
> >
> > As far as I know userspace currently just hard codes that information
> > which is certainly not desirable considering that we should have this
> > inside the kernel as well.
> >
> > So what those two patches here do is to first add some
> > dma_heap_create_device_link() and  dma_heap_remove_device_link()
> > function and then demonstrating the functionality with uvcvideo
> > driver.
> >
> > The preferred DMA-heap is represented with a symlink in sysfs between
> > the device and the virtual DMA-heap device node.
>
> I'll start with a few high-level comments/questions:
>
> - Instead of tying drivers to heaps, have you considered a system where
>   a driver would expose constraints, and a heap would then be selected
>   based on those constraints ? A tight coupling between heaps and
>   drivers means downstream patches to drivers in order to use
>   vendor-specific heaps, that sounds painful.

Though, maybe it should be in that case. More motivation to get your
heap (and its users) upstream. :)

>   A constraint-based system would also, I think, be easier to extend
>   with additional constraints in the future.

I think the issue of enumerating and exposing constraints to userland
is just really tough.  While on any one system there is a fixed number
of constraints, it's not clear we could come up with a bounded set for
all systems.
To avoid this back in the ION days I had proposed an idea of userland
having devices share an opaque constraint cookie, which userland could
mask together between devices and then find a heap that matches the
combined cookie, which would avoid exposing specific constraints to
userland, but the processes of using it seemed like such a mess to
explain.

So I think this driver driven links approach is pretty reasonable. I
do worry we might get situations where the drivers ability to use a
heap depends on some other factor (dts iommu setup maybe?), which the
driver might not know on its own, but I think having the driver
special-case that to resolve it would be doable.

> - I assume some drivers will be able to support multiple heaps. How do
>   you envision this being implemented ?

Yeah. I also agree we need to have multiple heap links.

> - Devices could have different constraints based on particular
>   configurations. For instance, a device may require specific memory
>   layout for multi-planar YUV formats only (as in allocating the Y and C
>   planes of NV12 from different memory banks). A dynamic API may thus be
>   needed (but may also be very painful to use from userspace).

Yeah. While I know folks really don't like the static userspace config
model that Android uses, I do fret that once we get past what a
workable heap is, it still won't address what the ideal heap is.

For instance, we might find that the system heap works for a given
pipeline, but because the cpu doesn't use the buffer in one case, the
system-uncached heap is really the best choice for performance. But in
another pipeline with the same devices, if the cpu is reading and
writing the buffer quite a bit, one would want the standard system
heap.

Because userland is the only one who can know the path a buffer will
take, userland is really the best place to choose the ideal allocation
type.

So while I don't object to this link based approach just to allow a
generic userland to find a working buffer type for a given set of
devices, I don't think it will be able to replace having device
specific userland policy (like gralloc), though it's my personal hope
the policy can be formalized to a config file rather then having
device specific binaries.

thanks
-john

Re: [PATCH 1/2] dma-heap: add device link and unlink functions

2023-01-23 Thread John Stultz

On Mon, Jan 23, 2023 at 4:38 AM Christian König
 wrote:
>
> This allows device drivers to specify a DMA-heap where they want their
> buffers to be allocated from. This information is then exposed as
> sysfs link between the device and the DMA-heap.
>
> Useful for userspace when in need to decide from which provider to
> allocate a buffer.
>
> Signed-off-by: Christian König 

Hey Christian!
  Thanks so much for sending this out and also thanks for including me
(Adding TJ as well)!

This looks like a really interesting approach, but I have a few thoughts below.

> ---
>  drivers/dma-buf/dma-heap.c | 41 ++
>  include/linux/dma-heap.h   | 35 
>  2 files changed, 68 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
> index c9e41e8a9e27..0f7cf713c22f 100644
> --- a/drivers/dma-buf/dma-heap.c
> +++ b/drivers/dma-buf/dma-heap.c
> @@ -31,6 +31,7 @@
>   * @heap_devt  heap device node
>   * @list   list head connecting to list of heaps
>   * @heap_cdev  heap char device
> + * @dev:   heap device in sysfs
>   *
>   * Represents a heap of memory from which buffers can be made.
>   */
> @@ -41,6 +42,7 @@ struct dma_heap {
> dev_t heap_devt;
> struct list_head list;
> struct cdev heap_cdev;
> +   struct device *dev;
>  };
>
>  static LIST_HEAD(heap_list);
> @@ -49,6 +51,33 @@ static dev_t dma_heap_devt;
>  static struct class *dma_heap_class;
>  static DEFINE_XARRAY_ALLOC(dma_heap_minors);
>
> +int dma_heap_create_device_link(struct device *dev, const char *heap)
> +{
> +   struct dma_heap *h;
> +
> +   /* check the name is valid */
> +   mutex_lock(&heap_list_lock);
> +   list_for_each_entry(h, &heap_list, list) {
> +   if (!strcmp(h->name, heap))
> +   break;
> +   }
> +   mutex_unlock(&heap_list_lock);
> +
> +   if (list_entry_is_head(h, &heap_list, list)) {
> +   pr_err("dma_heap: Link to invalid heap requested %s\n", heap);
> +   return -EINVAL;
> +   }
> +
> +   return sysfs_create_link(&dev->kobj, &h->dev->kobj, DEVNAME);
> +}

So, one concern with this (if I'm reading this right) is it seems like
a single heap link may be insufficient.

There may be multiple heaps that a driver could work with (especially
if the device isn't very constrained), so when sharing a buffer with a
second device that is more constrained we'd have the same problem we
have now where userspace just needs to know which device is the more
constrained one to allocate for.

So it might be useful to have a sysfs "dma_heaps" directory with the
supported heaps linked from there? That way userland could find across
the various devices the heap-link that was common.

This still has the downside that every driver needs to be aware of
every heap, and when new heaps are added, it may take awhile before
folks might be able to assess if a device is compatible or not to
update it, so I suspect we'll have eventually some loose
constraint-based helpers to register links. But I think this at least
moves us in a workable direction, so again, I'm really glad to see you
send this out!

thanks
-john

[RESEND][PATCH] MAINTAINERS: Add Sumit Semwal and Yongqin Liu as reviwers for kirin DRM driver

2023-01-19 Thread John Stultz

I no longer have access to the HiKey boards, so while I'm happy to
review code, I wanted to add Sumit and Yongqin to the reviewers list
so they would get CC'ed on future changes and would be able to have
a chance to validate and provide Tested-by: tags

Cc: Xinliang Liu 
Cc: Tian Tao  
Cc: Yongqin Liu 
Cc: Sumit Semwal 
Cc: kernel-t...@android.com
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 42fc47c6edfd..82df566e6925 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7007,9 +7007,11 @@ F:   drivers/gpu/drm/gma500/
 DRM DRIVERS FOR HISILICON
 M: Xinliang Liu 
 M: Tian Tao  
-R: John Stultz 
 R: Xinwei Kong 
 R: Chen Feng 
+R: Sumit Semwal 
+R: Yongqin Liu 
+R: John Stultz 
 L: dri-devel@lists.freedesktop.org
 S: Maintained
 T: git git://anongit.freedesktop.org/drm/drm-misc
-- 
2.39.1.405.gd4c25cc71f-goog

Re: [PATCH v3] drm: kirin: Enable COMPILE_TEST

2023-01-19 Thread John Stultz

On Thu, Jan 19, 2023 at 11:01 AM Sean Anderson  wrote:
>
> Make various small changes to allow compile-testing on other arches.
> This is helpful to allow testing changes to 32-bit arm drivers in the
> same build.
>
> The primary changes is to use macros for 64-bit divisions and shifts,
> but we also need some other fixes to deal with larger constants and
> differences in includes.
>
> Signed-off-by: Sean Anderson 

No objection from me (though I'm not sure I really see the benefit of
test building this driver on 32bit).

I no longer have hardware to test with, so adding YongQin and Amit.

So maybe a tentative Acked-by: John Stultz 

thanks
-john


> ---
>
> Changes in v3:
> - Include io.h for readl/writel
>
> Changes in v2:
> - Use BIT_ULL
>
>  drivers/gpu/drm/hisilicon/kirin/Kconfig |  2 +-
>  drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c| 10 +-
>  drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h|  2 ++
>  drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h |  2 +-
>  4 files changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/hisilicon/kirin/Kconfig 
> b/drivers/gpu/drm/hisilicon/kirin/Kconfig
> index c5265675bf0c..0772f79567ef 100644
> --- a/drivers/gpu/drm/hisilicon/kirin/Kconfig
> +++ b/drivers/gpu/drm/hisilicon/kirin/Kconfig
> @@ -1,7 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  config DRM_HISI_KIRIN
> tristate "DRM Support for Hisilicon Kirin series SoCs Platform"
> -   depends on DRM && OF && ARM64
> +   depends on DRM && OF && (ARM64 || COMPILE_TEST)
> select DRM_KMS_HELPER
> select DRM_GEM_DMA_HELPER
> select DRM_MIPI_DSI
> diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c 
> b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
> index d9978b79828c..1cfeffefd4b4 100644
> --- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
> +++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
> @@ -157,8 +157,8 @@ static u32 dsi_calc_phy_rate(u32 req_kHz, struct 
> mipi_phy_params *phy)
> q_pll = 0x10 >> (7 - phy->hstx_ckg_sel);
>
> temp = f_kHz * (u64)q_pll * (u64)ref_clk_ps;
> -   m_n_int = temp / (u64)10;
> -   m_n = (temp % (u64)10) / (u64)1;
> +   m_n_int = div_u64_rem(temp, 10, &m_n);
> +   m_n /= 1;
>
> if (m_n_int % 2 == 0) {
> if (m_n * 6 >= 50) {
> @@ -229,8 +229,8 @@ static u32 dsi_calc_phy_rate(u32 req_kHz, struct 
> mipi_phy_params *phy)
> phy->pll_fbd_div5f = 1;
> }
>
> -   f_kHz = (u64)10 * (u64)m_pll /
> -   ((u64)ref_clk_ps * (u64)n_pll * (u64)q_pll);
> +   f_kHz = div64_u64((u64)10 * (u64)m_pll,
> + (u64)ref_clk_ps * (u64)n_pll * (u64)q_pll);
>
> if (f_kHz >= req_kHz)
> break;
> @@ -490,7 +490,7 @@ static void dsi_set_mode_timing(void __iomem *base,
> hsa_time = (hsw * lane_byte_clk_kHz) / pixel_clk_kHz;
> hbp_time = (hbp * lane_byte_clk_kHz) / pixel_clk_kHz;
> tmp = (u64)htot * (u64)lane_byte_clk_kHz;
> -   hline_time = DIV_ROUND_UP(tmp, pixel_clk_kHz);
> +   hline_time = DIV64_U64_ROUND_UP(tmp, pixel_clk_kHz);
>
> /* all specified in byte-lane clocks */
> writel(hsa_time, base + VID_HSA_TIME);
> diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h 
> b/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
> index d79fc031e53d..a87d1135856f 100644
> --- a/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
> +++ b/drivers/gpu/drm/hisilicon/kirin/dw_dsi_reg.h
> @@ -7,6 +7,8 @@
>  #ifndef __DW_DSI_REG_H__
>  #define __DW_DSI_REG_H__
>
> +#include 
> +
>  #define MASK(x)(BIT(x) - 1)
>
>  /*
> diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h 
> b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
> index be9e789c2d04..36f923cc7594 100644
> --- a/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
> +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_ade_reg.h
> @@ -10,7 +10,7 @@
>  /*
>   * ADE Registers
>   */
> -#define MASK(x)(BIT(x) - 1)
> +#define MASK(x)(BIT_ULL(x) - 1)
>
>  #define ADE_CTRL   0x0004
>  #define FRM_END_START_OFST 0
> --
> 2.35.1.1320.gc452695387.dirty
>

Re: [PATCH] drm: kirin: Fix missing clk_disable_unprepare in ade_power_up()

2022-12-02 Thread John Stultz

On Fri, Dec 2, 2022 at 12:22 AM Shang XiaoJing  wrote:
>
> The clk_disable_unprepare() should be called in the error handling of
> ade_power_up(). So as reset_control_assert().
>
> Fixes: 783ad972c9a0 ("drm/hisilicon: Add crtc driver for ADE")
> Signed-off-by: Shang XiaoJing 

Looks reasonable to me. Thanks for sending this out!
CC'ing YongQin and Sumit as they have hardware to test against.

Acked-by: John Stultz 


> ---
>  drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c 
> b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
> index 871f79a6b17e..439e87923bcf 100644
> --- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
> +++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
> @@ -229,12 +229,15 @@ static int ade_power_up(struct ade_hw_ctx *ctx)
> ret = reset_control_deassert(ctx->reset);
> if (ret) {
> DRM_ERROR("failed to deassert reset\n");
> +   clk_disable_unprepare(ctx->media_noc_clk);
> return ret;
> }
>
> ret = clk_prepare_enable(ctx->ade_core_clk);
> if (ret) {
> DRM_ERROR("failed to enable ade_core_clk (%d)\n", ret);
> +   reset_control_assert(ctx->reset);
> +   clk_disable_unprepare(ctx->media_noc_clk);
> return ret;
> }
>
> --
> 2.17.1
>

[RESEND][PATCH] MAINTAINERS: Add Sumit Semwal and Yongqin Liu as reviwers for kirin DRM driver

2022-11-23 Thread John Stultz

I no longer have access to the HiKey boards, so while I'm happy to
review code, I wanted to add Sumit and Yongqin to the reviewers list
so they would get CC'ed on future changes and would be able to have
a chance to validate and provide Tested-by: tags

Cc: Xinliang Liu 
Cc: Tian Tao  
Cc: Yongqin Liu 
Cc: Sumit Semwal 
Cc: kernel-t...@android.com
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2585e7edc335..9f26e6b74ef4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6919,9 +6919,11 @@ F:   drivers/gpu/drm/gma500/
 DRM DRIVERS FOR HISILICON
 M: Xinliang Liu 
 M: Tian Tao  
-R: John Stultz 
 R: Xinwei Kong 
 R: Chen Feng 
+R: Sumit Semwal 
+R: Yongqin Liu 
+R: John Stultz 
 L: dri-devel@lists.freedesktop.org
 S: Maintained
 T: git git://anongit.freedesktop.org/drm/drm-misc
-- 
2.38.1.584.g0f3c55d4c2-goog

Re: [PATCH v4] dma-buf: fix racing conflict of dma_heap_add()

2022-11-23 Thread John Stultz

On Mon, Nov 21, 2022 at 10:24 AM Christian König
 wrote:
>
> Hi Dawei,
>
> from the technical description, coding style etc.. it looks clean to me,
> but I'm the completely wrong person to ask for a background check.
>
> I have a high level understanding of how dma-heaps work, but not a
> single line of this code is from me.
>
> Feel free to add my Acked-by, but Laura, John and others do you have any
> opinion?

No objection from me.
Thanks Dawei for submitting this improvement!

Acked-by: John Stultz 

thanks
-john

Re: [PATCH v3] dma-buf: cma_heap: Remove duplicated 'by' in comment

2022-10-28 Thread John Stultz

On Thu, Oct 27, 2022 at 11:55 PM Mark-PK Tsai  wrote:
>
> Remove duplicated 'by' from comment in cma_heap_allocate().
>
> Signed-off-by: Mark-PK Tsai 

Thanks for sending this and going through a few iterations!

Acked-by: John Stultz 

-john

[PATCH] MAINTAINERS: Add Sumit Semwal and Yongqin Liu as reviwers for kirin DRM driver

2022-09-26 Thread John Stultz

I no longer have access to the HiKey boards, so while I'm happy to
review code, I wanted to add Sumit and Yongqin to the reviewers list
so they would get CC'ed on future changes and would be able to have
a chance to validate and provide Tested-by: tags

Cc: Xinliang Liu 
Cc: Tian Tao  
Cc: Yongqin Liu 
Cc: Sumit Semwal 
Cc: kernel-t...@android.com
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 MAINTAINERS | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f5ca4aefd184..11027cf9b670 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6875,9 +6875,11 @@ F:   drivers/gpu/drm/gma500/
 DRM DRIVERS FOR HISILICON
 M: Xinliang Liu 
 M: Tian Tao  
-R: John Stultz 
 R: Xinwei Kong 
 R: Chen Feng 
+R: Sumit Semwal 
+R: Yongqin Liu 
+R: John Stultz 
 L: dri-devel@lists.freedesktop.org
 S: Maintained
 T: git git://anongit.freedesktop.org/drm/drm-misc
-- 
2.37.3.998.g577e59143f-goog

Re: [RFC v2 6/6] android: binder: Add a buffer flag to relinquish ownership of fds

2022-02-14 Thread John Stultz

On Mon, Feb 14, 2022 at 12:19 PM Todd Kjos  wrote:
> On Mon, Feb 14, 2022 at 11:29 AM Suren Baghdasaryan  wrote:
> > On Mon, Feb 14, 2022 at 10:33 AM Todd Kjos  wrote:
> > >
> > > Since we are creating a new gpu cgroup abstraction, couldn't this
> > > "transfer" be done in userspace by the target instead of in the kernel
> > > driver? Then this patch would reduce to just a flag on the buffer
> > > object.
> >
> > Are you suggesting to have a userspace accessible cgroup interface for
> > transferring buffer charges and the target process to use that
> > interface for requesting the buffer to be charged to its cgroup?
>
> Well, I'm asking why we need to do these cgroup-ish actions in the
> kernel when it seems more natural to do it in userspace.
>

In case its useful, some additional context from some of the Linux
Plumber's discussions last fall:

Daniel Stone outlines some concerns with the cgroup userland handling
for accounting:
  https://youtu.be/3OqllZONTiQ?t=3430

And the binder ownership transfer bit was suggested here by Daniel Vetter:
  https://youtu.be/3OqllZONTiQ?t=3730

thanks
-john

Re: [PATCH] dma-buf: heaps: Fix potential spectre v1 gadget

2022-01-31 Thread John Stultz

On Sat, Jan 29, 2022 at 7:06 AM Jordy Zomer  wrote:
>
> It appears like nr could be a Spectre v1 gadget as it's supplied by a
> user and used as an array index. Prevent the contents
> of kernel memory from being leaked to userspace via speculative
> execution by using array_index_nospec.
>
> Signed-off-by: Jordy Zomer 
> ---
>  drivers/dma-buf/dma-heap.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
> index 56bf5ad01ad5..8f5848aa144f 100644
> --- a/drivers/dma-buf/dma-heap.c
> +++ b/drivers/dma-buf/dma-heap.c
> @@ -14,6 +14,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -135,6 +136,7 @@ static long dma_heap_ioctl(struct file *file, unsigned 
> int ucmd,
> if (nr >= ARRAY_SIZE(dma_heap_ioctl_cmds))
> return -EINVAL;
>
> +   nr = array_index_nospec(nr, ARRAY_SIZE(dma_heap_ioctl_cmds));
> /* Get the kernel ioctl cmd that matches */
>     kcmd = dma_heap_ioctl_cmds[nr];

Thanks for submitting this! It looks sane to me.

Acked-by: John Stultz 

thanks
-john

Re: [PATCH v4] dma-buf: system_heap: Add a size check for allocation

2022-01-19 Thread John Stultz

On Wed, Jan 19, 2022 at 7:34 PM  wrote:
>
> From: Guangming 
>
> Add a size check for allocation since the allocation size should be
> always less than the total DRAM size on system heap.
> And it can prevent consuming too much time for invalid allocations.
>
> Signed-off-by: Guangming 
> ---
>  drivers/dma-buf/heaps/system_heap.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/dma-buf/heaps/system_heap.c 
> b/drivers/dma-buf/heaps/system_heap.c
> index 23a7e74ef966..bd6f255620e2 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -347,6 +347,13 @@ static struct dma_buf *system_heap_allocate(struct 
> dma_heap *heap,
> struct page *page, *tmp_page;
> int i, ret = -ENOMEM;
>
> +   /*
> +* Size check. The "len" should be less than totalram since 
> system_heap
> +* memory is comes from system. Adding check here can prevent 
> consuming
> +* too much time for invalid allocations.
> +*/
> +   if (len >> PAGE_SHIFT > totalram_pages())
> +   return -EINVAL;

Thanks so much for revising and sending this along! It looks good to me.

Acked-by: John Stultz 

thanks again
-john

Re: [PATCH v3] dma-buf: dma-heap: Add a size check for allocation

2022-01-19 Thread John Stultz

On Wed, Jan 19, 2022 at 1:58 AM Guangming.Cao
 wrote:
> On Fri, 2022-01-14 at 17:17 -0800, John Stultz wrote:
> > If the max value is per-heap, why not enforce that value in the
> > per-heap allocation function?
> >
> > Moving the check to the heap alloc to me seems simpler to me than
> > adding complexity to the infrastructure to add a heap max_size
> > callback. Is there some other use for the callback that you envision?
> >
>
> If you think max the value is per-heap, why not add an optional
> callback for dma-heap to solve this issue(prevent consuming too much
> time for a doomed to fail allocation), if the dma-heap doesn't have a
> special size check, just use the default value(totalram) in dma-heap
> framework to do the size check.

As the totalram default isn't correct for all heaps (or necessarily
even most heaps), so those heaps would need to implement the callback.

I'm just not sure adding complexity to the framework to address this
is useful. Instead of an additional check in the allocation function,
heap implementers will need to assess if the default logic in a
framework is correct, and then possibly implement the callback.

> Yes, for linux dma-heaps, only system-heap needs it, so adding it in
> system heap is the simplest. However, there are many vendor dma-heaps
> like system-heap which won't be uploaded to linux codebase, and maybe
> have same limitation, all these heaps need to add the same limitation.

My worry is that without seeing these vendor heaps, this is a bit of a
theoretical concern. We don't have the data on how common this is.
I very much hope that vendors can start submitting their heaps
upstream (along with drivers that benefit from the heaps). Then we can
really assess what makes the most sense for the community maintained
code.

> I just think it's boring. However, If you think discussing these absent
> cases based on current linux code is meaningless, I also agree to it.

So, as a rule, the upstream kernel doesn't create/maintain logic to
accommodate out of tree code.

Now, I agree there is the potential for some duplication in the checks
in the allocation logic, but until it affects the upstream kernel,
community maintainers can't really make an appropriate evaluation.

As a contra-example, if the allocation is some extreme hotpath, adding
an extra un-inlinable function pointer traversal for the size callback
may actually have a negative impact. This isn't likely but again, if
we cannot demonstrate it one way or the other against the upstream
tree, we can't figure out what the best solution might be.

> So, to summarize, if you still think adding it in system_heap.c is
> better, I also agree and I will update the patch to add it in
> system_heap.c

I think this is the best solution for now. As this is not part of an
userland ABI, we can always change it in the future once we see the
need.

thanks
-john

Re: [PATCH v3] dma-buf: dma-heap: Add a size check for allocation

2022-01-14 Thread John Stultz

On Fri, Jan 14, 2022 at 4:04 AM Guangming.Cao
 wrote:
>
> On Fri, 2022-01-14 at 08:16 +0100, Christian König wrote:
> > Am 14.01.22 um 00:26 schrieb John Stultz:
> > > On Thu, Jan 13, 2022 at 5:05 AM Christian König
> > >  wrote:
> > > > Am 13.01.22 um 14:00 schrieb Ruhl, Michael J:
> > > > > > -Original Message-
> > > > > > From: dri-devel  On
> > > > > > Behalf Of
> > > > > > Ruhl, Michael J
> > > > > > > -Original Message-
> > > > > > > From: dri-devel 
> > > > > > > On Behalf Of
> > > > > > > guangming@mediatek.com
> > > > > > > +   /*
> > > > > > > +* Invalid size check. The "len" should be less than
> > > > > > > totalram.
> > > > > > > +*
> > > > > > > +* Without this check, once the invalid size allocation
> > > > > > > runs on a process
> > > > > > > that
> > > > > > > +* can't be killed by OOM flow(such as "gralloc" on
> > > > > > > Android devices), it
> > > > > > > will
> > > > > > > +* cause a kernel exception, and to make matters worse,
> > > > > > > we can't find
> > > > > > > who are using
> > > > > > > +* so many memory with "dma_buf_debug_show" since the
> > > > > > > relevant
> > > > > > > dma-buf hasn't exported.
> > > > > > > +*/
> > > > > > > +   if (len >> PAGE_SHIFT > totalram_pages())
> > > > > >
> > > > > > If your "heap" is from cma, is this still a valid check?
> > > > >
> > > > > And thinking a bit further, if I create a heap from something
> > > > > else (say device memory),
> > > > > you will need to be able to figure out the maximum allowable
> > > > > check for the specific
> > > > > heap.
> > > > >
> > > > > Maybe the heap needs a callback for max size?
> Yes, I agree with this solution.
> If dma-heap framework support this via adding a callback to support it,
> seems it's more clear than adding a limitation in dma-heap framework
> since each heap maybe has different limitation.
> If you prefer adding callback, I can update this patch and add totalram
> limitation to system dma-heap.

If the max value is per-heap, why not enforce that value in the
per-heap allocation function?

Moving the check to the heap alloc to me seems simpler to me than
adding complexity to the infrastructure to add a heap max_size
callback. Is there some other use for the callback that you envision?

thanks
-john

Re: [PATCH v3] dma-buf: dma-heap: Add a size check for allocation

2022-01-13 Thread John Stultz

On Thu, Jan 13, 2022 at 5:05 AM Christian König
 wrote:
> Am 13.01.22 um 14:00 schrieb Ruhl, Michael J:
> >> -Original Message-
> >> From: dri-devel  On Behalf Of
> >> Ruhl, Michael J
> >>> -Original Message-
> >>> From: dri-devel  On Behalf Of
> >>> guangming@mediatek.com
> >>> +   /*
> >>> +* Invalid size check. The "len" should be less than totalram.
> >>> +*
> >>> +* Without this check, once the invalid size allocation runs on a 
> >>> process
> >>> that
> >>> +* can't be killed by OOM flow(such as "gralloc" on Android devices), 
> >>> it
> >>> will
> >>> +* cause a kernel exception, and to make matters worse, we can't find
> >>> who are using
> >>> +* so many memory with "dma_buf_debug_show" since the relevant
> >>> dma-buf hasn't exported.
> >>> +*/
> >>> +   if (len >> PAGE_SHIFT > totalram_pages())
> >> If your "heap" is from cma, is this still a valid check?
> > And thinking a bit further, if I create a heap from something else (say 
> > device memory),
> > you will need to be able to figure out the maximum allowable check for the 
> > specific
> > heap.
> >
> > Maybe the heap needs a callback for max size?
>
> Well we currently maintain a separate allocator and don't use dma-heap,
> but yes we have systems with 16GiB device and only 8GiB system memory so
> that check here is certainly not correct.

Good point.

> In general I would rather let the system run into -ENOMEM or -EINVAL
> from the allocator instead.

Probably the simpler solution is to push the allocation check to the
heap driver, rather than doing it at the top level here.

For CMA or other contiguous heaps, letting the allocator fail is fast
enough. For noncontiguous buffers, like the system heap, the
allocation can burn a lot of time and consume a lot of memory (causing
other trouble) before a large allocation might naturally fail.

thanks
-john

Re: [PATCH] dma-buf: cma_heap: Fix mutex locking section

2022-01-03 Thread John Stultz

On Mon, Jan 3, 2022 at 11:36 PM Weizhao Ouyang  wrote:
>
> Fix cma_heap_buffer mutex locking critical section to protect vmap_cnt
> and vaddr.
>
> Fixes: a5d2d29e24be ("dma-buf: heaps: Move heap-helper logic into the 
> cma_heap implementation")
> Signed-off-by: Weizhao Ouyang 

Looks good to me!  Thanks so much for sending this in!

Acked-by: John Stultz 

thanks again
-john

Re: [PATCH v2] dma-buf: dma-heap: Add a size check for allocation

2022-01-03 Thread John Stultz

On Mon, Dec 27, 2021 at 1:52 AM  wrote:
>
> From: Guangming 
>

Thanks for submitting this!

> Add a size check for allcation since the allocation size is

nit: "allocation" above.

> always less than the total DRAM size.

In general, it might be good to add more context to the commit message
to better answer *why* this change is needed rather than what the
change is doing.  ie: What negative thing happens without this change?
And so how does this change avoid or improve things?

> Signed-off-by: Guangming 
> Signed-off-by: jianjiao zeng 
> ---
> v2: 1. update size limitation as total_dram page size.
> 2. update commit message
> ---
>  drivers/dma-buf/dma-heap.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
> index 56bf5ad01ad5..e39d2be98d69 100644
> --- a/drivers/dma-buf/dma-heap.c
> +++ b/drivers/dma-buf/dma-heap.c
> @@ -55,6 +55,8 @@ static int dma_heap_buffer_alloc(struct dma_heap *heap, 
> size_t len,
> struct dma_buf *dmabuf;
> int fd;
>
> +   if (len / PAGE_SIZE > totalram_pages())
> +   return -EINVAL;

This seems sane. I know ION used to have some 1/2 of memory cap to
avoid unnecessary memory pressure on crazy allocations.

Could you send again with an improved commit message?

thanks
-john

Re: [PATCH RESEND] dma-buf: heaps: Fix mutex lock area and generalize struct dma_heap_attachment

2022-01-03 Thread John Stultz

On Tue, Dec 28, 2021 at 11:09 PM Weizhao Ouyang  wrote:
>
> Fix cma_heap_buffer mutex lock area to protect vmap_cnt and vaddr. And
> move struct dma_heap_attachment to dma-heap.h so that vendor dma heaps
> can use it, the same behaviour as struct dma_buf_attachment.
>

Hey!
  Thanks for submitting this patch! Apologies for the slow reply (was
out for the holidays).

This patch is combining two changes in one patch, so they need to be
split up. The locking change looks sane, but moving the
dma_heap_attachment may need some extra justification as changing
upstream code just to support out of tree code isn't usually done (if
there was some benefit to the in-tree code, that would be fine
though).

I'd also be eager to try to get the vendor heap to be merged, assuming
we can also merge an upstream user for it.

thanks
-john

Re: [PATCH v4] dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow

2021-11-29 Thread John Stultz

On Thu, Nov 25, 2021 at 11:48 PM  wrote:
>
> From: Guangming 
>
> For previous version, it uses 'sg_table.nent's to traverse sg_table in pages
> free flow.
> However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number 
> of
> created entries in the DMA adderess space.
> So, use 'sg_table.nents' in pages free flow will case some pages can't be 
> freed.
>
> Here we should use sg_table.orig_nents to free pages memory, but use the
> sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common
> helper 'for_each_sg' which maybe cause memory leak) is much better.
>
> Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if 
> available")
> Signed-off-by: Guangming 
> Reviewed-by: Robin Murphy 
> Cc:  # 5.11.*

Thanks so much for catching this and sending in all the revisions!

Reviewed-by: John Stultz

Re: [PATCH] lontium-lt9611: check a different register bit for HDMI sensing

2021-11-16 Thread John Stultz

On Tue, Nov 16, 2021 at 6:07 PM Peter Collingbourne  wrote:
>
> It has been observed that with certain monitors such as the HP Z27n,
> the register 0x825e reads a value of 0x79 when the HDMI cable is
> connected and 0x78 when it is disconnected, i.e. bit 0 appears
> to correspond to the HDMI connection status and bit 2 is never
> set. Therefore, change the driver to check bit 0 instead of bit 2.
>
> Signed-off-by: Peter Collingbourne 
> Link: 
> https://linux-review.googlesource.com/id/I7e76411127e1ce4988a3f6d0c8ba5f1c3d880c23
> ---
> N.B. I don't currently have easy access to a monitor that works
> with the existing driver, so it would be great if people with
> monitors that currently work could test this patch to make sure
> that it doesn't introduce any regressions. Otherwise I will change
> it to check both bits.

So very interesting! I gave this a spin with my monitors and it works fine.

Vinod: Can you check the datasheet to see if the wrong bit is being used here?

thanks
-john

Re: [PATCH v6 20/21] drm/kirin: dsi: Adjust probe order

2021-10-27 Thread John Stultz

On Mon, Oct 25, 2021 at 8:16 AM Maxime Ripard  wrote:
>
> Without proper care and an agreement between how DSI hosts and devices
> drivers register their MIPI-DSI entities and potential components, we can
> end up in a situation where the drivers can never probe.
>
> Most drivers were taking evasive maneuvers to try to workaround this,
> but not all of them were following the same conventions, resulting in
> various incompatibilities between DSI hosts and devices.
>
> Now that we have a sequence agreed upon and documented, let's convert
> kirin to it.
>
> Tested-by: John Stultz 
> Signed-off-by: Maxime Ripard 

For this patch, and any others in this series folks see fit:
   Acked-by: John Stultz 

thanks
-john

Re: [PATCH v2] dma-buf: heaps: init heaps in subsys_initcall

2021-10-21 Thread John Stultz

On Thu, Oct 21, 2021 at 6:49 PM Shuosheng Huang
 wrote:
>
> Some built-in modules will failed to use dma-buf heap to allocate
> memory if the heap drivers are too late to be initialized.
> To fix this issue, move initialization of dma-buf heap drivers in
> subsys_initcall() which is more earlier to be called.

Hey! Thanks so much for sending this out! I appreciate it!

So the change looks pretty straightforward to me, however, the
rationale for it is where we hit problems.

With the upstream kernel, there are not yet any modules that directly
allocate from dmabuf heaps. So in the context of the upstream kernel,
the reasoning doesn't make much sense.

Now, I know folks have their own drivers that want to allocate from
dmabuf heaps, but those haven't been submitted upstream yet.
So maybe can you submit those patches that need this along with this
change so it would make sense as part of a patch series? It would be
trivial to justify including this patch then.

thanks
-john

Re: [PATCH] drm/msm/devfreq: Restrict idle clamping to a618 for now

2021-10-18 Thread John Stultz

On Mon, Oct 18, 2021 at 8:31 AM Rob Clark  wrote:
>
> From: Rob Clark 
>
> Until we better understand the stability issues caused by frequent
> frequency changes, lets limit them to a618.
>
> Signed-off-by: Rob Clark 
> ---
> Caleb/John, I think this should help as a workaround for the power
> instability issues on a630.. could you give it a try?

While I hit it fairly often, I can't reliably reproduce the crash, but
in limited testing this seems ok to me.
I've not hit the crash so far, nor seen any other negative side
effects over 5.14.

So for what that's worth:
Tested-by: John Stultz 

Caleb has better luck tripping this issue right away, so they can
hopefully provide a more assured response.

thanks
-john

Re: [PATCH v2 2/2] qcom_scm: hide Kconfig symbol

2021-10-11 Thread John Stultz

On Thu, Oct 7, 2021 at 8:10 AM Arnd Bergmann  wrote:
>
> From: Arnd Bergmann 
>
> Now that SCM can be a loadable module, we have to add another
> dependency to avoid link failures when ipa or adreno-gpu are
> built-in:
>
> aarch64-linux-ld: drivers/net/ipa/ipa_main.o: in function `ipa_probe':
> ipa_main.c:(.text+0xfc4): undefined reference to `qcom_scm_is_available'
>
> ld.lld: error: undefined symbol: qcom_scm_is_available
> >>> referenced by adreno_gpu.c
> >>>   gpu/drm/msm/adreno/adreno_gpu.o:(adreno_zap_shader_load) in 
> >>> archive drivers/built-in.a
>
> This can happen when CONFIG_ARCH_QCOM is disabled and we don't select
> QCOM_MDT_LOADER, but some other module selects QCOM_SCM. Ideally we'd
> use a similar dependency here to what we have for QCOM_RPROC_COMMON,
> but that causes dependency loops from other things selecting QCOM_SCM.
>
> This appears to be an endless problem, so try something different this
> time:
>
>  - CONFIG_QCOM_SCM becomes a hidden symbol that nothing 'depends on'
>but that is simply selected by all of its users
>
>  - All the stubs in include/linux/qcom_scm.h can go away
>
>  - arm-smccc.h needs to provide a stub for __arm_smccc_smc() to
>allow compile-testing QCOM_SCM on all architectures.
>
>  - To avoid a circular dependency chain involving RESET_CONTROLLER
>and PINCTRL_SUNXI, drop the 'select RESET_CONTROLLER' statement.
>According to my testing this still builds fine, and the QCOM
>platform selects this symbol already.
>
> Acked-by: Kalle Valo 
> Acked-by: Alex Elder 
> Signed-off-by: Arnd Bergmann 
> ---
> Changes in v2:
> - fix the iommu dependencies

Hey Arnd,
   Thanks again so much for working out these details. Also my
apologies, as Bjorn asked for me to test this patch, but I wasn't able
to get to it before it landed.  Unfortunately I've hit an issue that
is keeping the db845c from booting with this.

> diff --git a/drivers/iommu/arm/arm-smmu/Makefile 
> b/drivers/iommu/arm/arm-smmu/Makefile
> index e240a7bcf310..b0cc01aa20c9 100644
> --- a/drivers/iommu/arm/arm-smmu/Makefile
> +++ b/drivers/iommu/arm/arm-smmu/Makefile
> @@ -1,4 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
>  obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
> -arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o arm-smmu-qcom.o
> +arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-nvidia.o
> +arm_smmu-$(CONFIG_ARM_SMMU_QCOM) += arm-smmu-qcom.o
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
> b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index 9f465e146799..2c25cce38060 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -215,7 +215,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
> arm_smmu_device *smmu)
> of_device_is_compatible(np, "nvidia,tegra186-smmu"))
> return nvidia_smmu_impl_init(smmu);
>
> -   smmu = qcom_smmu_impl_init(smmu);
> +   if (IS_ENABLED(CONFIG_ARM_SMMU_QCOM))
> +   smmu = qcom_smmu_impl_init(smmu);
>
> if (of_device_is_compatible(np, "marvell,ap806-smmu-500"))
> smmu->impl = &mrvl_mmu500_impl;


The problem with these two chunks is that there is currently no
CONFIG_ARM_SMMU_QCOM option. :)

Was that something you intended to add in the patch?

I'm working up a Kconfig patch to do so, so I'll send that out in a
second here, but let me know if you already have that somewhere (I
suspect you implemented it and just forgot to add the change to the
commit), as I'm sure your Kconfig help text will be better than mine.
:)

Again, I'm so sorry I didn't get over to testing your patch before
seeing this here!

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Wed, Sep 29, 2021 at 4:20 PM Rob Clark  wrote:
> On Wed, Sep 29, 2021 at 2:51 PM John Stultz  wrote:
> > On Wed, Sep 29, 2021 at 2:32 PM John Stultz  wrote:
> > > On Wed, Sep 29, 2021 at 2:27 PM John Stultz  
> > > wrote:
> > > > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> > > > > The best practice to avoid those issues is to register its functions 
> > > > > only after
> > > > > all its dependencies are live. We also shouldn't wait any longer than 
> > > > > we should
> > > > > to play nice with the other components that are waiting for us, so in 
> > > > > our case
> > > > > that would mean moving the DSI device registration to the bridge 
> > > > > probe.
> > > > >
> > > > > I also had a look at all the DSI hosts, and it seems that exynos, 
> > > > > kirin and msm
> > > > > would be affected by this and wouldn't probe anymore after those 
> > > > > changes.
> > > > > Exynos and kirin seems to be simple enough for a mechanical change 
> > > > > (that still
> > > > > requires to be tested), but the changes in msm seemed to be far more 
> > > > > important
> > > > > and I wasn't confortable doing them.
> > > >
> > > >
> > > > Hey Maxime,
> > > >   Sorry for taking so long to get to this, but now that plumbers is
> > > > over I've had a chance to check it out on kirin
> > > >
> > > > Rob Clark pointed me to his branch with some fixups here:
> > > >
> > > > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > > >
> > > > But trying to boot hikey with that, I see the following loop 
> > > > indefinitely:
> > > > [4.632132] adv7511 2-0039: supply avdd not found, using dummy 
> > > > regulator
> > > > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy 
> > > > regulator
> > > > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy 
> > > > regulator
> > > > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy 
> > > > regulator
> > > > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy 
> > > > regulator
> > > > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy 
> > > > regulator
> > > > [4.681898] adv7511 2-0039: failed to find dsi host
> > >
> > > I just realized Rob's tree is missing the kirin patch. My apologies!
> > > I'll retest and let you know.
> >
> > Ok, just retested including the kirin patch and unfortunately I'm
> > still seeing the same thing.  :(
> >
> > Will dig a bit and let you know when I find more.
>
> Did you have a chance to test it on anything using drm/msm with DSI
> panels?  That would at least confirm that I didn't miss anything in
> the drm/msm patch to swap the dsi-host vs bridge ordering..

Not a dsi panel, but for what it's worth, I did just test the series
with the db845c w/ HDMI using the lt9611 bridge.
So at least that is looking ok.

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Wed, Sep 29, 2021 at 2:51 PM John Stultz  wrote:
>
> On Wed, Sep 29, 2021 at 2:32 PM John Stultz  wrote:
> > On Wed, Sep 29, 2021 at 2:27 PM John Stultz  wrote:
> > > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> > > > The best practice to avoid those issues is to register its functions 
> > > > only after
> > > > all its dependencies are live. We also shouldn't wait any longer than 
> > > > we should
> > > > to play nice with the other components that are waiting for us, so in 
> > > > our case
> > > > that would mean moving the DSI device registration to the bridge probe.
> > > >
> > > > I also had a look at all the DSI hosts, and it seems that exynos, kirin 
> > > > and msm
> > > > would be affected by this and wouldn't probe anymore after those 
> > > > changes.
> > > > Exynos and kirin seems to be simple enough for a mechanical change 
> > > > (that still
> > > > requires to be tested), but the changes in msm seemed to be far more 
> > > > important
> > > > and I wasn't confortable doing them.
> > >
> > >
> > > Hey Maxime,
> > >   Sorry for taking so long to get to this, but now that plumbers is
> > > over I've had a chance to check it out on kirin
> > >
> > > Rob Clark pointed me to his branch with some fixups here:
> > >
> > > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > >
> > > But trying to boot hikey with that, I see the following loop indefinitely:
> > > [4.632132] adv7511 2-0039: supply avdd not found, using dummy 
> > > regulator
> > > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy 
> > > regulator
> > > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy 
> > > regulator
> > > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy 
> > > regulator
> > > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy 
> > > regulator
> > > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy 
> > > regulator
> > > [4.681898] adv7511 2-0039: failed to find dsi host
> >
> > I just realized Rob's tree is missing the kirin patch. My apologies!
> > I'll retest and let you know.
>
> Ok, just retested including the kirin patch and unfortunately I'm
> still seeing the same thing.  :(
>
> Will dig a bit and let you know when I find more.

Hey Maxime!
  I chased down the issue. The dsi probe code was still calling
drm_of_find_panel_or_bridge() in order to succeed.

I've moved the logic that looks for the bridge into the bridge_init
and with that it seems to work.

Feel free (assuming it looks ok) to fold this change into your kirin patch:
  
https://git.linaro.org/people/john.stultz/android-dev.git/commit/?id=4a35ccc4d7a53f68d6d93da3b47e232a7c75b91d

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Wed, Sep 29, 2021 at 4:20 PM Rob Clark  wrote:
> On Wed, Sep 29, 2021 at 2:51 PM John Stultz  wrote:
> > On Wed, Sep 29, 2021 at 2:32 PM John Stultz  wrote:
> > > On Wed, Sep 29, 2021 at 2:27 PM John Stultz  
> > > wrote:
> > > > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> > > > > The best practice to avoid those issues is to register its functions 
> > > > > only after
> > > > > all its dependencies are live. We also shouldn't wait any longer than 
> > > > > we should
> > > > > to play nice with the other components that are waiting for us, so in 
> > > > > our case
> > > > > that would mean moving the DSI device registration to the bridge 
> > > > > probe.
> > > > >
> > > > > I also had a look at all the DSI hosts, and it seems that exynos, 
> > > > > kirin and msm
> > > > > would be affected by this and wouldn't probe anymore after those 
> > > > > changes.
> > > > > Exynos and kirin seems to be simple enough for a mechanical change 
> > > > > (that still
> > > > > requires to be tested), but the changes in msm seemed to be far more 
> > > > > important
> > > > > and I wasn't confortable doing them.
> > > >
> > > >
> > > > Hey Maxime,
> > > >   Sorry for taking so long to get to this, but now that plumbers is
> > > > over I've had a chance to check it out on kirin
> > > >
> > > > Rob Clark pointed me to his branch with some fixups here:
> > > >
> > > > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> > > >
> > > > But trying to boot hikey with that, I see the following loop 
> > > > indefinitely:
> > > > [4.632132] adv7511 2-0039: supply avdd not found, using dummy 
> > > > regulator
> > > > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy 
> > > > regulator
> > > > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy 
> > > > regulator
> > > > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy 
> > > > regulator
> > > > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy 
> > > > regulator
> > > > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy 
> > > > regulator
> > > > [4.681898] adv7511 2-0039: failed to find dsi host
> > >
> > > I just realized Rob's tree is missing the kirin patch. My apologies!
> > > I'll retest and let you know.
> >
> > Ok, just retested including the kirin patch and unfortunately I'm
> > still seeing the same thing.  :(
> >
> > Will dig a bit and let you know when I find more.
>
> Did you have a chance to test it on anything using drm/msm with DSI
> panels?  That would at least confirm that I didn't miss anything in
> the drm/msm patch to swap the dsi-host vs bridge ordering..

I believe Amit(cc'ed) tried to give it a run on his pocof1, but had
some troubles getting it working against a kernel that wasn't
suffering other regressions at the moment.

Amit/Caleb: Any chance one of you could try again w/ these merged to 5.15-rc3?

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Wed, Sep 29, 2021 at 2:32 PM John Stultz  wrote:
> On Wed, Sep 29, 2021 at 2:27 PM John Stultz  wrote:
> > On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> > > The best practice to avoid those issues is to register its functions only 
> > > after
> > > all its dependencies are live. We also shouldn't wait any longer than we 
> > > should
> > > to play nice with the other components that are waiting for us, so in our 
> > > case
> > > that would mean moving the DSI device registration to the bridge probe.
> > >
> > > I also had a look at all the DSI hosts, and it seems that exynos, kirin 
> > > and msm
> > > would be affected by this and wouldn't probe anymore after those changes.
> > > Exynos and kirin seems to be simple enough for a mechanical change (that 
> > > still
> > > requires to be tested), but the changes in msm seemed to be far more 
> > > important
> > > and I wasn't confortable doing them.
> >
> >
> > Hey Maxime,
> >   Sorry for taking so long to get to this, but now that plumbers is
> > over I've had a chance to check it out on kirin
> >
> > Rob Clark pointed me to his branch with some fixups here:
> >
> > https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
> >
> > But trying to boot hikey with that, I see the following loop indefinitely:
> > [4.632132] adv7511 2-0039: supply avdd not found, using dummy regulator
> > [4.638961] adv7511 2-0039: supply dvdd not found, using dummy regulator
> > [4.645741] adv7511 2-0039: supply pvdd not found, using dummy regulator
> > [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy regulator
> > [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy regulator
> > [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy regulator
> > [4.681898] adv7511 2-0039: failed to find dsi host
>
> I just realized Rob's tree is missing the kirin patch. My apologies!
> I'll retest and let you know.

Ok, just retested including the kirin patch and unfortunately I'm
still seeing the same thing.  :(

Will dig a bit and let you know when I find more.

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Wed, Sep 29, 2021 at 2:27 PM John Stultz  wrote:
>
> On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
> >
> > We've encountered an issue with the RaspberryPi DSI panel that prevented the
> > whole display driver from probing.
> >
> > The issue is described in detail in the commit 7213246a803f ("drm/vc4: dsi:
> > Only register our component once a DSI device is attached"), but the basic 
> > idea
> > is that since the panel is probed through i2c, there's no synchronization
> > between its probe and the registration of the MIPI-DSI host it's attached 
> > to.
> >
> > We initially moved the component framework registration to the MIPI-DSI Host
> > attach hook to make sure we register our component only when we have a DSI
> > device attached to our MIPI-DSI host, and then use lookup our DSI device in 
> > our
> > bind hook.
> >
> > However, all the DSI bridges controlled through i2c are only registering 
> > their
> > associated DSI device in their bridge attach hook, meaning with our change
> > above, we never got that far, and therefore ended up in the same situation 
> > than
> > the one we were trying to fix for panels.
> >
> > The best practice to avoid those issues is to register its functions only 
> > after
> > all its dependencies are live. We also shouldn't wait any longer than we 
> > should
> > to play nice with the other components that are waiting for us, so in our 
> > case
> > that would mean moving the DSI device registration to the bridge probe.
> >
> > I also had a look at all the DSI hosts, and it seems that exynos, kirin and 
> > msm
> > would be affected by this and wouldn't probe anymore after those changes.
> > Exynos and kirin seems to be simple enough for a mechanical change (that 
> > still
> > requires to be tested), but the changes in msm seemed to be far more 
> > important
> > and I wasn't confortable doing them.
>
>
> Hey Maxime,
>   Sorry for taking so long to get to this, but now that plumbers is
> over I've had a chance to check it out on kirin
>
> Rob Clark pointed me to his branch with some fixups here:
>
> https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework
>
> But trying to boot hikey with that, I see the following loop indefinitely:
> [4.632132] adv7511 2-0039: supply avdd not found, using dummy regulator
> [4.638961] adv7511 2-0039: supply dvdd not found, using dummy regulator
> [4.645741] adv7511 2-0039: supply pvdd not found, using dummy regulator
> [4.652483] adv7511 2-0039: supply a2vdd not found, using dummy regulator
> [4.659342] adv7511 2-0039: supply v3p3 not found, using dummy regulator
> [4.666086] adv7511 2-0039: supply v1p2 not found, using dummy regulator
> [4.681898] adv7511 2-0039: failed to find dsi host

I just realized Rob's tree is missing the kirin patch. My apologies!
I'll retest and let you know.

thanks
-john

Re: [PATCH v4 00/24] drm/bridge: Make panel and bridge probe order consistent

2021-09-29 Thread John Stultz

On Fri, Sep 10, 2021 at 3:12 AM Maxime Ripard  wrote:
>
> We've encountered an issue with the RaspberryPi DSI panel that prevented the
> whole display driver from probing.
>
> The issue is described in detail in the commit 7213246a803f ("drm/vc4: dsi:
> Only register our component once a DSI device is attached"), but the basic 
> idea
> is that since the panel is probed through i2c, there's no synchronization
> between its probe and the registration of the MIPI-DSI host it's attached to.
>
> We initially moved the component framework registration to the MIPI-DSI Host
> attach hook to make sure we register our component only when we have a DSI
> device attached to our MIPI-DSI host, and then use lookup our DSI device in 
> our
> bind hook.
>
> However, all the DSI bridges controlled through i2c are only registering their
> associated DSI device in their bridge attach hook, meaning with our change
> above, we never got that far, and therefore ended up in the same situation 
> than
> the one we were trying to fix for panels.
>
> The best practice to avoid those issues is to register its functions only 
> after
> all its dependencies are live. We also shouldn't wait any longer than we 
> should
> to play nice with the other components that are waiting for us, so in our case
> that would mean moving the DSI device registration to the bridge probe.
>
> I also had a look at all the DSI hosts, and it seems that exynos, kirin and 
> msm
> would be affected by this and wouldn't probe anymore after those changes.
> Exynos and kirin seems to be simple enough for a mechanical change (that still
> requires to be tested), but the changes in msm seemed to be far more important
> and I wasn't confortable doing them.


Hey Maxime,
  Sorry for taking so long to get to this, but now that plumbers is
over I've had a chance to check it out on kirin

Rob Clark pointed me to his branch with some fixups here:
   
https://gitlab.freedesktop.org/robclark/msm/-/commits/for-mripard/bridge-rework

But trying to boot hikey with that, I see the following loop indefinitely:
[4.632132] adv7511 2-0039: supply avdd not found, using dummy regulator
[4.638961] adv7511 2-0039: supply dvdd not found, using dummy regulator
[4.645741] adv7511 2-0039: supply pvdd not found, using dummy regulator
[4.652483] adv7511 2-0039: supply a2vdd not found, using dummy regulator
[4.659342] adv7511 2-0039: supply v3p3 not found, using dummy regulator
[4.666086] adv7511 2-0039: supply v1p2 not found, using dummy regulator
[4.681898] adv7511 2-0039: failed to find dsi host
[4.688836] adv7511 2-0039: supply avdd not found, using dummy regulator
[4.695724] adv7511 2-0039: supply dvdd not found, using dummy regulator
[4.702583] adv7511 2-0039: supply pvdd not found, using dummy regulator
[4.709369] adv7511 2-0039: supply a2vdd not found, using dummy regulator
[4.716232] adv7511 2-0039: supply v3p3 not found, using dummy regulator
[4.722972] adv7511 2-0039: supply v1p2 not found, using dummy regulator
[4.738720] adv7511 2-0039: failed to find dsi host

I'll have to dig a bit to figure out what's going wrong, but wanted to
give you the heads up that there seems to be a problem

thanks
-john

[PATCH] dma-buf: system_heap: Avoid warning on mid-order allocations

2021-09-08 Thread John Stultz

When trying to do mid-order allocations, set __GFP_NOWARN to
avoid warning messages if the allocation fails, as we will
still fall back to single page allocatitions in that case.
This is the similar to what we already do for large order
allocations.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Simon Ser 
Cc: James Jones 
Cc: Leo Yan 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/system_heap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 23a7e74ef966..f57a39ddd063 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,11 +40,12 @@ struct dma_heap_attachment {
bool mapped;
 };
 
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
 #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
| __GFP_NORETRY) & ~__GFP_RECLAIM) \
| __GFP_COMP)
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
-static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP};
 /*
  * The selection of the orders used for allocation (1MB, 64K, 4K) is designed
  * to match with the sizes often found in IOMMUs. Using order 4 pages instead
-- 
2.25.1

Re: [PATCH] drm/msm: Disable frequency clamping on a630

2021-09-03 Thread John Stultz

On Thu, Jul 29, 2021 at 1:49 PM Rob Clark  wrote:
> On Thu, Jul 29, 2021 at 1:28 PM Caleb Connolly
>  wrote:
> > On 29/07/2021 21:24, Rob Clark wrote:
> > > On Thu, Jul 29, 2021 at 1:06 PM Caleb Connolly
> > >  wrote:
> > >>
> > >> Hi Rob,
> > >>
> > >> I've done some more testing! It looks like before that patch ("drm/msm: 
> > >> Devfreq tuning") the GPU would never get above
> > >> the second frequency in the OPP table (342MHz) (at least, not in 
> > >> glxgears). With the patch applied it would more
> > >> aggressively jump up to the max frequency which seems to be unstable at 
> > >> the default regulator voltages.
> > >
> > > *ohh*, yeah, ok, that would explain it
> > >
> > >> Hacking the pm8005 s1 regulator (which provides VDD_GFX) up to 0.988v 
> > >> (instead of the stock 0.516v) makes the GPU stable
> > >> at the higher frequencies.
> > >>
> > >> Applying this patch reverts the behaviour, and the GPU never goes above 
> > >> 342MHz in glxgears, losing ~30% performance in
> > >> glxgear.
> > >>
> > >> I think (?) that enabling CPR support would be the proper solution to 
> > >> this - that would ensure that the regulators run
> > >> at the voltage the hardware needs to be stable.
> > >>
> > >> Is hacking the voltage higher (although ideally not quite that high) an 
> > >> acceptable short term solution until we have
> > >> CPR? Or would it be safer to just not make use of the higher frequencies 
> > >> on a630 for now?
> > >>
> > >
> > > tbh, I'm not sure about the regulator stuff and CPR.. Bjorn is already
> > > on CC and I added sboyd, maybe one of them knows better.
> > >
> > > In the short term, removing the higher problematic OPPs from dts might
> > > be a better option than this patch (which I'm dropping), since there
> > > is nothing stopping other workloads from hitting higher OPPs.
> > Oh yeah that sounds like a more sensible workaround than mine .
> > >
> > > I'm slightly curious why I didn't have problems at higher OPPs on my
> > > c630 laptop (sdm850)
> > Perhaps you won the sillicon lottery - iirc sdm850 is binned for higher 
> > clocks as is out of the factory.
> >
> > Would it be best to drop the OPPs for all devices? Or just those affected? 
> > I guess it's possible another c630 might
> > crash where yours doesn't?
>
> I've not heard any reports of similar issues from the handful of other
> folks with c630's on #aarch64-laptops.. but I can't really say if that
> is luck or not.
>
> Maybe just remove it for affected devices?  But I'll defer to Bjorn.

Just as another datapoint, I was just marveling at how suddenly smooth
the UI was performing on db845c and Caleb pointed me at the "drm/msm:
Devfreq tuning" patch as the likely cause of the improvement, and
mid-discussion my board crashed into USB crash mode:
[  146.157696][C0] adreno 500.gpu: CP | AHB bus error
[  146.163303][C0] adreno 500.gpu: CP | AHB bus error
[  146.168837][C0] adreno 500.gpu: RBBM | ATB bus overflow
[  146.174960][C0] adreno 500.gpu: CP | HW fault | status=0x
[  146.181917][C0] adreno 500.gpu: CP | AHB bus error
[  146.187547][C0] adreno 500.gpu: CP illegal instruction error
[  146.194009][C0] adreno 500.gpu: CP | AHB bus error
[  146.308909][T9] Internal error: synchronous external abort:
9610 [#1] PREEMPT SMP
[  146.317150][T9] Modules linked in:
[  146.320941][T9] CPU: 3 PID: 9 Comm: kworker/u16:1 Tainted: G
W 5.14.0-mainline-06795-g42b258c2275c #24
[  146.331974][T9] Hardware name: Thundercomm Dragonboar
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.2.0-00371-SDM845LZB-1
S - IMAGE_VARIANT_STRING=SDM845LA
S - OEM_IMAGE_VERSION_STRING=TSBJ-FA-PC-02170

So Caleb sent me to this thread. :)

I'm still trying to trip it again, but it does seem like db845c is
also seeing some stability issues with Linus' HEAD.

thanks
-john

[RFC][PATCH] dma-buf: system_heap: Avoid warning on mid-order allocations

2021-08-25 Thread John Stultz

When trying to do mid-order allocations, set __GFP_NOWARN to
avoid warning messages if the allocation fails, as we will
still fall back to single page allocatitions in that case.
This is the similar to what we already do for large order
allocations.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: Leo Yan 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/system_heap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 23a7e74ef966..f57a39ddd063 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,11 +40,12 @@ struct dma_heap_attachment {
bool mapped;
 };
 
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
 #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
| __GFP_NORETRY) & ~__GFP_RECLAIM) \
| __GFP_COMP)
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
-static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP};
 /*
  * The selection of the orders used for allocation (1MB, 64K, 4K) is designed
  * to match with the sizes often found in IOMMUs. Using order 4 pages instead
-- 
2.25.1

Re: [PATCH] dma-buf: heaps: Set allocation limit for system heap

2021-08-02 Thread John Stultz

On Thu, Jul 22, 2021 at 12:07 PM Hridya Valsaraju  wrote:
> This patch limits the size of total memory that can be requested in a
> single allocation from the system heap. This would prevent a
> buggy/malicious client from depleting system memory by requesting for an
> extremely large allocation which might destabilize the system.
>
> The limit is set to half the size of the device's total RAM which is the
> same as what was set by the deprecated ION system heap.
>
> Signed-off-by: Hridya Valsaraju 

Seems sane to me, unless folks have better suggestions for allocation limits.

Reviewed-by: John Stultz 

thanks
-john

Re: [PATCH 3/3] dma-buf: nuke SW_SYNC debugfs files

2021-07-29 Thread John Stultz

On Thu, Jul 29, 2021 at 12:24 AM Daniel Vetter  wrote:
>
> On Thu, Jul 29, 2021 at 09:03:30AM +0200, Christian König wrote:
> > As we now knew controlling dma_fence synchronization from userspace is
> > extremely dangerous and can not only deadlock drivers but trivially also the
> > whole kernel memory management.
> >
> > Entirely remove this option. We now have in kernel unit tests to exercise 
> > the
> > dma_fence framework and it's containers.
> >
> > Signed-off-by: Christian König 
>
> There's also igts for this, and Android heavily uses this. So I'm not sure
> we can just nuke it.

Eeeeh... I don't think that's actually the case anymore. As of
android12-5.10 CONFIG_SW_SYNC is not turned on.
Further, Android is disabling debugfs in their kernels as it exposes
too much to userland.

That said, there still are some references to it:

https://cs.android.com/android/platform/superproject/+/master:system/core/libsync/sync.c;l=416

But it looks like the actual users are only kselftest and igt?

Adding Alistair, Hridya and Sandeep in case they have more context.

thanks
-john

Re: [PATCH] drm/msm: Fix display fault handling

2021-07-26 Thread John Stultz

On Wed, Jul 7, 2021 at 10:57 AM Rob Clark  wrote:
>
> From: Rob Clark 
>
> It turns out that when the display is enabled by the bootloader, we can
> get some transient iommu faults from the display.  Which doesn't go over
> too well when we install a fault handler that is gpu specific.  To avoid
> this, defer installing the fault handler until we get around to setting
> up per-process pgtables (which is adreno_smmu specific).  The arm-smmu
> fallback error reporting is sufficient for reporting display related
> faults (and in fact was all we had prior to f8f934c180f629bb927a04fd90d)
>
> Reported-by: Dmitry Baryshkov 
> Reported-by: Yassine Oudjana 
> Fixes: 2a574cc05d38 ("drm/msm: Improve the a6xx page fault handler")
> Signed-off-by: Rob Clark 
> Tested-by: John Stultz 
> ---

Hey folks!
  Just wanted to follow up on this, as it's still missing from
5.14-rc3 and is critical for resolving a boot regression on db845c. Is
there anything keeping this from heading upstream?

thanks
-john

Re: page pools, was Re: [PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-07-07 Thread John Stultz

On Wed, Jul 7, 2021 at 12:15 AM Christoph Hellwig  wrote:
>
> On Wed, Jul 07, 2021 at 09:10:26AM +0200, Christian K??nig wrote:
> > Well, the original code all this is based on already had the comment that
> > this really belong into the page allocator.
> >
> > The key point is traditionally only GPUs used uncached and write-combined
> > memory in such large quantities that having a pool for them makes sense.
> >
> > Because of this we had this separately to reduce the complexity in the page
> > allocator to handle another set of complexity of allocation types.
> >
> > For the upside, for those use cases it means huge performance improvements
> > for those drivers. See the numbers John provided in the cover letter.
> >
> > But essentially at least I would be totally fine moving this into the page
> > allocator, but moving it outside of TTM already helps with this goal. So
> > this patch set is certainly a step into the right direction.
>
> Unless I'm badly misreading the patch and this series there is nothing
> about cache attributes in this code.  It just allocates pages, zeroes
> them, eventually hands them out to a consumer and registers a shrinker
> for its freelist.
>
> If OTOH it actually dealt with cachability that should be documented
> in the commit log and probably also the naming of the implementation.

So the cache attributes are still managed in the tmm_pool code. This
series just pulls the pool/shrinker management out into common code so
we can use it in the dmabuf heap code without duplicating things.

thanks
-john

Re: page pools, was Re: [PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-07-07 Thread John Stultz

On Tue, Jul 6, 2021 at 11:38 PM Christoph Hellwig  wrote:
> On Wed, Jun 30, 2021 at 01:34:17AM +0000, John Stultz wrote:
> > This adds a shrinker controlled page pool, extracted
> > out of the ttm_pool logic, and abstracted out a bit
> > so it can be used by other non-ttm drivers.
>
> Can you explain in detail why you need a differnt page pool over the one
> maintained by the page allocator?  Fragmenting the memory into all kinds
> of pools has lots of downsides, so the upsides need to be explained in
> detail.

So, as Christian mentioned, on the TTM side it's useful, as they are
trying to avoid TLB flushes when changing caching attributes.

For the dmabuf system heap purposes, the main benefit is moving the
page zeroing to the free path, rather than the allocation path. This
on its own doesn't save much, but allows us to defer frees (and thus
the zeroing) to the background, which can get that work out of the hot
path.

thanks
-john

Re: [PATCH v5 0/5] iommu/arm-smmu: adreno-smmu page fault handling

2021-07-06 Thread John Stultz

On Sun, Jul 4, 2021 at 11:16 AM Rob Clark  wrote:
>
> I suspect you are getting a dpu fault, and need:
>
> https://lore.kernel.org/linux-arm-msm/CAF6AEGvTjTUQXqom-xhdh456tdLscbVFPQ+iud1H1gHc8A2=h...@mail.gmail.com/
>
> I suppose Bjorn was expecting me to send that patch

If it's helpful, I applied that and it got the db845c booting mainline
again for me (along with some reverts for a separate ext4 shrinker
crash).
Tested-by: John Stultz 

thanks
-john

Re: [PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-07-06 Thread John Stultz

On Tue, Jul 6, 2021 at 2:15 PM Daniel Vetter  wrote:
>
> On Tue, Jul 6, 2021 at 11:04 PM John Stultz  wrote:
> > On Wed, Jun 30, 2021 at 11:52 PM Christian König
> >  wrote:
> > >
> > > Am 01.07.21 um 00:24 schrieb John Stultz:
> > > > On Wed, Jun 30, 2021 at 2:10 AM Christian König
> > > >  wrote:
> > > >> Am 30.06.21 um 03:34 schrieb John Stultz:
> > > >>> +static unsigned long page_pool_size; /* max size of the pool */
> > > >>> +
> > > >>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page 
> > > >>> pool");
> > > >>> +module_param(page_pool_size, ulong, 0644);
> > > >>> +
> > > >>> +static atomic_long_t nr_managed_pages;
> > > >>> +
> > > >>> +static struct mutex shrinker_lock;
> > > >>> +static struct list_head shrinker_list;
> > > >>> +static struct shrinker mm_shrinker;
> > > >>> +
> > > >>> +/**
> > > >>> + * drm_page_pool_set_max - Sets maximum size of all pools
> > > >>> + *
> > > >>> + * Sets the maximum number of pages allows in all pools.
> > > >>> + * This can only be set once, and the first caller wins.
> > > >>> + */
> > > >>> +void drm_page_pool_set_max(unsigned long max)
> > > >>> +{
> > > >>> + if (!page_pool_size)
> > > >>> + page_pool_size = max;
> > > >>> +}
> > > >>> +
> > > >>> +/**
> > > >>> + * drm_page_pool_get_max - Maximum size of all pools
> > > >>> + *
> > > >>> + * Return the maximum number of pages allows in all pools
> > > >>> + */
> > > >>> +unsigned long drm_page_pool_get_max(void)
> > > >>> +{
> > > >>> + return page_pool_size;
> > > >>> +}
> > > >> Well in general I don't think it is a good idea to have getters/setters
> > > >> for one line functionality, similar applies to locking/unlocking the
> > > >> mutex below.
> > > >>
> > > >> Then in this specific case what those functions do is to aid
> > > >> initializing the general pool manager and that in turn should 
> > > >> absolutely
> > > >> not be exposed.
> > > >>
> > > >> The TTM pool manager exposes this as function because initializing the
> > > >> pool manager is done in one part of the module and calculating the
> > > >> default value for the pages in another one. But that is not something I
> > > >> would like to see here.
> > > > So, I guess I'm not quite clear on what you'd like to see...
> > > >
> > > > Part of what I'm balancing here is the TTM subsystem normally sets a
> > > > global max size, whereas the old ION pool didn't have caps (instead
> > > > just relying on the shrinker when needed).
> > > > So I'm trying to come up with a solution that can serve both uses. So
> > > > I've got this drm_page_pool_set_max() function to optionally set the
> > > > maximum value, which is called in the TTM initialization path or set
> > > > the boot argument. But for systems that use the dmabuf system heap,
> > > > but don't use TTM, no global limit is enforced.
> > >
> > > Yeah, exactly that's what I'm trying to prevent.
> > >
> > > See if we have the same functionality used by different use cases we
> > > should not have different behavior depending on what drivers are loaded.
> > >
> > > Is it a problem if we restrict the ION pool to 50% of system memory as
> > > well? If yes than I would rather drop the limit from TTM and only rely
> > > on the shrinker there as well.
> >
> > Would having the default value as a config option (still overridable
> > via boot argument) be an acceptable solution?
>
> We're also trying to get ttm over to the shrinker model, and a first
> cut of that even landed, but didn't really work out yet. So maybe just
> aiming for the shrinker? I do agree this should be consistent across
> the board, otherwise we're just sharing code but not actually sharing
> functionality, which is a recipe for disaster because one side will
> end up breaking the other side's use-case.

Fair enough, maybe it would be best to remove the default limit, but
leave the logic so it can still be set via the boot argument?

thanks
-john

Re: [PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-07-06 Thread John Stultz

On Wed, Jun 30, 2021 at 11:52 PM Christian König
 wrote:
>
> Am 01.07.21 um 00:24 schrieb John Stultz:
> > On Wed, Jun 30, 2021 at 2:10 AM Christian König
> >  wrote:
> >> Am 30.06.21 um 03:34 schrieb John Stultz:
> >>> +static unsigned long page_pool_size; /* max size of the pool */
> >>> +
> >>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> >>> +module_param(page_pool_size, ulong, 0644);
> >>> +
> >>> +static atomic_long_t nr_managed_pages;
> >>> +
> >>> +static struct mutex shrinker_lock;
> >>> +static struct list_head shrinker_list;
> >>> +static struct shrinker mm_shrinker;
> >>> +
> >>> +/**
> >>> + * drm_page_pool_set_max - Sets maximum size of all pools
> >>> + *
> >>> + * Sets the maximum number of pages allows in all pools.
> >>> + * This can only be set once, and the first caller wins.
> >>> + */
> >>> +void drm_page_pool_set_max(unsigned long max)
> >>> +{
> >>> + if (!page_pool_size)
> >>> + page_pool_size = max;
> >>> +}
> >>> +
> >>> +/**
> >>> + * drm_page_pool_get_max - Maximum size of all pools
> >>> + *
> >>> + * Return the maximum number of pages allows in all pools
> >>> + */
> >>> +unsigned long drm_page_pool_get_max(void)
> >>> +{
> >>> + return page_pool_size;
> >>> +}
> >> Well in general I don't think it is a good idea to have getters/setters
> >> for one line functionality, similar applies to locking/unlocking the
> >> mutex below.
> >>
> >> Then in this specific case what those functions do is to aid
> >> initializing the general pool manager and that in turn should absolutely
> >> not be exposed.
> >>
> >> The TTM pool manager exposes this as function because initializing the
> >> pool manager is done in one part of the module and calculating the
> >> default value for the pages in another one. But that is not something I
> >> would like to see here.
> > So, I guess I'm not quite clear on what you'd like to see...
> >
> > Part of what I'm balancing here is the TTM subsystem normally sets a
> > global max size, whereas the old ION pool didn't have caps (instead
> > just relying on the shrinker when needed).
> > So I'm trying to come up with a solution that can serve both uses. So
> > I've got this drm_page_pool_set_max() function to optionally set the
> > maximum value, which is called in the TTM initialization path or set
> > the boot argument. But for systems that use the dmabuf system heap,
> > but don't use TTM, no global limit is enforced.
>
> Yeah, exactly that's what I'm trying to prevent.
>
> See if we have the same functionality used by different use cases we
> should not have different behavior depending on what drivers are loaded.
>
> Is it a problem if we restrict the ION pool to 50% of system memory as
> well? If yes than I would rather drop the limit from TTM and only rely
> on the shrinker there as well.

Would having the default value as a config option (still overridable
via boot argument) be an acceptable solution?

Thanks again for the feedback!

thanks
-john

Re: [PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-06-30 Thread John Stultz

On Wed, Jun 30, 2021 at 2:10 AM Christian König
 wrote:
> Am 30.06.21 um 03:34 schrieb John Stultz:
> > +static unsigned long page_pool_size; /* max size of the pool */
> > +
> > +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> > +module_param(page_pool_size, ulong, 0644);
> > +
> > +static atomic_long_t nr_managed_pages;
> > +
> > +static struct mutex shrinker_lock;
> > +static struct list_head shrinker_list;
> > +static struct shrinker mm_shrinker;
> > +
> > +/**
> > + * drm_page_pool_set_max - Sets maximum size of all pools
> > + *
> > + * Sets the maximum number of pages allows in all pools.
> > + * This can only be set once, and the first caller wins.
> > + */
> > +void drm_page_pool_set_max(unsigned long max)
> > +{
> > + if (!page_pool_size)
> > + page_pool_size = max;
> > +}
> > +
> > +/**
> > + * drm_page_pool_get_max - Maximum size of all pools
> > + *
> > + * Return the maximum number of pages allows in all pools
> > + */
> > +unsigned long drm_page_pool_get_max(void)
> > +{
> > + return page_pool_size;
> > +}
>
> Well in general I don't think it is a good idea to have getters/setters
> for one line functionality, similar applies to locking/unlocking the
> mutex below.
>
> Then in this specific case what those functions do is to aid
> initializing the general pool manager and that in turn should absolutely
> not be exposed.
>
> The TTM pool manager exposes this as function because initializing the
> pool manager is done in one part of the module and calculating the
> default value for the pages in another one. But that is not something I
> would like to see here.

So, I guess I'm not quite clear on what you'd like to see...

Part of what I'm balancing here is the TTM subsystem normally sets a
global max size, whereas the old ION pool didn't have caps (instead
just relying on the shrinker when needed).
So I'm trying to come up with a solution that can serve both uses. So
I've got this drm_page_pool_set_max() function to optionally set the
maximum value, which is called in the TTM initialization path or set
the boot argument. But for systems that use the dmabuf system heap,
but don't use TTM, no global limit is enforced.

Your earlier suggestion to have it as an argument to the
drm_page_pool_shrinker_init() didn't make much sense to me, as then we
may have multiple subsystems trying to initialize the pool shrinker,
which doesn't seem ideal. So I'm not sure what would be preferred.

I guess we could have subsystems allocate their own pool managers with
their own shrinkers and per-manager-size-limits? But that also feels
like a fair increase in complexity, for I'm not sure how much benefit.

> > +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p)
> > +{
> > + unsigned int i, num_pages = 1 << pool->order;
> > +
> > + /* Make sure we won't grow larger then the max pool size */
> > + if (page_pool_size &&
> > +((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
> > + pool->free(pool, p);
> > + return;
> > + }
>
> That is not a good idea. See how ttm_pool_free() does this.
>
> First the page is given back to the pool, then all pools are shrinked if
> they are above the global limit.

Ok, initially my thought was it seemed like wasteful overhead to add
the page to the pool and then shrink the pool, so just freeing the
given page directly if the pool was full seemed more straightforward.
But on consideration here I do realize having most-recently-used hot
pages in the pool would be good for performance, so I'll switch that
back. Thanks for pointing this out!

Thanks again so much for the review and feedback!
-john

[PATCH v9 5/5] dma-buf: system_heap: Add deferred freeing to the system heap

2021-06-29 Thread John Stultz

Utilize the deferred free helper library in the system heap.

This provides a nice performance bump and puts the
system heap performance on par with ION.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Rework deferred-free api to use reason enum as suggested by
  Suren Baghdasaryan
* Rework for deferred-free api change to use nr_pages rather
  than size as suggsted by Suren Baghdasaryan
v8:
* Reworked to drop buffer zeroing logic, as the drm_page_pool now
  handles that.
---
 drivers/dma-buf/heaps/Kconfig   |  1 +
 drivers/dma-buf/heaps/system_heap.c | 28 ++--
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index 7e28934e0def..10632ccfb4a5 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -5,6 +5,7 @@ config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
select DRM_PAGE_POOL
+   select DMABUF_HEAPS_DEFERRED_FREE
help
  Choose this option to enable the system dmabuf heap. The system heap
  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 85ceca2ed61d..8a0170b0427e 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -22,6 +22,7 @@
 #include 
 
 #include 
+#include "deferred-free-helper.h"
 
 static struct dma_heap *sys_heap;
 
@@ -33,6 +34,7 @@ struct system_heap_buffer {
struct sg_table sg_table;
int vmap_cnt;
void *vaddr;
+   struct deferred_freelist_item deferred_free;
 };
 
 struct dma_heap_attachment {
@@ -290,27 +292,41 @@ static void system_heap_free_pages(struct drm_page_pool 
*pool, struct page *p)
__free_pages(p, pool->order);
 }
 
-static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+static void system_heap_buf_free(struct deferred_freelist_item *item,
+enum df_reason reason)
 {
-   struct system_heap_buffer *buffer = dmabuf->priv;
+   struct system_heap_buffer *buffer;
struct sg_table *table;
struct scatterlist *sg;
int i, j;
 
+   buffer = container_of(item, struct system_heap_buffer, deferred_free);
table = &buffer->sg_table;
for_each_sg(table->sgl, sg, table->nents, i) {
struct page *page = sg_page(sg);
 
-   for (j = 0; j < NUM_ORDERS; j++) {
-   if (compound_order(page) == orders[j])
-   break;
+   if (reason == DF_UNDER_PRESSURE) {
+   __free_pages(page, compound_order(page));
+   } else {
+   for (j = 0; j < NUM_ORDERS; j++) {
+   if (compound_order(page) == orders[j])
+   break;
+   }
+   drm_page_pool_add(&pools[j], page);
}
-   drm_page_pool_add(&pools[j], page);
}
sg_free_table(table);
kfree(buffer);
 }
 
+static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+   struct system_heap_buffer *buffer = dmabuf->priv;
+   int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE;
+
+   deferred_free(&buffer->deferred_free, system_heap_buf_free, npages);
+}
+
 static const struct dma_buf_ops system_heap_buf_ops = {
.attach = system_heap_attach,
.detach = system_heap_detach,
-- 
2.25.1

[PATCH v9 4/5] dma-buf: heaps: Add deferred-free-helper library code

2021-06-29 Thread John Stultz

This patch provides infrastructure for deferring buffer frees.

This is a feature ION provided which when used with some form
of a page pool, provides a nice performance boost in an
allocation microbenchmark. The reason it helps is it allows the
page-zeroing to be done out of the normal allocation/free path,
and pushed off to a kthread.

As not all heaps will find this useful, its implemented as
a optional helper library that heaps can utilize.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Fix sleep in atomic issue from using a mutex, by switching
  to a spinlock as Reported-by: kernel test robot 
* Cleanup API to use a reason enum for clarity and add some documentation
  comments as suggested by Suren Baghdasaryan.
v3:
* Minor tweaks so it can be built as a module
* A few small fixups suggested by Daniel Mentz
v4:
* Tweak from Daniel Mentz to make sure the shrinker
  count/freed values are tracked in pages not bytes
v5:
* Fix up page count tracking as suggested by Suren Baghdasaryan
v7:
* Rework accounting to use nr_pages rather then size, as suggested
  by Suren Baghdasaryan
---
 drivers/dma-buf/heaps/Kconfig|   3 +
 drivers/dma-buf/heaps/Makefile   |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 
 4 files changed, 197 insertions(+)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index f19bf1f82bc2..7e28934e0def 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,3 +1,6 @@
+config DMABUF_HEAPS_DEFERRED_FREE
+   tristate
+
 config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..4e7839875615 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_DMABUF_HEAPS_DEFERRED_FREE) += deferred-free-helper.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)  += system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.c 
b/drivers/dma-buf/heaps/deferred-free-helper.c
new file mode 100644
index ..e19c8b68dfeb
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.c
@@ -0,0 +1,138 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Deferred dmabuf freeing helper
+ *
+ * Copyright (C) 2020 Linaro, Ltd.
+ *
+ * Based on the ION page pool code
+ * Copyright (C) 2011 Google, Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "deferred-free-helper.h"
+
+static LIST_HEAD(free_list);
+static size_t list_nr_pages;
+wait_queue_head_t freelist_waitqueue;
+struct task_struct *freelist_task;
+static DEFINE_SPINLOCK(free_list_lock);
+
+void deferred_free(struct deferred_freelist_item *item,
+  void (*free)(struct deferred_freelist_item*,
+   enum df_reason),
+  size_t nr_pages)
+{
+   unsigned long flags;
+
+   INIT_LIST_HEAD(&item->list);
+   item->nr_pages = nr_pages;
+   item->free = free;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   list_add(&item->list, &free_list);
+   list_nr_pages += nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+   wake_up(&freelist_waitqueue);
+}
+EXPORT_SYMBOL_GPL(deferred_free);
+
+static size_t free_one_item(enum df_reason reason)
+{
+   unsigned long flags;
+   size_t nr_pages;
+   struct deferred_freelist_item *item;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   if (list_empty(&free_list)) {
+   spin_unlock_irqrestore(&free_list_lock, flags);
+   return 0;
+   }
+   item = list_first_entry(&free_list, struct deferred_freelist_item, 
list);
+   list_del(&item->list);
+   nr_pages = item->nr_pages;
+   list_nr_pages -= nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+
+   item->free(item, reason);
+   return nr_pages;
+}
+
+static unsigned long get_freelist_nr_pages(void)
+{
+   unsigned long nr_pages;
+   unsigned long flags;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   nr_pages = list_nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+

[PATCH v9 2/5] drm: ttm_pool: Rework ttm_pool to use drm_page_pool

2021-06-29 Thread John Stultz

This patch reworks the ttm_pool logic to utilize the recently
added drm_page_pool code.

This adds drm_page_pool structures to the ttm_pool_type
structures, and then removes all the ttm_pool_type shrinker
logic (as its handled in the drm_page_pool shrinker).

NOTE: There is one mismatch in the interfaces I'm not totally
happy with. The ttm_pool tracks all of its pooled pages across
a number of different pools, and tries to keep this size under
the specified page_pool_size value. With the drm_page_pool,
there may other users, however there is still one global
shrinker list of pools. So we can't easily reduce the ttm
pool under the ttm specified size without potentially doing
a lot of shrinking to other non-ttm pools. So either we can:
  1) Try to split it so each user of drm_page_pools manages its
 own pool shrinking.
  2) Push the max value into the drm_page_pool, and have it
 manage shrinking to fit under that global max. Then share
 those size/max values out so the ttm_pool debug output
 can have more context.

I've taken the second path in this patch set, but wanted to call
it out so folks could look closely.

Thoughts would be greatly appreciated here!

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v7:
* Major refactoring to use drm_page_pools inside the
  ttm_pool_type structure. This allows us to use container_of to
  get the needed context to free a page. This also means less
  code is changed overall.
v8:
* Reworked to use the new cleanly rewritten drm_page_pool logic
v9:
* Renamed functions, and dropped duplicative order tracking, as
  suggested by ChristianK
* Use new *_(un)lock_shrinker() hooks to fix atomic calculations
  for debugfs
---
 drivers/gpu/drm/Kconfig|   1 +
 drivers/gpu/drm/ttm/ttm_pool.c | 167 ++---
 include/drm/ttm/ttm_pool.h |  14 +--
 3 files changed, 33 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 52d9ba92b35e..6be5344c009c 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -183,6 +183,7 @@ config DRM_PAGE_POOL
 config DRM_TTM
tristate
depends on DRM && MMU
+   select DRM_PAGE_POOL
help
  GPU memory management subsystem for devices with multiple
  GPU memory types. Will be enabled automatically if a device driver
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index cb38b1a17b09..7ae647bce551 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -40,6 +40,7 @@
 #include 
 #endif
 
+#include 
 #include 
 #include 
 #include 
@@ -70,10 +71,6 @@ static struct ttm_pool_type global_uncached[MAX_ORDER];
 static struct ttm_pool_type global_dma32_write_combined[MAX_ORDER];
 static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
 
-static struct mutex shrinker_lock;
-static struct list_head shrinker_list;
-static struct shrinker mm_shrinker;
-
 /* Allocate pages of size 1 << order with the given gfp_flags */
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
unsigned int order)
@@ -158,6 +155,15 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum 
ttm_caching caching,
kfree(dma);
 }
 
+static void ttm_pool_free_callback(struct drm_page_pool *subpool,
+  struct page *p)
+{
+   struct ttm_pool_type *pt;
+
+   pt = container_of(subpool, struct ttm_pool_type, subpool);
+   return ttm_pool_free_page(pt->pool, pt->caching, subpool->order, p);
+}
+
 /* Apply a new caching to an array of pages */
 static int ttm_pool_apply_caching(struct page **first, struct page **last,
  enum ttm_caching caching)
@@ -219,66 +225,20 @@ static void ttm_pool_unmap(struct ttm_pool *pool, 
dma_addr_t dma_addr,
   DMA_BIDIRECTIONAL);
 }
 
-/* Give pages into a specific pool_type */
-static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p)
-{
-   unsigned int i, num_pages = 1 << pt->order;
-
-   for (i = 0; i < num_pages; ++i) {
-   if (PageHighMem(p))
-   clear_highpage(p + i);
-   else
-   clear_page(page_address(p + i));
-   }
-
-   spin_lock(&pt->lock);
-   list_add(&p->lru, &pt->pages);
-   spin_unlock(&pt->lock);
-   atomic_long_add(1 << pt->order, &allocated_pages);
-}
-
-/* Take pages from a specific pool_type, return NULL when nothing avail

[PATCH v9 3/5] dma-buf: system_heap: Add drm pagepool support to system heap

2021-06-29 Thread John Stultz

Utilize the drm pagepool code to speed up allocation
performance.

This is similar to the ION pagepool usage, but tries to
utilize generic code instead of a custom implementation.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Fix build issue caused by selecting PAGE_POOL w/o NET
  as Reported-by: kernel test robot 
v3:
* Simplify the page zeroing logic a bit by using kmap_atomic
  instead of vmap as suggested by Daniel Mentz
v5:
* Shift away from networking page pool completely to
  dmabuf page pool implementation
v6:
* Switch again to using the drm_page_pool code shared w/
  ttm_pool
v7:
* Slight rework for drm_page_pool changes
v8:
* Rework to use the rewritten drm_page_pool logic
* Drop explicit buffer zeroing, as the drm_page_pool handles that
v9:
* Fix compiler warning Reported-by: kernel test robot 
---
 drivers/dma-buf/heaps/Kconfig   |  1 +
 drivers/dma-buf/heaps/system_heap.c | 26 +++---
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..f19bf1f82bc2 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,6 +1,7 @@
 config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
+   select DRM_PAGE_POOL
help
  Choose this option to enable the system dmabuf heap. The system heap
  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index f57a39ddd063..85ceca2ed61d 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+
 static struct dma_heap *sys_heap;
 
 struct system_heap_buffer {
@@ -54,6 +56,7 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, 
LOW_ORDER_GFP};
  */
 static const unsigned int orders[] = {8, 4, 0};
 #define NUM_ORDERS ARRAY_SIZE(orders)
+struct drm_page_pool pools[NUM_ORDERS];
 
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
@@ -282,18 +285,27 @@ static void system_heap_vunmap(struct dma_buf *dmabuf, 
struct dma_buf_map *map)
dma_buf_map_clear(map);
 }
 
+static void system_heap_free_pages(struct drm_page_pool *pool, struct page *p)
+{
+   __free_pages(p, pool->order);
+}
+
 static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
 {
struct system_heap_buffer *buffer = dmabuf->priv;
struct sg_table *table;
struct scatterlist *sg;
-   int i;
+   int i, j;
 
table = &buffer->sg_table;
for_each_sg(table->sgl, sg, table->nents, i) {
struct page *page = sg_page(sg);
 
-   __free_pages(page, compound_order(page));
+   for (j = 0; j < NUM_ORDERS; j++) {
+   if (compound_order(page) == orders[j])
+   break;
+   }
+   drm_page_pool_add(&pools[j], page);
}
sg_free_table(table);
kfree(buffer);
@@ -324,7 +336,9 @@ static struct page *alloc_largest_available(unsigned long 
size,
if (max_order < orders[i])
continue;
 
-   page = alloc_pages(order_flags[i], orders[i]);
+   page = drm_page_pool_remove(&pools[i]);
+   if (!page)
+   page = alloc_pages(order_flags[i], orders[i]);
if (!page)
continue;
return page;
@@ -425,6 +439,12 @@ static const struct dma_heap_ops system_heap_ops = {
 static int system_heap_create(void)
 {
struct dma_heap_export_info exp_info;
+   int i;
+
+   for (i = 0; i < NUM_ORDERS; i++) {
+   drm_page_pool_init(&pools[i], orders[i],
+  system_heap_free_pages);
+   }
 
exp_info.name = "system";
exp_info.ops = &system_heap_ops;
-- 
2.25.1

[PATCH v9 1/5] drm: Add a sharable drm page-pool implementation

2021-06-29 Thread John Stultz

This adds a shrinker controlled page pool, extracted
out of the ttm_pool logic, and abstracted out a bit
so it can be used by other non-ttm drivers.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v8:
* Completely rewritten from scratch, using only the
  ttm_pool logic so it can be dual licensed.
v9:
* Add Kerneldoc comments similar to tmm implementation
  as suggested by ChristianK
* Mark some functions static as suggested by ChristianK
* Fix locking issue ChristianK pointed out
* Add methods to block the shrinker so users can
  make atomic calculations across multiple pools, as
  suggested by ChristianK
* Fix up Kconfig dependency issue as Reported-by:
  kernel test robot 
---
 drivers/gpu/drm/Kconfig |   3 +
 drivers/gpu/drm/Makefile|   2 +
 drivers/gpu/drm/page_pool.c | 297 
 include/drm/page_pool.h |  68 +
 4 files changed, 370 insertions(+)
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 3c16bd1afd87..52d9ba92b35e 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -177,6 +177,9 @@ config DRM_DP_CEC
  Note: not all adapters support this feature, and even for those
  that do support this they often do not hook up the CEC pin.
 
+config DRM_PAGE_POOL
+   bool
+
 config DRM_TTM
tristate
depends on DRM && MMU
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 5279db4392df..affa4ca3a08e 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -39,6 +39,8 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
 drm_ttm_helper-y := drm_gem_ttm_helper.o
 obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
 
+drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
+
 drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
drm_dsc.o drm_probe_helper.o \
drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \
diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
new file mode 100644
index ..c07bbe3afc32
--- /dev/null
+++ b/drivers/gpu/drm/page_pool.c
@@ -0,0 +1,297 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Sharable page pool implementation
+ *
+ * Extracted from drivers/gpu/drm/ttm/ttm_pool.c
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ * Copyright 2021 Linaro Ltd.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Christian König, John Stultz
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+static unsigned long page_pool_size; /* max size of the pool */
+
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
+module_param(page_pool_size, ulong, 0644);
+
+static atomic_long_t nr_managed_pages;
+
+static struct mutex shrinker_lock;
+static struct list_head shrinker_list;
+static struct shrinker mm_shrinker;
+
+/**
+ * drm_page_pool_set_max - Sets maximum size of all pools
+ *
+ * Sets the maximum number of pages allows in all pools.
+ * This can only be set once, and the first caller wins.
+ */
+void drm_page_pool_set_max(unsigned long max)
+{
+   if (!page_pool_size)
+   page_pool_size = max;
+}
+
+/**
+ * drm_page_pool_get_max - Maximum size of all pools
+ *
+ * Return the maximum number of pages allows in all pools
+ */
+unsigned long drm_page_pool_get_max(void)
+{
+   return page_pool_size;
+}
+
+/**
+ * drm_page_pool_get_total - Current size of all pools
+ *
+ * Return the number of pag

[PATCH v9 0/5] Generic page pool & deferred freeing for system dmabuf hea

2021-06-29 Thread John Stultz

After an unfortunately long pause (covid work-schedule burnout),
I wanted to revive and resubmit this series.

As before, the point of this series is trying to add both a page
pool as well as deferred-freeingto the DMA-BUF system heap to
improve allocation performance (so that it can match or beat the
old ION system heaps performance).

The combination of the page pool along with deferred freeing
allows us to offload page-zeroing out of the allocation hot
path. This was done originally with ION and this patch series
allows the DMA-BUF system heap to match ION's system heap
allocation performance in a simple microbenchmark [1] (ION
re-added to the kernel for comparision, running on an x86 vm
image):

./dmabuf-heap-bench -i 0 1 system
Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
-
dmabuf heap: alloc 4096 bytes 5000 times in 79314244 ns  15862 ns/call
ion heap:alloc 4096 bytes 5000 times in 107390769 ns 21478 ns/call
dmabuf heap: alloc 1048576 bytes 5000 times in 259083419 ns  51816 ns/call
ion heap:alloc 1048576 bytes 5000 times in 340497344 ns  68099 ns/call
dmabuf heap: alloc 8388608 bytes 5000 times in 2603105563 ns 520621 ns/call
ion heap:alloc 8388608 bytes 5000 times in 3613592860 ns 722718 ns/call
dmabuf heap: alloc 33554432 bytes 5000 times in 12212492979 ns   2442498 ns/call
ion heap:alloc 33554432 bytes 5000 times in 14584157792 ns   2916831 ns/call


Daniel didn't like earlier attempts to re-use the network
page-pool code to achieve this, and suggested the ttm_pool be
used instead, so this series pulls the page pool functionality
out of the ttm_pool logic and creates a generic page pool
that can be shared.

New in v9:
* Tried to address Christian König's feedback on the page pool
  changes (Kerneldoc, static functions, locking issues, duplicative
  order tracking)
* Fix up Kconfig dependency issue as Reported-by:
  kernel test robot 
* Fix compiler warning Reported-by:
  kernel test robot 

I know Christian had some less specific feedback on the deferred free
work that I'd like to revisit, but I wanted to restart the discussion
with this new series, rather then trying to dregdge up and reply to
a ~4mo old thread.

Input would be greatly appreciated. Testing as well, as I don't
have any development hardware that utilizes the ttm pool.

Thanks
-john

[1] 
https://android.googlesource.com/platform/system/memory/libdmabufheap/+/refs/heads/master/tests/dmabuf_heap_bench.c

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (5):
  drm: Add a sharable drm page-pool implementation
  drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  dma-buf: system_heap: Add drm pagepool support to system heap
  dma-buf: heaps: Add deferred-free-helper library code
  dma-buf: system_heap: Add deferred freeing to the system heap

 drivers/dma-buf/heaps/Kconfig|   5 +
 drivers/dma-buf/heaps/Makefile   |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 138 +
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 
 drivers/dma-buf/heaps/system_heap.c  |  46 ++-
 drivers/gpu/drm/Kconfig  |   4 +
 drivers/gpu/drm/Makefile |   2 +
 drivers/gpu/drm/page_pool.c  | 297 +++
 drivers/gpu/drm/ttm/ttm_pool.c   | 167 ++-
 include/drm/page_pool.h  |  68 +
 include/drm/ttm/ttm_pool.h   |  14 +-
 11 files changed, 643 insertions(+), 154 deletions(-)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

-- 
2.25.1

Re: [PATCH v4 22/24] drm/msm/dsi: remove temp data from global pll structure

2021-06-11 Thread John Stultz

On Fri, Jun 11, 2021 at 2:01 AM Dmitry Baryshkov
 wrote:
>
> Hi,
>
> On Fri, 11 Jun 2021 at 10:07, John Stultz  wrote:
> >
> > On Wed, Mar 31, 2021 at 3:58 AM Dmitry Baryshkov
> >  wrote:
> > >
> > > The 7nm, 10nm and 14nm drivers would store interim data used during
> > > VCO/PLL rate setting in the global dsi_pll_Nnm structure. Move this data
> > > structures to the onstack storage. While we are at it, drop
> > > unused/static 'config' data, unused config fields, etc.
> > >
> > > Signed-off-by: Dmitry Baryshkov 
> > > Reviewed-by: Abhinav Kumar 
> > > Tested-by: Stephen Boyd  # on sc7180 lazor
> >
> > Hey Dmitry,
> >   Just wanted to give you a heads up.  Peter Collingbourne reported
> > today that his db845c wasn't booting to display for him on his 4k
> > monitor. It works fine on a 1080p screen, and while 4k isn't supported
> > (yet?),  normally the board should fall back to 1080p when connected
> > to a 4k monitor.  I was able to reproduce this myself and I see the
> > errors below[1].
>
> It looks like I made a mistake testing these patches with the splash
> screen disabled.
> Stephen Boyd has proposed a fix few days ago (will be included into
> the 5.13). Could you check that it fixes the problem for you?
>
> https://lore.kernel.org/linux-arm-msm/20210608195519.125561-1-swb...@chromium.org/

Ah! This does seem to fix it! Thank you so much for pointing it out!

thanks
-john

Re: [PATCH v4 22/24] drm/msm/dsi: remove temp data from global pll structure

2021-06-11 Thread John Stultz

On Wed, Mar 31, 2021 at 3:58 AM Dmitry Baryshkov
 wrote:
>
> The 7nm, 10nm and 14nm drivers would store interim data used during
> VCO/PLL rate setting in the global dsi_pll_Nnm structure. Move this data
> structures to the onstack storage. While we are at it, drop
> unused/static 'config' data, unused config fields, etc.
>
> Signed-off-by: Dmitry Baryshkov 
> Reviewed-by: Abhinav Kumar 
> Tested-by: Stephen Boyd  # on sc7180 lazor

Hey Dmitry,
  Just wanted to give you a heads up.  Peter Collingbourne reported
today that his db845c wasn't booting to display for him on his 4k
monitor. It works fine on a 1080p screen, and while 4k isn't supported
(yet?),  normally the board should fall back to 1080p when connected
to a 4k monitor.  I was able to reproduce this myself and I see the
errors below[1].

I dug back and found that things were working ok on v5.12 w/ the
recently merged commit d1a97648ae028 ("drm/bridge: lt9611: Fix
handling of 4k panels"), and started digging around.

Seeing a bunch of changes to the
drivers/gpu/drm/msm/dsi/phy/dsi_phy_10nm.c file, I tried reverting a
chunk of the changes since 5.12 to that, and that got it working
again. I've narrowed it down to this change -
001d8dc33875 ("drm/msm/dsi: remove temp data from global pll
structure") upstream (also reverting following 6e2ad9c3bfca and
36c5dde5fdf0 first - but its reverting this change that actually makes
it work again).

I've not managed to really look into the change to see what might be
going wrong yet (its late and I'm about to crash), but I wanted to
give you a heads up. If you have any ideas for me to try I'm happy to
give them a go.

thanks
-john

[1]:
[   19.846857] msm_dsi_phy ae94400.dsi-phy:
[drm:dsi_pll_10nm_vco_prepare] *ERROR* DSI PLL(0) lock failed,
status=0x
[   19.857925] msm_dsi_phy ae94400.dsi-phy:
[drm:dsi_pll_10nm_vco_prepare] *ERROR* PLL(0) lock failed
[   19.866978] dsi_link_clk_enable_6g: Failed to enable dsi byte clk
[   19.873124] msm_dsi_host_power_on: failed to enable link clocks. ret=-110
[   19.879987] dsi_mgr_bridge_pre_enable: power on host 0 failed, -110
[   19.886309] Turning OFF PHY while PLL is on
[   20.415019] lt9611 10-003b: video check: hactive_a=0, hactive_b=0,
vactive=0, v_total=0, h_total_sysclk=0
[   20.481062] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
[   20.489306] [drm:dpu_kms_wait_for_commit_done:453] [dpu error]wait
for commit done returned -110
[   20.513031] [drm:dpu_encoder_frame_done_timeout:2161] [dpu
error]enc31 frame done timeout
[   20.553059] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
[   20.561300] [drm:dpu_kms_wait_for_commit_done:453] [dpu error]wait
for commit done returned -110
[   20.625054] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
[   20.633299] [drm:dpu_kms_wait_for_commit_done:453] [dpu error]wait
for commit done returned -110
[   20.657033] [drm:dpu_encoder_frame_done_timeout:2161] [dpu
error]enc31 frame done timeout
[   20.697065] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
[   20.705316] [drm:dpu_kms_wait_for_commit_done:453] [dpu error]wait
for commit done returned -110
[   20.769066] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
[   20.777330] [drm:dpu_kms_wait_for_commit_done:453] [dpu error]wait
for commit done returned -110
[   20.801035] [drm:dpu_encoder_frame_done_timeout:2161] [dpu
error]enc31 frame done timeout
[   20.845049] [drm:dpu_encoder_phys_vid_wait_for_commit_done:528]
[dpu error]vblank timeout
...

Re: [PATCH v5 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation

2021-05-21 Thread John Stultz

On Fri, May 21, 2021 at 2:40 AM Lee Jones  wrote:
> On Tue, 10 Nov 2020 at 03:49, John Stultz  wrote:
>> This series reworks the system heap to use sgtables, and then
>> consolidates the pagelist method from the heap-helpers into the
>> CMA heap. After which the heap-helpers logic is removed (as it
>> is unused). I'd still like to find a better way to avoid some of
>> the logic duplication in implementing the entire dma_buf_ops
>> handlers per heap. But unfortunately that code is tied somewhat
>> to how the buffer's memory is tracked. As more heaps show up I
>> think we'll have a better idea how to best share code, so for
>> now I think this is ok.
>>
>> After this, the series introduces an optimization that
>> Ørjan Eide implemented for ION that avoids calling sync on
>> attachments that don't have a mapping.
>>
>> Next, an optimization to use larger order pages for the system
>> heap. This change brings us closer to the current performance
>> of the ION allocation code (though there still is a gap due
>> to ION using a mix of deferred-freeing and page pools, I'll be
>> looking at integrating those eventually).
>>
>> Finally, a reworked version of my uncached system heap
>> implementation I was submitting a few weeks back. Since it
>> duplicated a lot of the now reworked system heap code, I
>> realized it would be much simpler to add the functionality to
>> the system_heap implementation itself.
>>
>> While not improving the core allocation performance, the
>> uncached heap allocations do result in *much* improved
>> performance on HiKey960 as it avoids a lot of flushing and
>> invalidating buffers that the cpu doesn't touch often.
>>
>
>
> John, did this ever make it past v5?  I don't see a follow-up.

So most of these have landed upstream already.

The one exception is the system-uncached heap implementation, as
DanielV wanted a usecase where it was beneficial to a device with an
open driver.
Unfortunately this hasn't been trivial to show with the open gpu
devices I have, but taking Nicolas Dufresne's note, we're looking to
enable v4l2 integration in AOSP on db845c, so we can hopefully show
some benefit there. The HAL integration work has been taking some time
to get working though.

So it's a bit blocked on that for now.

thanks
-john

Re: [PATCH] drm/msm/dpu: Delete bonkers code

2021-04-30 Thread John Stultz

On Fri, Apr 30, 2021 at 10:14 AM Rob Clark  wrote:
>
> From: Rob Clark 
>
> dpu_crtc_atomic_flush() was directly poking it's attached planes in a
> code path that ended up in dpu_plane_atomic_update(), even if the plane
> was not involved in the current atomic update.  While a bit dubious,
> this worked before because plane->state would always point to something
> valid.  But now using drm_atomic_get_new_plane_state() we could get a
> NULL state pointer instead, leading to:
>
>[   20.873273] Call trace:
>[   20.875740]  dpu_plane_atomic_update+0x5c/0xed0
>[   20.880311]  dpu_plane_restore+0x40/0x88
>[   20.884266]  dpu_crtc_atomic_flush+0xf4/0x208
>[   20.888660]  drm_atomic_helper_commit_planes+0x150/0x238
>[   20.894014]  msm_atomic_commit_tail+0x1d4/0x7a0
>[   20.898579]  commit_tail+0xa4/0x168
>[   20.902102]  drm_atomic_helper_commit+0x164/0x178
>[   20.906841]  drm_atomic_commit+0x54/0x60
>[   20.910798]  drm_atomic_connector_commit_dpms+0x10c/0x118
>[   20.916236]  drm_mode_obj_set_property_ioctl+0x1e4/0x440
>[   20.921588]  drm_connector_property_set_ioctl+0x60/0x88
>[   20.926852]  drm_ioctl_kernel+0xd0/0x120
>[   20.930807]  drm_ioctl+0x21c/0x478
>[   20.934235]  __arm64_sys_ioctl+0xa8/0xe0
>[   20.938193]  invoke_syscall+0x64/0x130
>[   20.941977]  el0_svc_common.constprop.3+0x5c/0xe0
>[   20.946716]  do_el0_svc+0x80/0xa0
>[   20.950058]  el0_svc+0x20/0x30
>[   20.953145]  el0_sync_handler+0x88/0xb0
>[   20.957014]  el0_sync+0x13c/0x140
>
> The reason for the codepath seems dubious, the atomic suspend/resume
> heplers should handle the power-collapse case.  If not, the CRTC's
> atomic_check() should be adding the planes to the atomic update.

Thanks! This patch gets things booting again!

Tested-by: John Stultz 

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v8 5/5] dma-buf: system_heap: Add deferred freeing to the system heap

2021-03-04 Thread John Stultz

Utilize the deferred free helper library in the system heap.

This provides a nice performance bump and puts the
system heap performance on par with ION.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Rework deferred-free api to use reason enum as suggested by
  Suren Baghdasaryan
* Rework for deferred-free api change to use nr_pages rather
  than size as suggsted by Suren Baghdasaryan
v8:
* Reworked to drop buffer zeroing logic, as the drm_page_pool now
  handles that.
---
 drivers/dma-buf/heaps/Kconfig   |  1 +
 drivers/dma-buf/heaps/system_heap.c | 28 ++--
 2 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index 7e28934e0def..10632ccfb4a5 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -5,6 +5,7 @@ config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
select DRM_PAGE_POOL
+   select DMABUF_HEAPS_DEFERRED_FREE
help
  Choose this option to enable the system dmabuf heap. The system heap
  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 006271881d85..c753c82fd9f1 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -22,6 +22,7 @@
 #include 
 
 #include 
+#include "deferred-free-helper.h"
 
 static struct dma_heap *sys_heap;
 
@@ -33,6 +34,7 @@ struct system_heap_buffer {
struct sg_table sg_table;
int vmap_cnt;
void *vaddr;
+   struct deferred_freelist_item deferred_free;
 };
 
 struct dma_heap_attachment {
@@ -290,27 +292,41 @@ static unsigned long system_heap_free_pages(struct 
drm_page_pool *pool, struct p
return 1 << pool->order;
 }
 
-static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+static void system_heap_buf_free(struct deferred_freelist_item *item,
+enum df_reason reason)
 {
-   struct system_heap_buffer *buffer = dmabuf->priv;
+   struct system_heap_buffer *buffer;
struct sg_table *table;
struct scatterlist *sg;
int i, j;
 
+   buffer = container_of(item, struct system_heap_buffer, deferred_free);
table = &buffer->sg_table;
for_each_sg(table->sgl, sg, table->nents, i) {
struct page *page = sg_page(sg);
 
-   for (j = 0; j < NUM_ORDERS; j++) {
-   if (compound_order(page) == orders[j])
-   break;
+   if (reason == DF_UNDER_PRESSURE) {
+   __free_pages(page, compound_order(page));
+   } else {
+   for (j = 0; j < NUM_ORDERS; j++) {
+   if (compound_order(page) == orders[j])
+   break;
+   }
+   drm_page_pool_add(&pools[j], page);
}
-   drm_page_pool_add(&pools[j], page);
}
sg_free_table(table);
kfree(buffer);
 }
 
+static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+   struct system_heap_buffer *buffer = dmabuf->priv;
+   int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE;
+
+   deferred_free(&buffer->deferred_free, system_heap_buf_free, npages);
+}
+
 static const struct dma_buf_ops system_heap_buf_ops = {
.attach = system_heap_attach,
.detach = system_heap_detach,
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v8 4/5] dma-buf: system_heap: Add drm pagepool support to system heap

2021-03-04 Thread John Stultz

Utilize the drm pagepool code to speed up allocation
performance.

This is similar to the ION pagepool usage, but tries to
utilize generic code instead of a custom implementation.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Fix build issue caused by selecting PAGE_POOL w/o NET
  as Reported-by: kernel test robot 
v3:
* Simplify the page zeroing logic a bit by using kmap_atomic
  instead of vmap as suggested by Daniel Mentz
v5:
* Shift away from networking page pool completely to
  dmabuf page pool implementation
v6:
* Switch again to using the drm_page_pool code shared w/
  ttm_pool
v7:
* Slight rework for drm_page_pool changes
v8:
* Rework to use the rewritten drm_page_pool logic
* Drop explicit buffer zeroing, as the drm_page_pool handles that
---
 drivers/dma-buf/heaps/Kconfig   |  1 +
 drivers/dma-buf/heaps/system_heap.c | 27 ---
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index f7aef8bc7119..7e28934e0def 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -4,6 +4,7 @@ config DMABUF_HEAPS_DEFERRED_FREE
 config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
+   select DRM_PAGE_POOL
help
  Choose this option to enable the system dmabuf heap. The system heap
  is backed by pages from the buddy allocator. If in doubt, say Y.
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 29e49ac17251..006271881d85 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+
 static struct dma_heap *sys_heap;
 
 struct system_heap_buffer {
@@ -53,6 +55,7 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, 
LOW_ORDER_GFP};
  */
 static const unsigned int orders[] = {8, 4, 0};
 #define NUM_ORDERS ARRAY_SIZE(orders)
+struct drm_page_pool pools[NUM_ORDERS];
 
 static struct sg_table *dup_sg_table(struct sg_table *table)
 {
@@ -281,18 +284,28 @@ static void system_heap_vunmap(struct dma_buf *dmabuf, 
struct dma_buf_map *map)
dma_buf_map_clear(map);
 }
 
+static unsigned long system_heap_free_pages(struct drm_page_pool *pool, struct 
page *p)
+{
+   __free_pages(p, pool->order);
+   return 1 << pool->order;
+}
+
 static void system_heap_dma_buf_release(struct dma_buf *dmabuf)
 {
struct system_heap_buffer *buffer = dmabuf->priv;
struct sg_table *table;
struct scatterlist *sg;
-   int i;
+   int i, j;
 
table = &buffer->sg_table;
for_each_sg(table->sgl, sg, table->nents, i) {
struct page *page = sg_page(sg);
 
-   __free_pages(page, compound_order(page));
+   for (j = 0; j < NUM_ORDERS; j++) {
+   if (compound_order(page) == orders[j])
+   break;
+   }
+   drm_page_pool_add(&pools[j], page);
}
sg_free_table(table);
kfree(buffer);
@@ -323,7 +336,9 @@ static struct page *alloc_largest_available(unsigned long 
size,
if (max_order < orders[i])
continue;
 
-   page = alloc_pages(order_flags[i], orders[i]);
+   page = drm_page_pool_remove(&pools[i]);
+   if (!page)
+   page = alloc_pages(order_flags[i], orders[i]);
if (!page)
continue;
return page;
@@ -423,6 +438,12 @@ static const struct dma_heap_ops system_heap_ops = {
 static int system_heap_create(void)
 {
struct dma_heap_export_info exp_info;
+   int i;
+
+   for (i = 0; i < NUM_ORDERS; i++) {
+   drm_page_pool_init(&pools[i], orders[i],
+  system_heap_free_pages);
+   }
 
exp_info.name = "system";
exp_info.ops = &system_heap_ops;
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH v8 3/5] dma-buf: heaps: Add deferred-free-helper library code

2021-03-04 Thread John Stultz

This patch provides infrastructure for deferring buffer frees.

This is a feature ION provided which when used with some form
of a page pool, provides a nice performance boost in an
allocation microbenchmark. The reason it helps is it allows the
page-zeroing to be done out of the normal allocation/free path,
and pushed off to a kthread.

As not all heaps will find this useful, its implemented as
a optional helper library that heaps can utilize.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Fix sleep in atomic issue from using a mutex, by switching
  to a spinlock as Reported-by: kernel test robot 
* Cleanup API to use a reason enum for clarity and add some documentation
  comments as suggested by Suren Baghdasaryan.
v3:
* Minor tweaks so it can be built as a module
* A few small fixups suggested by Daniel Mentz
v4:
* Tweak from Daniel Mentz to make sure the shrinker
  count/freed values are tracked in pages not bytes
v5:
* Fix up page count tracking as suggested by Suren Baghdasaryan
v7:
* Rework accounting to use nr_pages rather then size, as suggested
  by Suren Baghdasaryan
---
 drivers/dma-buf/heaps/Kconfig|   3 +
 drivers/dma-buf/heaps/Makefile   |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 
 4 files changed, 197 insertions(+)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..f7aef8bc7119 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -1,3 +1,6 @@
+config DMABUF_HEAPS_DEFERRED_FREE
+   tristate
+
 config DMABUF_HEAPS_SYSTEM
bool "DMA-BUF System Heap"
depends on DMABUF_HEAPS
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..4e7839875615 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_DMABUF_HEAPS_DEFERRED_FREE) += deferred-free-helper.o
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)  += system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o
diff --git a/drivers/dma-buf/heaps/deferred-free-helper.c 
b/drivers/dma-buf/heaps/deferred-free-helper.c
new file mode 100644
index ..e19c8b68dfeb
--- /dev/null
+++ b/drivers/dma-buf/heaps/deferred-free-helper.c
@@ -0,0 +1,138 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Deferred dmabuf freeing helper
+ *
+ * Copyright (C) 2020 Linaro, Ltd.
+ *
+ * Based on the ION page pool code
+ * Copyright (C) 2011 Google, Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "deferred-free-helper.h"
+
+static LIST_HEAD(free_list);
+static size_t list_nr_pages;
+wait_queue_head_t freelist_waitqueue;
+struct task_struct *freelist_task;
+static DEFINE_SPINLOCK(free_list_lock);
+
+void deferred_free(struct deferred_freelist_item *item,
+  void (*free)(struct deferred_freelist_item*,
+   enum df_reason),
+  size_t nr_pages)
+{
+   unsigned long flags;
+
+   INIT_LIST_HEAD(&item->list);
+   item->nr_pages = nr_pages;
+   item->free = free;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   list_add(&item->list, &free_list);
+   list_nr_pages += nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+   wake_up(&freelist_waitqueue);
+}
+EXPORT_SYMBOL_GPL(deferred_free);
+
+static size_t free_one_item(enum df_reason reason)
+{
+   unsigned long flags;
+   size_t nr_pages;
+   struct deferred_freelist_item *item;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   if (list_empty(&free_list)) {
+   spin_unlock_irqrestore(&free_list_lock, flags);
+   return 0;
+   }
+   item = list_first_entry(&free_list, struct deferred_freelist_item, 
list);
+   list_del(&item->list);
+   nr_pages = item->nr_pages;
+   list_nr_pages -= nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+
+   item->free(item, reason);
+   return nr_pages;
+}
+
+static unsigned long get_freelist_nr_pages(void)
+{
+   unsigned long nr_pages;
+   unsigned long flags;
+
+   spin_lock_irqsave(&free_list_lock, flags);
+   nr_pages = list_nr_pages;
+   spin_unlock_irqrestore(&free_list_lock, flags);
+

[PATCH v8 2/5] drm: ttm_pool: Rework ttm_pool to use drm_page_pool

2021-03-04 Thread John Stultz

This patch reworks the ttm_pool logic to utilize the recently
added drm_page_pool code.

This adds drm_page_pool structures to the ttm_pool_type
structures, and then removes all the ttm_pool_type shrinker
logic (as its handled in the drm_page_pool shrinker).

NOTE: There is one mismatch in the interfaces I'm not totally
happy with. The ttm_pool tracks all of its pooled pages across
a number of different pools, and tries to keep this size under
the specified page_pool_size value. With the drm_page_pool,
there may other users, however there is still one global
shrinker list of pools. So we can't easily reduce the ttm
pool under the ttm specified size without potentially doing
a lot of shrinking to other non-ttm pools. So either we can:
  1) Try to split it so each user of drm_page_pools manages its
 own pool shrinking.
  2) Push the max value into the drm_page_pool, and have it
 manage shrinking to fit under that global max. Then share
 those size/max values out so the ttm_pool debug output
 can have more context.

I've taken the second path in this patch set, but wanted to call
it out so folks could look closely.

Thoughts would be greatly appreciated here!

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v7:
* Major refactoring to use drm_page_pools inside the
  ttm_pool_type structure. This allows us to use container_of to
  get the needed context to free a page. This also means less
  code is changed overall.
v8:
* Reworked to use the new cleanly rewritten drm_page_pool logic
---
 drivers/gpu/drm/Kconfig|   1 +
 drivers/gpu/drm/ttm/ttm_pool.c | 156 ++---
 include/drm/ttm/ttm_pool.h |   6 +-
 3 files changed, 31 insertions(+), 132 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7cbcecb8f7df..a6cbdb63f6c7 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -184,6 +184,7 @@ config DRM_PAGE_POOL
 config DRM_TTM
tristate
depends on DRM && MMU
+   select DRM_PAGE_POOL
help
  GPU memory management subsystem for devices with multiple
  GPU memory types. Will be enabled automatically if a device driver
diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c
index 6e27cb1bf48b..f74ea801d7ab 100644
--- a/drivers/gpu/drm/ttm/ttm_pool.c
+++ b/drivers/gpu/drm/ttm/ttm_pool.c
@@ -39,6 +39,7 @@
 #include 
 #endif
 
+#include 
 #include 
 #include 
 #include 
@@ -68,8 +69,6 @@ static struct ttm_pool_type 
global_dma32_write_combined[MAX_ORDER];
 static struct ttm_pool_type global_dma32_uncached[MAX_ORDER];
 
 static struct mutex shrinker_lock;
-static struct list_head shrinker_list;
-static struct shrinker mm_shrinker;
 
 /* Allocate pages of size 1 << order with the given gfp_flags */
 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags,
@@ -125,8 +124,9 @@ static struct page *ttm_pool_alloc_page(struct ttm_pool 
*pool, gfp_t gfp_flags,
 }
 
 /* Reset the caching and pages of size 1 << order */
-static void ttm_pool_free_page(struct ttm_pool *pool, enum ttm_caching caching,
-  unsigned int order, struct page *p)
+static unsigned long ttm_pool_free_page(struct ttm_pool *pool,
+   enum ttm_caching caching,
+   unsigned int order, struct page *p)
 {
unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
struct ttm_pool_dma *dma;
@@ -142,7 +142,7 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum 
ttm_caching caching,
 
if (!pool || !pool->use_dma_alloc) {
__free_pages(p, order);
-   return;
+   return 1UL << order;
}
 
if (order)
@@ -153,6 +153,16 @@ static void ttm_pool_free_page(struct ttm_pool *pool, enum 
ttm_caching caching,
dma_free_attrs(pool->dev, (1UL << order) * PAGE_SIZE, vaddr, dma->addr,
   attr);
kfree(dma);
+   return 1UL << order;
+}
+
+static unsigned long ttm_subpool_free_page(struct drm_page_pool *subpool,
+  struct page *p)
+{
+   struct ttm_pool_type *pt;
+
+   pt = container_of(subpool, struct ttm_pool_type, subpool);
+   return ttm_pool_free_page(pt->pool, pt->caching, pt->order, p);
 }
 
 /* Apply a new caching to an array of pages */
@@ -216,40 +226,6 @@ static void ttm_pool_unmap(struct ttm_pool *pool, 
dma_addr_t dma_addr,
   DMA_BIDIRECTIONAL);
 }
 
-/* Give pages into a specific pool_t

[PATCH v8 1/5] drm: Add a sharable drm page-pool implementation

2021-03-04 Thread John Stultz

This adds a shrinker controlled page pool, extracted
out of the ttm_pool logic, and abstracted out a bit
so it can be used by other non-ttm drivers.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v8:
* Completely rewritten from scratch, using only the
  ttm_pool logic so it can be dual licensed.
---
 drivers/gpu/drm/Kconfig |   4 +
 drivers/gpu/drm/Makefile|   2 +
 drivers/gpu/drm/page_pool.c | 214 
 include/drm/page_pool.h |  65 +++
 4 files changed, 285 insertions(+)
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index e392a90ca687..7cbcecb8f7df 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -177,6 +177,10 @@ config DRM_DP_CEC
  Note: not all adapters support this feature, and even for those
  that do support this they often do not hook up the CEC pin.
 
+config DRM_PAGE_POOL
+   bool
+   depends on DRM
+
 config DRM_TTM
tristate
depends on DRM && MMU
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 926adef289db..2dc7b2fe3fe5 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -39,6 +39,8 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
 drm_ttm_helper-y := drm_gem_ttm_helper.o
 obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
 
+drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
+
 drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
drm_dsc.o drm_probe_helper.o \
drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \
diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
new file mode 100644
index ..a60b954cfe0f
--- /dev/null
+++ b/drivers/gpu/drm/page_pool.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0 OR MIT
+/*
+ * Sharable page pool implementation
+ *
+ * Extracted from drivers/gpu/drm/ttm/ttm_pool.c
+ * Copyright 2020 Advanced Micro Devices, Inc.
+ * Copyright 2021 Linaro Ltd.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Christian König, John Stultz
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+static unsigned long page_pool_size;
+
+MODULE_PARM_DESC(page_pool_size, "Number of pages in the WC/UC/DMA pool");
+module_param(page_pool_size, ulong, 0644);
+
+static atomic_long_t allocated_pages;
+
+static struct mutex shrinker_lock;
+static struct list_head shrinker_list;
+static struct shrinker mm_shrinker;
+
+void drm_page_pool_set_max(unsigned long max)
+{
+   if (!page_pool_size)
+   page_pool_size = max;
+}
+
+unsigned long drm_page_pool_get_max(void)
+{
+   return page_pool_size;
+}
+
+unsigned long drm_page_pool_get_total(void)
+{
+   return atomic_long_read(&allocated_pages);
+}
+
+unsigned long drm_page_pool_get_size(struct drm_page_pool *pool)
+{
+   unsigned long size;
+
+   spin_lock(&pool->lock);
+   size = pool->page_count;
+   spin_unlock(&pool->lock);
+   return size;
+}
+
+/* Give pages into a specific pool */
+void drm_page_pool_add(struct drm_page_pool *pool, struct page *p)
+{
+   unsigned int i, num_pages = 1 << pool->order;
+
+   for (i = 0; i < num_pages; ++i) {
+   if (PageHighMem(p))
+   clear_highpage(p + i);
+   else
+   clear_page(page_address(p + i));
+   }
+
+   spin_lock(&pool->lock);
+

[PATCH v8 0/5] Generic page pool & deferred freeing for system dmabuf heap

2021-03-04 Thread John Stultz

Apologies for letting so much time pass since the last revision!

The point of this series is trying to add both deferred-freeing
logic as well as a page pool to the DMA-BUF system heap to improve
allocation performance.

This is desired, as the combination of deferred freeing along
with the page pool allows us to offload page-zeroing out of
the allocation hot path. This was done originally with ION
and this patch series allows the DMA-BUF system heap to match
ION's system heap allocation performance in a simple
microbenchmark [1] (ION re-added to the kernel for comparision,
running on an x86 vm image):

./dmabuf-heap-bench -i 0 1 system
Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
-
dmabuf heap: alloc 4096 bytes 5000 times in 88092722 ns  17618 ns/call
ion heap:alloc 4096 bytes 5000 times in 103043547 ns 20608 ns/call
dmabuf heap: alloc 1048576 bytes 5000 times in 252416639 ns  50483 ns/call
ion heap:alloc 1048576 bytes 5000 times in 358190744 ns  71638 ns/call
dmabuf heap: alloc 8388608 bytes 5000 times in 2854351310 ns 570870 ns/call
ion heap:alloc 8388608 bytes 5000 times in 3676328905 ns 735265 ns/call
dmabuf heap: alloc 33554432 bytes 5000 times in 13208119197 ns   2641623 ns/call
ion heap:alloc 33554432 bytes 5000 times in 15306975287 ns   3061395 ns/call


Daniel didn't like earlier attempts to re-use the network
page-pool code to achieve this, and suggested the ttm_pool be
used instead, so this series pulls the page pool functionality
out of the ttm_pool logic and creates a generic page pool
that can be shared.

New in v7 (never submitted):
* Reworked how I integrated the page pool with the ttm logic
  to use container of to avoid allocating structures per page. 

New in v8:
* Due to the dual license requirement from Christian König
  I completely threw away the earlier shared page pool
  implementation (which had evolved from ion code), and
  rewrote it using just the ttm_pool logic. My apologies
  for any previously reviewed issues that I've reintroduced
  in doing so.

Input would be greatly appreciated. Testing as well, as I don't
have any development hardware that utilizes the ttm pool.

thanks
-john

[1] 
https://android.googlesource.com/platform/system/memory/libdmabufheap/+/refs/heads/master/tests/dmabuf_heap_bench.c

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (5):
  drm: Add a sharable drm page-pool implementation
  drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  dma-buf: heaps: Add deferred-free-helper library code
  dma-buf: system_heap: Add drm pagepool support to system heap
  dma-buf: system_heap: Add deferred freeing to the system heap

 drivers/dma-buf/heaps/Kconfig|   5 +
 drivers/dma-buf/heaps/Makefile   |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 138 
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 +
 drivers/dma-buf/heaps/system_heap.c  |  47 +++-
 drivers/gpu/drm/Kconfig  |   5 +
 drivers/gpu/drm/Makefile |   2 +
 drivers/gpu/drm/page_pool.c  | 214 +++
 drivers/gpu/drm/ttm/ttm_pool.c   | 156 +++---
 include/drm/page_pool.h  |  65 ++
 include/drm/ttm/ttm_pool.h   |   6 +-
 11 files changed, 557 insertions(+), 137 deletions(-)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RESEND][PATCH v2 2/2] dma-buf: heaps: Fix the name used when exporting dmabufs to be the actual heap name

2021-03-01 Thread John Stultz

By default dma_buf_export() sets the exporter name to be
KBUILD_MODNAME. Unfortunately this may not be identical to the
string used as the heap name (ie: "system" vs "system_heap").

This can cause some minor confusion with tooling, and there is
the future potential where multiple heap types may be exported
by the same module (but would all have the same name).

So to avoid all this, set the exporter exp_name to the heap name.

Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/cma_heap.c| 1 +
 drivers/dma-buf/heaps/system_heap.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 5d64eccd21d6..0c05b79870f9 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -339,6 +339,7 @@ static struct dma_buf *cma_heap_allocate(struct dma_heap 
*heap,
buffer->pagecount = pagecount;
 
/* create the dmabuf */
+   exp_info.exp_name = dma_heap_get_name(heap);
exp_info.ops = &cma_heap_buf_ops;
exp_info.size = buffer->len;
exp_info.flags = fd_flags;
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 29e49ac17251..23a7e74ef966 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -390,6 +390,7 @@ static struct dma_buf *system_heap_allocate(struct dma_heap 
*heap,
}
 
/* create the dmabuf */
+   exp_info.exp_name = dma_heap_get_name(heap);
exp_info.ops = &system_heap_buf_ops;
exp_info.size = buffer->len;
exp_info.flags = fd_flags;
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[RESEND][PATCH v2 1/2] dma-buf: dma-heap: Provide accessor to get heap name

2021-03-01 Thread John Stultz

It can be useful to access the name for the heap,
so provide an accessor to do so.

Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
v2:
* Make sure to use "const char *" as Reported-by: kernel test robot 

---
 drivers/dma-buf/dma-heap.c | 12 
 include/linux/dma-heap.h   |  9 +
 2 files changed, 21 insertions(+)

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 6b5db954569f..56bf5ad01ad5 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -202,6 +202,18 @@ void *dma_heap_get_drvdata(struct dma_heap *heap)
return heap->priv;
 }
 
+/**
+ * dma_heap_get_name() - get heap name
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The char* for the heap name.
+ */
+const char *dma_heap_get_name(struct dma_heap *heap)
+{
+   return heap->name;
+}
+
 struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info)
 {
struct dma_heap *heap, *h, *err_ret;
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
index 5bc5c946af58..0c05561cad6e 100644
--- a/include/linux/dma-heap.h
+++ b/include/linux/dma-heap.h
@@ -50,6 +50,15 @@ struct dma_heap_export_info {
  */
 void *dma_heap_get_drvdata(struct dma_heap *heap);
 
+/**
+ * dma_heap_get_name() - get heap name
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The char* for the heap name.
+ */
+const char *dma_heap_get_name(struct dma_heap *heap);
+
 /**
  * dma_heap_add - adds a heap to dmabuf heaps
  * @exp_info:  information needed to register this heap
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH] dma-buf: heaps: Set VM_PFNMAP in mmap for system and cma heaps

2021-03-01 Thread John Stultz

On Sat, Feb 27, 2021 at 1:44 AM Christoph Hellwig  wrote:
>
> On Fri, Feb 26, 2021 at 08:36:55AM +0100, Daniel Vetter wrote:
> > Also given that both deal with struct page there's a ton of divergence
> > between these two that doesn't make much sense. Maybe could even share
> > the code fully, aside from how you allocate the struct pages.
>
> I've been saying that since the code was first submitted.  Once pages
> are allocated from CMA they should be treated not different from normal
> pages.
>
> Please take a look at how the DMA contigous allocator manages to share
> all code for handling CMA vs alloc_pages pages.

I'll take a look at that! Thanks for the pointer!
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2] drm/bridge: lt9611: Fix handling of 4k panels

2021-03-01 Thread John Stultz

On Thu, Jan 21, 2021 at 1:50 AM Robert Foss  wrote:
> +Sam Ravnborg
>
> I think this patch is ready to get pulled into the drm-misc tree.
>
> On Thu, 17 Dec 2020 at 15:09, Robert Foss  wrote:
> >
> > 4k requires two dsi pipes, so don't report MODE_OK when only a
> > single pipe is configured. But rather report MODE_PANEL to
> > signal that requirements of the panel are not being met.

Hey All, I just wanted to follow up on this patch as it seems like it
missed 5.12 ?

Just wanted to make sure it didn't slip through the cracks.

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] dma-buf: heaps: Set VM_PFNMAP in mmap for system and cma heaps

2021-02-25 Thread John Stultz

Per discussion and patches here:
  
https://lore.kernel.org/dri-devel/20210223105951.912577-1-daniel.vet...@ffwll.ch/

Daniel is planning on making VM_PFNMAP required on dmabufs.

Thus to avoid the warn_on noise, set the VM_PFNMAP in the
system and cma heap's mmap handler.

Cc: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Ezequiel Garcia 
Cc: Simon Ser 
Cc: James Jones 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/cma_heap.c| 1 +
 drivers/dma-buf/heaps/system_heap.c | 4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 364fc2f3e499..34bc3987f942 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -185,6 +185,7 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma)
 
vma->vm_ops = &dma_heap_vm_ops;
vma->vm_private_data = buffer;
+   vma->vm_flags |= VM_PFNMAP;
 
return 0;
 }
diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 3548b20cb98c..8995e3cbfcaf 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -228,8 +228,10 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct 
vm_area_struct *vma)
return ret;
addr += PAGE_SIZE;
if (addr >= vma->vm_end)
-   return 0;
+   break;
}
+
+   vma->vm_flags |= VM_PFNMAP;
return 0;
 }
 
-- 
2.25.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap

2021-02-25 Thread John Stultz

On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter  wrote:
>
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
> From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
> References: 
> https://lore.kernel.org/lkml/cakmk7uhi+mg0z0humnt13qccvuturvjpcr0njrl12k-wbwz...@mail.gmail.com/
> Acked-by: Christian König 
> Cc: Jason Gunthorpe 
> Cc: Suren Baghdasaryan 
> Cc: Matthew Wilcox 
> Cc: John Stultz 
> Signed-off-by: Daniel Vetter 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> ---
>  drivers/dma-buf/dma-buf.c | 15 +--
>  1 file changed, 13 insertions(+), 2 deletions(-)


So I gave this a spin in a few of my environments, and with the
current dmabuf heaps it spews a lot of warnings.

I'm testing some simple fixes to add:
vma->vm_flags |= VM_PFNMAP;

to the dmabuf heap mmap ops, which we might want to queue along side of this.

So assuming those can land together.
Acked-by: John Stultz 

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1016 matches

Mail list logo