Re: [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap

2020-10-07 Thread John Stultz
On Mon, Oct 5, 2020 at 6:45 AM Christoph Hellwig  wrote:
>
> How is this going to deal with VIVT caches?

Hrm. That's a good question.   I'm not sure I totally have my head
around it but, I guess we could make sure to call
invalidate_kernel_vmap_range() in begin_cpu_access()  and
flush_kernel_vmap_range() in end_cpu_access() rather then exiting out
early as we do now?

Unless you have better guidance?

Worse case we could check CONFIG_CPU_CACHE_VIVT and not register the
system-uncached heap.

thanks
-john
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: linux-next: build failure after merge of the drm-misc tree

2020-10-07 Thread Stephen Rothwell
Hi all,

On Thu, 8 Oct 2020 14:09:03 +1100 Stephen Rothwell  
wrote:
>
> After merging the drm-misc tree, today's linux-next build (x86_64
> allmodconfig) failed like this:

In file included from include/linux/clk.h:13,
 from drivers/gpu/drm/ingenic/ingenic-drm-drv.c:10:
drivers/gpu/drm/ingenic/ingenic-drm-drv.c: In function 
'ingenic_drm_update_palette':
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' 
has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'?
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |   ^~~
include/linux/kernel.h:47:33: note: in definition of macro 'ARRAY_SIZE'
   47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
  | ^~~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' 
has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'?
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |   ^~~
include/linux/kernel.h:47:48: note: in definition of macro 'ARRAY_SIZE'
   47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
  |^~~
In file included from include/linux/bits.h:22,
 from include/linux/bitops.h:5,
 from drivers/gpu/drm/ingenic/ingenic-drm.h:10,
 from drivers/gpu/drm/ingenic/ingenic-drm-drv.c:7:
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' 
has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'?
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |   ^~~
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
   16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
  |  ^
include/linux/compiler.h:224:46: note: in expansion of macro '__same_type'
  224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
  |  ^~~
include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array'
   47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
  |   
^~~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 
'ARRAY_SIZE'
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |  ^~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' 
has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'?
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |   ^~~
include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
   16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
  |  ^
include/linux/compiler.h:224:46: note: in expansion of macro '__same_type'
  224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
  |  ^~~
include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array'
   47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
  |   
^~~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 
'ARRAY_SIZE'
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |  ^~
include/linux/build_bug.h:16:51: error: bit-field '' width not an 
integer constant
   16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
  |   ^
include/linux/compiler.h:224:28: note: in expansion of macro 'BUILD_BUG_ON_ZERO'
  224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
  |^
include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array'
   47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + 
__must_be_array(arr))
  |   
^~~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 
'ARRAY_SIZE'
  448 |  for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) {
  |  ^~
drivers/gpu/drm/ingenic/ingenic-drm-drv.c:453:9: error: 'struct ingenic_drm' 
has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'?
  453 |   priv->dma_hwdescs->palette[i] = color;
  |

Re: [PATCH 3/5] drm/vmwgfx: add a move callback.

2020-10-07 Thread Dave Airlie
On Thu, 8 Oct 2020 at 13:41, Zack Rusin  wrote:
>
>
> > On Oct 5, 2020, at 20:06, Dave Airlie  wrote:
> >
> > From: Dave Airlie 
> >
> > This just copies the fallback to vmwgfx, I'm going to iterate on this
> > a bit until it's not the same as the fallback path.
> >
> > Signed-off-by: Dave Airlie 
>
> What are your plans for it? i.e. how is it going to be different?

Initial plan is to put move_notify inside the move callback, then
eventually get rid of the ttm bind/unbind callback and let the driver
do that itself if needed.

I've got most of it in a branch (and I posted a 45 patch series a week
or two ago), but I need to rebase and clean it up for reposting.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 19/20] drm/tegra: Implement new UAPI

2020-10-07 Thread kernel test robot
Hi Mikko,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tegra-drm/drm/tegra/for-next]
[also build test ERROR on tegra/for-next linus/master v5.9-rc8 next-20201007]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
base:   git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next
config: arm64-randconfig-r004-20201008 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/6a3b3d79ce4488695cc0745edd19015fc2220d97
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
git checkout 6a3b3d79ce4488695cc0745edd19015fc2220d97
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All error/warnings (new ones prefixed by >>):

   In file included from drivers/gpu/drm/tegra/uapi/uapi.c:12:
>> drivers/gpu/drm/tegra/uapi/../drm.h:84:1: error: attempted to randomize 
>> userland API struct tegra_drm_client_ops
  84 | };
 | ^
>> drivers/gpu/drm/tegra/uapi/uapi.c:62:5: warning: no previous prototype for 
>> 'close_channel_ctx' [-Wmissing-prototypes]
  62 | int close_channel_ctx(int id, void *p, void *data)
 | ^
--
   In file included from drivers/gpu/drm/tegra/uapi/submit.c:18:
>> drivers/gpu/drm/tegra/uapi/../drm.h:84:1: error: attempted to randomize 
>> userland API struct tegra_drm_client_ops
  84 | };
 | ^

vim +84 drivers/gpu/drm/tegra/uapi/../drm.h

d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22  74  
53fa7f7204c97d drivers/gpu/host1x/drm/drm.h Thierry Reding  2013-09-24  75  
struct tegra_drm_client_ops {
53fa7f7204c97d drivers/gpu/host1x/drm/drm.h Thierry Reding  2013-09-24  76  
int (*open_channel)(struct tegra_drm_client *client,
c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding  2013-09-26  77  
struct tegra_drm_context *context);
c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding  2013-09-26  78  
void (*close_channel)(struct tegra_drm_context *context);
c40f0f1afcb1dc drivers/gpu/drm/tegra/drm.h  Thierry Reding  2013-10-10  79  
int (*is_addr_reg)(struct device *dev, u32 class, u32 offset);
0f563a4bf66e51 drivers/gpu/drm/tegra/drm.h  Dmitry Osipenko 2017-06-15  80  
int (*is_valid_class)(u32 class);
c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding  2013-09-26  81  
int (*submit)(struct tegra_drm_context *context,
d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22  82  
  struct drm_tegra_submit *args, struct drm_device *drm,
d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22  83  
  struct drm_file *file);
d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 @84  };
d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22  85  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[git pull] drm nouveau fixes for 5.9 final

2020-10-07 Thread Dave Airlie
Hi Linus,

Karol found two last minute nouveau fixes, they both fix crashes, the
TTM one follows what other drivers do already, and the other is for
bailing on load on unrecognised chipsets.

Thanks,
Dave.

drm-fixes-2020-10-08:
drm nouveau fixes for 5.9 final

nouveau:
- fix crash in TTM alloc fail path
- return error earlier for unknown chipsets
The following changes since commit 86fdf61e71046618f6f499542cee12f2348c523c:

  Merge tag 'drm-misc-fixes-2020-10-01' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2020-10-06
12:38:28 +1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2020-10-08

for you to fetch changes up to d10285a25e29f13353bbf7760be8980048c1ef2f:

  drm/nouveau/mem: guard against NULL pointer access in mem_del
(2020-10-07 15:33:09 +1000)


drm nouveau fixes for 5.9 final

nouveau:
- fix crash in TTM alloc fail path
- return error earlier for unknown chipsets


Karol Herbst (2):
  drm/nouveau/device: return error for unknown chipsets
  drm/nouveau/mem: guard against NULL pointer access in mem_del

 drivers/gpu/drm/nouveau/nouveau_mem.c | 2 ++
 drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 1 +
 2 files changed, 3 insertions(+)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/5] drm/vmwgfx: add a move callback.

2020-10-07 Thread Zack Rusin


> On Oct 5, 2020, at 20:06, Dave Airlie  wrote:
> 
> From: Dave Airlie 
> 
> This just copies the fallback to vmwgfx, I'm going to iterate on this
> a bit until it's not the same as the fallback path.
> 
> Signed-off-by: Dave Airlie 

What are your plans for it? i.e. how is it going to be different?

Reviewed-by: Zack Rusin 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/5] drm/vmwgfx: move null mem checks outside move notifies

2020-10-07 Thread Zack Rusin

> On Oct 5, 2020, at 20:06, Dave Airlie  wrote:
> 
> From: Dave Airlie 
> 
> Both fns checked mem == NULL, just move the check outside.
> 
> Signed-off-by: Dave Airlie 

That’s a nice cleanup.

Reviewed-by: Zack Rusin 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


linux-next: build failure after merge of the drm-misc tree

2020-10-07 Thread Stephen Rothwell
Hi all,

After merging the drm-misc tree, today's linux-next build (x86_64
allmodconfig) failed like this:

I noticed that the ingenic driver revert I had been waiting for appeared
in hte drm-misc tree, so I removed the BROKEN dependency for it, but it
produced the above errors, so I have marked it BROKEN again.

-- 
Cheers,
Stephen Rothwell


pgpvnTVhAesU2.pgp
Description: OpenPGP digital signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 07/13] mm: close race in generic_access_phys

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

- gpu drivers dynamically manage their memory nowadays, invalidating
   ptes with unmap_mapping_range when buffers get moved

- contiguous dma allocations have moved from dedicated carvetouts to


s/carvetouts/carveouts/


   cma regions. This means if we miss the unmap the pfn might contain
   pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

- even /dev/mem now invalidates mappings when the kernel requests that
   iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
   ("/dev/mem: Revoke mappings when a driver claims the region")


Thanks for putting these references into the log, it's very helpful.
...

diff --git a/mm/memory.c b/mm/memory.c
index fcfc4ca36eba..8d467e23b44e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4873,28 +4873,68 @@ int follow_phys(struct vm_area_struct *vma,
return ret;
  }
  
+/**

+ * generic_access_phys - generic implementation for iomem mmap access
+ * @vma: the vma to access
+ * @addr: userspace addres, not relative offset within @vma
+ * @buf: buffer to read/write
+ * @len: length of transfer
+ * @write: set to FOLL_WRITE when writing, otherwise reading
+ *
+ * This is a generic implementation for &vm_operations_struct.access for an
+ * iomem mapping. This callback is used by access_process_vm() when the @vma is
+ * not page based.
+ */
  int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
void *buf, int len, int write)
  {
resource_size_t phys_addr;
unsigned long prot = 0;
void __iomem *maddr;
+   pte_t *ptep, pte;
+   spinlock_t *ptl;
int offset = addr & (PAGE_SIZE-1);
+   int ret = -EINVAL;
+
+   if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
+   return -EINVAL;
+
+retry:
+   if (follow_pte(vma->vm_mm, addr, &ptep, &ptl))
+   return -EINVAL;
+   pte = *ptep;
+   pte_unmap_unlock(ptep, ptl);
  
-	if (follow_phys(vma, addr, write, &prot, &phys_addr))

+   prot = pgprot_val(pte_pgprot(pte));
+   phys_addr = (resource_size_t)pte_pfn(pte) << PAGE_SHIFT;
+
+   if ((write & FOLL_WRITE) && !pte_write(pte))
return -EINVAL;
  
  	maddr = ioremap_prot(phys_addr, PAGE_ALIGN(len + offset), prot);

if (!maddr)
return -ENOMEM;
  
+	if (follow_pte(vma->vm_mm, addr, &ptep, &ptl))

+   goto out_unmap;
+
+   if (pte_same(pte, *ptep)) {



The ioremap area is something I'm sorta new to, so a newbie question:
is it possible for the same pte to already be there, ever? If so, we
be stuck in an infinite loop here.  I'm sure that's not the case, but
it's not yet obvious to me why it's impossible. Resource reservations
maybe?


thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 09/20] gpu: host1x: DMA fences and userspace fence creation

2020-10-07 Thread kernel test robot
Hi Mikko,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on tegra-drm/drm/tegra/for-next]
[also build test WARNING on tegra/for-next linus/master v5.9-rc8 next-20201007]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
base:   git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next
config: arm64-randconfig-r004-20201008 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/c4f5ec983027f2b19e6854a362e23a79e1630100
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
git checkout c4f5ec983027f2b19e6854a362e23a79e1630100
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> drivers/gpu/host1x/fence.c:105:6: warning: no previous prototype for 
>> 'host1x_fence_signal' [-Wmissing-prototypes]
 105 | void host1x_fence_signal(struct host1x_syncpt_fence *f)
 |  ^~~

vim +/host1x_fence_signal +105 drivers/gpu/host1x/fence.c

   104  
 > 105  void host1x_fence_signal(struct host1x_syncpt_fence *f)
   106  {
   107  if (atomic_xchg(&f->signaling, 1))
   108  return;
   109  
   110  /*
   111   * Cancel pending timeout work - if it races, it will
   112   * not get 'f->signaling' and return.
   113   */
   114  cancel_delayed_work_sync(&f->timeout_work);
   115  
   116  host1x_intr_put_ref(f->sp->host, f->sp->id, f->waiter_ref);
   117  
   118  dma_fence_signal(&f->base);
   119  dma_fence_put(&f->base);
   120  }
   121  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm/fourcc: Add AXBXGXRX106106106106 format

2020-10-07 Thread Joe Perches
On Wed, 2020-10-07 at 10:27 +0100, Matteo Franchin wrote:
> Add ABGR format with 10-bit components packed in 64-bit per pixel.
> This format can be used to handle
> VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 on little-endian
> architectures.

trivial note:

> diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c
[]
> @@ -202,6 +202,7 @@ const struct drm_format_info *__drm_format_info(u32 
> format)
>   { .format = DRM_FORMAT_XBGR16161616F,   .depth = 0,  
> .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 },
>   { .format = DRM_FORMAT_ARGB16161616F,   .depth = 0,  
> .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true 
> },
>   { .format = DRM_FORMAT_ABGR16161616F,   .depth = 0,  
> .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true 
> },
> + { .format = DRM_FORMAT_AXBXGXRX106106106106,.depth = 0,  
> .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true 
> },

My is to separate this into 2 lines so every
column including .depth on still visually aligns.

+   { .format = DRM_FORMAT_AXBXGXRX106106106106,
+   .depth = 0,  
.num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true },


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Dan Williams
On Wed, Oct 7, 2020 at 3:23 PM Dan Williams  wrote:
>
> On Wed, Oct 7, 2020 at 12:49 PM Daniel Vetter  wrote:
> >
> > On Wed, Oct 7, 2020 at 9:33 PM Dan Williams  
> > wrote:
> > >
> > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter  
> > > wrote:
> > > >
> > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > > > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > > > the default for all driver uses.
> > > >
> > > > Except there's two more ways to access pci bars: sysfs and proc mmap
> > > > support. Let's plug that hole.
> > >
> > > Ooh, yes, lets.
> > >
> > > > For revoke_devmem() to work we need to link our vma into the same
> > > > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > > > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > > > at ->open time, but that's a bit tricky here with all the entry points
> > > > and arch code. So instead create a fake file and adjust vma->vm_file.
> > >
> > > I don't think you want to share the devmem inode for this, this should
> > > be based off the sysfs inode which I believe there is already only one
> > > instance per resource. In contrast /dev/mem can have multiple inodes
> > > because anyone can just mknod a new character device file, the same
> > > problem does not exist for sysfs.
> >
> > But then I need to find the right one, plus I also need to find the
> > right one for the procfs side. That gets messy, and I already have no
> > idea how to really test this. Shared address_space is the same trick
> > we're using in drm (where we have multiple things all pointing to the
> > same underlying resources, through different files), and it gets the
> > job done. So that's why I figured the shared address_space is the
> > cleaner solution since then unmap_mapping_range takes care of
> > iterating over all vma for us. I guess I could reimplement that logic
> > with our own locking and everything in revoke_devmem, but feels a bit
> > silly. But it would also solve the problem of having mutliple
> > different mknod of /dev/kmem with different address_space behind them.
> > Also because of how remap_pfn_range works, all these vma do use the
> > same pgoff already anyway.
>
> True, remap_pfn_range() makes sure that ->pgoff is an absolute
> physical address offset for all use cases. So you might be able to
> just point proc_bus_pci_open() at the shared devmem address space. For
> sysfs it's messier. I think you would need to somehow get the inode
> from kernfs_fop_open() to adjust its address space, but only if the
> bin_file will ultimately be used for PCI memory.

To me this seems like a new sysfs_create_bin_file() flavor that
registers the file with the common devmem address_space.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 06/20] gpu: host1x: Cleanup and refcounting for syncpoints

2020-10-07 Thread kernel test robot
Hi Mikko,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tegra-drm/drm/tegra/for-next]
[also build test ERROR on tegra/for-next linus/master v5.9-rc8 next-20201007]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
base:   git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next
config: arm-allyesconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/3721bf9ddd2b05fe12b3512999f77351ae839d08
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403
git checkout 3721bf9ddd2b05fe12b3512999f77351ae839d08
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   drivers/staging/media/tegra-video/vi.c: In function 'tegra_channel_cleanup':
>> drivers/staging/media/tegra-video/vi.c:621:2: error: implicit declaration of 
>> function 'host1x_syncpt_free'; did you mean 'host1x_syncpt_read'? 
>> [-Werror=implicit-function-declaration]
 621 |  host1x_syncpt_free(chan->mw_ack_sp);
 |  ^~
 |  host1x_syncpt_read
   cc1: some warnings being treated as errors

vim +621 drivers/staging/media/tegra-video/vi.c

3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  616  
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  617  static void 
tegra_channel_cleanup(struct tegra_vi_channel *chan)
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  618  {
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  619 
v4l2_ctrl_handler_free(&chan->ctrl_handler);
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  620 
media_entity_cleanup(&chan->video.entity);
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 @621 
host1x_syncpt_free(chan->mw_ack_sp);
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  622 
host1x_syncpt_free(chan->frame_start_sp);
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  623 
mutex_destroy(&chan->video_lock);
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  624  }
3d8a97eabef0883 Sowjanya Komatineni 2020-05-04  625  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Dan Williams
On Wed, Oct 7, 2020 at 12:49 PM Daniel Vetter  wrote:
>
> On Wed, Oct 7, 2020 at 9:33 PM Dan Williams  wrote:
> >
> > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter  
> > wrote:
> > >
> > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > > the default for all driver uses.
> > >
> > > Except there's two more ways to access pci bars: sysfs and proc mmap
> > > support. Let's plug that hole.
> >
> > Ooh, yes, lets.
> >
> > > For revoke_devmem() to work we need to link our vma into the same
> > > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > > at ->open time, but that's a bit tricky here with all the entry points
> > > and arch code. So instead create a fake file and adjust vma->vm_file.
> >
> > I don't think you want to share the devmem inode for this, this should
> > be based off the sysfs inode which I believe there is already only one
> > instance per resource. In contrast /dev/mem can have multiple inodes
> > because anyone can just mknod a new character device file, the same
> > problem does not exist for sysfs.
>
> But then I need to find the right one, plus I also need to find the
> right one for the procfs side. That gets messy, and I already have no
> idea how to really test this. Shared address_space is the same trick
> we're using in drm (where we have multiple things all pointing to the
> same underlying resources, through different files), and it gets the
> job done. So that's why I figured the shared address_space is the
> cleaner solution since then unmap_mapping_range takes care of
> iterating over all vma for us. I guess I could reimplement that logic
> with our own locking and everything in revoke_devmem, but feels a bit
> silly. But it would also solve the problem of having mutliple
> different mknod of /dev/kmem with different address_space behind them.
> Also because of how remap_pfn_range works, all these vma do use the
> same pgoff already anyway.

True, remap_pfn_range() makes sure that ->pgoff is an absolute
physical address offset for all use cases. So you might be able to
just point proc_bus_pci_open() at the shared devmem address space. For
sysfs it's messier. I think you would need to somehow get the inode
from kernfs_fop_open() to adjust its address space, but only if the
bin_file will ultimately be used for PCI memory.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 06/13] media: videobuf2: Move frame_vector into media subsystem

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR
symbol from all over the tree (well just one place, somehow omap media
driver still had this in its Kconfig, despite not using it).

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: Mauro Carvalho Chehab 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Daniel Vetter 
---


Failed to spot any problems here. :)

Reviewed-by: John Hubbard 

thanks,
--
John Hubbard
NVIDIA


  drivers/media/common/videobuf2/Kconfig|  1 -
  drivers/media/common/videobuf2/Makefile   |  1 +
  .../media/common/videobuf2}/frame_vector.c|  2 +
  drivers/media/platform/omap/Kconfig   |  1 -
  include/linux/mm.h| 42 ---
  include/media/videobuf2-core.h| 42 +++
  mm/Kconfig|  3 --
  mm/Makefile   |  1 -
  8 files changed, 45 insertions(+), 48 deletions(-)
  rename {mm => drivers/media/common/videobuf2}/frame_vector.c (99%)

diff --git a/drivers/media/common/videobuf2/Kconfig 
b/drivers/media/common/videobuf2/Kconfig
index edbc99ebba87..d2223a12c95f 100644
--- a/drivers/media/common/videobuf2/Kconfig
+++ b/drivers/media/common/videobuf2/Kconfig
@@ -9,7 +9,6 @@ config VIDEOBUF2_V4L2
  
  config VIDEOBUF2_MEMOPS

tristate
-   select FRAME_VECTOR
  
  config VIDEOBUF2_DMA_CONTIG

tristate
diff --git a/drivers/media/common/videobuf2/Makefile 
b/drivers/media/common/videobuf2/Makefile
index 77bebe8b202f..54306f8d096c 100644
--- a/drivers/media/common/videobuf2/Makefile
+++ b/drivers/media/common/videobuf2/Makefile
@@ -1,5 +1,6 @@
  # SPDX-License-Identifier: GPL-2.0
  videobuf2-common-objs := videobuf2-core.o
+videobuf2-common-objs += frame_vector.o
  
  ifeq ($(CONFIG_TRACEPOINTS),y)

videobuf2-common-objs += vb2-trace.o
diff --git a/mm/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
similarity index 99%
rename from mm/frame_vector.c
rename to drivers/media/common/videobuf2/frame_vector.c
index 39db520a51dc..b95f4f371681 100644
--- a/mm/frame_vector.c
+++ b/drivers/media/common/videobuf2/frame_vector.c
@@ -8,6 +8,8 @@
  #include 
  #include 
  
+#include 

+
  /**
   * get_vaddr_frames() - map virtual addresses to pfns
   * @start:starting user address
diff --git a/drivers/media/platform/omap/Kconfig 
b/drivers/media/platform/omap/Kconfig
index f73b5893220d..de16de46c0f4 100644
--- a/drivers/media/platform/omap/Kconfig
+++ b/drivers/media/platform/omap/Kconfig
@@ -12,6 +12,5 @@ config VIDEO_OMAP2_VOUT
depends on VIDEO_V4L2
select VIDEOBUF2_DMA_CONTIG
select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3
-   select FRAME_VECTOR
help
  V4L2 Display driver support for OMAP2/3 based boards.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 16b799a0522c..acd60fbf1a5a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1743,48 +1743,6 @@ int account_locked_vm(struct mm_struct *mm, unsigned 
long pages, bool inc);
  int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
struct task_struct *task, bool bypass_rlim);
  
-/* Container for pinned pfns / pages */

-struct frame_vector {
-   unsigned int nr_allocated;  /* Number of frames we have space for */
-   unsigned int nr_frames; /* Number of frames stored in ptrs array */
-   bool got_ref;   /* Did we pin pages by getting page ref? */
-   bool is_pfns;   /* Does array contain pages or pfns? */
-   void *ptrs[];   /* Array of pinned pfns / pages. Use
-* pfns_vector_pages() or pfns_vector_pfns()
-* for access */
-};
-
-struct frame_vector *frame_vector_create(unsigned int nr_frames);
-void frame_vector_destroy(struct frame_vector *vec);
-int get_vaddr_frames(unsigned long start, unsigned int nr_pfns,
-unsigned int gup_flags, struct frame_vector *vec);
-void put_vaddr_frames(struct frame_vector *vec);
-int frame_vector_to_pages(struct frame_vector *vec);
-void frame_vector_to_pfns(struct frame_vector *vec);
-
-static inline unsigned int frame_vector_count(struct frame_vector *vec)
-{
-   return vec->nr_frames;
-}
-
-static inline struct page **frame_vector_pages(struct frame_vector *vec)
-{
-   if (vec->is_pfns) {
-   int err = frame_vector_to_pages(vec);
-
-   if (err)
-   return ERR_PTR(err);
-   }
-   return (struct page **)(vec->ptrs);
-}
-
-static inline unsigned long *frame_vector_pfns(struct frame_vector *vec)
-{
-   if (!vec->is

Re: [PATCH 1/2] drm/i915/dpcd_bl: Skip testing control capability with force DPCD quirk

2020-10-07 Thread Lyude Paul
Hi! I thought this patch rang a bell, we actually already had some discussion
about this since there's a couple of other systems this was causing issues for.
Unfortunately it never seems like that patch got sent out. Satadru?

(if I don't hear back from them soon, I'll just send out a patch for this
myself)

JFYI - the proper fix here is to just drop the
DP_EDP_BACKLIGHT_BRIGHTNESS_PWM_PIN_CAP check from the code entirely. As long as
the backlight supports AUX_SET_CAP, that should be enough for us to control it.


On Wed, 2020-10-07 at 14:58 +0800, Kai-Heng Feng wrote:
> HP DreamColor panel needs to be controlled via AUX interface. However,
> it has both DP_EDP_BACKLIGHT_BRIGHTNESS_AUX_SET_CAP and
> DP_EDP_BACKLIGHT_BRIGHTNESS_PWM_PIN_CAP set, so it fails to pass
> intel_dp_aux_display_control_capable() test.
> 
> Skip the test if the panel has force DPCD quirk.
> 
> Signed-off-by: Kai-Heng Feng 
> ---
>  drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> index acbd7eb66cbe..acf2e1c65290 100644
> --- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> +++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c
> @@ -347,9 +347,13 @@ int intel_dp_aux_init_backlight_funcs(struct
> intel_connector *intel_connector)
>   struct intel_panel *panel = &intel_connector->panel;
>   struct intel_dp *intel_dp = enc_to_intel_dp(intel_connector->encoder);
>   struct drm_i915_private *i915 = dp_to_i915(intel_dp);
> + bool force_dpcd;
> +
> + force_dpcd = drm_dp_has_quirk(&intel_dp->desc, intel_dp->edid_quirks,
> +   DP_QUIRK_FORCE_DPCD_BACKLIGHT);
>  
>   if (i915->params.enable_dpcd_backlight == 0 ||
> - !intel_dp_aux_display_control_capable(intel_connector))
> + (!force_dpcd &&
> !intel_dp_aux_display_control_capable(intel_connector)))
>   return -ENODEV;
>  
>   /*
> @@ -358,9 +362,7 @@ int intel_dp_aux_init_backlight_funcs(struct
> intel_connector *intel_connector)
>*/
>   if (i915->vbt.backlight.type !=
>   INTEL_BACKLIGHT_VESA_EDP_AUX_INTERFACE &&
> - i915->params.enable_dpcd_backlight != 1 &&
> - !drm_dp_has_quirk(&intel_dp->desc, intel_dp->edid_quirks,
> -   DP_QUIRK_FORCE_DPCD_BACKLIGHT)) {
> + i915->params.enable_dpcd_backlight != 1 && !force_dpcd) {
>   drm_info(&i915->drm,
>"Panel advertises DPCD backlight support, but "
>"VBT disagrees. If your backlight controls "
-- 
Sincerely,
  Lyude Paul (she/her)
  Software Engineer at Red Hat

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 11:37 PM John Hubbard  wrote:
>
> On 10/7/20 2:32 PM, Daniel Vetter wrote:
> > On Wed, Oct 7, 2020 at 10:33 PM John Hubbard  wrote:
> >>
> >> On 10/7/20 9:44 AM, Daniel Vetter wrote:
> ...
> >>> @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct 
> >>> g2d_data *g2d,
> >>>dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
> >>>  DMA_BIDIRECTIONAL, 0);
> >>>
> >>> - pages = frame_vector_pages(g2d_userptr->vec);
> >>> - if (!IS_ERR(pages)) {
> >>> - int i;
> >>> + for (i = 0; i < g2d_userptr->npages; i++)
> >>> + set_page_dirty_lock(g2d_userptr->pages[i]);
> >>>
> >>> - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
> >>> - set_page_dirty_lock(pages[i]);
> >>> - }
> >>> - put_vaddr_frames(g2d_userptr->vec);
> >>> - frame_vector_destroy(g2d_userptr->vec);
> >>> + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages);
> >>> + kvfree(g2d_userptr->pages);
> >>
> >> You can avoid writing your own loop, and just simplify the whole thing 
> >> down to
> >> two lines:
> >>
> >>  unpin_user_pages_dirty_lock(g2d_userptr->pages, 
> >> g2d_userptr->npages,
> >>  true);
> >>  kvfree(g2d_userptr->pages);
> >
> > Oh nice, this is neat. I'll also roll it out in the habanalabs patch,
> > that has the same thing. Well almost, it only uses set_page_dirty, not
> > the _lock variant. But I have no idea whether that matters or not?
>
>
> It matters. And invariably, call sites that use set_page_dirty() instead
> of set_page_dirty_lock() were already wrong.  Which is why I never had to
> provide anything like "unpin_user_pages_dirty (not locked)".
>
> Although in habanalabs case, I just reviewed patch 3 and I think they *were*
> correctly using set_page_dirty_lock()...

Yeah I mixed that up with some other code I read, habanalabs is using
_lock. I have seen a pile of gup/pup code though that only uses
set_page_dirty. And looking around I did not really parse the comment
above set_page_dirty(). I guess just using the _lock variant shouldn't
hurt too much. I've found a comment though from the infiniband umem
notifier that it's sometimes called with the page locked, and
sometimes not, so life is complicated there. But how it avoids races I
didn't understand.
-Daniel


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers

2020-10-07 Thread John Hubbard

On 10/7/20 2:32 PM, Daniel Vetter wrote:

On Wed, Oct 7, 2020 at 10:33 PM John Hubbard  wrote:


On 10/7/20 9:44 AM, Daniel Vetter wrote:

...

@@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
   dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
 DMA_BIDIRECTIONAL, 0);

- pages = frame_vector_pages(g2d_userptr->vec);
- if (!IS_ERR(pages)) {
- int i;
+ for (i = 0; i < g2d_userptr->npages; i++)
+ set_page_dirty_lock(g2d_userptr->pages[i]);

- for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
- set_page_dirty_lock(pages[i]);
- }
- put_vaddr_frames(g2d_userptr->vec);
- frame_vector_destroy(g2d_userptr->vec);
+ unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages);
+ kvfree(g2d_userptr->pages);


You can avoid writing your own loop, and just simplify the whole thing down to
two lines:

 unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages,
 true);
 kvfree(g2d_userptr->pages);


Oh nice, this is neat. I'll also roll it out in the habanalabs patch,
that has the same thing. Well almost, it only uses set_page_dirty, not
the _lock variant. But I have no idea whether that matters or not?



It matters. And invariably, call sites that use set_page_dirty() instead
of set_page_dirty_lock() were already wrong.  Which is why I never had to
provide anything like "unpin_user_pages_dirty (not locked)".

Although in habanalabs case, I just reviewed patch 3 and I think they *were*
correctly using set_page_dirty_lock()...

thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 10:33 PM John Hubbard  wrote:
>
> On 10/7/20 9:44 AM, Daniel Vetter wrote:
> > All we need are a pages array, pin_user_pages_fast can give us that
> > directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Jason Gunthorpe 
> > Cc: Inki Dae 
> > Cc: Joonyoung Shim 
> > Cc: Seung-Woo Kim 
> > Cc: Kyungmin Park 
> > Cc: Kukjin Kim 
> > Cc: Krzysztof Kozlowski 
> > Cc: Andrew Morton 
> > Cc: John Hubbard 
> > Cc: Jérôme Glisse 
> > Cc: Jan Kara 
> > Cc: Dan Williams 
> > Cc: linux...@kvack.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-samsung-...@vger.kernel.org
> > Cc: linux-me...@vger.kernel.org
> > ---
> >   drivers/gpu/drm/exynos/Kconfig  |  1 -
> >   drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 -
> >   2 files changed, 22 insertions(+), 27 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
> > index 6417f374b923..43257ef3c09d 100644
> > --- a/drivers/gpu/drm/exynos/Kconfig
> > +++ b/drivers/gpu/drm/exynos/Kconfig
> > @@ -88,7 +88,6 @@ comment "Sub-drivers"
> >   config DRM_EXYNOS_G2D
> >   bool "G2D"
> >   depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST
> > - select FRAME_VECTOR
> >   help
> > Choose this option if you want to use Exynos G2D for DRM.
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
> > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > index 967a5cdc120e..c83f6faac9de 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> > @@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr {
> >   dma_addr_t  dma_addr;
> >   unsigned long   userptr;
> >   unsigned long   size;
> > - struct frame_vector *vec;
> > + struct page **pages;
> > + unsigned intnpages;
> >   struct sg_table *sgt;
> >   atomic_trefcount;
> >   boolin_pool;
> > @@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data 
> > *g2d,
> >   bool force)
> >   {
> >   struct g2d_cmdlist_userptr *g2d_userptr = obj;
> > - struct page **pages;
> > + int i;
>
> The above line can also be deleted, see below.
>
> >
> >   if (!obj)
> >   return;
> > @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data 
> > *g2d,
> >   dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
> > DMA_BIDIRECTIONAL, 0);
> >
> > - pages = frame_vector_pages(g2d_userptr->vec);
> > - if (!IS_ERR(pages)) {
> > - int i;
> > + for (i = 0; i < g2d_userptr->npages; i++)
> > + set_page_dirty_lock(g2d_userptr->pages[i]);
> >
> > - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
> > - set_page_dirty_lock(pages[i]);
> > - }
> > - put_vaddr_frames(g2d_userptr->vec);
> > - frame_vector_destroy(g2d_userptr->vec);
> > + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages);
> > + kvfree(g2d_userptr->pages);
>
> You can avoid writing your own loop, and just simplify the whole thing down to
> two lines:
>
> unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages,
> true);
> kvfree(g2d_userptr->pages);

Oh nice, this is neat. I'll also roll it out in the habanalabs patch,
that has the same thing. Well almost, it only uses set_page_dirty, not
the _lock variant. But I have no idea whether that matters or not?
-Daniel

>
>
> >
> >   if (!g2d_userptr->out_of_list)
> >   list_del_init(&g2d_userptr->list);
> > @@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
> > g2d_data *g2d,
> >   offset = userptr & ~PAGE_MASK;
> >   end = PAGE_ALIGN(userptr + size);
> >   npages = (end - start) >> PAGE_SHIFT;
> > - g2d_userptr->vec = frame_vector_create(npages);
> > - if (!g2d_userptr->vec) {
> > + g2d_userptr->pages = kvmalloc_array(npages, 
> > sizeof(*g2d_userptr->pages),
> > + GFP_KERNEL);
> > + if (!g2d_userptr->pages) {
> >   ret = -ENOMEM;
> >   goto err_free;
> >   }
> >
> > - ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE,
> > - g2d_userptr->vec);
> > + ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
> > +   g2d_userptr->pages);
> >   if (ret != npages) {
> >   DRM_DEV_ERROR(g2d->dev,
> > "failed to get user pages from userptr.\n");
> >   if (ret < 0)
> > - goto err_destroy_framevec;
> > - ret = -EFAULT;
> > - goto err_put_framevec;
> > - }
> > - if (frame_vector_to_pages(g2d_userptr->

Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 11:13 PM John Hubbard  wrote:
>
> On 10/7/20 9:44 AM, Daniel Vetter wrote:
> > This is used by media/videbuf2 for persistent dma mappings, not just
> > for a single dma operation and then freed again, so needs
> > FOLL_LONGTERM.
> >
> > Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> > locking issues. Rework the code to pull the pup path out from the
> > mmap_sem critical section as suggested by Jason.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Jason Gunthorpe 
> > Cc: Pawel Osciak 
> > Cc: Marek Szyprowski 
> > Cc: Kyungmin Park 
> > Cc: Tomasz Figa 
> > Cc: Mauro Carvalho Chehab 
> > Cc: Andrew Morton 
> > Cc: John Hubbard 
> > Cc: Jérôme Glisse 
> > Cc: Jan Kara 
> > Cc: Dan Williams 
> > Cc: linux...@kvack.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-samsung-...@vger.kernel.org
> > Cc: linux-me...@vger.kernel.org
> > ---
> >   mm/frame_vector.c | 36 +++-
> >   1 file changed, 11 insertions(+), 25 deletions(-)
> >
> > diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> > index 10f82d5643b6..39db520a51dc 100644
> > --- a/mm/frame_vector.c
> > +++ b/mm/frame_vector.c
> > @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> > nr_frames,
> >   struct vm_area_struct *vma;
> >   int ret = 0;
> >   int err;
> > - int locked;
> >
> >   if (nr_frames == 0)
> >   return 0;
> > @@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> > nr_frames,
> >
> >   start = untagged_addr(start);
> >
> > + ret = pin_user_pages_fast(start, nr_frames,
> > +   FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> > +   (struct page **)(vec->ptrs));
> > + if (ret > 0) {
> > + vec->got_ref = true;
> > + vec->is_pfns = false;
> > + goto out_unlocked;
> > + }
>
> This part looks good, and changing to _fast is a potential performance 
> improvement,
> too.
>
> > +
> >   mmap_read_lock(mm);
> > - locked = 1;
> >   vma = find_vma_intersection(mm, start, start + 1);
> >   if (!vma) {
> >   ret = -EFAULT;
> >   goto out;
> >   }
> >
> > - /*
> > -  * While get_vaddr_frames() could be used for transient (kernel
> > -  * controlled lifetime) pinning of memory pages all current
> > -  * users establish long term (userspace controlled lifetime)
> > -  * page pinning. Treat get_vaddr_frames() like
> > -  * get_user_pages_longterm() and disallow it for filesystem-dax
> > -  * mappings.
> > -  */
> > - if (vma_is_fsdax(vma)) {
> > - ret = -EOPNOTSUPP;
> > - goto out;
> > - }
>
> Are you sure we don't need to check vma_is_fsdax() anymore?

Since FOLL_LONGTERM checks for this and can only return struct page
backed memory, and explicitly excludes VM_IO | VM_PFNMAP, was assuming
this is not needed for follow_pfn. And the get_user_pages_locked this
used back then didn't have the same check, hence why it was added (and
FOLL_LONGTERM still doesn't work for the _locked versions, as you
pointed out on the last round of this discussion).

But now that you're asking, I have no idea whether fsdax vma can also
be of VM_IO | VM_PFNMAP type. I'm not seeing that set anywhere in
fs/dax.c, but that says nothing :-)

Dan, you added this check originally, do we need it for VM_SPECIAL vmas too?

Thanks, Daniel

>
> > -
> > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> > - vec->got_ref = true;
> > - vec->is_pfns = false;
> > - ret = pin_user_pages_locked(start, nr_frames,
> > - gup_flags, (struct page **)(vec->ptrs), &locked);
> > - goto out;
> > - }
> > -
> >   vec->got_ref = false;
> >   vec->is_pfns = true;
> >   do {
> > @@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> > nr_frames,
> >   vma = find_vma_intersection(mm, start, start + 1);
> >   } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
> >   out:
> > - if (locked)
> > - mmap_read_unlock(mm);
> > + mmap_read_unlock(mm);
> > +out_unlocked:
> >   if (!ret)
> >   ret = -EFAULT;
> >   if (ret > 0)
> >
>
> All of the error handling still looks accurate there.
>
> thanks,
> --
> John Hubbard
> NVIDIA



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

This is used by media/videbuf2 for persistent dma mappings, not just
for a single dma operation and then freed again, so needs
FOLL_LONGTERM.

Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
locking issues. Rework the code to pull the pup path out from the
mmap_sem critical section as suggested by Jason.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: Mauro Carvalho Chehab 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
  mm/frame_vector.c | 36 +++-
  1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index 10f82d5643b6..39db520a51dc 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
struct vm_area_struct *vma;
int ret = 0;
int err;
-   int locked;
  
  	if (nr_frames == 0)

return 0;
@@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
  
  	start = untagged_addr(start);
  
+	ret = pin_user_pages_fast(start, nr_frames,

+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
+ (struct page **)(vec->ptrs));
+   if (ret > 0) {
+   vec->got_ref = true;
+   vec->is_pfns = false;
+   goto out_unlocked;
+   }


This part looks good, and changing to _fast is a potential performance 
improvement,
too.


+
mmap_read_lock(mm);
-   locked = 1;
vma = find_vma_intersection(mm, start, start + 1);
if (!vma) {
ret = -EFAULT;
goto out;
}
  
-	/*

-* While get_vaddr_frames() could be used for transient (kernel
-* controlled lifetime) pinning of memory pages all current
-* users establish long term (userspace controlled lifetime)
-* page pinning. Treat get_vaddr_frames() like
-* get_user_pages_longterm() and disallow it for filesystem-dax
-* mappings.
-*/
-   if (vma_is_fsdax(vma)) {
-   ret = -EOPNOTSUPP;
-   goto out;
-   }


Are you sure we don't need to check vma_is_fsdax() anymore?


-
-   if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
-   vec->got_ref = true;
-   vec->is_pfns = false;
-   ret = pin_user_pages_locked(start, nr_frames,
-   gup_flags, (struct page **)(vec->ptrs), &locked);
-   goto out;
-   }
-
vec->got_ref = false;
vec->is_pfns = true;
do {
@@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
vma = find_vma_intersection(mm, start, start + 1);
} while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
  out:
-   if (locked)
-   mmap_read_unlock(mm);
+   mmap_read_unlock(mm);
+out_unlocked:
if (!ret)
ret = -EFAULT;
if (ret > 0)



All of the error handling still looks accurate there.

thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 04/13] misc/habana: Use FOLL_LONGTERM for userptr

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

These are persistent, not just for the duration of a dma operation.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Oded Gabbay 
Cc: Omer Shpigelman 
Cc: Ofir Bitton 
Cc: Tomer Tayar 
Cc: Moti Haimovski 
Cc: Daniel Vetter 
Cc: Greg Kroah-Hartman 
Cc: Pawel Piskorski 
---
  drivers/misc/habanalabs/common/memory.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/memory.c 
b/drivers/misc/habanalabs/common/memory.c
index ef89cfa2f95a..94bef8faa82a 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -1288,7 +1288,8 @@ static int get_user_memory(struct hl_device *hdev, u64 
addr, u64 size,
return -ENOMEM;
}
  
-	rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,

+   rc = pin_user_pages_fast(start, npages,
+FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
 userptr->pages);
  
  	if (rc != npages) {




Again, from a pin_user_pages_fast() point of view, and not being at all familiar
with the habana driver (but their use of this really does seem clearly 
_LONGTERM!):

Reviewed-by: John Hubbard 

thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 02/13] drm/exynos: Use FOLL_LONGTERM for g2d cmdlists

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

The exynos g2d interface is very unusual, but it looks like the
userptr objects are persistent. Hence they need FOLL_LONGTERM.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Inki Dae 
Cc: Joonyoung Shim 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index c83f6faac9de..514fd000feb1 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -478,7 +478,8 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data 
*g2d,
goto err_free;
}
  
-	ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,

+   ret = pin_user_pages_fast(start, npages,
+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
  g2d_userptr->pages);
if (ret != npages) {
DRM_DEV_ERROR(g2d->dev,



Looks good from a pin_user_pages_fast() point of view. I'm of course not a 
exynos
developer, so we still need a look from one of those, ideally, but:

Reviewed-by: John Hubbard 

thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 03/13] misc/habana: Stop using frame_vector helpers

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:
...

@@ -1414,15 +1410,10 @@ void hl_unpin_host_memory(struct hl_device *hdev, 
struct hl_userptr *userptr)
userptr->sgt->nents,
userptr->dir);
  
-	pages = frame_vector_pages(userptr->vec);

-   if (!IS_ERR(pages)) {
-   int i;
-
-   for (i = 0; i < frame_vector_count(userptr->vec); i++)
-   set_page_dirty_lock(pages[i]);
-   }
-   put_vaddr_frames(userptr->vec);
-   frame_vector_destroy(userptr->vec);
+   for (i = 0; i < userptr->npages; i++)
+   set_page_dirty_lock(userptr->pages[i]);
+   unpin_user_pages(userptr->pages, userptr->npages);
+   kvfree(userptr->pages);


Same thing here as in patch 1: you can further simplify by using
unpin_user_pages_dirty_lock().

  
  	list_del(&userptr->job_node);
  



thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers

2020-10-07 Thread John Hubbard

On 10/7/20 9:44 AM, Daniel Vetter wrote:

All we need are a pages array, pin_user_pages_fast can give us that
directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Inki Dae 
Cc: Joonyoung Shim 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
  drivers/gpu/drm/exynos/Kconfig  |  1 -
  drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 -
  2 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
index 6417f374b923..43257ef3c09d 100644
--- a/drivers/gpu/drm/exynos/Kconfig
+++ b/drivers/gpu/drm/exynos/Kconfig
@@ -88,7 +88,6 @@ comment "Sub-drivers"
  config DRM_EXYNOS_G2D
bool "G2D"
depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST
-   select FRAME_VECTOR
help
  Choose this option if you want to use Exynos G2D for DRM.
  
diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c

index 967a5cdc120e..c83f6faac9de 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr {
dma_addr_t  dma_addr;
unsigned long   userptr;
unsigned long   size;
-   struct frame_vector *vec;
+   struct page **pages;
+   unsigned intnpages;
struct sg_table *sgt;
atomic_trefcount;
boolin_pool;
@@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
bool force)
  {
struct g2d_cmdlist_userptr *g2d_userptr = obj;
-   struct page **pages;
+   int i;


The above line can also be deleted, see below.

  
  	if (!obj)

return;
@@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
  DMA_BIDIRECTIONAL, 0);
  
-	pages = frame_vector_pages(g2d_userptr->vec);

-   if (!IS_ERR(pages)) {
-   int i;
+   for (i = 0; i < g2d_userptr->npages; i++)
+   set_page_dirty_lock(g2d_userptr->pages[i]);
  
-		for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)

-   set_page_dirty_lock(pages[i]);
-   }
-   put_vaddr_frames(g2d_userptr->vec);
-   frame_vector_destroy(g2d_userptr->vec);
+   unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages);
+   kvfree(g2d_userptr->pages);


You can avoid writing your own loop, and just simplify the whole thing down to
two lines:

unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages,
true);
kvfree(g2d_userptr->pages);


  
  	if (!g2d_userptr->out_of_list)

list_del_init(&g2d_userptr->list);
@@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
g2d_data *g2d,
offset = userptr & ~PAGE_MASK;
end = PAGE_ALIGN(userptr + size);
npages = (end - start) >> PAGE_SHIFT;
-   g2d_userptr->vec = frame_vector_create(npages);
-   if (!g2d_userptr->vec) {
+   g2d_userptr->pages = kvmalloc_array(npages, sizeof(*g2d_userptr->pages),
+   GFP_KERNEL);
+   if (!g2d_userptr->pages) {
ret = -ENOMEM;
goto err_free;
}
  
-	ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE,

-   g2d_userptr->vec);
+   ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
+ g2d_userptr->pages);
if (ret != npages) {
DRM_DEV_ERROR(g2d->dev,
  "failed to get user pages from userptr.\n");
if (ret < 0)
-   goto err_destroy_framevec;
-   ret = -EFAULT;
-   goto err_put_framevec;
-   }
-   if (frame_vector_to_pages(g2d_userptr->vec) < 0) {
+   goto err_destroy_pages;
+   npages = ret;
ret = -EFAULT;
-   goto err_put_framevec;
+   goto err_unpin_pages;
}
+   g2d_userptr->npages = npages;
  
  	sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);

if (!sgt) {
ret = -ENOMEM;
-   goto err_put_framevec;
+   goto err_unpin_pages;
}
  
  	ret = sg_alloc_table_from_pages(sgt,

-   frame_vector_pages(g2d_userptr->vec),
+   g2d_userptr->pages,

Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 9:33 PM Dan Williams  wrote:
>
> On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter  wrote:
> >
> > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > the default for all driver uses.
> >
> > Except there's two more ways to access pci bars: sysfs and proc mmap
> > support. Let's plug that hole.
>
> Ooh, yes, lets.
>
> > For revoke_devmem() to work we need to link our vma into the same
> > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > at ->open time, but that's a bit tricky here with all the entry points
> > and arch code. So instead create a fake file and adjust vma->vm_file.
>
> I don't think you want to share the devmem inode for this, this should
> be based off the sysfs inode which I believe there is already only one
> instance per resource. In contrast /dev/mem can have multiple inodes
> because anyone can just mknod a new character device file, the same
> problem does not exist for sysfs.

But then I need to find the right one, plus I also need to find the
right one for the procfs side. That gets messy, and I already have no
idea how to really test this. Shared address_space is the same trick
we're using in drm (where we have multiple things all pointing to the
same underlying resources, through different files), and it gets the
job done. So that's why I figured the shared address_space is the
cleaner solution since then unmap_mapping_range takes care of
iterating over all vma for us. I guess I could reimplement that logic
with our own locking and everything in revoke_devmem, but feels a bit
silly. But it would also solve the problem of having mutliple
different mknod of /dev/kmem with different address_space behind them.
Also because of how remap_pfn_range works, all these vma do use the
same pgoff already anyway.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 11/13] mm: add unsafe_follow_pfn

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 9:00 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 08:10:34PM +0200, Daniel Vetter wrote:
> > On Wed, Oct 7, 2020 at 7:36 PM Jason Gunthorpe  wrote:
> > >
> > > On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote:
> > > > Way back it was a reasonable assumptions that iomem mappings never
> > > > change the pfn range they point at. But this has changed:
> > > >
> > > > - gpu drivers dynamically manage their memory nowadays, invalidating
> > > > ptes with unmap_mapping_range when buffers get moved
> > > >
> > > > - contiguous dma allocations have moved from dedicated carvetouts to
> > > > cma regions. This means if we miss the unmap the pfn might contain
> > > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
> > > >
> > > > - even /dev/mem now invalidates mappings when the kernel requests that
> > > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> > > > ("/dev/mem: Revoke mappings when a driver claims the region")
> > > >
> > > > Accessing pfns obtained from ptes without holding all the locks is
> > > > therefore no longer a good idea.
> > > >
> > > > Unfortunately there's some users where this is not fixable (like v4l
> > > > userptr of iomem mappings) or involves a pile of work (vfio type1
> > > > iommu). For now annotate these as unsafe and splat appropriately.
> > > >
> > > > This patch adds an unsafe_follow_pfn, which later patches will then
> > > > roll out to all appropriate places.
> > > >
> > > > Signed-off-by: Daniel Vetter 
> > > > Cc: Jason Gunthorpe 
> > > > Cc: Kees Cook 
> > > > Cc: Dan Williams 
> > > > Cc: Andrew Morton 
> > > > Cc: John Hubbard 
> > > > Cc: Jérôme Glisse 
> > > > Cc: Jan Kara 
> > > > Cc: Dan Williams 
> > > > Cc: linux...@kvack.org
> > > > Cc: linux-arm-ker...@lists.infradead.org
> > > > Cc: linux-samsung-...@vger.kernel.org
> > > > Cc: linux-me...@vger.kernel.org
> > > > Cc: k...@vger.kernel.org
> > > >  include/linux/mm.h |  2 ++
> > > >  mm/memory.c| 32 +++-
> > > >  mm/nommu.c | 17 +
> > > >  security/Kconfig   | 13 +
> > > >  4 files changed, 63 insertions(+), 1 deletion(-)
> > >
> > > Makes sense to me.
> > >
> > > I wonder if we could change the original follow_pfn to require the
> > > ptep and then lockdep_assert_held() it against the page table lock?
> >
> > The safe variant with the pagetable lock is follow_pte_pmd. The only
> > way to make follow_pfn safe is if you have an mmu notifier and
> > corresponding retry logic. That is not covered by lockdep (it would
> > splat if we annotate the retry side), so I'm not sure how you'd check
> > for that?
>
> Right OK.
>
> > Checking for ptep lock doesn't work here, since the one leftover safe
> > user of this (kvm) doesn't need that at all, because it has the mmu
> > notifier.
>
> Ah, so a better name and/or function kdoc for follow_pfn is probably a
> good iead in this patch as well.

I did change that already to mention that you need an mmu notifier,
and that follow_pte_pmd respectively unsafe_follow_pfn are the
alternatives. Do you want more or something else here?

Note that I left the kerneldoc for the nommu.c case unchanged, since
without an mmu all bets are off anyway.

> > So I think we're as good as it gets, since I really have no idea how
> > to make sure follow_pfn callers do have an mmu notifier registered.
>
> Yah, can't be done. Most mmu notifier users should be using
> hmm_range_fault anyhow, kvm is really very special here.

We could pass an mmu notifier to follow_pfn and check that it has a
registration for vma->vm_mm, but that feels like overkill when kvm is
the only legit user for this.

> > I've followed the few other CONFIG_STRICT_FOO I've seen, which are all
> > explicit enables and default to "do not break uapi, damn the
> > (security) bugs". Which is I think how this should be done. It is in
> > the security section though, so hopefully competent distros will
> > enable this all.
>
> I thought the strict ones were more general and less clear security
> worries, not bugs like this.
>
> This is "allow a user triggerable use after free bug to exist in the
> kernel"

Since at most you get at GFP_MOVEABLE stuff I'm not sure you can use
this to pull the kernel over the table. Maybe best way is if you get a
gpu pagetable somehow into your pfn and then use that to access
abitrary stuff, but there's still an iommu. I think leveraging this is
going to be very tricky, and pretty much has to be device or driver
specific somehow.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Dan Williams
On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter  wrote:
>
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.
>
> Except there's two more ways to access pci bars: sysfs and proc mmap
> support. Let's plug that hole.

Ooh, yes, lets.

>
> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> at ->open time, but that's a bit tricky here with all the entry points
> and arch code. So instead create a fake file and adjust vma->vm_file.

I don't think you want to share the devmem inode for this, this should
be based off the sysfs inode which I believe there is already only one
instance per resource. In contrast /dev/mem can have multiple inodes
because anyone can just mknod a new character device file, the same
problem does not exist for sysfs.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 8:41 PM Bjorn Helgaas  wrote:
>
> Capitalize subject, like other patches in this series and previous
> drivers/pci history.
>
> On Wed, Oct 07, 2020 at 06:44:23PM +0200, Daniel Vetter wrote:
> > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> > the region") /dev/kmem zaps ptes when the kernel requests exclusive
> > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> > the default for all driver uses.
> >
> > Except there's two more ways to access pci bars: sysfs and proc mmap
> > support. Let's plug that hole.
>
> s/pci/PCI/ in commit logs and comments.
>
> > For revoke_devmem() to work we need to link our vma into the same
> > address_space, with consistent vma->vm_pgoff. ->pgoff is already
> > adjusted, because that's how (io_)remap_pfn_range works, but for the
> > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> > at ->open time, but that's a bit tricky here with all the entry points
> > and arch code. So instead create a fake file and adjust vma->vm_file.
> >
> > Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that
> > seems to be a subset of architectures support STRICT_DEVMEM, so we
> > should be good.
> >
> > The only difference in access checks left is that sysfs pci mmap does
> > not check for CAP_RAWIO. But I think that makes some sense compared to
> > /dev/mem and proc, where one file gives you access to everything and
> > no ownership applies.
>
> > --- a/drivers/char/mem.c
> > +++ b/drivers/char/mem.c
> > @@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t 
> > offset, int orig)
> >  }
> >
> >  static struct inode *devmem_inode;
> > +static struct vfsmount *devmem_vfs_mount;
> >
> >  #ifdef CONFIG_IO_STRICT_DEVMEM
> >  void revoke_devmem(struct resource *res)
> > @@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res)
> >
> >   unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 
> > 1);
> >  }
> > +
> > +struct file *devmem_getfile(void)
> > +{
> > + struct file *file;
> > +
> > + file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem",
> > +  O_RDWR, &kmem_fops);
> > + if (IS_ERR(file))
> > + return NULL;
> > +
> > + file->f_mapping = devmem_indoe->i_mapping;
>
> "devmem_indoe"?  Obviously not compiled, I guess?

Yeah apologies, I forgot to compile this with CONFIG_IO_STRICT_DEVMEM
set. The entire series is more rfc about the overall problem really, I
need to also figure out how to even this this somehow. I guess there's
nothing really ready made here?
-Daniel

> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct 
> > resource *base,
> >
> >  #ifdef CONFIG_IO_STRICT_DEVMEM
> >  void revoke_devmem(struct resource *res);
> > +struct file *devm_getfile(void);
> >  #else
> >  static inline void revoke_devmem(struct resource *res) { };
> > +static inline struct file *devmem_getfile(void) { return NULL; };
>
> I guess these names are supposed to match?
>
> >  #endif
> >
> >  #endif /* __ASSEMBLY__ */
> > --
> > 2.28.0
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 09/13] PCI: obey iomem restrictions for procfs mmap

2020-10-07 Thread Bjorn Helgaas
On Wed, Oct 07, 2020 at 06:44:22PM +0200, Daniel Vetter wrote:
> There's three ways to access pci bars from userspace: /dev/mem, sysfs
> files, and the old proc interface. Two check against
> iomem_is_exclusive, proc never did. And with CONFIG_IO_STRICT_DEVMEM,
> this starts to matter, since we don't want random userspace having
> access to pci bars while a driver is loaded and using it.
> 
> Fix this.

Please mention *how* you're fixing this.  I know you can sort of
deduce it from the first paragraph, but it's easy to save readers the
trouble.

s/pci/PCI/
s/bars/BARs/
Capitalize subject to match other patches.

> References: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
> Signed-off-by: Daniel Vetter 
> Cc: Jason Gunthorpe 
> Cc: Kees Cook 
> Cc: Dan Williams 
> Cc: Andrew Morton 
> Cc: John Hubbard 
> Cc: Jérôme Glisse 
> Cc: Jan Kara 
> Cc: Dan Williams 
> Cc: linux...@kvack.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-samsung-...@vger.kernel.org
> Cc: linux-me...@vger.kernel.org
> Cc: Bjorn Helgaas 
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/pci/proc.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
> index d35186b01d98..3a2f90beb4cb 100644
> --- a/drivers/pci/proc.c
> +++ b/drivers/pci/proc.c
> @@ -274,6 +274,11 @@ static int proc_bus_pci_mmap(struct file *file, struct 
> vm_area_struct *vma)
>   else
>   return -EINVAL;
>   }
> +
> + if (dev->resource[i].flags & IORESOURCE_MEM &&
> + iomem_is_exclusive(dev->resource[i].start))
> + return -EINVAL;
> +
>   ret = pci_mmap_page_range(dev, i, vma,
> fpriv->mmap_state, write_combine);
>   if (ret < 0)
> -- 
> 2.28.0
> 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Bjorn Helgaas
Capitalize subject, like other patches in this series and previous
drivers/pci history.

On Wed, Oct 07, 2020 at 06:44:23PM +0200, Daniel Vetter wrote:
> Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
> the region") /dev/kmem zaps ptes when the kernel requests exclusive
> acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
> the default for all driver uses.
> 
> Except there's two more ways to access pci bars: sysfs and proc mmap
> support. Let's plug that hole.

s/pci/PCI/ in commit logs and comments.

> For revoke_devmem() to work we need to link our vma into the same
> address_space, with consistent vma->vm_pgoff. ->pgoff is already
> adjusted, because that's how (io_)remap_pfn_range works, but for the
> mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
> at ->open time, but that's a bit tricky here with all the entry points
> and arch code. So instead create a fake file and adjust vma->vm_file.
> 
> Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that
> seems to be a subset of architectures support STRICT_DEVMEM, so we
> should be good.
> 
> The only difference in access checks left is that sysfs pci mmap does
> not check for CAP_RAWIO. But I think that makes some sense compared to
> /dev/mem and proc, where one file gives you access to everything and
> no ownership applies.

> --- a/drivers/char/mem.c
> +++ b/drivers/char/mem.c
> @@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t 
> offset, int orig)
>  }
>  
>  static struct inode *devmem_inode;
> +static struct vfsmount *devmem_vfs_mount;
>  
>  #ifdef CONFIG_IO_STRICT_DEVMEM
>  void revoke_devmem(struct resource *res)
> @@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res)
>  
>   unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 
> 1);
>  }
> +
> +struct file *devmem_getfile(void)
> +{
> + struct file *file;
> +
> + file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem",
> +  O_RDWR, &kmem_fops);
> + if (IS_ERR(file))
> + return NULL;
> +
> + file->f_mapping = devmem_indoe->i_mapping;

"devmem_indoe"?  Obviously not compiled, I guess?

> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct resource 
> *base,
>  
>  #ifdef CONFIG_IO_STRICT_DEVMEM
>  void revoke_devmem(struct resource *res);
> +struct file *devm_getfile(void);
>  #else
>  static inline void revoke_devmem(struct resource *res) { };
> +static inline struct file *devmem_getfile(void) { return NULL; };

I guess these names are supposed to match?

>  #endif
>  
>  #endif /* __ASSEMBLY__ */
> -- 
> 2.28.0
> 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 13/13] vfio/type1: Mark follow_pfn as unsafe

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 7:39 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 06:44:26PM +0200, Daniel Vetter wrote:
> > The code seems to stuff these pfns into iommu pts (or something like
> > that, I didn't follow), but there's no mmu_notifier to ensure that
> > access is synchronized with pte updates.
> >
> > Hence mark these as unsafe. This means that with
> > CONFIG_STRICT_FOLLOW_PFN, these will be rejected.
> >
> > Real fix is to wire up an mmu_notifier ... somehow. Probably means any
> > invalidate is a fatal fault for this vfio device, but then this
> > shouldn't ever happen if userspace is reasonable.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Jason Gunthorpe 
> > Cc: Kees Cook 
> > Cc: Dan Williams 
> > Cc: Andrew Morton 
> > Cc: John Hubbard 
> > Cc: Jérôme Glisse 
> > Cc: Jan Kara 
> > Cc: Dan Williams 
> > Cc: linux...@kvack.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-samsung-...@vger.kernel.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: Alex Williamson 
> > Cc: Cornelia Huck 
> > Cc: k...@vger.kernel.org
> > ---
> >  drivers/vfio/vfio_iommu_type1.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vfio/vfio_iommu_type1.c 
> > b/drivers/vfio/vfio_iommu_type1.c
> > index 5fbf0c1f7433..a4d53f3d0a35 100644
> > --- a/drivers/vfio/vfio_iommu_type1.c
> > +++ b/drivers/vfio/vfio_iommu_type1.c
> > @@ -421,7 +421,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, 
> > struct mm_struct *mm,
> >  {
> >   int ret;
> >
> > - ret = follow_pfn(vma, vaddr, pfn);
> > + ret = unsafe_follow_pfn(vma, vaddr, pfn);
> >   if (ret) {
> >   bool unlocked = false;
> >
> > @@ -435,7 +435,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, 
> > struct mm_struct *mm,
> >   if (ret)
> >   return ret;
> >
> > - ret = follow_pfn(vma, vaddr, pfn);
> > + ret = unsafe_follow_pfn(vma, vaddr, pfn);
> >   }
>
> This is actually being commonly used, so it needs fixing.
>
> When I talked to Alex about this last we had worked out a patch series
> that adds a test on vm_ops that the vma came from vfio in the first
> place. The VMA's created by VFIO are 'safe' as the PTEs are never changed.

Hm, but wouldn't need that the semi-nasty vma_open trick to make sure
that vma doesn't untimely disappear? Or is the idea to look up the
underlying vfio object, and refcount that directly?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 11/13] mm: add unsafe_follow_pfn

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 7:36 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote:
> > Way back it was a reasonable assumptions that iomem mappings never
> > change the pfn range they point at. But this has changed:
> >
> > - gpu drivers dynamically manage their memory nowadays, invalidating
> > ptes with unmap_mapping_range when buffers get moved
> >
> > - contiguous dma allocations have moved from dedicated carvetouts to
> > cma regions. This means if we miss the unmap the pfn might contain
> > pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
> >
> > - even /dev/mem now invalidates mappings when the kernel requests that
> > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> > ("/dev/mem: Revoke mappings when a driver claims the region")
> >
> > Accessing pfns obtained from ptes without holding all the locks is
> > therefore no longer a good idea.
> >
> > Unfortunately there's some users where this is not fixable (like v4l
> > userptr of iomem mappings) or involves a pile of work (vfio type1
> > iommu). For now annotate these as unsafe and splat appropriately.
> >
> > This patch adds an unsafe_follow_pfn, which later patches will then
> > roll out to all appropriate places.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: Jason Gunthorpe 
> > Cc: Kees Cook 
> > Cc: Dan Williams 
> > Cc: Andrew Morton 
> > Cc: John Hubbard 
> > Cc: Jérôme Glisse 
> > Cc: Jan Kara 
> > Cc: Dan Williams 
> > Cc: linux...@kvack.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-samsung-...@vger.kernel.org
> > Cc: linux-me...@vger.kernel.org
> > Cc: k...@vger.kernel.org
> > ---
> >  include/linux/mm.h |  2 ++
> >  mm/memory.c| 32 +++-
> >  mm/nommu.c | 17 +
> >  security/Kconfig   | 13 +
> >  4 files changed, 63 insertions(+), 1 deletion(-)
>
> Makes sense to me.
>
> I wonder if we could change the original follow_pfn to require the
> ptep and then lockdep_assert_held() it against the page table lock?

The safe variant with the pagetable lock is follow_pte_pmd. The only
way to make follow_pfn safe is if you have an mmu notifier and
corresponding retry logic. That is not covered by lockdep (it would
splat if we annotate the retry side), so I'm not sure how you'd check
for that?

Checking for ptep lock doesn't work here, since the one leftover safe
user of this (kvm) doesn't need that at all, because it has the mmu
notifier.

Also follow_pte_pmd will splat with lockdep if you get it wrong, since
the function leaves you with the right ptlock lock when it returns. If
you forget to unlock that, lockdep will complain.

So I think we're as good as it gets, since I really have no idea how
to make sure follow_pfn callers do have an mmu notifier registered.

> > +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
> > + unsigned long *pfn)
> > +{
> > +#ifdef CONFIG_STRICT_FOLLOW_PFN
> > + pr_info("unsafe follow_pfn usage rejected, see
> > CONFIG_STRICT_FOLLOW_PFN\n");
>
> Wonder if we can print something useful here, like the current
> PID/process name?

Yeah adding comm/pid here makes sense.

> > diff --git a/security/Kconfig b/security/Kconfig
> > index 7561f6f99f1d..48945402e103 100644
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -230,6 +230,19 @@ config STATIC_USERMODEHELPER_PATH
> > If you wish for all usermode helper programs to be disabled,
> > specify an empty string here (i.e. "").
> >
> > +config STRICT_FOLLOW_PFN
> > + bool "Disable unsafe use of follow_pfn"
> > + depends on MMU
>
> I would probably invert this CONFIG_ALLOW_UNSAFE_FOLLOW_PFN
> default n

I've followed the few other CONFIG_STRICT_FOO I've seen, which are all
explicit enables and default to "do not break uapi, damn the
(security) bugs". Which is I think how this should be done. It is in
the security section though, so hopefully competent distros will
enable this all.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 07/13] mm: close race in generic_access_phys

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 7:27 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 06:44:20PM +0200, Daniel Vetter wrote:
> > Way back it was a reasonable assumptions that iomem mappings never
> > change the pfn range they point at. But this has changed:
> >
> > - gpu drivers dynamically manage their memory nowadays, invalidating
> >   ptes with unmap_mapping_range when buffers get moved
> >
> > - contiguous dma allocations have moved from dedicated carvetouts to
> >   cma regions. This means if we miss the unmap the pfn might contain
> >   pagecache or anon memory (well anything allocated with GFP_MOVEABLE)
> >
> > - even /dev/mem now invalidates mappings when the kernel requests that
> >   iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
> >   ("/dev/mem: Revoke mappings when a driver claims the region")
> >
> > Accessing pfns obtained from ptes without holding all the locks is
> > therefore no longer a good idea. Fix this.
> >
> > Since ioremap might need to manipulate pagetables too we need to drop
> > the pt lock and have a retry loop if we raced.
> >
> > While at it, also add kerneldoc and improve the comment for the
> > vma_ops->access function. It's for accessing, not for moving the
> > memory from iomem to system memory, as the old comment seemed to
> > suggest.
> >
> > References: 28b2ee20c7cb ("access_process_vm device memory infrastructure")
> > Cc: Jason Gunthorpe 
> > Cc: Dan Williams 
> > Cc: Kees Cook 
> > Cc: Rik van Riel 
> > Cc: Benjamin Herrensmidt 
> > Cc: Dave Airlie 
> > Cc: Hugh Dickins 
> > Cc: Andrew Morton 
> > Cc: John Hubbard 
> > Cc: Jérôme Glisse 
> > Cc: Jan Kara 
> > Cc: Dan Williams 
> > Cc: linux...@kvack.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-samsung-...@vger.kernel.org
> > Cc: linux-me...@vger.kernel.org
> > Signed-off-by: Daniel Vetter 
> > ---
> >  include/linux/mm.h |  3 ++-
> >  mm/memory.c| 44 ++--
> >  2 files changed, 44 insertions(+), 3 deletions(-)
>
> This does seem to solve the race with revoke_devmem(), but it is really ugly.
>
> It would be much nicer to wrap a rwsem around this access and the unmap.
>
> Any place using it has a nice linear translation from vm_off to pfn,
> so I don't think there is a such a good reason to use follow_pte in
> the first place.
>
> ie why not the helper be this:
>
>  int generic_access_phys(unsigned long pfn, unsigned long pgprot,
>   void *buf, size_t len, bool write)
>
> Then something like dev/mem would compute pfn and obtain the lock:
>
> dev_access(struct vm_area_struct *vma, unsigned long addr, void *buf, int 
> len, int write)
> {
>  cpu_addr = vma->vm_pgoff*PAGE_SIZE + (addr - vma->vm_start));
>
>  /* FIXME: Has to be over each page of len */
>  if (!devmem_is_allowed_access(PHYS_PFN(cpu_addr/4096)))
>return -EPERM;
>
>  down_read(&mem_sem);
>  generic_access_phys(cpu_addr/4096, pgprot_val(vma->vm_page_prot),
>  buf, len, write);
>  up_read(&mem_sem);
> }
>
> The other cases looked simpler because they don't revoke, here the
> mmap_sem alone should be enough protection, they would just need to
> provide the linear translation to pfn.
>
> What do you think?

I think it'd fix the bug, until someone wires ->access up for
drivers/gpu, or the next subsystem. This is also just for ptrace, so
we really don't care when we stall the vm badly and other silly
things. So I figured the somewhat ugly, but full generic solution is
the better one, so that people who want to be able to ptrace
read/write their iomem mmaps can just sprinkle this wherever they feel
like.

But yeah if we go with most minimal fix, i.e. only trying to fix the
current users, then your thing should work and is simpler. But it
leaves the door open for future problems.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 09/20] gpu: host1x: DMA fences and userspace fence creation

2020-10-07 Thread Mikko Perttunen
Add an implementation of dma_fences based on syncpoints. Syncpoint
interrupts are used to signal fences. Additionally, after
software signaling has been enabled, a 30 second timeout is started.
If the syncpoint threshold is not reached within this period,
the fence is signalled with an -ETIMEDOUT error code. This is to
allow fences that would never reach their syncpoint threshold to
be cleaned up.

Additionally, add a new /dev/host1x IOCTL for creating sync_file
file descriptors backed by syncpoint fences.

Signed-off-by: Mikko Perttunen 
---
v3:
* Move declaration of host1x_fence_extract to public header
---
 drivers/gpu/host1x/Makefile |   1 +
 drivers/gpu/host1x/fence.c  | 207 
 drivers/gpu/host1x/fence.h  |  13 +++
 drivers/gpu/host1x/intr.c   |   9 ++
 drivers/gpu/host1x/intr.h   |   2 +
 drivers/gpu/host1x/uapi.c   | 106 ++
 include/linux/host1x.h  |   4 +
 7 files changed, 342 insertions(+)
 create mode 100644 drivers/gpu/host1x/fence.c
 create mode 100644 drivers/gpu/host1x/fence.h

diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile
index 882f928d75e1..a48af2cefae1 100644
--- a/drivers/gpu/host1x/Makefile
+++ b/drivers/gpu/host1x/Makefile
@@ -10,6 +10,7 @@ host1x-y = \
debug.o \
mipi.o \
uapi.o \
+   fence.o \
hw/host1x01.o \
hw/host1x02.o \
hw/host1x04.o \
diff --git a/drivers/gpu/host1x/fence.c b/drivers/gpu/host1x/fence.c
new file mode 100644
index ..400da6c1ab48
--- /dev/null
+++ b/drivers/gpu/host1x/fence.c
@@ -0,0 +1,207 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Syncpoint dma_fence implementation
+ *
+ * Copyright (c) 2020, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "intr.h"
+#include "syncpt.h"
+
+DEFINE_SPINLOCK(lock);
+
+struct host1x_syncpt_fence {
+   struct dma_fence base;
+
+   atomic_t signaling;
+
+   struct host1x_syncpt *sp;
+   u32 threshold;
+
+   struct host1x_waitlist *waiter;
+   void *waiter_ref;
+
+   struct delayed_work timeout_work;
+};
+
+static const char *syncpt_fence_get_driver_name(struct dma_fence *f)
+{
+   return "host1x";
+}
+
+static const char *syncpt_fence_get_timeline_name(struct dma_fence *f)
+{
+   return "syncpoint";
+}
+
+static bool syncpt_fence_enable_signaling(struct dma_fence *f)
+{
+   struct host1x_syncpt_fence *sf =
+   container_of(f, struct host1x_syncpt_fence, base);
+   int err;
+
+   if (host1x_syncpt_is_expired(sf->sp, sf->threshold))
+   return false;
+
+   dma_fence_get(f);
+
+   /*
+* The dma_fence framework requires the fence driver to keep a
+* reference to any fences for which 'enable_signaling' has been
+* called (and that have not been signalled).
+* 
+* We provide a userspace API to create arbitrary syncpoint fences,
+* so we cannot normally guarantee that all fences get signalled.
+* As such, setup a timeout, so that long-lasting fences will get
+* reaped eventually.
+*/
+   schedule_delayed_work(&sf->timeout_work, msecs_to_jiffies(3));
+
+   err = host1x_intr_add_action(sf->sp->host, sf->sp, sf->threshold,
+HOST1X_INTR_ACTION_SIGNAL_FENCE, f,
+sf->waiter, &sf->waiter_ref);
+   if (err) {
+   cancel_delayed_work_sync(&sf->timeout_work);
+   dma_fence_put(f);
+   return false;
+   }
+
+   /* intr framework takes ownership of waiter */
+   sf->waiter = NULL;
+
+   /*
+* The fence may get signalled at any time after the above call,
+* so we need to initialize all state used by signalling
+* before it.
+*/
+
+   return true;
+}
+
+static void syncpt_fence_release(struct dma_fence *f)
+{
+   struct host1x_syncpt_fence *sf =
+   container_of(f, struct host1x_syncpt_fence, base);
+
+   if (sf->waiter)
+   kfree(sf->waiter);
+
+   dma_fence_free(f);
+}
+
+const struct dma_fence_ops syncpt_fence_ops = {
+   .get_driver_name = syncpt_fence_get_driver_name,
+   .get_timeline_name = syncpt_fence_get_timeline_name,
+   .enable_signaling = syncpt_fence_enable_signaling,
+   .release = syncpt_fence_release,
+};
+
+void host1x_fence_signal(struct host1x_syncpt_fence *f)
+{
+   if (atomic_xchg(&f->signaling, 1))
+   return;
+
+   /*
+* Cancel pending timeout work - if it races, it will
+* not get 'f->signaling' and return.
+*/
+   cancel_delayed_work_sync(&f->timeout_work);
+
+   host1x_intr_put_ref(f->sp->host, f->sp->id, f->waiter_ref);
+
+   dma_fence_signal(&f->base);
+   dma_fence_put(&f->base);
+}
+
+static void do_fence_timeout(struct work_struct *work)
+{
+   struct delayed_work *dwork = (struct delayed_work 

[PATCH v3 07/20] gpu: host1x: Introduce UAPI header

2020-10-07 Thread Mikko Perttunen
Add the userspace interface header, specifying interfaces
for allocating and accessing syncpoints from userspace,
and for creating sync_file based fences based on syncpoint
thresholds.

Signed-off-by: Mikko Perttunen 
---
 include/uapi/linux/host1x.h | 134 
 1 file changed, 134 insertions(+)
 create mode 100644 include/uapi/linux/host1x.h

diff --git a/include/uapi/linux/host1x.h b/include/uapi/linux/host1x.h
new file mode 100644
index ..9c8fb9425cb2
--- /dev/null
+++ b/include/uapi/linux/host1x.h
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* Copyright (c) 2020 NVIDIA Corporation */
+
+#ifndef _UAPI__LINUX_HOST1X_H
+#define _UAPI__LINUX_HOST1X_H
+
+#include 
+#include 
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+struct host1x_allocate_syncpoint {
+   /**
+* @fd: [out]
+*
+* New file descriptor representing the allocated syncpoint.
+*/
+   __s32 fd;
+
+   __u32 reserved[3];
+};
+
+struct host1x_syncpoint_info {
+   /**
+* @id: [out]
+*
+* System-global ID of the syncpoint.
+*/
+   __u32 id;
+
+   __u32 reserved[3];
+};
+
+struct host1x_syncpoint_increment {
+   /**
+* @count: [in]
+*
+* Number of times to increment the syncpoint. The syncpoint can
+* be observed at in-between values, but each increment is atomic.
+*/
+   __u32 count;
+};
+
+struct host1x_read_syncpoint {
+   /**
+* @id: [in]
+*
+* ID of the syncpoint to read.
+*/
+   __u32 id;
+
+   /**
+* @value: [out]
+*
+* Current value of the syncpoint.
+*/
+   __u32 value;
+};
+
+struct host1x_create_fence {
+   /**
+* @id: [in]
+*
+* ID of the syncpoint to create a fence for.
+*/
+   __u32 id;
+
+   /**
+* @threshold: [in]
+*
+* When the syncpoint reaches this value, the fence will be signaled.
+* The syncpoint is considered to have reached the threshold when the
+* following condition is true:
+*
+*  ((value - threshold) & 0x8000U) == 0U
+*
+*/
+   __u32 threshold;
+
+   /**
+* @fence_fd: [out]
+*
+* New sync_file file descriptor containing the created fence.
+*/
+   __s32 fence_fd;
+
+   __u32 reserved[1];
+};
+
+struct host1x_fence_extract_fence {
+   __u32 id;
+   __u32 threshold;
+};
+
+struct host1x_fence_extract {
+   /**
+* @fence_fd: [in]
+*
+* sync_file file descriptor
+*/
+   __s32 fence_fd;
+
+   /**
+* @num_fences: [in,out]
+*
+* In: size of the `fences_ptr` array counted in elements.
+* Out: required size of the `fences_ptr` array counted in elements.
+*/
+   __u32 num_fences;
+
+   /**
+* @fences_ptr: [in]
+*
+* Pointer to array of `struct host1x_fence_extract_fence`.
+*/
+   __u64 fences_ptr;
+
+   __u32 reserved[2];
+};
+
+#define HOST1X_IOCTL_ALLOCATE_SYNCPOINT  _IOWR('X', 0x00, struct 
host1x_allocate_syncpoint)
+#define HOST1X_IOCTL_READ_SYNCPOINT  _IOR ('X', 0x01, struct 
host1x_read_syncpoint)
+#define HOST1X_IOCTL_CREATE_FENCE_IOWR('X', 0x02, struct 
host1x_create_fence)
+#define HOST1X_IOCTL_SYNCPOINT_INFO  _IOWR('X', 0x03, struct 
host1x_syncpoint_info)
+#define HOST1X_IOCTL_SYNCPOINT_INCREMENT _IOWR('X', 0x04, struct 
host1x_syncpoint_increment)
+#define HOST1X_IOCTL_FENCE_EXTRACT   _IOWR('X', 0x05, struct 
host1x_fence_extract)
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 11/20] gpu: host1x: Add job release callback

2020-10-07 Thread Mikko Perttunen
Add a callback field to the job structure, to be called just before
the job is to be freed. This allows the job's submitter to clean
up any of its own state, like decrement runtime PM refcounts.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/job.c | 3 +++
 include/linux/host1x.h   | 4 
 2 files changed, 7 insertions(+)

diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index e4f16fc899b0..acf322beb56c 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -79,6 +79,9 @@ static void job_free(struct kref *ref)
 {
struct host1x_job *job = container_of(ref, struct host1x_job, ref);
 
+   if (job->release)
+   job->release(job);
+
if (job->waiter)
host1x_intr_put_ref(job->syncpt->host, job->syncpt->id,
job->waiter);
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index fb62cc8b77dd..d7070fd65833 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -265,6 +265,10 @@ struct host1x_job {
 
/* Fast-forward syncpoint increments on job timeout */
bool syncpt_recovery;
+
+   /* Callback called when job is freed */
+   void (*release)(struct host1x_job *job);
+   void *user_data;
 };
 
 struct host1x_job *host1x_job_alloc(struct host1x_channel *ch,
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 15/20] drm/tegra: Add new UAPI to header

2020-10-07 Thread Mikko Perttunen
Update the tegra_drm.h UAPI header, adding the new proposed UAPI.
The old staging UAPI is left in for now, with minor modification
to avoid name collisions.

Signed-off-by: Mikko Perttunen 
---
v3:
* Remove timeout field
* Inline the syncpt_incrs array to the submit structure
* Remove WRITE_RELOC (it is now implicit)
---
 include/uapi/drm/tegra_drm.h | 420 ---
 1 file changed, 393 insertions(+), 27 deletions(-)

diff --git a/include/uapi/drm/tegra_drm.h b/include/uapi/drm/tegra_drm.h
index c4df3c3668b3..9588d5e3308f 100644
--- a/include/uapi/drm/tegra_drm.h
+++ b/include/uapi/drm/tegra_drm.h
@@ -1,24 +1,5 @@
-/*
- * Copyright (c) 2012-2013, NVIDIA CORPORATION.  All rights reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- */
+/* SPDX-License-Identifier: MIT */
+/* Copyright (c) 2012-2020 NVIDIA Corporation */
 
 #ifndef _UAPI_TEGRA_DRM_H_
 #define _UAPI_TEGRA_DRM_H_
@@ -29,6 +10,8 @@
 extern "C" {
 #endif
 
+/* TegraDRM legacy UAPI. Only enabled with STAGING */
+
 #define DRM_TEGRA_GEM_CREATE_TILED (1 << 0)
 #define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1)
 
@@ -644,13 +627,13 @@ struct drm_tegra_gem_get_flags {
__u32 flags;
 };
 
-#define DRM_TEGRA_GEM_CREATE   0x00
-#define DRM_TEGRA_GEM_MMAP 0x01
+#define DRM_TEGRA_GEM_CREATE_LEGACY0x00
+#define DRM_TEGRA_GEM_MMAP_LEGACY  0x01
 #define DRM_TEGRA_SYNCPT_READ  0x02
 #define DRM_TEGRA_SYNCPT_INCR  0x03
 #define DRM_TEGRA_SYNCPT_WAIT  0x04
-#define DRM_TEGRA_OPEN_CHANNEL 0x05
-#define DRM_TEGRA_CLOSE_CHANNEL0x06
+#define DRM_TEGRA_OPEN_CHANNEL 0x05
+#define DRM_TEGRA_CLOSE_CHANNEL0x06
 #define DRM_TEGRA_GET_SYNCPT   0x07
 #define DRM_TEGRA_SUBMIT   0x08
 #define DRM_TEGRA_GET_SYNCPT_BASE  0x09
@@ -659,8 +642,8 @@ struct drm_tegra_gem_get_flags {
 #define DRM_TEGRA_GEM_SET_FLAGS0x0c
 #define DRM_TEGRA_GEM_GET_FLAGS0x0d
 
-#define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create)
-#define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap)
+#define DRM_IOCTL_TEGRA_GEM_CREATE_LEGACY DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_CREATE_LEGACY, struct drm_tegra_gem_create)
+#define DRM_IOCTL_TEGRA_GEM_MMAP_LEGACY DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_MMAP_LEGACY, struct drm_tegra_gem_mmap)
 #define DRM_IOCTL_TEGRA_SYNCPT_READ DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_SYNCPT_READ, struct drm_tegra_syncpt_read)
 #define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr)
 #define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait)
@@ -674,6 +657,389 @@ struct drm_tegra_gem_get_flags {
 #define DRM_IOCTL_TEGRA_GEM_SET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_SET_FLAGS, struct drm_tegra_gem_set_flags)
 #define DRM_IOCTL_TEGRA_GEM_GET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + 
DRM_TEGRA_GEM_GET_FLAGS, struct drm_tegra_gem_get_flags)
 
+/* New TegraDRM UAPI */
+
+struct drm_tegra_channel_open {
+   /**
+* @host1x_class: [in]
+*
+* Host1x class of the engine that will be programmed using this
+* channel.
+*/
+   __u32 host1x_class;
+
+   /**
+* @flags: [in]
+*
+* Flags.
+*/
+   __u32 flags;
+
+   /**
+* @channel_ctx: [out]
+*
+* Opaque identifier corresponding to the opened channel.
+*/
+   __u32 channel_ctx;
+
+   /**
+* @hardware_version: [out]
+*
+* Version of the engine hardware. This can be used by userspace
+* to determine how the engine needs to be programmed.
+*/
+   __u32 hardware_version;
+
+   __u32 reserved[2];
+};
+
+struct drm_t

[PATCH v3 18/20] drm/tegra: Allocate per-engine channel in core code

2020-10-07 Thread Mikko Perttunen
To avoid duplication, allocate the per-engine shared channel in the
core code instead. Once MLOCKs are implemented on Host1x side, we
can also update this to avoid allocating a shared channel when
MLOCKs are enabled.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/drm/tegra/drm.c | 11 +++
 drivers/gpu/drm/tegra/drm.h |  4 
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 7437c67924aa..7124b0b0154b 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -887,6 +887,14 @@ static struct drm_driver tegra_drm_driver = {
 int tegra_drm_register_client(struct tegra_drm *tegra,
  struct tegra_drm_client *client)
 {
+   /*
+* When MLOCKs are implemented, change to allocate a shared channel
+* only when MLOCKs are disabled.
+*/
+   client->shared_channel = host1x_channel_request(&client->base);
+   if (!client->shared_channel)
+   return -EBUSY;
+
mutex_lock(&tegra->clients_lock);
list_add_tail(&client->list, &tegra->clients);
client->drm = tegra;
@@ -903,6 +911,9 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra,
client->drm = NULL;
mutex_unlock(&tegra->clients_lock);
 
+   if (client->shared_channel)
+   host1x_channel_put(client->shared_channel);
+
return 0;
 }
 
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index b25443255be6..3fc42fd97911 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -86,8 +86,12 @@ struct tegra_drm_client {
struct list_head list;
struct tegra_drm *drm;
 
+   /* Set by driver */
unsigned int version;
const struct tegra_drm_client_ops *ops;
+
+   /* Set by TegraDRM core */
+   struct host1x_channel *shared_channel;
 };
 
 static inline struct tegra_drm_client *
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 19/20] drm/tegra: Implement new UAPI

2020-10-07 Thread Mikko Perttunen
Implement the new UAPI, and bump the TegraDRM major version.

Signed-off-by: Mikko Perttunen 
---
v3:
* Remove WRITE_RELOC. Relocations are now patched implicitly
  when patching is needed.
* Directly call PM runtime APIs on devices instead of using
  power_on/power_off callbacks.
* Remove incorrect mutex unlock in tegra_drm_ioctl_channel_open
* Use XA_FLAGS_ALLOC1 instead of XA_FLAGS_ALLOC
* Accommodate for removal of timeout field and inlining of
  syncpt_incrs array.
* Copy entire user arrays at a time instead of going through
  elements one-by-one.
* Implement waiting of DMA reservations.
* Split out gather_bo implementation into a separate file.
* Fix length parameter passed to sg_init_one in gather_bo
* Cosmetic cleanup.
---
 drivers/gpu/drm/tegra/Makefile |   3 +
 drivers/gpu/drm/tegra/drm.c|  46 +-
 drivers/gpu/drm/tegra/drm.h|   5 +
 drivers/gpu/drm/tegra/uapi.h   |  63 +++
 drivers/gpu/drm/tegra/uapi/gather_bo.c |  86 
 drivers/gpu/drm/tegra/uapi/gather_bo.h |  22 +
 drivers/gpu/drm/tegra/uapi/submit.c| 675 +
 drivers/gpu/drm/tegra/uapi/submit.h|  17 +
 drivers/gpu/drm/tegra/uapi/uapi.c  | 326 
 9 files changed, 1225 insertions(+), 18 deletions(-)
 create mode 100644 drivers/gpu/drm/tegra/uapi.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.c
 create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/submit.c
 create mode 100644 drivers/gpu/drm/tegra/uapi/submit.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/uapi.c

diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index d6cf202414f0..059322e88943 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -3,6 +3,9 @@ ccflags-$(CONFIG_DRM_TEGRA_DEBUG) += -DDEBUG
 
 tegra-drm-y := \
drm.o \
+   uapi/uapi.o \
+   uapi/submit.o \
+   uapi/gather_bo.o \
gem.o \
fb.o \
dp.o \
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 7124b0b0154b..88226dd0fd88 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -20,24 +20,20 @@
 #include 
 #include 
 
+#include "uapi.h"
 #include "drm.h"
 #include "gem.h"
 
 #define DRIVER_NAME "tegra"
 #define DRIVER_DESC "NVIDIA Tegra graphics"
 #define DRIVER_DATE "20120330"
-#define DRIVER_MAJOR 0
+#define DRIVER_MAJOR 1
 #define DRIVER_MINOR 0
 #define DRIVER_PATCHLEVEL 0
 
 #define CARVEOUT_SZ SZ_64M
 #define CDMA_GATHER_FETCHES_MAX_NB 16383
 
-struct tegra_drm_file {
-   struct idr contexts;
-   struct mutex lock;
-};
-
 static int tegra_atomic_check(struct drm_device *drm,
  struct drm_atomic_state *state)
 {
@@ -90,7 +86,8 @@ static int tegra_drm_open(struct drm_device *drm, struct 
drm_file *filp)
if (!fpriv)
return -ENOMEM;
 
-   idr_init(&fpriv->contexts);
+   idr_init(&fpriv->legacy_contexts);
+   xa_init_flags(&fpriv->contexts, XA_FLAGS_ALLOC1);
mutex_init(&fpriv->lock);
filp->driver_priv = fpriv;
 
@@ -432,7 +429,7 @@ static int tegra_client_open(struct tegra_drm_file *fpriv,
if (err < 0)
return err;
 
-   err = idr_alloc(&fpriv->contexts, context, 1, 0, GFP_KERNEL);
+   err = idr_alloc(&fpriv->legacy_contexts, context, 1, 0, GFP_KERNEL);
if (err < 0) {
client->ops->close_channel(context);
return err;
@@ -487,13 +484,13 @@ static int tegra_close_channel(struct drm_device *drm, 
void *data,
 
mutex_lock(&fpriv->lock);
 
-   context = idr_find(&fpriv->contexts, args->context);
+   context = idr_find(&fpriv->legacy_contexts, args->context);
if (!context) {
err = -EINVAL;
goto unlock;
}
 
-   idr_remove(&fpriv->contexts, context->id);
+   idr_remove(&fpriv->legacy_contexts, context->id);
tegra_drm_context_free(context);
 
 unlock:
@@ -512,7 +509,7 @@ static int tegra_get_syncpt(struct drm_device *drm, void 
*data,
 
mutex_lock(&fpriv->lock);
 
-   context = idr_find(&fpriv->contexts, args->context);
+   context = idr_find(&fpriv->legacy_contexts, args->context);
if (!context) {
err = -ENODEV;
goto unlock;
@@ -541,7 +538,7 @@ static int tegra_submit(struct drm_device *drm, void *data,
 
mutex_lock(&fpriv->lock);
 
-   context = idr_find(&fpriv->contexts, args->context);
+   context = idr_find(&fpriv->legacy_contexts, args->context);
if (!context) {
err = -ENODEV;
goto unlock;
@@ -566,7 +563,7 @@ static int tegra_get_syncpt_base(struct drm_device *drm, 
void *data,
 
mutex_lock(&fpriv->lock);
 
-   context = idr_find(&fpriv->contexts, args->context);
+   context = idr_find(&fpriv->legacy_contexts, args->context);
if (!context) {
err = -EN

[PATCH v3 14/20] gpu: host1x: Reserve VBLANK syncpoints at initialization

2020-10-07 Thread Mikko Perttunen
On T20-T148 chips, the bootloader can set up a boot splash
screen with DC configured to increment syncpoint 26/27
at VBLANK. Because of this we shouldn't allow these syncpoints
to be allocated until DC has been reset and will no longer
increment them in the background.

As such, on these chips, reserve those two syncpoints at
initialization, and only mark them free once the DC
driver has indicated it's safe to do so.

Signed-off-by: Mikko Perttunen 
---
v3:
* New patch
---
 drivers/gpu/drm/tegra/dc.c  |  6 ++
 drivers/gpu/host1x/dev.c|  6 ++
 drivers/gpu/host1x/dev.h|  6 ++
 drivers/gpu/host1x/syncpt.c | 34 +-
 include/linux/host1x.h  |  3 +++
 5 files changed, 54 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index efb41c10dad4..0b23e0922c25 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2031,6 +2031,12 @@ static int tegra_dc_init(struct host1x_client *client)
struct drm_plane *cursor = NULL;
int err;
 
+   /*
+* DC has been reset by now, so VBLANK syncpoint can be released
+* for general use.
+*/
+   host1x_syncpt_release_vblank_reservation(client, 26 + dc->pipe);
+
/*
 * XXX do not register DCs with no window groups because we cannot
 * assign a primary plane to them, which in turn will cause KMS to
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index 641317d23828..8b50fbb22846 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -77,6 +77,7 @@ static const struct host1x_info host1x01_info = {
.has_hypervisor = false,
.num_sid_entries = 0,
.sid_table = NULL,
+   .reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x02_info = {
@@ -91,6 +92,7 @@ static const struct host1x_info host1x02_info = {
.has_hypervisor = false,
.num_sid_entries = 0,
.sid_table = NULL,
+   .reserve_vblank_syncpts = true,
 };
 
 static const struct host1x_info host1x04_info = {
@@ -105,6 +107,7 @@ static const struct host1x_info host1x04_info = {
.has_hypervisor = false,
.num_sid_entries = 0,
.sid_table = NULL,
+   .reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_info host1x05_info = {
@@ -119,6 +122,7 @@ static const struct host1x_info host1x05_info = {
.has_hypervisor = false,
.num_sid_entries = 0,
.sid_table = NULL,
+   .reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra186_sid_table[] = {
@@ -142,6 +146,7 @@ static const struct host1x_info host1x06_info = {
.has_hypervisor = true,
.num_sid_entries = ARRAY_SIZE(tegra186_sid_table),
.sid_table = tegra186_sid_table,
+   .reserve_vblank_syncpts = false,
 };
 
 static const struct host1x_sid_entry tegra194_sid_table[] = {
@@ -165,6 +170,7 @@ static const struct host1x_info host1x07_info = {
.has_hypervisor = true,
.num_sid_entries = ARRAY_SIZE(tegra194_sid_table),
.sid_table = tegra194_sid_table,
+   .reserve_vblank_syncpts = false,
 };
 
 static const struct of_device_id host1x_of_match[] = {
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 7b8b7e20e32b..e360bc4a25f6 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -102,6 +102,12 @@ struct host1x_info {
bool has_hypervisor; /* has hypervisor registers */
unsigned int num_sid_entries;
const struct host1x_sid_entry *sid_table;
+   /*
+* On T20-T148, the boot chain may setup DC to increment syncpoints
+* 26/27 on VBLANK. As such we cannot use these syncpoints until
+* the display driver disables VBLANK increments.
+*/
+   bool reserve_vblank_syncpts;
 };
 
 struct host1x {
diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 99d31932eb34..d0be7bdbc6c9 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -52,7 +52,7 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
 
mutex_lock(&host->syncpt_mutex);
 
-   for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++)
+   for (i = 0; i < host->info->nb_pts && kref_read(&sp->ref); i++, sp++)
;
 
if (i >= host->info->nb_pts)
@@ -359,6 +359,11 @@ int host1x_syncpt_init(struct host1x *host)
if (!host->nop_sp)
return -ENOMEM;
 
+   if (host->info->reserve_vblank_syncpts) {
+   kref_init(&host->syncpt[26].ref);
+   kref_init(&host->syncpt[27].ref);
+   }
+
return 0;
 }
 
@@ -545,3 +550,30 @@ u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base)
return base->id;
 }
 EXPORT_SYMBOL(host1x_syncpt_base_id);
+
+static void do_nothing(struct kref *ref)
+{
+}
+
+/**
+ * host1x_syncpt_release_vblank_reservation() - Make VBLANK syncpoint
+ *  

[PATCH v3 04/20] gpu: host1x: Remove cancelled waiters immediately

2020-10-07 Thread Mikko Perttunen
Before this patch, cancelled waiters would only be cleaned up
once their threshold value was reached. Make host1x_intr_put_ref
process the cancellation immediately to fix this.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/intr.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c
index 9245add23b5d..5d328d20ce6d 100644
--- a/drivers/gpu/host1x/intr.c
+++ b/drivers/gpu/host1x/intr.c
@@ -247,13 +247,17 @@ void host1x_intr_put_ref(struct host1x *host, unsigned 
int id, void *ref)
struct host1x_waitlist *waiter = ref;
struct host1x_syncpt *syncpt;
 
-   while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) ==
-  WLS_REMOVED)
-   schedule();
+   atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED);
 
syncpt = host->syncpt + id;
-   (void)process_wait_list(host, syncpt,
-   host1x_syncpt_load(host->syncpt + id));
+
+   spin_lock(&syncpt->intr.lock);
+   if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) ==
+   WLS_CANCELLED) {
+   list_del(&waiter->list);
+   kref_put(&waiter->refcount, waiter_release);
+   }
+   spin_unlock(&syncpt->intr.lock);
 
kref_put(&waiter->refcount, waiter_release);
 }
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 03/20] gpu: host1x: Show number of pending waiters in debugfs

2020-10-07 Thread Mikko Perttunen
Show the number of pending waiters in the debugfs status file.
This is useful for testing to verify that waiters do not leak
or accumulate incorrectly.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/debug.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c
index 3eee4318b158..2d06a7406b3b 100644
--- a/drivers/gpu/host1x/debug.c
+++ b/drivers/gpu/host1x/debug.c
@@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void 
*data, bool show_fifo)
 
 static void show_syncpts(struct host1x *m, struct output *o)
 {
+   struct list_head *pos;
unsigned int i;
 
host1x_debug_output(o, " syncpts \n");
@@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o)
for (i = 0; i < host1x_syncpt_nb_pts(m); i++) {
u32 max = host1x_syncpt_read_max(m->syncpt + i);
u32 min = host1x_syncpt_load(m->syncpt + i);
+   unsigned int waiters = 0;
 
-   if (!min && !max)
+   spin_lock(&m->syncpt[i].intr.lock);
+   list_for_each(pos, &m->syncpt[i].intr.wait_head)
+   waiters++;
+   spin_unlock(&m->syncpt[i].intr.lock);
+
+   if (!min && !max && !waiters)
continue;
 
-   host1x_debug_output(o, "id %u (%s) min %d max %d\n",
-   i, m->syncpt[i].name, min, max);
+   host1x_debug_output(o,
+   "id %u (%s) min %d max %d (%d waiters)\n",
+   i, m->syncpt[i].name, min, max, waiters);
}
 
for (i = 0; i < host1x_syncpt_nb_bases(m); i++) {
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 05/20] gpu: host1x: Use HW-equivalent syncpoint expiration check

2020-10-07 Thread Mikko Perttunen
Make syncpoint expiration checks always use the same logic used by
the hardware. This ensures that there are no race conditions that
could occur because of the hardware triggering a syncpoint interrupt
and then the driver disagreeing.

One situation where this could occur is if a job incremented a
syncpoint too many times -- then the hardware would trigger an
interrupt, but the driver would assume that a syncpoint value
greater than the syncpoint's max value is in the future, and not
clean up the job.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/syncpt.c | 51 ++---
 1 file changed, 2 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 5982fdf64e1c..9ca0d852e32f 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -306,59 +306,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait);
 bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh)
 {
u32 current_val;
-   u32 future_val;
 
smp_rmb();
 
current_val = (u32)atomic_read(&sp->min_val);
-   future_val = (u32)atomic_read(&sp->max_val);
-
-   /* Note the use of unsigned arithmetic here (mod 1<<32).
-*
-* c = current_val = min_val= the current value of the syncpoint.
-* t = thresh   = the value we are checking
-* f = future_val  = max_val= the value c will reach when all
-*outstanding increments have completed.
-*
-* Note that c always chases f until it reaches f.
-*
-* Dtf = (f - t)
-* Dtc = (c - t)
-*
-*  Consider all cases:
-*
-*  A) .c..t..f.Dtf < Dtc   need to wait
-*  B) .c.f..t..Dtf > Dtc   expired
-*  C) ..t..c.f.Dtf > Dtc   expired(Dct very 
large)
-*
-*  Any case where f==c: always expired (for any t).Dtf == Dcf
-*  Any case where t==c: always expired (for any f).Dtf >= Dtc 
(because Dtc==0)
-*  Any case where t==f!=c: always wait.Dtf <  Dtc 
(because Dtf==0,
-*  Dtc!=0)
-*
-*  Other cases:
-*
-*  A) .t..f..c.Dtf < Dtc   need to wait
-*  A) .f..c..t.Dtf < Dtc   need to wait
-*  A) .f..t..c.Dtf > Dtc   expired
-*
-*   So:
-* Dtf >= Dtc implies EXPIRED   (return true)
-* Dtf <  Dtc implies WAIT  (return false)
-*
-* Note: If t is expired then we *cannot* wait on it. We would wait
-* forever (hang the system).
-*
-* Note: do NOT get clever and remove the -thresh from both sides. It
-* is NOT the same.
-*
-* If future valueis zero, we have a client managed sync point. In that
-* case we do a direct comparison.
-*/
-   if (!host1x_syncpt_client_managed(sp))
-   return future_val - thresh >= current_val - thresh;
-   else
-   return (s32)(current_val - thresh) >= 0;
+
+   return ((current_val - thresh) & 0x8000U) == 0U;
 }
 
 int host1x_syncpt_init(struct host1x *host)
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 06/20] gpu: host1x: Cleanup and refcounting for syncpoints

2020-10-07 Thread Mikko Perttunen
Add reference counting for allocated syncpoints to allow keeping
them allocated while jobs are referencing them. Additionally,
clean up various places using syncpoint IDs to use host1x_syncpt
pointers instead.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/drm/tegra/dc.c |  4 +-
 drivers/gpu/drm/tegra/drm.c| 17 ---
 drivers/gpu/drm/tegra/gr2d.c   |  4 +-
 drivers/gpu/drm/tegra/gr3d.c   |  4 +-
 drivers/gpu/drm/tegra/vic.c|  4 +-
 drivers/gpu/host1x/cdma.c  | 11 ++---
 drivers/gpu/host1x/dev.h   |  7 ++-
 drivers/gpu/host1x/hw/cdma_hw.c|  2 +-
 drivers/gpu/host1x/hw/channel_hw.c | 10 ++--
 drivers/gpu/host1x/hw/debug_hw.c   |  2 +-
 drivers/gpu/host1x/job.c   |  5 +-
 drivers/gpu/host1x/syncpt.c| 75 +++---
 drivers/gpu/host1x/syncpt.h|  3 ++
 include/linux/host1x.h |  8 ++--
 14 files changed, 99 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index 9a0b3240bc58..efb41c10dad4 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -2127,7 +2127,7 @@ static int tegra_dc_init(struct host1x_client *client)
drm_plane_cleanup(primary);
 
host1x_client_iommu_detach(client);
-   host1x_syncpt_free(dc->syncpt);
+   host1x_syncpt_put(dc->syncpt);
 
return err;
 }
@@ -2152,7 +2152,7 @@ static int tegra_dc_exit(struct host1x_client *client)
}
 
host1x_client_iommu_detach(client);
-   host1x_syncpt_free(dc->syncpt);
+   host1x_syncpt_put(dc->syncpt);
 
return 0;
 }
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index ba9d1c3e7cac..ceea9db341f0 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -171,7 +171,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
struct drm_tegra_syncpt syncpt;
struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
struct drm_gem_object **refs;
-   struct host1x_syncpt *sp;
+   struct host1x_syncpt *sp = NULL;
struct host1x_job *job;
unsigned int num_refs;
int err;
@@ -298,8 +298,8 @@ int tegra_drm_submit(struct tegra_drm_context *context,
goto fail;
}
 
-   /* check whether syncpoint ID is valid */
-   sp = host1x_syncpt_get(host1x, syncpt.id);
+   /* Syncpoint ref will be dropped on job release. */
+   sp = host1x_syncpt_get_by_id(host1x, syncpt.id);
if (!sp) {
err = -ENOENT;
goto fail;
@@ -308,7 +308,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
job->is_addr_reg = context->client->ops->is_addr_reg;
job->is_valid_class = context->client->ops->is_valid_class;
job->syncpt_incrs = syncpt.incrs;
-   job->syncpt_id = syncpt.id;
+   job->syncpt = sp;
job->timeout = 1;
 
if (args->timeout && args->timeout < 1)
@@ -327,6 +327,9 @@ int tegra_drm_submit(struct tegra_drm_context *context,
args->fence = job->syncpt_end;
 
 fail:
+   if (sp)
+   host1x_syncpt_put(sp);
+
while (num_refs--)
drm_gem_object_put(refs[num_refs]);
 
@@ -380,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void 
*data,
struct drm_tegra_syncpt_read *args = data;
struct host1x_syncpt *sp;
 
-   sp = host1x_syncpt_get(host, args->id);
+   sp = host1x_syncpt_get_by_id_noref(host, args->id);
if (!sp)
return -EINVAL;
 
@@ -395,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void 
*data,
struct drm_tegra_syncpt_incr *args = data;
struct host1x_syncpt *sp;
 
-   sp = host1x_syncpt_get(host1x, args->id);
+   sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
if (!sp)
return -EINVAL;
 
@@ -409,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void 
*data,
struct drm_tegra_syncpt_wait *args = data;
struct host1x_syncpt *sp;
 
-   sp = host1x_syncpt_get(host1x, args->id);
+   sp = host1x_syncpt_get_by_id_noref(host1x, args->id);
if (!sp)
return -EINVAL;
 
diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c
index 1a0d3ba6e525..d857a99b21a7 100644
--- a/drivers/gpu/drm/tegra/gr2d.c
+++ b/drivers/gpu/drm/tegra/gr2d.c
@@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client)
 detach:
host1x_client_iommu_detach(client);
 free:
-   host1x_syncpt_free(client->syncpts[0]);
+   host1x_syncpt_put(client->syncpts[0]);
 put:
host1x_channel_put(gr2d->channel);
return err;
@@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client)
return err;
 
host1x_client_iommu_detach(client);
-   host1x_syncpt_free(client->syncpts[0]);
+   host1x_syncpt_put(client->syncpts[0]);
host1x_channe

[PATCH v3 02/20] gpu: host1x: Allow syncpoints without associated client

2020-10-07 Thread Mikko Perttunen
Syncpoints don't need to be associated with any client,
so remove the property, and expose host1x_syncpt_alloc.
This will allow allocating syncpoints without prior knowledge
of the engine that it will be used with.

Signed-off-by: Mikko Perttunen 
---
v3:
* Clean up host1x_syncpt_alloc signature to allow specifying
  a name for the syncpoint.
* Export the function.
---
 drivers/gpu/host1x/syncpt.c | 22 ++
 drivers/gpu/host1x/syncpt.h |  1 -
 include/linux/host1x.h  |  3 +++
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index fce7892d5137..5982fdf64e1c 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -42,13 +42,13 @@ static void host1x_syncpt_base_free(struct 
host1x_syncpt_base *base)
base->requested = false;
 }
 
-static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
-struct host1x_client *client,
-unsigned long flags)
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+ unsigned long flags,
+ const char *name)
 {
struct host1x_syncpt *sp = host->syncpt;
+   char *full_name;
unsigned int i;
-   char *name;
 
mutex_lock(&host->syncpt_mutex);
 
@@ -64,13 +64,11 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct 
host1x *host,
goto unlock;
}
 
-   name = kasprintf(GFP_KERNEL, "%02u-%s", sp->id,
-client ? dev_name(client->dev) : NULL);
-   if (!name)
+   full_name = kasprintf(GFP_KERNEL, "%u-%s", sp->id, name);
+   if (!full_name)
goto free_base;
 
-   sp->client = client;
-   sp->name = name;
+   sp->name = full_name;
 
if (flags & HOST1X_SYNCPT_CLIENT_MANAGED)
sp->client_managed = true;
@@ -87,6 +85,7 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct 
host1x *host,
mutex_unlock(&host->syncpt_mutex);
return NULL;
 }
+EXPORT_SYMBOL(host1x_syncpt_alloc);
 
 /**
  * host1x_syncpt_id() - retrieve syncpoint ID
@@ -401,7 +400,7 @@ int host1x_syncpt_init(struct host1x *host)
host1x_hw_syncpt_enable_protection(host);
 
/* Allocate sync point to use for clearing waits for expired fences */
-   host->nop_sp = host1x_syncpt_alloc(host, NULL, 0);
+   host->nop_sp = host1x_syncpt_alloc(host, 0, "reserved-nop");
if (!host->nop_sp)
return -ENOMEM;
 
@@ -423,7 +422,7 @@ struct host1x_syncpt *host1x_syncpt_request(struct 
host1x_client *client,
 {
struct host1x *host = dev_get_drvdata(client->host->parent);
 
-   return host1x_syncpt_alloc(host, client, flags);
+   return host1x_syncpt_alloc(host, flags, dev_name(client->dev));
 }
 EXPORT_SYMBOL(host1x_syncpt_request);
 
@@ -447,7 +446,6 @@ void host1x_syncpt_free(struct host1x_syncpt *sp)
host1x_syncpt_base_free(sp->base);
kfree(sp->name);
sp->base = NULL;
-   sp->client = NULL;
sp->name = NULL;
sp->client_managed = false;
 
diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h
index 8e1d04dacaa0..3aa6b25b1b9c 100644
--- a/drivers/gpu/host1x/syncpt.h
+++ b/drivers/gpu/host1x/syncpt.h
@@ -33,7 +33,6 @@ struct host1x_syncpt {
const char *name;
bool client_managed;
struct host1x *host;
-   struct host1x_client *client;
struct host1x_syncpt_base *base;
 
/* interrupt data */
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index f711fc0154f4..099eff8a06d2 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -154,6 +154,9 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 
thresh, long timeout,
 struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client,
unsigned long flags);
 void host1x_syncpt_free(struct host1x_syncpt *sp);
+struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host,
+ unsigned long flags,
+ const char *name);
 
 struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp);
 u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base);
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 10/20] gpu: host1x: Add no-recovery mode

2020-10-07 Thread Mikko Perttunen
Add a new property for jobs to enable or disable recovery i.e.
CPU increments of syncpoints to max value on job timeout. This
allows for a more solid model for hanged jobs, where userspace
doesn't need to guess if a syncpoint increment happened because
the job completed, or because job timeout was triggered.

On job timeout, we stop the channel, NOP all future jobs on the
channel using the same syncpoint, mark the syncpoint as locked
and resume the channel from the next job, if any.

The future jobs are NOPed, since because we don't do the CPU
increments, the value of the syncpoint is no longer synchronized,
and any waiters would become confused if a future job incremented
the syncpoint. The syncpoint is marked locked to ensure that any
future jobs cannot increment the syncpoint either, until the
application has recognized the situation and reallocated the
syncpoint.

Signed-off-by: Mikko Perttunen 
---
v3:
* Move 'locked' check inside CDMA lock to prevent race
* Add clarifying comment to NOP-patching code
---
 drivers/gpu/drm/tegra/drm.c|  1 +
 drivers/gpu/host1x/cdma.c  | 58 ++
 drivers/gpu/host1x/hw/channel_hw.c |  2 +-
 drivers/gpu/host1x/job.c   |  4 +++
 drivers/gpu/host1x/syncpt.c|  2 ++
 drivers/gpu/host1x/syncpt.h| 12 +++
 include/linux/host1x.h |  9 +
 7 files changed, 81 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index ceea9db341f0..7437c67924aa 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -197,6 +197,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
job->client = client;
job->class = client->class;
job->serialize = true;
+   job->syncpt_recovery = true;
 
/*
 * Track referenced BOs so that they can be unreferenced after the
diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c
index 6e6ca774f68d..bd151c3a2a5f 100644
--- a/drivers/gpu/host1x/cdma.c
+++ b/drivers/gpu/host1x/cdma.c
@@ -312,10 +312,6 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
bool signal = false;
struct host1x_job *job, *n;
 
-   /* If CDMA is stopped, queue is cleared and we can return */
-   if (!cdma->running)
-   return;
-
/*
 * Walk the sync queue, reading the sync point registers as necessary,
 * to consume as many sync queue entries as possible without blocking
@@ -324,7 +320,8 @@ static void update_cdma_locked(struct host1x_cdma *cdma)
struct host1x_syncpt *sp = job->syncpt;
 
/* Check whether this syncpt has completed, and bail if not */
-   if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) {
+   if (!host1x_syncpt_is_expired(sp, job->syncpt_end) &&
+   !job->cancelled) {
/* Start timer on next pending syncpt */
if (job->timeout)
cdma_start_timer_locked(cdma, job);
@@ -413,8 +410,11 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma 
*cdma,
else
restart_addr = cdma->last_pos;
 
+   if (!job)
+   goto resume;
+
/* do CPU increments for the remaining syncpts */
-   if (job) {
+   if (job->syncpt_recovery) {
dev_dbg(dev, "%s: perform CPU incr on pending buffers\n",
__func__);
 
@@ -433,8 +433,44 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma 
*cdma,
 
dev_dbg(dev, "%s: finished sync_queue modification\n",
__func__);
+   } else {
+   struct host1x_job *failed_job = job;
+
+   host1x_job_dump(dev, job);
+
+   host1x_syncpt_set_locked(job->syncpt);
+   failed_job->cancelled = true;
+
+   list_for_each_entry_continue(job, &cdma->sync_queue, list) {
+   unsigned int i;
+
+   if (job->syncpt != failed_job->syncpt)
+   continue;
+
+   for (i = 0; i < job->num_slots; i++) {
+   unsigned int slot = (job->first_get/8 + i) %
+   HOST1X_PUSHBUFFER_SLOTS;
+   u32 *mapped = cdma->push_buffer.mapped;
+
+   /*
+* Overwrite opcodes with 0 word writes to
+* to offset 0xbad. This does nothing but
+* has a easily detected signature in debug
+* traces.
+*/
+   mapped[2*slot+0] = 0x1bad;
+   mapped[2*slot+1] = 0x1bad;
+   }
+
+   job->cancelled = true;
+   }
+
+   wmb

[PATCH v3 20/20] drm/tegra: Add job firewall

2020-10-07 Thread Mikko Perttunen
Add a firewall that validates jobs before submission to ensure
they don't do anything they aren't allowed to do, like accessing
memory they should not access.

The firewall is functionality-wise a copy of the firewall already
implemented in gpu/host1x. It is copied here as it makes more
sense for it to live on the DRM side, as it is only needed for
userspace job submissions, and generally the data it needs to
do its job is easier to access here.

In the future, the other implementation will be removed.

Signed-off-by: Mikko Perttunen 
---
v3:
* New patch
---
 drivers/gpu/drm/tegra/Makefile|   1 +
 drivers/gpu/drm/tegra/uapi/firewall.c | 197 ++
 drivers/gpu/drm/tegra/uapi/submit.c   |   4 +
 drivers/gpu/drm/tegra/uapi/submit.h   |   3 +
 4 files changed, 205 insertions(+)
 create mode 100644 drivers/gpu/drm/tegra/uapi/firewall.c

diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 059322e88943..4e3295f436f1 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -5,6 +5,7 @@ tegra-drm-y := \
drm.o \
uapi/uapi.o \
uapi/submit.o \
+   uapi/firewall.o \
uapi/gather_bo.o \
gem.o \
fb.o \
diff --git a/drivers/gpu/drm/tegra/uapi/firewall.c 
b/drivers/gpu/drm/tegra/uapi/firewall.c
new file mode 100644
index ..a9c5b71bc235
--- /dev/null
+++ b/drivers/gpu/drm/tegra/uapi/firewall.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2010-2020 NVIDIA Corporation */
+
+#include "../drm.h"
+#include "../uapi.h"
+
+#include "submit.h"
+
+struct tegra_drm_firewall {
+   struct tegra_drm_submit_data *submit;
+   struct tegra_drm_client *client;
+   u32 *data;
+   u32 pos;
+   u32 end;
+};
+
+static int fw_next(struct tegra_drm_firewall *fw, u32 *word)
+{
+   if (fw->pos == fw->end)
+   return -EINVAL;
+
+   *word = fw->data[fw->pos++];
+
+   return 0;
+}
+
+static bool fw_check_addr_valid(struct tegra_drm_firewall *fw, u32 offset)
+{
+   u32 i;
+
+   for (i = 0; i < fw->submit->num_used_mappings; i++) {
+   struct tegra_drm_mapping *m = 
fw->submit->used_mappings[i].mapping;
+
+   if (offset >= m->iova && offset <= m->iova_end)
+   return true;
+   }
+
+   return false;
+}
+
+static int fw_check_reg(struct tegra_drm_firewall *fw, u32 offset)
+{
+   bool is_addr;
+   u32 word;
+   int err;
+
+   err = fw_next(fw, &word);
+   if (err)
+   return err;
+
+   if (!fw->client->ops->is_addr_reg)
+   return 0;
+
+   is_addr = fw->client->ops->is_addr_reg(
+   fw->client->base.dev, fw->client->base.class, offset);
+
+   if (!is_addr)
+   return 0;
+
+   if (!fw_check_addr_valid(fw, word))
+   return -EINVAL;
+
+   return 0;
+}
+
+static int fw_check_regs_seq(struct tegra_drm_firewall *fw, u32 offset,
+u32 count, bool incr)
+{
+   u32 i;
+
+   for (i = 0; i < count; i++) {
+   if (fw_check_reg(fw, offset))
+   return -EINVAL;
+
+   if (incr)
+   offset++;
+   }
+
+   return 0;
+}
+
+static int fw_check_regs_mask(struct tegra_drm_firewall *fw, u32 offset,
+ u16 mask)
+{
+   unsigned long bmask = mask;
+   unsigned int bit;
+
+   for_each_set_bit(bit, &bmask, 16) {
+   if (fw_check_reg(fw, offset+bit))
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int fw_check_regs_imm(struct tegra_drm_firewall *fw, u32 offset)
+{
+   bool is_addr;
+
+   is_addr = fw->client->ops->is_addr_reg(fw->client->base.dev,
+  fw->client->base.class, offset);
+   if (is_addr)
+   return -EINVAL;
+
+   return 0;
+}
+
+enum {
+HOST1X_OPCODE_SETCLASS  = 0x00,
+HOST1X_OPCODE_INCR  = 0x01,
+HOST1X_OPCODE_NONINCR   = 0x02,
+HOST1X_OPCODE_MASK  = 0x03,
+HOST1X_OPCODE_IMM   = 0x04,
+HOST1X_OPCODE_RESTART   = 0x05,
+HOST1X_OPCODE_GATHER= 0x06,
+HOST1X_OPCODE_SETSTRMID = 0x07,
+HOST1X_OPCODE_SETAPPID  = 0x08,
+HOST1X_OPCODE_SETPYLD   = 0x09,
+HOST1X_OPCODE_INCR_W= 0x0a,
+HOST1X_OPCODE_NONINCR_W = 0x0b,
+HOST1X_OPCODE_GATHER_W  = 0x0c,
+HOST1X_OPCODE_RESTART_W = 0x0d,
+HOST1X_OPCODE_EXTEND= 0x0e,
+};
+
+int tegra_drm_fw_validate(struct tegra_drm_client *client, u32 *data, u32 
start,
+ u32 words, struct tegra_drm_submit_data *submit)
+{
+   struct tegra_drm_firewall fw = {
+   .submit = submit,
+   .client = client,
+   .data = data,
+   .pos = start,
+   .end = start+words,
+   };
+   bool payl

[PATCH v3 12/20] gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer

2020-10-07 Thread Mikko Perttunen
Add support for inserting syncpoint waits in the CDMA pushbuffer.
These waits need to be done in HOST1X class, while gather submitted
by the application execute in engine class.

Support is added by converting the gather list of job into a command
list that can include both gathers and waits. When the job is
submitted, these commands are pushed as the appropriate opcodes
on the CDMA pushbuffer.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/hw/channel_hw.c | 51 +++
 drivers/gpu/host1x/hw/debug_hw.c   |  9 +++-
 drivers/gpu/host1x/job.c   | 67 +-
 drivers/gpu/host1x/job.h   | 14 +++
 include/linux/host1x.h |  5 ++-
 5 files changed, 105 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/host1x/hw/channel_hw.c 
b/drivers/gpu/host1x/hw/channel_hw.c
index bf21512e5078..d88a32f73f5e 100644
--- a/drivers/gpu/host1x/hw/channel_hw.c
+++ b/drivers/gpu/host1x/hw/channel_hw.c
@@ -55,31 +55,46 @@ static void submit_gathers(struct host1x_job *job)
 #endif
unsigned int i;
 
-   for (i = 0; i < job->num_gathers; i++) {
-   struct host1x_job_gather *g = &job->gathers[i];
-   dma_addr_t addr = g->base + g->offset;
-   u32 op2, op3;
+   for (i = 0; i < job->num_cmds; i++) {
+   struct host1x_job_cmd *cmd = &job->cmds[i];
 
-   op2 = lower_32_bits(addr);
-   op3 = upper_32_bits(addr);
+   if (cmd->is_wait) {
+   /* TODO use modern wait */
+   host1x_cdma_push(cdma,
+host1x_opcode_setclass(HOST1X_CLASS_HOST1X,
+   host1x_uclass_wait_syncpt_r(), 1),
+host1x_class_host_wait_syncpt(cmd->wait.id,
+   cmd->wait.threshold));
+   host1x_cdma_push(
+   cdma, host1x_opcode_setclass(job->class, 0, 0),
+   HOST1X_OPCODE_NOP);
+   } else {
+   struct host1x_job_gather *g = &cmd->gather;
 
-   trace_write_gather(cdma, g->bo, g->offset, g->words);
+   dma_addr_t addr = g->base + g->offset;
+   u32 op2, op3;
 
-   if (op3 != 0) {
+   op2 = lower_32_bits(addr);
+   op3 = upper_32_bits(addr);
+
+   trace_write_gather(cdma, g->bo, g->offset, g->words);
+
+   if (op3 != 0) {
 #if HOST1X_HW >= 6
-   u32 op1 = host1x_opcode_gather_wide(g->words);
-   u32 op4 = HOST1X_OPCODE_NOP;
+   u32 op1 = host1x_opcode_gather_wide(g->words);
+   u32 op4 = HOST1X_OPCODE_NOP;
 
-   host1x_cdma_push_wide(cdma, op1, op2, op3, op4);
+   host1x_cdma_push_wide(cdma, op1, op2, op3, op4);
 #else
-   dev_err(dev, "invalid gather for push buffer %pad\n",
-   &addr);
-   continue;
+   dev_err(dev, "invalid gather for push buffer 
%pad\n",
+   &addr);
+   continue;
 #endif
-   } else {
-   u32 op1 = host1x_opcode_gather(g->words);
+   } else {
+   u32 op1 = host1x_opcode_gather(g->words);
 
-   host1x_cdma_push(cdma, op1, op2);
+   host1x_cdma_push(cdma, op1, op2);
+   }
}
}
 }
@@ -126,7 +141,7 @@ static int channel_submit(struct host1x_job *job)
struct host1x *host = dev_get_drvdata(ch->dev->parent);
 
trace_host1x_channel_submit(dev_name(ch->dev),
-   job->num_gathers, job->num_relocs,
+   job->num_cmds, job->num_relocs,
job->syncpt->id, job->syncpt_incrs);
 
/* before error checks, return current max */
diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c
index ceb48229d14b..35952fd5597e 100644
--- a/drivers/gpu/host1x/hw/debug_hw.c
+++ b/drivers/gpu/host1x/hw/debug_hw.c
@@ -208,10 +208,15 @@ static void show_channel_gathers(struct output *o, struct 
host1x_cdma *cdma)
job->first_get, job->timeout,
job->num_slots, job->num_unpins);
 
-   for (i = 0; i < job->num_gathers; i++) {
-   struct host1x_job_gather *g = &job->gathers[i];
+   for (i = 0; i < job->num_cmds; i++) {
+   struct host1x_job_gather *g;
u32 *mapped;
 
+   if (job->cmds[i].is_wait)
+ 

[PATCH v3 16/20] drm/tegra: Boot VIC during runtime PM resume

2020-10-07 Thread Mikko Perttunen
With the new UAPI implementation, engines are powered on and off
when there are active jobs, and the core code handles channel
allocation. To accommodate that, boot the engine as part of
runtime PM instead of using the open_channel callback, which is
not used by the new submit path.

Signed-off-by: Mikko Perttunen 
---
v3:
* runtime_get/put is now done directly from submit path, so no
  callbacks are added
* Reworded.
---
 drivers/gpu/drm/tegra/vic.c | 114 +---
 1 file changed, 53 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index cb476da59adc..5d2ad125dca3 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -29,7 +29,6 @@ struct vic_config {
 
 struct vic {
struct falcon falcon;
-   bool booted;
 
void __iomem *regs;
struct tegra_drm_client client;
@@ -52,48 +51,6 @@ static void vic_writel(struct vic *vic, u32 value, unsigned 
int offset)
writel(value, vic->regs + offset);
 }
 
-static int vic_runtime_resume(struct device *dev)
-{
-   struct vic *vic = dev_get_drvdata(dev);
-   int err;
-
-   err = clk_prepare_enable(vic->clk);
-   if (err < 0)
-   return err;
-
-   usleep_range(10, 20);
-
-   err = reset_control_deassert(vic->rst);
-   if (err < 0)
-   goto disable;
-
-   usleep_range(10, 20);
-
-   return 0;
-
-disable:
-   clk_disable_unprepare(vic->clk);
-   return err;
-}
-
-static int vic_runtime_suspend(struct device *dev)
-{
-   struct vic *vic = dev_get_drvdata(dev);
-   int err;
-
-   err = reset_control_assert(vic->rst);
-   if (err < 0)
-   return err;
-
-   usleep_range(2000, 4000);
-
-   clk_disable_unprepare(vic->clk);
-
-   vic->booted = false;
-
-   return 0;
-}
-
 static int vic_boot(struct vic *vic)
 {
 #ifdef CONFIG_IOMMU_API
@@ -103,9 +60,6 @@ static int vic_boot(struct vic *vic)
void *hdr;
int err = 0;
 
-   if (vic->booted)
-   return 0;
-
 #ifdef CONFIG_IOMMU_API
if (vic->config->supports_sid && spec) {
u32 value;
@@ -153,8 +107,6 @@ static int vic_boot(struct vic *vic)
return err;
}
 
-   vic->booted = true;
-
return 0;
 }
 
@@ -308,35 +260,76 @@ static int vic_load_firmware(struct vic *vic)
return err;
 }
 
-static int vic_open_channel(struct tegra_drm_client *client,
-   struct tegra_drm_context *context)
+
+static int vic_runtime_resume(struct device *dev)
 {
-   struct vic *vic = to_vic(client);
+   struct vic *vic = dev_get_drvdata(dev);
int err;
 
-   err = pm_runtime_get_sync(vic->dev);
+   err = clk_prepare_enable(vic->clk);
if (err < 0)
return err;
 
+   usleep_range(10, 20);
+
+   err = reset_control_deassert(vic->rst);
+   if (err < 0)
+   goto disable;
+
+   usleep_range(10, 20);
+
err = vic_load_firmware(vic);
if (err < 0)
-   goto rpm_put;
+   goto assert;
 
err = vic_boot(vic);
if (err < 0)
-   goto rpm_put;
+   goto assert;
+
+   return 0;
+
+assert:
+   reset_control_assert(vic->rst);
+disable:
+   clk_disable_unprepare(vic->clk);
+   return err;
+}
+
+static int vic_runtime_suspend(struct device *dev)
+{
+   struct vic *vic = dev_get_drvdata(dev);
+   int err;
+
+   err = reset_control_assert(vic->rst);
+   if (err < 0)
+   return err;
+
+   usleep_range(2000, 4000);
+
+   clk_disable_unprepare(vic->clk);
+
+   return 0;
+}
+
+static int vic_open_channel(struct tegra_drm_client *client,
+   struct tegra_drm_context *context)
+{
+   struct vic *vic = to_vic(client);
+   int err;
+
+   err = pm_runtime_get_sync(vic->dev);
+   if (err < 0) {
+   pm_runtime_put(vic->dev);
+   return err;
+   }
 
context->channel = host1x_channel_get(vic->channel);
if (!context->channel) {
-   err = -ENOMEM;
-   goto rpm_put;
+   pm_runtime_put(vic->dev);
+   return -ENOMEM;
}
 
return 0;
-
-rpm_put:
-   pm_runtime_put(vic->dev);
-   return err;
 }
 
 static void vic_close_channel(struct tegra_drm_context *context)
@@ -344,7 +337,6 @@ static void vic_close_channel(struct tegra_drm_context 
*context)
struct vic *vic = to_vic(context->client);
 
host1x_channel_put(context->channel);
-
pm_runtime_put(vic->dev);
 }
 
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 00/20] Host1x/TegraDRM UAPI

2020-10-07 Thread Mikko Perttunen
Hi all,

here's the third revision of the Host1x/TegraDRM UAPI proposal.
The open issues from RFCv2 should be resolved now, so I'm
dropping the RFC tag. The series is still only tested with Tegra186
so I'm hoping for people with devices with other chips to test this
out.

The test suite[1] has been updated for the changes in this revision,
and also includes tests for the newly added DMA reservation support.
If there are no further issues with the UAPI definition, I'll
look at porting other userspace next - hoping for some help with that
as well since most of it is for chips I don't have easy access to.

The series can be also found in
https://github.com/cyndis/linux/commits/work/host1x-uapi-v3.

Older versions:
v1: https://www.spinics.net/lists/linux-tegra/msg51000.html
v2: https://www.spinics.net/lists/linux-tegra/msg53061.html

Thank you,
Mikko

[1] https://github.com/cyndis/uapi-test

Mikko Perttunen (20):
  gpu: host1x: Use different lock classes for each client
  gpu: host1x: Allow syncpoints without associated client
  gpu: host1x: Show number of pending waiters in debugfs
  gpu: host1x: Remove cancelled waiters immediately
  gpu: host1x: Use HW-equivalent syncpoint expiration check
  gpu: host1x: Cleanup and refcounting for syncpoints
  gpu: host1x: Introduce UAPI header
  gpu: host1x: Implement /dev/host1x device node
  gpu: host1x: DMA fences and userspace fence creation
  gpu: host1x: Add no-recovery mode
  gpu: host1x: Add job release callback
  gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
  gpu: host1x: Reset max value when freeing a syncpoint
  gpu: host1x: Reserve VBLANK syncpoints at initialization
  drm/tegra: Add new UAPI to header
  drm/tegra: Boot VIC during runtime PM resume
  drm/tegra: Set resv fields when importing/exporting GEMs
  drm/tegra: Allocate per-engine channel in core code
  drm/tegra: Implement new UAPI
  drm/tegra: Add job firewall

 drivers/gpu/drm/tegra/Makefile |   4 +
 drivers/gpu/drm/tegra/dc.c |  10 +-
 drivers/gpu/drm/tegra/drm.c|  75 ++-
 drivers/gpu/drm/tegra/drm.h|   9 +
 drivers/gpu/drm/tegra/gem.c|   2 +
 drivers/gpu/drm/tegra/gr2d.c   |   4 +-
 drivers/gpu/drm/tegra/gr3d.c   |   4 +-
 drivers/gpu/drm/tegra/uapi.h   |  63 +++
 drivers/gpu/drm/tegra/uapi/firewall.c  | 197 +++
 drivers/gpu/drm/tegra/uapi/gather_bo.c |  86 
 drivers/gpu/drm/tegra/uapi/gather_bo.h |  22 +
 drivers/gpu/drm/tegra/uapi/submit.c| 679 +
 drivers/gpu/drm/tegra/uapi/submit.h|  20 +
 drivers/gpu/drm/tegra/uapi/uapi.c  | 326 
 drivers/gpu/drm/tegra/vic.c| 118 ++---
 drivers/gpu/host1x/Makefile|   2 +
 drivers/gpu/host1x/bus.c   |   7 +-
 drivers/gpu/host1x/cdma.c  |  69 ++-
 drivers/gpu/host1x/debug.c |  14 +-
 drivers/gpu/host1x/dev.c   |  15 +
 drivers/gpu/host1x/dev.h   |  16 +-
 drivers/gpu/host1x/fence.c | 207 
 drivers/gpu/host1x/fence.h |  13 +
 drivers/gpu/host1x/hw/cdma_hw.c|   2 +-
 drivers/gpu/host1x/hw/channel_hw.c |  63 ++-
 drivers/gpu/host1x/hw/debug_hw.c   |  11 +-
 drivers/gpu/host1x/intr.c  |  23 +-
 drivers/gpu/host1x/intr.h  |   2 +
 drivers/gpu/host1x/job.c   |  79 ++-
 drivers/gpu/host1x/job.h   |  14 +
 drivers/gpu/host1x/syncpt.c| 185 ---
 drivers/gpu/host1x/syncpt.h|  16 +-
 drivers/gpu/host1x/uapi.c  | 382 ++
 drivers/gpu/host1x/uapi.h  |  22 +
 include/linux/host1x.h |  47 +-
 include/uapi/drm/tegra_drm.h   | 420 ++-
 include/uapi/linux/host1x.h| 134 +
 37 files changed, 3076 insertions(+), 286 deletions(-)
 create mode 100644 drivers/gpu/drm/tegra/uapi.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/firewall.c
 create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.c
 create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/submit.c
 create mode 100644 drivers/gpu/drm/tegra/uapi/submit.h
 create mode 100644 drivers/gpu/drm/tegra/uapi/uapi.c
 create mode 100644 drivers/gpu/host1x/fence.c
 create mode 100644 drivers/gpu/host1x/fence.h
 create mode 100644 drivers/gpu/host1x/uapi.c
 create mode 100644 drivers/gpu/host1x/uapi.h
 create mode 100644 include/uapi/linux/host1x.h

-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 08/20] gpu: host1x: Implement /dev/host1x device node

2020-10-07 Thread Mikko Perttunen
Add the /dev/host1x device node, implementing the following
functionality:

- Reading syncpoint values
- Allocating syncpoints (providing syncpoint FDs)
- Incrementing syncpoints (based on syncpoint FD)

Signed-off-by: Mikko Perttunen 
---
v3:
* Pass process name as syncpoint name when allocating
  syncpoint.
---
 drivers/gpu/host1x/Makefile |   1 +
 drivers/gpu/host1x/dev.c|   9 ++
 drivers/gpu/host1x/dev.h|   3 +
 drivers/gpu/host1x/uapi.c   | 276 
 drivers/gpu/host1x/uapi.h   |  22 +++
 include/linux/host1x.h  |   2 +
 6 files changed, 313 insertions(+)
 create mode 100644 drivers/gpu/host1x/uapi.c
 create mode 100644 drivers/gpu/host1x/uapi.h

diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile
index 096017b8789d..882f928d75e1 100644
--- a/drivers/gpu/host1x/Makefile
+++ b/drivers/gpu/host1x/Makefile
@@ -9,6 +9,7 @@ host1x-y = \
job.o \
debug.o \
mipi.o \
+   uapi.o \
hw/host1x01.o \
hw/host1x02.o \
hw/host1x04.o \
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index d0ebb70e2fdd..641317d23828 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -461,6 +461,12 @@ static int host1x_probe(struct platform_device *pdev)
goto deinit_syncpt;
}
 
+   err = host1x_uapi_init(&host->uapi, host);
+   if (err) {
+   dev_err(&pdev->dev, "failed to initialize uapi\n");
+   goto deinit_intr;
+   }
+
host1x_debug_init(host);
 
if (host->info->has_hypervisor)
@@ -480,6 +486,8 @@ static int host1x_probe(struct platform_device *pdev)
host1x_unregister(host);
 deinit_debugfs:
host1x_debug_deinit(host);
+   host1x_uapi_deinit(&host->uapi);
+deinit_intr:
host1x_intr_deinit(host);
 deinit_syncpt:
host1x_syncpt_deinit(host);
@@ -501,6 +509,7 @@ static int host1x_remove(struct platform_device *pdev)
 
host1x_unregister(host);
host1x_debug_deinit(host);
+   host1x_uapi_deinit(&host->uapi);
host1x_intr_deinit(host);
host1x_syncpt_deinit(host);
reset_control_assert(host->rst);
diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h
index 63010ae37a97..7b8b7e20e32b 100644
--- a/drivers/gpu/host1x/dev.h
+++ b/drivers/gpu/host1x/dev.h
@@ -17,6 +17,7 @@
 #include "intr.h"
 #include "job.h"
 #include "syncpt.h"
+#include "uapi.h"
 
 struct host1x_syncpt;
 struct host1x_syncpt_base;
@@ -143,6 +144,8 @@ struct host1x {
struct list_head list;
 
struct device_dma_parameters dma_parms;
+
+   struct host1x_uapi uapi;
 };
 
 void host1x_hypervisor_writel(struct host1x *host1x, u32 r, u32 v);
diff --git a/drivers/gpu/host1x/uapi.c b/drivers/gpu/host1x/uapi.c
new file mode 100644
index ..4747d8de132e
--- /dev/null
+++ b/drivers/gpu/host1x/uapi.c
@@ -0,0 +1,276 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * /dev/host1x syncpoint interface
+ *
+ * Copyright (c) 2020, NVIDIA Corporation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "dev.h"
+#include "syncpt.h"
+#include "uapi.h"
+
+#include 
+
+static int syncpt_file_release(struct inode *inode, struct file *file)
+{
+   struct host1x_syncpt *sp = file->private_data;
+
+   host1x_syncpt_put(sp);
+
+   return 0;
+}
+
+static int syncpt_file_ioctl_info(struct host1x_syncpt *sp, void __user *data)
+{
+   struct host1x_syncpoint_info args;
+   unsigned long copy_err;
+
+   copy_err = copy_from_user(&args, data, sizeof(args));
+   if (copy_err)
+   return -EFAULT;
+
+   if (args.reserved[0] || args.reserved[1] || args.reserved[2])
+   return -EINVAL;
+
+   args.id = sp->id;
+
+   copy_err = copy_to_user(data, &args, sizeof(args));
+   if (copy_err)
+   return -EFAULT;
+
+   return 0;
+}
+
+static int syncpt_file_ioctl_incr(struct host1x_syncpt *sp, void __user *data)
+{
+   struct host1x_syncpoint_increment args;
+   unsigned long copy_err;
+   u32 i;
+
+   copy_err = copy_from_user(&args, data, sizeof(args));
+   if (copy_err)
+   return -EFAULT;
+
+   for (i = 0; i < args.count; i++) {
+   host1x_syncpt_incr(sp);
+   if (signal_pending(current))
+   return -EINTR;
+   }
+
+   return 0;
+}
+
+static long syncpt_file_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
+{
+   void __user *data = (void __user *)arg;
+   long err;
+
+   switch (cmd) {
+   case HOST1X_IOCTL_SYNCPOINT_INFO:
+   err = syncpt_file_ioctl_info(file->private_data, data);
+   break;
+
+   case HOST1X_IOCTL_SYNCPOINT_INCREMENT:
+   err = syncpt_file_ioctl_incr(file->private_data, data);
+   break;
+
+   default:
+   err = -ENOTTY;
+   }
+
+

[PATCH v3 13/20] gpu: host1x: Reset max value when freeing a syncpoint

2020-10-07 Thread Mikko Perttunen
With job recovery becoming optional, syncpoints may have a mismatch
between their value and max value when freed. As such, when freeing,
set the max value to the current value of the syncpoint so that it
is in a sane state for the next user.

Signed-off-by: Mikko Perttunen 
---
v3:
* Use host1x_syncpt_read instead of read_min to ensure syncpoint
  value is current.
---
 drivers/gpu/host1x/syncpt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c
index 8d658e5f7db2..99d31932eb34 100644
--- a/drivers/gpu/host1x/syncpt.c
+++ b/drivers/gpu/host1x/syncpt.c
@@ -385,6 +385,7 @@ static void syncpt_release(struct kref *ref)
 {
struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref);
 
+   atomic_set(&sp->max_val, host1x_syncpt_read(sp));
sp->locked = false;
 
mutex_lock(&sp->host->syncpt_mutex);
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 17/20] drm/tegra: Set resv fields when importing/exporting GEMs

2020-10-07 Thread Mikko Perttunen
To allow sharing of implicit fences when exporting/importing dma_buf
objects, set the 'resv' fields when importing or exporting GEM
objects.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/drm/tegra/gem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 723df142a981..4a8acd4724bd 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -423,6 +423,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device 
*drm,
}
 
bo->gem.import_attach = attach;
+   bo->gem.resv = buf->resv;
 
return bo;
 
@@ -675,6 +676,7 @@ struct dma_buf *tegra_gem_prime_export(struct 
drm_gem_object *gem,
exp_info.size = gem->size;
exp_info.flags = flags;
exp_info.priv = gem;
+   exp_info.resv = gem->resv;
 
return drm_gem_dmabuf_export(gem->dev, &exp_info);
 }
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 01/20] gpu: host1x: Use different lock classes for each client

2020-10-07 Thread Mikko Perttunen
To avoid false lockdep warnings, give each client lock a different
lock class, passed from the initialization site by macro.

Signed-off-by: Mikko Perttunen 
---
 drivers/gpu/host1x/bus.c | 7 ---
 include/linux/host1x.h   | 9 -
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index e201f62d62c0..4101f64bd545 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -714,13 +714,14 @@ EXPORT_SYMBOL(host1x_driver_unregister);
  * device and call host1x_device_init(), which will in turn call each client's
  * &host1x_client_ops.init implementation.
  */
-int host1x_client_register(struct host1x_client *client)
+int __host1x_client_register(struct host1x_client *client,
+  struct lock_class_key *key)
 {
struct host1x *host1x;
int err;
 
INIT_LIST_HEAD(&client->list);
-   mutex_init(&client->lock);
+   __mutex_init(&client->lock, "host1x client lock", key);
client->usecount = 0;
 
mutex_lock(&devices_lock);
@@ -741,7 +742,7 @@ int host1x_client_register(struct host1x_client *client)
 
return 0;
 }
-EXPORT_SYMBOL(host1x_client_register);
+EXPORT_SYMBOL(__host1x_client_register);
 
 /**
  * host1x_client_unregister() - unregister a host1x client
diff --git a/include/linux/host1x.h b/include/linux/host1x.h
index 20c885d0bddc..f711fc0154f4 100644
--- a/include/linux/host1x.h
+++ b/include/linux/host1x.h
@@ -320,7 +320,14 @@ static inline struct host1x_device 
*to_host1x_device(struct device *dev)
 int host1x_device_init(struct host1x_device *device);
 int host1x_device_exit(struct host1x_device *device);
 
-int host1x_client_register(struct host1x_client *client);
+int __host1x_client_register(struct host1x_client *client,
+struct lock_class_key *key);
+#define host1x_client_register(class) \
+   ({ \
+   static struct lock_class_key __key; \
+   __host1x_client_register(class, &__key); \
+   })
+
 int host1x_client_unregister(struct host1x_client *client);
 
 int host1x_client_suspend(struct host1x_client *client);
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM

2020-10-07 Thread Daniel Vetter
On Wed, Oct 7, 2020 at 6:53 PM Jason Gunthorpe  wrote:
>
> On Wed, Oct 07, 2020 at 06:44:18PM +0200, Daniel Vetter wrote:
> >
> > - /*
> > -  * While get_vaddr_frames() could be used for transient (kernel
> > -  * controlled lifetime) pinning of memory pages all current
> > -  * users establish long term (userspace controlled lifetime)
> > -  * page pinning. Treat get_vaddr_frames() like
> > -  * get_user_pages_longterm() and disallow it for filesystem-dax
> > -  * mappings.
> > -  */
> > - if (vma_is_fsdax(vma)) {
> > - ret = -EOPNOTSUPP;
> > - goto out;
> > - }
> > -
> > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> > - vec->got_ref = true;
> > - vec->is_pfns = false;
> > - ret = pin_user_pages_locked(start, nr_frames,
> > - gup_flags, (struct page **)(vec->ptrs), &locked);
> > - goto out;
> > - }
>
> The vm_flags still need to be checked before going into the while
> loop. If the break is taken then nothing would check vm_flags

Hm right that's a bin inconsistent. follow_pfn also checks for this,
so I think we can just ditch this entirely both here and in the do {}
while () check, simplifying the latter to just while (vma). Well, just
make it a real loop with less confusing control flow probably.

Or prefer I keep this and touch the code less?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 4/6] drm/dp: Add LTTPR helpers

2020-10-07 Thread Imre Deak
Add the helpers and register definitions needed to read out the common
and per-PHY LTTPR capabilities and perform link training in the LTTPR
non-transparent mode.

v2:
- Add drm_dp_dpcd_read_phy_link_status() and DP_PHY_LTTPR() here instead
  of adding these to i915. (Ville)
v3:
- Use memmove() to convert LTTPR to DPRX link status format. (Ville)

Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä 
Reviewed-by: Ville Syrjälä 
Signed-off-by: Imre Deak 
---
 drivers/gpu/drm/drm_dp_helper.c | 232 +++-
 include/drm/drm_dp_helper.h |  62 +
 2 files changed, 290 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c
index 478dd51f738d..79732402336d 100644
--- a/drivers/gpu/drm/drm_dp_helper.c
+++ b/drivers/gpu/drm/drm_dp_helper.c
@@ -150,11 +150,8 @@ void drm_dp_link_train_clock_recovery_delay(const u8 
dpcd[DP_RECEIVER_CAP_SIZE])
 }
 EXPORT_SYMBOL(drm_dp_link_train_clock_recovery_delay);
 
-void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE])
+static void __drm_dp_link_train_channel_eq_delay(unsigned long rd_interval)
 {
-   unsigned long rd_interval = dpcd[DP_TRAINING_AUX_RD_INTERVAL] &
-DP_TRAINING_AUX_RD_MASK;
-
if (rd_interval > 4)
DRM_DEBUG_KMS("AUX interval %lu, out of range (max 4)\n",
  rd_interval);
@@ -166,8 +163,35 @@ void drm_dp_link_train_channel_eq_delay(const u8 
dpcd[DP_RECEIVER_CAP_SIZE])
 
usleep_range(rd_interval, rd_interval * 2);
 }
+
+void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE])
+{
+   __drm_dp_link_train_channel_eq_delay(dpcd[DP_TRAINING_AUX_RD_INTERVAL] &
+DP_TRAINING_AUX_RD_MASK);
+}
 EXPORT_SYMBOL(drm_dp_link_train_channel_eq_delay);
 
+void drm_dp_lttpr_link_train_clock_recovery_delay(void)
+{
+   usleep_range(100, 200);
+}
+EXPORT_SYMBOL(drm_dp_lttpr_link_train_clock_recovery_delay);
+
+static u8 dp_lttpr_phy_cap(const u8 phy_cap[DP_LTTPR_PHY_CAP_SIZE], int r)
+{
+   return phy_cap[r - DP_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1];
+}
+
+void drm_dp_lttpr_link_train_channel_eq_delay(const u8 
phy_cap[DP_LTTPR_PHY_CAP_SIZE])
+{
+   u8 interval = dp_lttpr_phy_cap(phy_cap,
+  
DP_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1) &
+ DP_TRAINING_AUX_RD_MASK;
+
+   __drm_dp_link_train_channel_eq_delay(interval);
+}
+EXPORT_SYMBOL(drm_dp_lttpr_link_train_channel_eq_delay);
+
 u8 drm_dp_link_rate_to_bw_code(int link_rate)
 {
/* Spec says link_bw = link_rate / 0.27Gbps */
@@ -363,6 +387,59 @@ int drm_dp_dpcd_read_link_status(struct drm_dp_aux *aux,
 }
 EXPORT_SYMBOL(drm_dp_dpcd_read_link_status);
 
+/**
+ * drm_dp_dpcd_read_phy_link_status - get the link status information for a DP 
PHY
+ * @aux: DisplayPort AUX channel
+ * @dp_phy: the DP PHY to get the link status for
+ * @link_status: buffer to return the status in
+ *
+ * Fetch the AUX DPCD registers for the DPRX or an LTTPR PHY link status. The
+ * layout of the returned @link_status matches the DPCD register layout of the
+ * DPRX PHY link status.
+ *
+ * Returns 0 if the information was read successfully or a negative error code
+ * on failure.
+ */
+int drm_dp_dpcd_read_phy_link_status(struct drm_dp_aux *aux,
+enum drm_dp_phy dp_phy,
+u8 link_status[DP_LINK_STATUS_SIZE])
+{
+   int ret;
+
+   if (dp_phy == DP_PHY_DPRX) {
+   ret = drm_dp_dpcd_read(aux,
+  DP_LANE0_1_STATUS,
+  link_status,
+  DP_LINK_STATUS_SIZE);
+
+   if (ret < 0)
+   return ret;
+
+   WARN_ON(ret != DP_LINK_STATUS_SIZE);
+
+   return 0;
+   }
+
+   ret = drm_dp_dpcd_read(aux,
+  DP_LANE0_1_STATUS_PHY_REPEATER(dp_phy),
+  link_status,
+  DP_LINK_STATUS_SIZE - 1);
+
+   if (ret < 0)
+   return ret;
+
+   WARN_ON(ret != DP_LINK_STATUS_SIZE - 1);
+
+   /* Convert the LTTPR to the sink PHY link status layout */
+   memmove(&link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS + 1],
+   &link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS],
+   DP_LINK_STATUS_SIZE - (DP_SINK_STATUS - DP_LANE0_1_STATUS) - 1);
+   link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS] = 0;
+
+   return 0;
+}
+EXPORT_SYMBOL(drm_dp_dpcd_read_phy_link_status);
+
 static bool is_edid_digital_input_dp(const struct edid *edid)
 {
return edid && edid->revision >= 4 &&
@@ -2098,6 +2175,153 @@ int drm_dp_dsc_sink_supported_input_bpcs(const u8 
dsc_dpcd[DP_DSC_RECEIVER_CAP_S
 }
 EXPORT_SYMBOL(drm_dp_dsc_sink_supported_input_bpcs);
 
+/**
+ * 

Re: [PATCH 2/5] thermal: devfreq_cooling: get a copy of device status

2020-10-07 Thread Ionela Voinescu
On Monday 21 Sep 2020 at 13:20:04 (+0100), Lukasz Luba wrote:
> Devfreq cooling needs to now the correct status of the device in order
> to operate. Do not rely on Devfreq last_status which might be a stale data
> and get more up-to-date values of the load.
> 
> Devfreq framework can change the device status in the background. To
> mitigate this situation make a copy of the status structure and use it
> for internal calculations.
> 
> In addition this patch adds normalization function, which also makes sure
> that whatever data comes from the device, it is in a sane range.
> 
> Signed-off-by: Lukasz Luba 
> ---
>  drivers/thermal/devfreq_cooling.c | 52 +--
>  1 file changed, 43 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/thermal/devfreq_cooling.c 
> b/drivers/thermal/devfreq_cooling.c
> index 7063ccb7b86d..cf045bd4d16b 100644
> --- a/drivers/thermal/devfreq_cooling.c
> +++ b/drivers/thermal/devfreq_cooling.c
> @@ -227,6 +227,24 @@ static inline unsigned long get_total_power(struct 
> devfreq_cooling_device *dfc,
>  voltage);
>  }
>  
> +static void _normalize_load(struct devfreq_dev_status *status)

Is there a reason for the leading "_" ?
AFAIK, "__name()" is meant to suggest a "worker" function for another
"name()" function, but that would not apply here.

> +{
> + /* Make some space if needed */
> + if (status->busy_time > 0x) {
> + status->busy_time >>= 10;
> + status->total_time >>= 10;
> + }

How about removing the above code and adding here:

status->busy_time = status->busy_time ? : 1;

> +
> + if (status->busy_time > status->total_time)

This check would then cover the possibility that total_time is 0.

> + status->busy_time = status->total_time;

But a reversal is needed here:
status->total_time = status->busy_time;

> +
> + status->busy_time *= 100;
> + status->busy_time /= status->total_time ? : 1;
> +
> + /* Avoid division by 0 */
> + status->busy_time = status->busy_time ? : 1;
> + status->total_time = 100;

Then all of this code can be replaced by:

status->busy_time = (unsigned long)div64_u64((u64)status->busy_time << 10,
 status->total_time);
status->total_time = 1 << 10;

This way you gain some resolution to busy_time and the divisions in the
callers would just become shifts by 10.

Hope it helps,
Ionela.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/3] drm: commit_work scheduling

2020-10-07 Thread Qais Yousef
On 10/07/20 08:57, Rob Clark wrote:
> Yeah, I think we will end up making some use of uclamp.. there is
> someone else working on that angle
> 
> But without it, this is a case that exposes legit prioritization
> problems with commit_work which we should fix ;-)

I wasn't suggesting this as an alternative to fixing the other problem. But it
seemed you had a different problem here that I thought I could help with :-)

I did give my opinion about how to handle that priority issue. If the 2 threads
are kernel threads and by design they need relative priorities IMO the kernel
need to be taught to set this relative priority. It seemed the vblank worker
could run as SCHED_DEADLINE. If this works, then the priority problem for
commit_work disappears as SCHED_DEADLINE will preempt RT. If commit_work uses
sched_set_fifo(), its priority will be 50, hence your SF threads can no longer
preempt it. And you can manage the SF threads to be any value you want relative
to 50 anyway without having to manage commit_work itself.

I'm not sure if you have problems with RT tasks preempting important CFS
tasks. My brain registered two conflicting statements.

Thanks

--
Qais Yousef
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 12/13] media/videbuf1|2: Mark follow_pfn usage as unsafe

2020-10-07 Thread Daniel Vetter
The media model assumes that buffers are all preallocated, so that
when a media pipeline is running we never miss a deadline because the
buffers aren't allocated or available.

This means we cannot fix the v4l follow_pfn usage through
mmu_notifier, without breaking how this all works. The only real fix
is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and
tell everyone to cut over to dma-buf memory sharing for zerocopy.

userptr for normal memory will keep working as-is.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Kees Cook 
Cc: Dan Williams 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: Laurent Dufour 
Cc: Vlastimil Babka 
Cc: Daniel Jordan 
Cc: Michel Lespinasse 
---
 drivers/media/common/videobuf2/frame_vector.c | 2 +-
 drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/media/common/videobuf2/frame_vector.c 
b/drivers/media/common/videobuf2/frame_vector.c
index b95f4f371681..d56eb6258f09 100644
--- a/drivers/media/common/videobuf2/frame_vector.c
+++ b/drivers/media/common/videobuf2/frame_vector.c
@@ -71,7 +71,7 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
unsigned long *nums = frame_vector_pfns(vec);
 
while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
-   err = follow_pfn(vma, start, &nums[ret]);
+   err = unsafe_follow_pfn(vma, start, &nums[ret]);
if (err) {
if (ret == 0)
ret = err;
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c 
b/drivers/media/v4l2-core/videobuf-dma-contig.c
index 52312ce2ba05..821c4a76ab96 100644
--- a/drivers/media/v4l2-core/videobuf-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
@@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct 
videobuf_dma_contig_memory *mem,
user_address = untagged_baddr;
 
while (pages_done < (mem->size >> PAGE_SHIFT)) {
-   ret = follow_pfn(vma, user_address, &this_pfn);
+   ret = unsafe_follow_pfn(vma, user_address, &this_pfn);
if (ret)
break;
 
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 08/13] s390/pci: Remove races against pte updates

2020-10-07 Thread Daniel Vetter
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

- gpu drivers dynamically manage their memory nowadays, invalidating
ptes with unmap_mapping_range when buffers get moved

- contiguous dma allocations have moved from dedicated carvetouts to
cma regions. This means if we miss the unmap the pfn might contain
pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

- even /dev/mem now invalidates mappings when the kernel requests that
iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
("/dev/mem: Revoke mappings when a driver claims the region")

Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea. Fix this.

Since zpci_memcpy_from|toio seems to not do anything nefarious with
locks we just need to open code get_pfn and follow_pfn and make sure
we drop the locks only after we've done. The write function also needs
the copy_from_user move, since we can't take userspace faults while
holding the mmap sem.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Dan Williams 
Cc: Kees Cook 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Niklas Schnelle 
Cc: Gerald Schaefer 
Cc: linux-s...@vger.kernel.org
---
 arch/s390/pci/pci_mmio.c | 98 +++-
 1 file changed, 57 insertions(+), 41 deletions(-)

diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c
index 401cf670a243..4d194cb09372 100644
--- a/arch/s390/pci/pci_mmio.c
+++ b/arch/s390/pci/pci_mmio.c
@@ -119,33 +119,15 @@ static inline int __memcpy_toio_inuser(void __iomem *dst,
return rc;
 }
 
-static long get_pfn(unsigned long user_addr, unsigned long access,
-   unsigned long *pfn)
-{
-   struct vm_area_struct *vma;
-   long ret;
-
-   mmap_read_lock(current->mm);
-   ret = -EINVAL;
-   vma = find_vma(current->mm, user_addr);
-   if (!vma)
-   goto out;
-   ret = -EACCES;
-   if (!(vma->vm_flags & access))
-   goto out;
-   ret = follow_pfn(vma, user_addr, pfn);
-out:
-   mmap_read_unlock(current->mm);
-   return ret;
-}
-
 SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr,
const void __user *, user_buffer, size_t, length)
 {
u8 local_buf[64];
void __iomem *io_addr;
void *buf;
-   unsigned long pfn;
+   struct vm_area_struct *vma;
+   pte_t *ptep;
+   spinlock_t *ptl;
long ret;
 
if (!zpci_is_enabled())
@@ -158,7 +140,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, 
mmio_addr,
 * We only support write access to MIO capable devices if we are on
 * a MIO enabled system. Otherwise we would have to check for every
 * address if it is a special ZPCI_ADDR and would have to do
-* a get_pfn() which we don't need for MIO capable devices.  Currently
+* a pfn lookup which we don't need for MIO capable devices.  Currently
 * ISM devices are the only devices without MIO support and there is no
 * known need for accessing these from userspace.
 */
@@ -176,21 +158,37 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, 
mmio_addr,
} else
buf = local_buf;
 
-   ret = get_pfn(mmio_addr, VM_WRITE, &pfn);
+   ret = -EFAULT;
+   if (copy_from_user(buf, user_buffer, length))
+   goto out_free;
+
+   mmap_read_lock(current->mm);
+   ret = -EINVAL;
+   vma = find_vma(current->mm, mmio_addr);
+   if (!vma)
+   goto out_unlock_mmap;
+   ret = -EACCES;
+   if (!(vma->vm_flags & VM_WRITE))
+   goto out_unlock_mmap;
+   if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
+   goto out_unlock_mmap;
+
+   ret = follow_pte_pmd(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl);
if (ret)
-   goto out;
-   io_addr = (void __iomem *)((pfn << PAGE_SHIFT) |
+   goto out_unlock_mmap;
+
+   io_addr = (void __iomem *)((pte_pfn(*ptep) << PAGE_SHIFT) |
(mmio_addr & ~PAGE_MASK));
 
-   ret = -EFAULT;
if ((unsigned long) io_addr < ZPCI_IOMAP_ADDR_BASE)
-   goto out;
-
-   if (copy_from_user(buf, user_buffer, length))
-   goto out;
+   goto out_unlock_pt;
 
ret = zpci_memcpy_toio(io_addr, buf, length);
-out:
+out_unlock_pt:
+   pte_unmap_unlock(ptep, ptl);
+out_unlock_mmap:
+   mmap_read_unlock(current->mm);
+out_free:
if (buf != local_buf)
kfree(buf);
return ret;
@@ -274,7 +272,9 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, 
mmio_addr,
u8 local_buf[64];
void __iomem *io_addr;
void *buf;
-  

[PATCH 09/13] PCI: obey iomem restrictions for procfs mmap

2020-10-07 Thread Daniel Vetter
There's three ways to access pci bars from userspace: /dev/mem, sysfs
files, and the old proc interface. Two check against
iomem_is_exclusive, proc never did. And with CONFIG_IO_STRICT_DEVMEM,
this starts to matter, since we don't want random userspace having
access to pci bars while a driver is loaded and using it.

Fix this.

References: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Kees Cook 
Cc: Dan Williams 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Bjorn Helgaas 
Cc: linux-...@vger.kernel.org
---
 drivers/pci/proc.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index d35186b01d98..3a2f90beb4cb 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -274,6 +274,11 @@ static int proc_bus_pci_mmap(struct file *file, struct 
vm_area_struct *vma)
else
return -EINVAL;
}
+
+   if (dev->resource[i].flags & IORESOURCE_MEM &&
+   iomem_is_exclusive(dev->resource[i].start))
+   return -EINVAL;
+
ret = pci_mmap_page_range(dev, i, vma,
  fpriv->mmap_state, write_combine);
if (ret < 0)
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 02/13] drm/exynos: Use FOLL_LONGTERM for g2d cmdlists

2020-10-07 Thread Daniel Vetter
The exynos g2d interface is very unusual, but it looks like the
userptr objects are persistent. Hence they need FOLL_LONGTERM.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Inki Dae 
Cc: Joonyoung Shim 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
 drivers/gpu/drm/exynos/exynos_drm_g2d.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index c83f6faac9de..514fd000feb1 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -478,7 +478,8 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data 
*g2d,
goto err_free;
}
 
-   ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
+   ret = pin_user_pages_fast(start, npages,
+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
  g2d_userptr->pages);
if (ret != npages) {
DRM_DEV_ERROR(g2d->dev,
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/3] drm: commit_work scheduling

2020-10-07 Thread Rob Clark
On Mon, Oct 5, 2020 at 5:15 AM Ville Syrjälä
 wrote:
>
> On Fri, Oct 02, 2020 at 10:55:52AM -0700, Rob Clark wrote:
> > On Fri, Oct 2, 2020 at 4:05 AM Ville Syrjälä
> >  wrote:
> > >
> > > On Fri, Oct 02, 2020 at 01:52:56PM +0300, Ville Syrjälä wrote:
> > > > On Thu, Oct 01, 2020 at 05:25:55PM +0200, Daniel Vetter wrote:
> > > > > On Thu, Oct 1, 2020 at 5:15 PM Rob Clark  wrote:
> > > > > >
> > > > > > I'm leaning towards converting the other drivers over to use the
> > > > > > per-crtc kwork, and then dropping the 'commit_work` from atomic 
> > > > > > state.
> > > > > > I can add a patch to that, but figured I could postpone that churn
> > > > > > until there is some by-in on this whole idea.
> > > > >
> > > > > i915 has its own commit code, it's not even using the current commit
> > > > > helpers (nor the commit_work). Not sure how much other fun there is.
> > > >
> > > > I don't think we want per-crtc threads for this in i915. Seems
> > > > to me easier to guarantee atomicity across multiple crtcs if
> > > > we just commit them from the same thread.
> > >
> > > Oh, and we may have to commit things in a very specific order
> > > to guarantee the hw doesn't fall over, so yeah definitely per-crtc
> > > thread is a no go.
> >
> > If I'm understanding the i915 code, this is only the case for modeset
> > commits?  I suppose we could achieve the same result by just deciding
> > to pick the kthread of the first CRTC for modeset commits.  I'm not
> > really so much concerned about parallelism for modeset.
>
> I'm not entirely happy about the random differences between modesets
> and other commits. Ideally we wouldn't need any.
>
> Anyways, even if we ignore modesets we still have the issue with
> atomicity guarantees across multiple crtcs. So I think we still
> don't want per-crtc threads, rather it should be thread for each
> commit.

I don't really see any other way to solve the priority inversion other
than per-CRTC kthreads.  I've been thinking about it a bit more, and
my conclusion is:

(1) There isn't really any use for the N+1'th commit to start running
before the kthread_work for the N'th commit completes, so I don't mind
losing the unbound aspect of the workqueue approach
(2) For cases where there does need to be serialization between
commits on different CRTCs, since there is a per-CRTC kthread, you
could achieve this with locking

Since i915 isn't using the atomic helpers here, I suppose it is an
option for i915 to just continue doing what it is doing.

And I could ofc just stop using the atomic commit helper and do the
kthreads thing in msm. But my first preference would be that the
commit helper does generally the right thing.

BR,
-R
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM

2020-10-07 Thread Daniel Vetter
This is used by media/videbuf2 for persistent dma mappings, not just
for a single dma operation and then freed again, so needs
FOLL_LONGTERM.

Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
locking issues. Rework the code to pull the pup path out from the
mmap_sem critical section as suggested by Jason.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: Mauro Carvalho Chehab 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
 mm/frame_vector.c | 36 +++-
 1 file changed, 11 insertions(+), 25 deletions(-)

diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index 10f82d5643b6..39db520a51dc 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
struct vm_area_struct *vma;
int ret = 0;
int err;
-   int locked;
 
if (nr_frames == 0)
return 0;
@@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
 
start = untagged_addr(start);
 
+   ret = pin_user_pages_fast(start, nr_frames,
+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
+ (struct page **)(vec->ptrs));
+   if (ret > 0) {
+   vec->got_ref = true;
+   vec->is_pfns = false;
+   goto out_unlocked;
+   }
+
mmap_read_lock(mm);
-   locked = 1;
vma = find_vma_intersection(mm, start, start + 1);
if (!vma) {
ret = -EFAULT;
goto out;
}
 
-   /*
-* While get_vaddr_frames() could be used for transient (kernel
-* controlled lifetime) pinning of memory pages all current
-* users establish long term (userspace controlled lifetime)
-* page pinning. Treat get_vaddr_frames() like
-* get_user_pages_longterm() and disallow it for filesystem-dax
-* mappings.
-*/
-   if (vma_is_fsdax(vma)) {
-   ret = -EOPNOTSUPP;
-   goto out;
-   }
-
-   if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
-   vec->got_ref = true;
-   vec->is_pfns = false;
-   ret = pin_user_pages_locked(start, nr_frames,
-   gup_flags, (struct page **)(vec->ptrs), &locked);
-   goto out;
-   }
-
vec->got_ref = false;
vec->is_pfns = true;
do {
@@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
nr_frames,
vma = find_vma_intersection(mm, start, start + 1);
} while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
 out:
-   if (locked)
-   mmap_read_unlock(mm);
+   mmap_read_unlock(mm);
+out_unlocked:
if (!ret)
ret = -EFAULT;
if (ret > 0)
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 07/13] mm: close race in generic_access_phys

2020-10-07 Thread Daniel Vetter
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

- gpu drivers dynamically manage their memory nowadays, invalidating
  ptes with unmap_mapping_range when buffers get moved

- contiguous dma allocations have moved from dedicated carvetouts to
  cma regions. This means if we miss the unmap the pfn might contain
  pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

- even /dev/mem now invalidates mappings when the kernel requests that
  iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
  ("/dev/mem: Revoke mappings when a driver claims the region")

Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea. Fix this.

Since ioremap might need to manipulate pagetables too we need to drop
the pt lock and have a retry loop if we raced.

While at it, also add kerneldoc and improve the comment for the
vma_ops->access function. It's for accessing, not for moving the
memory from iomem to system memory, as the old comment seemed to
suggest.

References: 28b2ee20c7cb ("access_process_vm device memory infrastructure")
Cc: Jason Gunthorpe 
Cc: Dan Williams 
Cc: Kees Cook 
Cc: Rik van Riel 
Cc: Benjamin Herrensmidt 
Cc: Dave Airlie 
Cc: Hugh Dickins 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Signed-off-by: Daniel Vetter 
---
 include/linux/mm.h |  3 ++-
 mm/memory.c| 44 ++--
 2 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index acd60fbf1a5a..2a16631c1fda 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -566,7 +566,8 @@ struct vm_operations_struct {
vm_fault_t (*pfn_mkwrite)(struct vm_fault *vmf);
 
/* called by access_process_vm when get_user_pages() fails, typically
-* for use by special VMAs that can switch between memory and hardware
+* for use by special VMAs. See also generic_access_phys() for a generic
+* implementation useful for any iomem mapping.
 */
int (*access)(struct vm_area_struct *vma, unsigned long addr,
  void *buf, int len, int write);
diff --git a/mm/memory.c b/mm/memory.c
index fcfc4ca36eba..8d467e23b44e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4873,28 +4873,68 @@ int follow_phys(struct vm_area_struct *vma,
return ret;
 }
 
+/**
+ * generic_access_phys - generic implementation for iomem mmap access
+ * @vma: the vma to access
+ * @addr: userspace addres, not relative offset within @vma
+ * @buf: buffer to read/write
+ * @len: length of transfer
+ * @write: set to FOLL_WRITE when writing, otherwise reading
+ *
+ * This is a generic implementation for &vm_operations_struct.access for an
+ * iomem mapping. This callback is used by access_process_vm() when the @vma is
+ * not page based.
+ */
 int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
void *buf, int len, int write)
 {
resource_size_t phys_addr;
unsigned long prot = 0;
void __iomem *maddr;
+   pte_t *ptep, pte;
+   spinlock_t *ptl;
int offset = addr & (PAGE_SIZE-1);
+   int ret = -EINVAL;
+
+   if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)))
+   return -EINVAL;
+
+retry:
+   if (follow_pte(vma->vm_mm, addr, &ptep, &ptl))
+   return -EINVAL;
+   pte = *ptep;
+   pte_unmap_unlock(ptep, ptl);
 
-   if (follow_phys(vma, addr, write, &prot, &phys_addr))
+   prot = pgprot_val(pte_pgprot(pte));
+   phys_addr = (resource_size_t)pte_pfn(pte) << PAGE_SHIFT;
+
+   if ((write & FOLL_WRITE) && !pte_write(pte))
return -EINVAL;
 
maddr = ioremap_prot(phys_addr, PAGE_ALIGN(len + offset), prot);
if (!maddr)
return -ENOMEM;
 
+   if (follow_pte(vma->vm_mm, addr, &ptep, &ptl))
+   goto out_unmap;
+
+   if (pte_same(pte, *ptep)) {
+   pte_unmap_unlock(ptep, ptl);
+   iounmap(maddr);
+
+   goto retry;
+   }
+
if (write)
memcpy_toio(maddr + offset, buf, len);
else
memcpy_fromio(buf, maddr + offset, len);
+   ret = len;
+   pte_unmap_unlock(ptep, ptl);
+out_unmap:
iounmap(maddr);
 
-   return len;
+   return ret;
 }
 EXPORT_SYMBOL_GPL(generic_access_phys);
 #endif
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 03/13] misc/habana: Stop using frame_vector helpers

2020-10-07 Thread Daniel Vetter
All we need are a pages array, pin_user_pages_fast can give us that
directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Oded Gabbay 
Cc: Omer Shpigelman 
Cc: Ofir Bitton 
Cc: Tomer Tayar 
Cc: Moti Haimovski 
Cc: Daniel Vetter 
Cc: Greg Kroah-Hartman 
Cc: Pawel Piskorski 
---
 drivers/misc/habanalabs/Kconfig |  1 -
 drivers/misc/habanalabs/common/habanalabs.h |  3 +-
 drivers/misc/habanalabs/common/memory.c | 51 +
 3 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig
index 8eb5d38c618e..2f04187f7167 100644
--- a/drivers/misc/habanalabs/Kconfig
+++ b/drivers/misc/habanalabs/Kconfig
@@ -6,7 +6,6 @@
 config HABANA_AI
tristate "HabanaAI accelerators (habanalabs)"
depends on PCI && HAS_IOMEM
-   select FRAME_VECTOR
select DMA_SHARED_BUFFER
select GENERIC_ALLOCATOR
select HWMON
diff --git a/drivers/misc/habanalabs/common/habanalabs.h 
b/drivers/misc/habanalabs/common/habanalabs.h
index edbd627b29d2..c1b3ad613b15 100644
--- a/drivers/misc/habanalabs/common/habanalabs.h
+++ b/drivers/misc/habanalabs/common/habanalabs.h
@@ -881,7 +881,8 @@ struct hl_ctx_mgr {
 struct hl_userptr {
enum vm_type_t  vm_type; /* must be first */
struct list_headjob_node;
-   struct frame_vector *vec;
+   struct page **pages;
+   unsigned intnpages;
struct sg_table *sgt;
enum dma_data_direction dir;
struct list_headdebugfs_list;
diff --git a/drivers/misc/habanalabs/common/memory.c 
b/drivers/misc/habanalabs/common/memory.c
index 5ff4688683fd..ef89cfa2f95a 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -1281,45 +1281,41 @@ static int get_user_memory(struct hl_device *hdev, u64 
addr, u64 size,
return -EFAULT;
}
 
-   userptr->vec = frame_vector_create(npages);
-   if (!userptr->vec) {
+   userptr->pages = kvmalloc_array(npages, sizeof(*userptr->pages),
+   GFP_KERNEL);
+   if (!userptr->pages) {
dev_err(hdev->dev, "Failed to create frame vector\n");
return -ENOMEM;
}
 
-   rc = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE,
-   userptr->vec);
+   rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
+userptr->pages);
 
if (rc != npages) {
dev_err(hdev->dev,
"Failed to map host memory, user ptr probably wrong\n");
if (rc < 0)
-   goto destroy_framevec;
+   goto destroy_pages;
+   npages = rc;
rc = -EFAULT;
-   goto put_framevec;
-   }
-
-   if (frame_vector_to_pages(userptr->vec) < 0) {
-   dev_err(hdev->dev,
-   "Failed to translate frame vector to pages\n");
-   rc = -EFAULT;
-   goto put_framevec;
+   goto put_pages;
}
+   userptr->npages = npages;
 
rc = sg_alloc_table_from_pages(userptr->sgt,
-   frame_vector_pages(userptr->vec),
-   npages, offset, size, GFP_ATOMIC);
+  userptr->pages,
+  npages, offset, size, GFP_ATOMIC);
if (rc < 0) {
dev_err(hdev->dev, "failed to create SG table from pages\n");
-   goto put_framevec;
+   goto put_pages;
}
 
return 0;
 
-put_framevec:
-   put_vaddr_frames(userptr->vec);
-destroy_framevec:
-   frame_vector_destroy(userptr->vec);
+put_pages:
+   unpin_user_pages(userptr->pages, npages);
+destroy_pages:
+   kvfree(userptr->pages);
return rc;
 }
 
@@ -1405,7 +1401,7 @@ int hl_pin_host_memory(struct hl_device *hdev, u64 addr, 
u64 size,
  */
 void hl_unpin_host_memory(struct hl_device *hdev, struct hl_userptr *userptr)
 {
-   struct page **pages;
+   int i;
 
hl_debugfs_remove_userptr(hdev, userptr);
 
@@ -1414,15 +1410,10 @@ void hl_unpin_host_memory(struct hl_device *hdev, 
struct hl_userptr *userptr)
userptr->sgt->nents,
userptr->dir);
 
-   pages = frame_vector_pages(userptr->vec);
-   if (!IS_ERR(pages)) {
-   int i;
-
-   for (i = 0; i < frame_vector_count(userptr->v

[PATCH 13/13] vfio/type1: Mark follow_pfn as unsafe

2020-10-07 Thread Daniel Vetter
The code seems to stuff these pfns into iommu pts (or something like
that, I didn't follow), but there's no mmu_notifier to ensure that
access is synchronized with pte updates.

Hence mark these as unsafe. This means that with
CONFIG_STRICT_FOLLOW_PFN, these will be rejected.

Real fix is to wire up an mmu_notifier ... somehow. Probably means any
invalidate is a fatal fault for this vfio device, but then this
shouldn't ever happen if userspace is reasonable.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Kees Cook 
Cc: Dan Williams 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Alex Williamson 
Cc: Cornelia Huck 
Cc: k...@vger.kernel.org
---
 drivers/vfio/vfio_iommu_type1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 5fbf0c1f7433..a4d53f3d0a35 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -421,7 +421,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, 
struct mm_struct *mm,
 {
int ret;
 
-   ret = follow_pfn(vma, vaddr, pfn);
+   ret = unsafe_follow_pfn(vma, vaddr, pfn);
if (ret) {
bool unlocked = false;
 
@@ -435,7 +435,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, 
struct mm_struct *mm,
if (ret)
return ret;
 
-   ret = follow_pfn(vma, vaddr, pfn);
+   ret = unsafe_follow_pfn(vma, vaddr, pfn);
}
 
return ret;
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 04/13] misc/habana: Use FOLL_LONGTERM for userptr

2020-10-07 Thread Daniel Vetter
These are persistent, not just for the duration of a dma operation.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Oded Gabbay 
Cc: Omer Shpigelman 
Cc: Ofir Bitton 
Cc: Tomer Tayar 
Cc: Moti Haimovski 
Cc: Daniel Vetter 
Cc: Greg Kroah-Hartman 
Cc: Pawel Piskorski 
---
 drivers/misc/habanalabs/common/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/habanalabs/common/memory.c 
b/drivers/misc/habanalabs/common/memory.c
index ef89cfa2f95a..94bef8faa82a 100644
--- a/drivers/misc/habanalabs/common/memory.c
+++ b/drivers/misc/habanalabs/common/memory.c
@@ -1288,7 +1288,8 @@ static int get_user_memory(struct hl_device *hdev, u64 
addr, u64 size,
return -ENOMEM;
}
 
-   rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
+   rc = pin_user_pages_fast(start, npages,
+FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
 userptr->pages);
 
if (rc != npages) {
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 11/13] mm: add unsafe_follow_pfn

2020-10-07 Thread Daniel Vetter
Way back it was a reasonable assumptions that iomem mappings never
change the pfn range they point at. But this has changed:

- gpu drivers dynamically manage their memory nowadays, invalidating
ptes with unmap_mapping_range when buffers get moved

- contiguous dma allocations have moved from dedicated carvetouts to
cma regions. This means if we miss the unmap the pfn might contain
pagecache or anon memory (well anything allocated with GFP_MOVEABLE)

- even /dev/mem now invalidates mappings when the kernel requests that
iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87
("/dev/mem: Revoke mappings when a driver claims the region")

Accessing pfns obtained from ptes without holding all the locks is
therefore no longer a good idea.

Unfortunately there's some users where this is not fixable (like v4l
userptr of iomem mappings) or involves a pile of work (vfio type1
iommu). For now annotate these as unsafe and splat appropriately.

This patch adds an unsafe_follow_pfn, which later patches will then
roll out to all appropriate places.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Kees Cook 
Cc: Dan Williams 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: k...@vger.kernel.org
---
 include/linux/mm.h |  2 ++
 mm/memory.c| 32 +++-
 mm/nommu.c | 17 +
 security/Kconfig   | 13 +
 4 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2a16631c1fda..ec8c90928fc9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1653,6 +1653,8 @@ int follow_pte_pmd(struct mm_struct *mm, unsigned long 
address,
   pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp);
 int follow_pfn(struct vm_area_struct *vma, unsigned long address,
unsigned long *pfn);
+int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
+ unsigned long *pfn);
 int follow_phys(struct vm_area_struct *vma, unsigned long address,
unsigned int flags, unsigned long *prot, resource_size_t *phys);
 int generic_access_phys(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index 8d467e23b44e..8db7ad1c261c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4821,7 +4821,12 @@ EXPORT_SYMBOL(follow_pte_pmd);
  * @address: user virtual address
  * @pfn: location to store found PFN
  *
- * Only IO mappings and raw PFN mappings are allowed.
+ * Only IO mappings and raw PFN mappings are allowed. Note that callers must
+ * ensure coherency with pte updates by using a &mmu_notifier to follow 
updates.
+ * If this is not feasible, or the access to the @pfn is only very short term,
+ * use follow_pte_pmd() instead and hold the pagetable lock for the duration of
+ * the access instead. Any caller not following these requirements must use
+ * unsafe_follow_pfn() instead.
  *
  * Return: zero and the pfn at @pfn on success, -ve otherwise.
  */
@@ -4844,6 +4849,31 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long 
address,
 }
 EXPORT_SYMBOL(follow_pfn);
 
+/**
+ * unsafe_follow_pfn - look up PFN at a user virtual address
+ * @vma: memory mapping
+ * @address: user virtual address
+ * @pfn: location to store found PFN
+ *
+ * Only IO mappings and raw PFN mappings are allowed.
+ *
+ * Returns zero and the pfn at @pfn on success, -ve otherwise.
+ */
+int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
+   unsigned long *pfn)
+{
+#ifdef CONFIG_STRICT_FOLLOW_PFN
+   pr_info("unsafe follow_pfn usage rejected, see 
CONFIG_STRICT_FOLLOW_PFN\n");
+   return -EINVAL;
+#else
+   WARN_ONCE(1, "unsafe follow_pfn usage\n");
+   add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+
+   return follow_pfn(vma, address, pfn);
+#endif
+}
+EXPORT_SYMBOL(unsafe_follow_pfn);
+
 #ifdef CONFIG_HAVE_IOREMAP_PROT
 int follow_phys(struct vm_area_struct *vma,
unsigned long address, unsigned int flags,
diff --git a/mm/nommu.c b/mm/nommu.c
index 75a327149af1..3db2910f0d64 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -132,6 +132,23 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long 
address,
 }
 EXPORT_SYMBOL(follow_pfn);
 
+/**
+ * unsafe_follow_pfn - look up PFN at a user virtual address
+ * @vma: memory mapping
+ * @address: user virtual address
+ * @pfn: location to store found PFN
+ *
+ * Only IO mappings and raw PFN mappings are allowed.
+ *
+ * Returns zero and the pfn at @pfn on success, -ve otherwise.
+ */
+int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address,
+   unsigned long *pfn)
+{
+   return follow_pfn(vma, address, pfn);
+}
+EXPORT_SYMBOL(unsafe_follow_pfn);
+
 LIST_HEAD(vmap_area_list);
 
 void vfree(const void *addr)
diff --git a/security/Kconfig b/security/Kconfig
inde

[PATCH 10/13] PCI: revoke mappings like devmem

2020-10-07 Thread Daniel Vetter
Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims
the region") /dev/kmem zaps ptes when the kernel requests exclusive
acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is
the default for all driver uses.

Except there's two more ways to access pci bars: sysfs and proc mmap
support. Let's plug that hole.

For revoke_devmem() to work we need to link our vma into the same
address_space, with consistent vma->vm_pgoff. ->pgoff is already
adjusted, because that's how (io_)remap_pfn_range works, but for the
mapping we need to adjust vma->vm_file->f_mapping. Usually that's done
at ->open time, but that's a bit tricky here with all the entry points
and arch code. So instead create a fake file and adjust vma->vm_file.

Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that
seems to be a subset of architectures support STRICT_DEVMEM, so we
should be good.

The only difference in access checks left is that sysfs pci mmap does
not check for CAP_RAWIO. But I think that makes some sense compared to
/dev/mem and proc, where one file gives you access to everything and
no ownership applies.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Kees Cook 
Cc: Dan Williams 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Bjorn Helgaas 
Cc: linux-...@vger.kernel.org
---
 drivers/char/mem.c | 16 +++-
 drivers/pci/mmap.c |  3 +++
 include/linux/ioport.h |  2 ++
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index abd4ffdc8cde..5e58a326d4ee 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t 
offset, int orig)
 }
 
 static struct inode *devmem_inode;
+static struct vfsmount *devmem_vfs_mount;
 
 #ifdef CONFIG_IO_STRICT_DEVMEM
 void revoke_devmem(struct resource *res)
@@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res)
 
unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 
1);
 }
+
+struct file *devmem_getfile(void)
+{
+   struct file *file;
+
+   file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem",
+O_RDWR, &kmem_fops);
+   if (IS_ERR(file))
+   return NULL;
+
+   file->f_mapping = devmem_indoe->i_mapping;
+
+   return file;
+}
 #endif
 
 static int open_port(struct inode *inode, struct file *filp)
@@ -1010,7 +1025,6 @@ static struct file_system_type devmem_fs_type = {
 
 static int devmem_init_inode(void)
 {
-   static struct vfsmount *devmem_vfs_mount;
static int devmem_fs_cnt;
struct inode *inode;
int rc;
diff --git a/drivers/pci/mmap.c b/drivers/pci/mmap.c
index b8c9011987f4..63786cc9c746 100644
--- a/drivers/pci/mmap.c
+++ b/drivers/pci/mmap.c
@@ -7,6 +7,7 @@
  * Author: David Woodhouse 
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -64,6 +65,8 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar,
vma->vm_pgoff += (pci_resource_start(pdev, bar) >> PAGE_SHIFT);
 
vma->vm_ops = &pci_phys_vm_ops;
+   fput(vma->vm_file);
+   vma->vm_file = devmem_getfile();
 
return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff,
  vma->vm_end - vma->vm_start,
diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 6c2b06fe8beb..83238cba19fe 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct resource 
*base,
 
 #ifdef CONFIG_IO_STRICT_DEVMEM
 void revoke_devmem(struct resource *res);
+struct file *devm_getfile(void);
 #else
 static inline void revoke_devmem(struct resource *res) { };
+static inline struct file *devmem_getfile(void) { return NULL; };
 #endif
 
 #endif /* __ASSEMBLY__ */
-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 06/13] media: videobuf2: Move frame_vector into media subsystem

2020-10-07 Thread Daniel Vetter
It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR
symbol from all over the tree (well just one place, somehow omap media
driver still had this in its Kconfig, despite not using it).

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Pawel Osciak 
Cc: Marek Szyprowski 
Cc: Kyungmin Park 
Cc: Tomasz Figa 
Cc: Mauro Carvalho Chehab 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: Daniel Vetter 
---
 drivers/media/common/videobuf2/Kconfig|  1 -
 drivers/media/common/videobuf2/Makefile   |  1 +
 .../media/common/videobuf2}/frame_vector.c|  2 +
 drivers/media/platform/omap/Kconfig   |  1 -
 include/linux/mm.h| 42 ---
 include/media/videobuf2-core.h| 42 +++
 mm/Kconfig|  3 --
 mm/Makefile   |  1 -
 8 files changed, 45 insertions(+), 48 deletions(-)
 rename {mm => drivers/media/common/videobuf2}/frame_vector.c (99%)

diff --git a/drivers/media/common/videobuf2/Kconfig 
b/drivers/media/common/videobuf2/Kconfig
index edbc99ebba87..d2223a12c95f 100644
--- a/drivers/media/common/videobuf2/Kconfig
+++ b/drivers/media/common/videobuf2/Kconfig
@@ -9,7 +9,6 @@ config VIDEOBUF2_V4L2
 
 config VIDEOBUF2_MEMOPS
tristate
-   select FRAME_VECTOR
 
 config VIDEOBUF2_DMA_CONTIG
tristate
diff --git a/drivers/media/common/videobuf2/Makefile 
b/drivers/media/common/videobuf2/Makefile
index 77bebe8b202f..54306f8d096c 100644
--- a/drivers/media/common/videobuf2/Makefile
+++ b/drivers/media/common/videobuf2/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 videobuf2-common-objs := videobuf2-core.o
+videobuf2-common-objs += frame_vector.o
 
 ifeq ($(CONFIG_TRACEPOINTS),y)
   videobuf2-common-objs += vb2-trace.o
diff --git a/mm/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c
similarity index 99%
rename from mm/frame_vector.c
rename to drivers/media/common/videobuf2/frame_vector.c
index 39db520a51dc..b95f4f371681 100644
--- a/mm/frame_vector.c
+++ b/drivers/media/common/videobuf2/frame_vector.c
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include 
+
 /**
  * get_vaddr_frames() - map virtual addresses to pfns
  * @start: starting user address
diff --git a/drivers/media/platform/omap/Kconfig 
b/drivers/media/platform/omap/Kconfig
index f73b5893220d..de16de46c0f4 100644
--- a/drivers/media/platform/omap/Kconfig
+++ b/drivers/media/platform/omap/Kconfig
@@ -12,6 +12,5 @@ config VIDEO_OMAP2_VOUT
depends on VIDEO_V4L2
select VIDEOBUF2_DMA_CONTIG
select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3
-   select FRAME_VECTOR
help
  V4L2 Display driver support for OMAP2/3 based boards.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 16b799a0522c..acd60fbf1a5a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1743,48 +1743,6 @@ int account_locked_vm(struct mm_struct *mm, unsigned 
long pages, bool inc);
 int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
struct task_struct *task, bool bypass_rlim);
 
-/* Container for pinned pfns / pages */
-struct frame_vector {
-   unsigned int nr_allocated;  /* Number of frames we have space for */
-   unsigned int nr_frames; /* Number of frames stored in ptrs array */
-   bool got_ref;   /* Did we pin pages by getting page ref? */
-   bool is_pfns;   /* Does array contain pages or pfns? */
-   void *ptrs[];   /* Array of pinned pfns / pages. Use
-* pfns_vector_pages() or pfns_vector_pfns()
-* for access */
-};
-
-struct frame_vector *frame_vector_create(unsigned int nr_frames);
-void frame_vector_destroy(struct frame_vector *vec);
-int get_vaddr_frames(unsigned long start, unsigned int nr_pfns,
-unsigned int gup_flags, struct frame_vector *vec);
-void put_vaddr_frames(struct frame_vector *vec);
-int frame_vector_to_pages(struct frame_vector *vec);
-void frame_vector_to_pfns(struct frame_vector *vec);
-
-static inline unsigned int frame_vector_count(struct frame_vector *vec)
-{
-   return vec->nr_frames;
-}
-
-static inline struct page **frame_vector_pages(struct frame_vector *vec)
-{
-   if (vec->is_pfns) {
-   int err = frame_vector_to_pages(vec);
-
-   if (err)
-   return ERR_PTR(err);
-   }
-   return (struct page **)(vec->ptrs);
-}
-
-static inline unsigned long *frame_vector_pfns(struct frame_vector *vec)
-{
-   if (!vec->is_pfns)
-   frame_vector_to_pfns(vec);
-   return (unsigned long *)(vec->ptrs);
-}
-
 struct kvec;
 int get_kernel_pages(const struct kvec *iov, int nr_pages, i

[PATCH 01/13] drm/exynos: Stop using frame_vector helpers

2020-10-07 Thread Daniel Vetter
All we need are a pages array, pin_user_pages_fast can give us that
directly. Plus this avoids the entire raw pfn side of get_vaddr_frames.

Signed-off-by: Daniel Vetter 
Cc: Jason Gunthorpe 
Cc: Inki Dae 
Cc: Joonyoung Shim 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Kukjin Kim 
Cc: Krzysztof Kozlowski 
Cc: Andrew Morton 
Cc: John Hubbard 
Cc: Jérôme Glisse 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: linux...@kvack.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
---
 drivers/gpu/drm/exynos/Kconfig  |  1 -
 drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 -
 2 files changed, 22 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
index 6417f374b923..43257ef3c09d 100644
--- a/drivers/gpu/drm/exynos/Kconfig
+++ b/drivers/gpu/drm/exynos/Kconfig
@@ -88,7 +88,6 @@ comment "Sub-drivers"
 config DRM_EXYNOS_G2D
bool "G2D"
depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST
-   select FRAME_VECTOR
help
  Choose this option if you want to use Exynos G2D for DRM.
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c 
b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index 967a5cdc120e..c83f6faac9de 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr {
dma_addr_t  dma_addr;
unsigned long   userptr;
unsigned long   size;
-   struct frame_vector *vec;
+   struct page **pages;
+   unsigned intnpages;
struct sg_table *sgt;
atomic_trefcount;
boolin_pool;
@@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
bool force)
 {
struct g2d_cmdlist_userptr *g2d_userptr = obj;
-   struct page **pages;
+   int i;
 
if (!obj)
return;
@@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d,
dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt,
  DMA_BIDIRECTIONAL, 0);
 
-   pages = frame_vector_pages(g2d_userptr->vec);
-   if (!IS_ERR(pages)) {
-   int i;
+   for (i = 0; i < g2d_userptr->npages; i++)
+   set_page_dirty_lock(g2d_userptr->pages[i]);
 
-   for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++)
-   set_page_dirty_lock(pages[i]);
-   }
-   put_vaddr_frames(g2d_userptr->vec);
-   frame_vector_destroy(g2d_userptr->vec);
+   unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages);
+   kvfree(g2d_userptr->pages);
 
if (!g2d_userptr->out_of_list)
list_del_init(&g2d_userptr->list);
@@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
g2d_data *g2d,
offset = userptr & ~PAGE_MASK;
end = PAGE_ALIGN(userptr + size);
npages = (end - start) >> PAGE_SHIFT;
-   g2d_userptr->vec = frame_vector_create(npages);
-   if (!g2d_userptr->vec) {
+   g2d_userptr->pages = kvmalloc_array(npages, sizeof(*g2d_userptr->pages),
+   GFP_KERNEL);
+   if (!g2d_userptr->pages) {
ret = -ENOMEM;
goto err_free;
}
 
-   ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE,
-   g2d_userptr->vec);
+   ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE,
+ g2d_userptr->pages);
if (ret != npages) {
DRM_DEV_ERROR(g2d->dev,
  "failed to get user pages from userptr.\n");
if (ret < 0)
-   goto err_destroy_framevec;
-   ret = -EFAULT;
-   goto err_put_framevec;
-   }
-   if (frame_vector_to_pages(g2d_userptr->vec) < 0) {
+   goto err_destroy_pages;
+   npages = ret;
ret = -EFAULT;
-   goto err_put_framevec;
+   goto err_unpin_pages;
}
+   g2d_userptr->npages = npages;
 
sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
if (!sgt) {
ret = -ENOMEM;
-   goto err_put_framevec;
+   goto err_unpin_pages;
}
 
ret = sg_alloc_table_from_pages(sgt,
-   frame_vector_pages(g2d_userptr->vec),
+   g2d_userptr->pages,
npages, offset, size, GFP_KERNEL);
if (ret < 0) {
DRM_DEV_ERROR(g2d->dev, "failed to get sgt from pages.\n");
@@ -538,11 +534,11 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct 
g2d_data *g2d,
 err_free_sgt:
kfree(sgt);
 
-err_put_framevec:
-   put_

[PATCH 00/13] follow_pfn and other iomap races

2020-10-07 Thread Daniel Vetter
Hi all,

This developed from a discussion with Jason, starting with some patches
touching get_vaddr_frame that I typed up.

The problem is that way back VM_IO | VM_PFNMAP mappings were pretty
static, and so just following the ptes to derive a pfn and then use that
somewhere else was ok.

But we're no longer in such a world, there's tons of little races and some
fundamental problems.

This series here is an attempt to at least scope the problem, it's all the
issues I've found with quite some code reading all over the tree:
- first part tries to move mm/frame-vector.c away, it's fundamentally an
  unsafe thing
- two patches to close follow_pfn races by holding pt locks
- two pci patches where I spotted inconsinstencies between the 3 different
  ways userspace can map pci bars
- and finally some patches to mark up the remaining issue

No testing beyond "it compiles", this is very much an rfc to figure out
whether this makes sense, whether it's a real thing, and how to fix this
up properly.

Cheers, Daniel

Daniel Vetter (13):
  drm/exynos: Stop using frame_vector helpers
  drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
  misc/habana: Stop using frame_vector helpers
  misc/habana: Use FOLL_LONGTERM for userptr
  mm/frame-vector: Use FOLL_LONGTERM
  media: videobuf2: Move frame_vector into media subsystem
  mm: close race in generic_access_phys
  s390/pci: Remove races against pte updates
  PCI: obey iomem restrictions for procfs mmap
  PCI: revoke mappings like devmem
  mm: add unsafe_follow_pfn
  media/videbuf1|2: Mark follow_pfn usage as unsafe
  vfio/type1: Mark follow_pfn as unsafe

 arch/s390/pci/pci_mmio.c  | 98 +++
 drivers/char/mem.c| 16 ++-
 drivers/gpu/drm/exynos/Kconfig|  1 -
 drivers/gpu/drm/exynos/exynos_drm_g2d.c   | 49 +-
 drivers/media/common/videobuf2/Kconfig|  1 -
 drivers/media/common/videobuf2/Makefile   |  1 +
 .../media/common/videobuf2}/frame_vector.c| 40 +++-
 drivers/media/platform/omap/Kconfig   |  1 -
 drivers/media/v4l2-core/videobuf-dma-contig.c |  2 +-
 drivers/misc/habanalabs/Kconfig   |  1 -
 drivers/misc/habanalabs/common/habanalabs.h   |  3 +-
 drivers/misc/habanalabs/common/memory.c   | 52 +-
 drivers/pci/mmap.c|  3 +
 drivers/pci/proc.c|  5 +
 drivers/vfio/vfio_iommu_type1.c   |  4 +-
 include/linux/ioport.h|  2 +
 include/linux/mm.h| 47 +
 include/media/videobuf2-core.h| 42 
 mm/Kconfig|  3 -
 mm/Makefile   |  1 -
 mm/memory.c   | 76 +-
 mm/nommu.c| 17 
 security/Kconfig  | 13 +++
 23 files changed, 296 insertions(+), 182 deletions(-)
 rename {mm => drivers/media/common/videobuf2}/frame_vector.c (90%)

-- 
2.28.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 14/14] drm/amd/pm: Replace one-element array with flexible-array in struct ATOM_Vega10_GFXCLK_Dependency_Table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Use a flexible-array member in struct ATOM_Vega10_GFXCLK_Dependency_Table
instead of a one-element array.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7d61dd.o8jxxi5c6p9fob%2fd%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
index c934e9612c1b..a6968009acc4 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h
@@ -163,7 +163,7 @@ typedef struct _ATOM_Vega10_MCLK_Dependency_Record {
 typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table {
 UCHAR ucRevId;
 UCHAR ucNumEntries; /* Number of 
entries. */
-ATOM_Vega10_GFXCLK_Dependency_Record entries[1];/* Dynamically 
allocate entries. */
+ATOM_Vega10_GFXCLK_Dependency_Record entries[]; /* Dynamically 
allocate entries. */
 } ATOM_Vega10_GFXCLK_Dependency_Table;
 
 typedef struct _ATOM_Vega10_MCLK_Dependency_Table {
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 13/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_pcie_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_ppt_v1_pcie_table, instead of a one-element array, and use
the struct_size() helper to calculate the size for the allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7db0bc.7xivn4k83f7xw0ug%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h|  2 +-
 .../powerplay/hwmgr/process_pptables_v1_0.c   | 22 ---
 .../powerplay/hwmgr/vega10_processpptables.c  | 10 +++--
 3 files changed, 13 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
index e11298cdeb30..729615aff126 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
@@ -103,7 +103,7 @@ typedef struct phm_ppt_v1_pcie_record 
phm_ppt_v1_pcie_record;
 
 struct phm_ppt_v1_pcie_table {
uint32_t count;/* Number of 
entries. */
-   phm_ppt_v1_pcie_record entries[1]; /* 
Dynamically allocate count entries. */
+   phm_ppt_v1_pcie_record entries[];  /* 
Dynamically allocate count entries. */
 };
 typedef struct phm_ppt_v1_pcie_table phm_ppt_v1_pcie_table;
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index 426655b9c678..4fa58614e26a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -478,7 +478,7 @@ static int get_pcie_table(
PPTable_Generic_SubTable_Header const *ptable
)
 {
-   uint32_t table_size, i, pcie_count;
+   uint32_t i, pcie_count;
phm_ppt_v1_pcie_table *pcie_table;
struct phm_ppt_v1_information *pp_table_information =
(struct phm_ppt_v1_information *)(hwmgr->pptable);
@@ -491,12 +491,10 @@ static int get_pcie_table(
PP_ASSERT_WITH_CODE((atom_pcie_table->ucNumEntries != 0),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_pcie_record) * 
atom_pcie_table->ucNumEntries;
-
-   pcie_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (pcie_table == NULL)
+   pcie_table = kzalloc(struct_size(pcie_table, entries,
+atom_pcie_table->ucNumEntries),
+GFP_KERNEL);
+   if (!pcie_table)
return -ENOMEM;
 
/*
@@ -530,12 +528,10 @@ static int get_pcie_table(
PP_ASSERT_WITH_CODE((atom_pcie_table->ucNumEntries != 0),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_pcie_record) * 
atom_pcie_table->ucNumEntries;
-
-   pcie_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (pcie_table == NULL)
+   pcie_table = kzalloc(struct_size(pcie_table, entries,
+atom_pcie_table->ucNumEntries),
+GFP_KERNEL);
+   if (!pcie_table)
return -ENOMEM;
 
/*
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
index 3d7f915381c8..535404de78a2 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
@@ -784,7 +784,7 @@ static int get_pcie_table(struct pp_hwmgr *hwmgr,
struct phm_ppt_v1_pcie_table **vega10_pcie_table,
const Vega10_PPTable_Generic_SubTable_Header *table)
 {
-   uint32_t table_size, i, pcie_count;
+   uint32_t i, pcie_count;
struct phm_ppt_v1_pcie_table *pcie_table;
struct phm_ppt_v2_information *table_info =
(struct phm_ppt_v2_information *)(hwmgr->pptable);
@@ -795,12 +795,8 @@ static int get_pcie_table(struct pp_hwmgr *hwmgr,
"Invalid PowerPlay Table!",
return 0);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(struc

[PATCH 11/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_mm_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_ppt_v1_mm_clock_voltage_dependency_table, instead of a
one-element array, and use the struct_size() helper to calculate the
size for the allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7d61e2.qitvtyg2pvog8bb0%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h|  2 +-
 .../amd/pm/powerplay/hwmgr/process_pptables_v1_0.c| 11 ---
 .../amd/pm/powerplay/hwmgr/vega10_processpptables.c   |  9 +++--
 3 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
index c167083b0872..923cc04e405a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
@@ -71,7 +71,7 @@ typedef struct phm_ppt_v1_mm_clock_voltage_dependency_record 
phm_ppt_v1_mm_clock
 
 struct phm_ppt_v1_mm_clock_voltage_dependency_table {
uint32_t count; 
/* Number of entries. */
-   phm_ppt_v1_mm_clock_voltage_dependency_record entries[1];   
/* Dynamically allocate count entries. */
+   phm_ppt_v1_mm_clock_voltage_dependency_record entries[];
/* Dynamically allocate count entries. */
 };
 typedef struct phm_ppt_v1_mm_clock_voltage_dependency_table 
phm_ppt_v1_mm_clock_voltage_dependency_table;
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index 0725531fbfff..5d8016cd1986 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -678,19 +678,16 @@ static int get_mm_clock_voltage_table(
const ATOM_Tonga_MM_Dependency_Table * mm_dependency_table
)
 {
-   uint32_t table_size, i;
+   uint32_t i;
const ATOM_Tonga_MM_Dependency_Record *mm_dependency_record;
phm_ppt_v1_mm_clock_voltage_dependency_table *mm_table;
phm_ppt_v1_mm_clock_voltage_dependency_record *mm_table_record;
 
PP_ASSERT_WITH_CODE((0 != mm_dependency_table->ucNumEntries),
"Invalid PowerPlay Table!", return -1);
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_mm_clock_voltage_dependency_record)
-   * mm_dependency_table->ucNumEntries;
-   mm_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == mm_table)
+   mm_table = kzalloc(struct_size(mm_table, entries, 
mm_dependency_table->ucNumEntries),
+  GFP_KERNEL);
+   if (!mm_table)
return -ENOMEM;
 
mm_table->count = mm_dependency_table->ucNumEntries;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
index 787b23fa25e7..4f6a73a2cf28 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
@@ -344,18 +344,15 @@ static int get_mm_clock_voltage_table(
phm_ppt_v1_mm_clock_voltage_dependency_table **vega10_mm_table,
const ATOM_Vega10_MM_Dependency_Table *mm_dependency_table)
 {
-   uint32_t table_size, i;
+   uint32_t i;
const ATOM_Vega10_MM_Dependency_Record *mm_dependency_record;
phm_ppt_v1_mm_clock_voltage_dependency_table *mm_table;
 
PP_ASSERT_WITH_CODE((mm_dependency_table->ucNumEntries != 0),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_mm_clock_voltage_dependency_record) *
-   mm_dependency_table->ucNumEntries;
-   mm_table = kzalloc(table_size, GFP_KERNEL);
-
+   mm_table = kzalloc(struct_size(mm_table, entries, 
mm_dependency_table->ucNumEntries),
+  GFP_KERNEL);
if (!mm_table)
return -ENOMEM;
 
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 3/7] dt-bindings: display: mxsfb: Add a bus-width endpoint property

2020-10-07 Thread Rob Herring
On Wed, 07 Oct 2020 04:24:34 +0300, Laurent Pinchart wrote:
> When the PCB routes the display data signals in an unconventional way,
> the output bus width may differ from the bus width of the connected
> panel or encoder. For instance, when a 18-bit RGB panel has its R[5:0],
> G[5:0] and B[5:0] signals connected to LCD_DATA[7:2], LCD_DATA[15:10]
> and LCD_DATA[23:18], the output bus width is 24 instead of 18 when the
> signals are routed to LCD_DATA[5:0], LCD_DATA[11:6] and LCD_DATA[17:12].
> 
> Add a bus-width property to describe this data routing.
> 
> Signed-off-by: Laurent Pinchart 
> ---
> Changes since v1:
> 
> - Fix property name in binding
> ---
>  .../devicetree/bindings/display/fsl,lcdif.yaml   | 12 
>  1 file changed, 12 insertions(+)
> 

Reviewed-by: Rob Herring 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 12/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_voltage_lookup_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_ppt_v1_voltage_lookup_table, instead of a one-element array,
and use the struct_size() helper to calculate the size for the allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7d61df.jwrffnjxgbjskpop%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h |  2 +-
 .../drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 10 +++---
 .../amd/pm/powerplay/hwmgr/vega10_processpptables.c| 10 +++---
 3 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
index 923cc04e405a..e11298cdeb30 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
@@ -86,7 +86,7 @@ typedef struct phm_ppt_v1_voltage_lookup_record 
phm_ppt_v1_voltage_lookup_record
 
 struct phm_ppt_v1_voltage_lookup_table {
uint32_t count;
-   phm_ppt_v1_voltage_lookup_record entries[1];/* Dynamically allocate 
count entries. */
+   phm_ppt_v1_voltage_lookup_record entries[];/* Dynamically allocate 
count entries. */
 };
 typedef struct phm_ppt_v1_voltage_lookup_table phm_ppt_v1_voltage_lookup_table;
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index 5d8016cd1986..426655b9c678 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -157,7 +157,7 @@ static int get_vddc_lookup_table(
uint32_t max_levels
)
 {
-   uint32_t table_size, i;
+   uint32_t i;
phm_ppt_v1_voltage_lookup_table *table;
phm_ppt_v1_voltage_lookup_record *record;
ATOM_Tonga_Voltage_Lookup_Record *atom_record;
@@ -165,12 +165,8 @@ static int get_vddc_lookup_table(
PP_ASSERT_WITH_CODE((0 != vddc_lookup_pp_tables->ucNumEntries),
"Invalid CAC Leakage PowerPlay Table!", return 1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_voltage_lookup_record) * max_levels;
-
-   table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == table)
+   table = kzalloc(struct_size(table, entries, max_levels), GFP_KERNEL);
+   if (!table)
return -ENOMEM;
 
table->count = vddc_lookup_pp_tables->ucNumEntries;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
index 4f6a73a2cf28..3d7f915381c8 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
@@ -1040,18 +1040,14 @@ static int get_vddc_lookup_table(
const ATOM_Vega10_Voltage_Lookup_Table *vddc_lookup_pp_tables,
uint32_t max_levels)
 {
-   uint32_t table_size, i;
+   uint32_t i;
phm_ppt_v1_voltage_lookup_table *table;
 
PP_ASSERT_WITH_CODE((vddc_lookup_pp_tables->ucNumEntries != 0),
"Invalid SOC_VDDD Lookup Table!", return 1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(phm_ppt_v1_voltage_lookup_record) * max_levels;
-
-   table = kzalloc(table_size, GFP_KERNEL);
-
-   if (table == NULL)
+   table = kzalloc(struct_size(table, entries, max_levels), GFP_KERNEL);
+   if (!table)
return -ENOMEM;
 
table->count = vddc_lookup_pp_tables->ucNumEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML

2020-10-07 Thread Rob Herring
On Wed, Oct 07, 2020 at 11:00:20AM -0500, Rob Herring wrote:
> On Wed, Oct 07, 2020 at 04:24:32AM +0300, Laurent Pinchart wrote:
> > Convert the mxsfb binding to YAML. The deprecated binding is dropped, as
> > neither the DT sources nor the driver support it anymore. The converted
> > binding is named fsl,lcdif.yaml to match the usual bindings naming
> > scheme.
> > 
> > The compatible strings are messy, and DT sources use different kinds of
> > combination of documented and undocumented values. Keep it simple for
> > now, and update the example to make it valid. Aligning the binding with
> > the existing DT sources will be performed separately.
> > 
> > Signed-off-by: Laurent Pinchart 
> > Reviewed-by: Sam Ravnborg 
> > --
> > Changes since v1:
> > 
> > - Drop unneeded quotes in string
> > - Replace minItems with maxItems in conditional check
> > - Add blank line before ...
> > - Squash the rename in this commit
> > ---
> >  .../bindings/display/fsl,lcdif.yaml   | 116 ++
> >  .../devicetree/bindings/display/mxsfb.txt |  87 -
> >  MAINTAINERS   |   2 +-
> >  3 files changed, 117 insertions(+), 88 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/display/fsl,lcdif.yaml
> >  delete mode 100644 Documentation/devicetree/bindings/display/mxsfb.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/display/fsl,lcdif.yaml 
> > b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml
> > new file mode 100644
> > index ..063bb8c58114
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml
> > @@ -0,0 +1,116 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/display/fsl,lcdif.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Freescale/NXP i.MX LCD Interface (LCDIF)
> > +
> > +maintainers:
> > +  - Marek Vasut 
> > +  - Stefan Agner 
> > +
> > +description: |
> > +  (e)LCDIF display controller found in the Freescale/NXP i.MX SoCs.
> > +
> > +properties:
> > +  compatible:
> > +enum:
> > +  - fsl,imx23-lcdif
> > +  - fsl,imx28-lcdif
> > +  - fsl,imx6sx-lcdif
> > +  - fsl,imx8mq-lcdif
> > +
> > +  reg:
> > +maxItems: 1
> > +
> > +  clocks:
> > +items:
> > +  - description: Pixel clock
> > +  - description: Bus clock
> > +  - description: Display AXI clock
> > +minItems: 1
> > +
> > +  clock-names:
> > +items:
> > +  - const: pix
> > +  - const: axi
> > +  - const: disp_axi
> > +minItems: 1
> > +
> > +  interrupts:
> > +maxItems: 1
> > +
> > +  port:
> > +description: The LCDIF output port
> > +type: object
> > +
> > +properties:
> > +  endpoint:
> 
> What happened on the graph binding schema work? I started a meta-schema 
> for it BTW.
> 
> You can drop all the endpoint parts. With that,

NM, I see in patch 3 you need it.

> 
> Reviewed-by: Rob Herring 
> 
> > +type: object
> > +
> > +properties:
> > +  remote-endpoint:
> > +$ref: /schemas/types.yaml#/definitions/phandle
> > +
> > +required:
> > +  - remote-endpoint
> > +
> > +additionalProperties: false
> > +
> > +additionalProperties: false
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - clocks
> > +  - interrupts
> > +  - port
> > +
> > +additionalProperties: false
> > +
> > +allOf:
> > +  - if:
> > +  properties:
> > +compatible:
> > +  contains:
> > +const: fsl,imx6sx-lcdif
> > +then:
> > +  properties:
> > +clocks:
> > +  minItems: 2
> > +  maxItems: 3
> > +clock-names:
> > +  minItems: 2
> > +  maxItems: 3
> > +  required:
> > +- clock-names
> > +else:
> > +  properties:
> > +clocks:
> > +  maxItems: 1
> > +clock-names:
> > +  maxItems: 1
> > +
> > +examples:
> > +  - |
> > +#include 
> > +#include 
> > +
> > +display-controller@222 {
> > +compatible = "fsl,imx6sx-lcdif";
> > +reg = <0x0222 0x4000>;
> > +interrupts = ;
> > +clocks = <&clks IMX6SX_CLK_LCDIF1_PIX>,
> > + <&clks IMX6SX_CLK_LCDIF_APB>,
> > + <&clks IMX6SX_CLK_DISPLAY_AXI>;
> > +clock-names = "pix", "axi", "disp_axi";
> > +
> > +port {
> > +endpoint {
> > +remote-endpoint = <&panel_in>;
> > +};
> > +};
> > +};
> > +
> > +...
> > diff --git a/Documentation/devicetree/bindings/display/mxsfb.txt 
> > b/Documentation/devicetree/bindings/display/mxsfb.txt
> > deleted file mode 100644
> > index c985871c46b3..
> > --- a/Documentation/devicetree/bindings/display/mxsfb.txt
> > +++ /dev/null
> > @@ -1,87 +0,0 @@
> > -* Freescale MXS LCD Interface (LCDIF)
> > -
> > -New bindings:
> > -=
> > -

[PATCH 10/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_ppt_v1_clock_voltage_dependency_table, instead of a one-element
array, and use the struct_size() helper to calculate the size for the
allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: 
Signed-off-by: Gustavo A. R. Silva 
---
 .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h|  2 +-
 .../powerplay/hwmgr/process_pptables_v1_0.c   | 31 
 .../powerplay/hwmgr/vega10_processpptables.c  | 50 ++-
 3 files changed, 27 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
index c0193e09d58a..c167083b0872 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h
@@ -48,7 +48,7 @@ typedef struct phm_ppt_v1_clock_voltage_dependency_record 
phm_ppt_v1_clock_volta
 
 struct phm_ppt_v1_clock_voltage_dependency_table {
uint32_t count;/* Number of 
entries. */
-   phm_ppt_v1_clock_voltage_dependency_record entries[1]; /* 
Dynamically allocate count entries. */
+   phm_ppt_v1_clock_voltage_dependency_record entries[];  /* 
Dynamically allocate count entries. */
 };
 
 typedef struct phm_ppt_v1_clock_voltage_dependency_table 
phm_ppt_v1_clock_voltage_dependency_table;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index 52188f6cd150..0725531fbfff 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -367,7 +367,7 @@ static int get_mclk_voltage_dependency_table(
ATOM_Tonga_MCLK_Dependency_Table const *mclk_dep_table
)
 {
-   uint32_t table_size, i;
+   uint32_t i;
phm_ppt_v1_clock_voltage_dependency_table *mclk_table;
phm_ppt_v1_clock_voltage_dependency_record *mclk_table_record;
ATOM_Tonga_MCLK_Dependency_Record *mclk_dep_record;
@@ -375,12 +375,9 @@ static int get_mclk_voltage_dependency_table(
PP_ASSERT_WITH_CODE((0 != mclk_dep_table->ucNumEntries),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) + 
sizeof(phm_ppt_v1_clock_voltage_dependency_record)
-   * mclk_dep_table->ucNumEntries;
-
-   mclk_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == mclk_table)
+   mclk_table = kzalloc(struct_size(mclk_table, entries, 
mclk_dep_table->ucNumEntries),
+GFP_KERNEL);
+   if (!mclk_table)
return -ENOMEM;
 
mclk_table->count = (uint32_t)mclk_dep_table->ucNumEntries;
@@ -410,7 +407,7 @@ static int get_sclk_voltage_dependency_table(
PPTable_Generic_SubTable_Header const  *sclk_dep_table
)
 {
-   uint32_t table_size, i;
+   uint32_t i;
phm_ppt_v1_clock_voltage_dependency_table *sclk_table;
phm_ppt_v1_clock_voltage_dependency_record *sclk_table_record;
 
@@ -422,12 +419,9 @@ static int get_sclk_voltage_dependency_table(
PP_ASSERT_WITH_CODE((0 != tonga_table->ucNumEntries),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) + 
sizeof(phm_ppt_v1_clock_voltage_dependency_record)
-   * tonga_table->ucNumEntries;
-
-   sclk_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == sclk_table)
+   sclk_table = kzalloc(struct_size(sclk_table, entries, 
tonga_table->ucNumEntries),
+GFP_KERNEL);
+   if (!sclk_table)
return -ENOMEM;
 
sclk_table->count = (uint32_t)tonga_table->ucNumEntries;
@@ -454,12 +448,9 @@ static int get_sclk_voltage_dependency_table(
PP_ASSERT_WITH_CODE((0 != polaris_table->ucNumEntries),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) + 
sizeof(phm_ppt_v1_clock_voltage_dependency_record)
-   * polaris_table->ucNumEntries;
-
-   sclk_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == sclk_table)
+   sclk_table = kzalloc(struct_size(sclk_table, entries, 
polaris_table->ucNumEntries),
+GFP_KERNEL);
+ 

Re: [PATCH v2 2/7] dt-bindings: display: mxsfb: Add and fix compatible strings

2020-10-07 Thread Rob Herring
On Wed, 07 Oct 2020 04:24:33 +0300, Laurent Pinchart wrote:
> Additional compatible strings have been added in DT source for the
> i.MX6SL, i.MX6SLL, i.MX6UL and i.MX7D without updating the bindings.
> Most of the upstream DT sources use the fsl,imx28-lcdif compatible
> string, which mostly predates the realization that the LCDIF in the
> i.MX6 and newer SoCs have extra features compared to the i.MX28.
> 
> Update the bindings to add the missing compatible strings, with the
> correct fallback values. This fails to validate some of the upstream DT
> sources. Instead of adding the incorrect compatible fallback to the
> binding, the sources should be updated separately.
> 
> Signed-off-by: Laurent Pinchart 
> Reviewed-by: Sam Ravnborg 
> ---
> Changes since v1:
> 
> - Fix indentation under enum
> ---
>  .../devicetree/bindings/display/fsl,lcdif.yaml | 18 +-
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 

Reviewed-by: Rob Herring 
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 09/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_samu_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_samu_clock_voltage_dependency_table, instead of a one-element array,
and use the struct_size() helper to calculate the size for the allocation.

Also, save some heap space as the original code is multiplying
table->numEntries by sizeof(struct phm_samu_clock_voltage_dependency_table)
when it should have been multiplied it by
sizeof(struct phm_samu_clock_voltage_dependency_record) instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c5d3a.rym4gmzr3e0jezy+%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h|  2 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c  | 11 ---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index 7e0c948a7097..dad703ba0522 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -404,7 +404,7 @@ struct phm_samu_clock_voltage_dependency_record {
 
 struct phm_samu_clock_voltage_dependency_table {
uint8_t count;
-   struct phm_samu_clock_voltage_dependency_record entries[1];
+   struct phm_samu_clock_voltage_dependency_record entries[];
 };
 
 struct phm_cac_tdp_table {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index e059802d1e25..48d550d26c6a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1163,15 +1163,12 @@ static int get_samu_clock_voltage_limit_table(struct 
pp_hwmgr *hwmgr,
 struct phm_samu_clock_voltage_dependency_table **ptable,
 const ATOM_PPLIB_SAMClk_Voltage_Limit_Table *table)
 {
-   unsigned long table_size, i;
+   unsigned long i;
struct phm_samu_clock_voltage_dependency_table *samu_table;
 
-   table_size = sizeof(unsigned long) +
-   sizeof(struct phm_samu_clock_voltage_dependency_table) *
-   table->numEntries;
-
-   samu_table = kzalloc(table_size, GFP_KERNEL);
-   if (NULL == samu_table)
+   samu_table = kzalloc(struct_size(samu_table, entries, 
table->numEntries),
+GFP_KERNEL);
+   if (!samu_table)
return -ENOMEM;
 
samu_table->count = table->numEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 08/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_cac_leakage_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_cac_leakage_table, instead of a one-element array,
and use the struct_size() helper to calculate the size for the allocation.

Also, save some heap space as the original code is multiplying
table->ucNumEntries by sizeof(struct phm_cac_leakage_table) when it
should have been multiplied it by sizeof(struct phm_cac_leakage_record)
instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c5d38.it%2fqtjn+659xudo5%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h  |  2 +-
 .../drm/amd/pm/powerplay/hwmgr/processpptables.c| 13 +
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index b8e33325fac6..7e0c948a7097 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -393,7 +393,7 @@ union phm_cac_leakage_record {
 
 struct phm_cac_leakage_table {
uint32_t count;
-   union phm_cac_leakage_record entries[1];
+   union phm_cac_leakage_record entries[];
 };
 
 struct phm_samu_clock_voltage_dependency_record {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index 7719f52e6d52..e059802d1e25 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1384,17 +1384,14 @@ static int get_cac_leakage_table(struct pp_hwmgr *hwmgr,
const ATOM_PPLIB_CAC_Leakage_Table *table)
 {
struct phm_cac_leakage_table  *cac_leakage_table;
-   unsigned longtable_size, i;
+   unsigned long i;
 
-   if (hwmgr == NULL || table == NULL || ptable == NULL)
+   if (!hwmgr || !table || !ptable)
return -EINVAL;
 
-   table_size = sizeof(ULONG) +
-   (sizeof(struct phm_cac_leakage_table) * table->ucNumEntries);
-
-   cac_leakage_table = kzalloc(table_size, GFP_KERNEL);
-
-   if (cac_leakage_table == NULL)
+   cac_leakage_table = kzalloc(struct_size(cac_leakage_table, entries, 
table->ucNumEntries),
+   GFP_KERNEL);
+   if (!cac_leakage_table)
return -ENOMEM;
 
cac_leakage_table->count = (ULONG)table->ucNumEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML

2020-10-07 Thread Rob Herring
On Wed, Oct 07, 2020 at 04:24:32AM +0300, Laurent Pinchart wrote:
> Convert the mxsfb binding to YAML. The deprecated binding is dropped, as
> neither the DT sources nor the driver support it anymore. The converted
> binding is named fsl,lcdif.yaml to match the usual bindings naming
> scheme.
> 
> The compatible strings are messy, and DT sources use different kinds of
> combination of documented and undocumented values. Keep it simple for
> now, and update the example to make it valid. Aligning the binding with
> the existing DT sources will be performed separately.
> 
> Signed-off-by: Laurent Pinchart 
> Reviewed-by: Sam Ravnborg 
> --
> Changes since v1:
> 
> - Drop unneeded quotes in string
> - Replace minItems with maxItems in conditional check
> - Add blank line before ...
> - Squash the rename in this commit
> ---
>  .../bindings/display/fsl,lcdif.yaml   | 116 ++
>  .../devicetree/bindings/display/mxsfb.txt |  87 -
>  MAINTAINERS   |   2 +-
>  3 files changed, 117 insertions(+), 88 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/display/fsl,lcdif.yaml
>  delete mode 100644 Documentation/devicetree/bindings/display/mxsfb.txt
> 
> diff --git a/Documentation/devicetree/bindings/display/fsl,lcdif.yaml 
> b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml
> new file mode 100644
> index ..063bb8c58114
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml
> @@ -0,0 +1,116 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/display/fsl,lcdif.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Freescale/NXP i.MX LCD Interface (LCDIF)
> +
> +maintainers:
> +  - Marek Vasut 
> +  - Stefan Agner 
> +
> +description: |
> +  (e)LCDIF display controller found in the Freescale/NXP i.MX SoCs.
> +
> +properties:
> +  compatible:
> +enum:
> +  - fsl,imx23-lcdif
> +  - fsl,imx28-lcdif
> +  - fsl,imx6sx-lcdif
> +  - fsl,imx8mq-lcdif
> +
> +  reg:
> +maxItems: 1
> +
> +  clocks:
> +items:
> +  - description: Pixel clock
> +  - description: Bus clock
> +  - description: Display AXI clock
> +minItems: 1
> +
> +  clock-names:
> +items:
> +  - const: pix
> +  - const: axi
> +  - const: disp_axi
> +minItems: 1
> +
> +  interrupts:
> +maxItems: 1
> +
> +  port:
> +description: The LCDIF output port
> +type: object
> +
> +properties:
> +  endpoint:

What happened on the graph binding schema work? I started a meta-schema 
for it BTW.

You can drop all the endpoint parts. With that,

Reviewed-by: Rob Herring 

> +type: object
> +
> +properties:
> +  remote-endpoint:
> +$ref: /schemas/types.yaml#/definitions/phandle
> +
> +required:
> +  - remote-endpoint
> +
> +additionalProperties: false
> +
> +additionalProperties: false
> +
> +required:
> +  - compatible
> +  - reg
> +  - clocks
> +  - interrupts
> +  - port
> +
> +additionalProperties: false
> +
> +allOf:
> +  - if:
> +  properties:
> +compatible:
> +  contains:
> +const: fsl,imx6sx-lcdif
> +then:
> +  properties:
> +clocks:
> +  minItems: 2
> +  maxItems: 3
> +clock-names:
> +  minItems: 2
> +  maxItems: 3
> +  required:
> +- clock-names
> +else:
> +  properties:
> +clocks:
> +  maxItems: 1
> +clock-names:
> +  maxItems: 1
> +
> +examples:
> +  - |
> +#include 
> +#include 
> +
> +display-controller@222 {
> +compatible = "fsl,imx6sx-lcdif";
> +reg = <0x0222 0x4000>;
> +interrupts = ;
> +clocks = <&clks IMX6SX_CLK_LCDIF1_PIX>,
> + <&clks IMX6SX_CLK_LCDIF_APB>,
> + <&clks IMX6SX_CLK_DISPLAY_AXI>;
> +clock-names = "pix", "axi", "disp_axi";
> +
> +port {
> +endpoint {
> +remote-endpoint = <&panel_in>;
> +};
> +};
> +};
> +
> +...
> diff --git a/Documentation/devicetree/bindings/display/mxsfb.txt 
> b/Documentation/devicetree/bindings/display/mxsfb.txt
> deleted file mode 100644
> index c985871c46b3..
> --- a/Documentation/devicetree/bindings/display/mxsfb.txt
> +++ /dev/null
> @@ -1,87 +0,0 @@
> -* Freescale MXS LCD Interface (LCDIF)
> -
> -New bindings:
> -=
> -Required properties:
> -- compatible:Should be "fsl,imx23-lcdif" for i.MX23.
> - Should be "fsl,imx28-lcdif" for i.MX28.
> - Should be "fsl,imx6sx-lcdif" for i.MX6SX.
> - Should be "fsl,imx8mq-lcdif" for i.MX8MQ.
> -- reg:   Address and length of the register set for LCDIF
> -- interrupts:Should contain LCDIF interrupt
> -- clocks:A list of phandle + clock-specifier pa

[PATCH 07/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_vce_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_vce_clock_voltage_dependency_table, instead of a one-element array,
and use the struct_size() helper to calculate the size for the allocation.

Also, save some heap space as the original code is multiplying
table->numEntries by sizeof(struct phm_vce_clock_voltage_dependency_table)
when it should have multiplied it by sizeof(struct 
phm_vce_clock_voltage_dependency_record)
instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c5d35.pjtogs3h9khzk6ws%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h|  2 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c  | 11 ---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index ad614e32079e..b8e33325fac6 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -186,7 +186,7 @@ struct phm_acpclock_voltage_dependency_table {
 
 struct phm_vce_clock_voltage_dependency_table {
uint8_t count;
-   struct phm_vce_clock_voltage_dependency_record entries[1];
+   struct phm_vce_clock_voltage_dependency_record entries[];
 };
 
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index b2ef76580c6a..7719f52e6d52 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1135,15 +1135,12 @@ static int get_vce_clock_voltage_limit_table(struct 
pp_hwmgr *hwmgr,
const ATOM_PPLIB_VCE_Clock_Voltage_Limit_Table *table,
const VCEClockInfoArray*array)
 {
-   unsigned long table_size, i;
+   unsigned long i;
struct phm_vce_clock_voltage_dependency_table *vce_table = NULL;
 
-   table_size = sizeof(unsigned long) +
-   sizeof(struct phm_vce_clock_voltage_dependency_table)
-   * table->numEntries;
-
-   vce_table = kzalloc(table_size, GFP_KERNEL);
-   if (NULL == vce_table)
+   vce_table = kzalloc(struct_size(vce_table, entries, table->numEntries),
+   GFP_KERNEL);
+   if (!vce_table)
return -ENOMEM;
 
vce_table->count = table->numEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 06/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_phase_shedding_limits_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_phase_shedding_limits_table, instead of a one-element array,
and use the struct_size() helper to calculate the size for the allocation.

Also, save some heap space as the original code is multiplying
ptable->ucNumEntries by sizeof(struct phm_phase_shedding_limits_table)
when it should have multiplied it by sizeof(struct 
phm_phase_shedding_limits_record)
instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c5d36.6pstuzp2hrxaz7im%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h   |  2 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 12 
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index 361cb1125351..ad614e32079e 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -161,7 +161,7 @@ struct phm_vce_clock_voltage_dependency_record {
 
 struct phm_phase_shedding_limits_table {
uint32_t   count;
-   struct phm_phase_shedding_limits_record  entries[1];
+   struct phm_phase_shedding_limits_record  entries[];
 };
 
 struct phm_vceclock_voltage_dependency_table {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index a1b198045978..b2ef76580c6a 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1530,16 +1530,12 @@ static int init_phase_shedding_table(struct pp_hwmgr 
*hwmgr,
(((unsigned long)powerplay_table4) +

le16_to_cpu(powerplay_table4->usVddcPhaseShedLimitsTableOffset));
struct phm_phase_shedding_limits_table *table;
-   unsigned long size, i;
+   unsigned long i;
 
 
-   size = sizeof(unsigned long) +
-   (sizeof(struct phm_phase_shedding_limits_table) 
*
-   ptable->ucNumEntries);
-
-   table = kzalloc(size, GFP_KERNEL);
-
-   if (table == NULL)
+   table = kzalloc(struct_size(table, entries, 
ptable->ucNumEntries),
+   GFP_KERNEL);
+   if (!table)
return -ENOMEM;
 
table->count = (unsigned long)ptable->ucNumEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 05/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_acp_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_acp_clock_voltage_dependency_table, instead of a one-element
array, and use the struct_size() helper to calculate the size for the
allocation.

Also, save some heap space as the original code is multiplying
table->numEntries by sizeof(struct phm_acp_clock_voltage_dependency_table)
when it should have multiplied it by 
sizeof(phm_acp_clock_voltage_dependency_record)
instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c5d3c.tyfohg%2fa6jycl6zn%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h|  2 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c  | 11 ---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index 2f1886bc5535..361cb1125351 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -150,7 +150,7 @@ struct phm_acp_clock_voltage_dependency_record {
 
 struct phm_acp_clock_voltage_dependency_table {
uint32_t count;
-   struct phm_acp_clock_voltage_dependency_record entries[1];
+   struct phm_acp_clock_voltage_dependency_record entries[];
 };
 
 struct phm_vce_clock_voltage_dependency_record {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index 305d95c4162d..a1b198045978 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1194,15 +1194,12 @@ static int get_acp_clock_voltage_limit_table(struct 
pp_hwmgr *hwmgr,
struct phm_acp_clock_voltage_dependency_table **ptable,
const ATOM_PPLIB_ACPClk_Voltage_Limit_Table *table)
 {
-   unsigned table_size, i;
+   unsigned long i;
struct phm_acp_clock_voltage_dependency_table *acp_table;
 
-   table_size = sizeof(unsigned long) +
-   sizeof(struct phm_acp_clock_voltage_dependency_table) *
-   table->numEntries;
-
-   acp_table = kzalloc(table_size, GFP_KERNEL);
-   if (NULL == acp_table)
+   acp_table = kzalloc(struct_size(acp_table, entries, table->numEntries),
+   GFP_KERNEL);
+   if (!acp_table)
return -ENOMEM;
 
acp_table->count = (unsigned long)table->numEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 03/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_clock_array

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_clock_array, instead of a one-element array, and use the
struct_size() helper to calculate the size for the allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c433f.zymd+yuivawihgve%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h|  2 +-
 .../amd/pm/powerplay/hwmgr/process_pptables_v1_0.c| 11 ---
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c  |  7 +++
 .../amd/pm/powerplay/hwmgr/vega10_processpptables.c   |  9 +++--
 4 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index d68b547743e6..e84cff09af2d 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -91,7 +91,7 @@ struct phm_set_power_state_input {
 
 struct phm_clock_array {
uint32_t count;
-   uint32_t values[1];
+   uint32_t values[];
 };
 
 struct phm_clock_voltage_dependency_record {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
index b760f95e7fa7..52188f6cd150 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c
@@ -318,19 +318,16 @@ static int get_valid_clk(
phm_ppt_v1_clock_voltage_dependency_table const 
*clk_volt_pp_table
)
 {
-   uint32_t table_size, i;
+   uint32_t i;
struct phm_clock_array *table;
phm_ppt_v1_clock_voltage_dependency_record *dep_record;
 
PP_ASSERT_WITH_CODE((0 != clk_volt_pp_table->count),
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(uint32_t) * clk_volt_pp_table->count;
-
-   table = kzalloc(table_size, GFP_KERNEL);
-
-   if (NULL == table)
+   table = kzalloc(struct_size(table, values, clk_volt_pp_table->count),
+   GFP_KERNEL);
+   if (!table)
return -ENOMEM;
 
table->count = (uint32_t)clk_volt_pp_table->count;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index d94a7d8e0587..d9bed4df6f65 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -404,12 +404,11 @@ static int get_valid_clk(struct pp_hwmgr *hwmgr,
struct phm_clock_array **ptable,
const struct phm_clock_voltage_dependency_table *table)
 {
-   unsigned long table_size, i;
+   unsigned long i;
struct phm_clock_array *clock_table;
 
-   table_size = sizeof(unsigned long) + sizeof(unsigned long) * 
table->count;
-   clock_table = kzalloc(table_size, GFP_KERNEL);
-   if (NULL == clock_table)
+   clock_table = kzalloc(struct_size(clock_table, values, table->count), 
GFP_KERNEL);
+   if (!clock_table)
return -ENOMEM;
 
clock_table->count = (unsigned long)table->count;
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
index f29af5ca0aa0..e655c04ccdfb 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c
@@ -875,17 +875,14 @@ static int get_valid_clk(
struct phm_clock_array **clk_table,
const phm_ppt_v1_clock_voltage_dependency_table 
*clk_volt_pp_table)
 {
-   uint32_t table_size, i;
+   uint32_t i;
struct phm_clock_array *table;
 
PP_ASSERT_WITH_CODE(clk_volt_pp_table->count,
"Invalid PowerPlay Table!", return -1);
 
-   table_size = sizeof(uint32_t) +
-   sizeof(uint32_t) * clk_volt_pp_table->count;
-
-   table = kzalloc(table_size, GFP_KERNEL);
-
+   table = kzalloc(struct_size(table, values, clk_volt_pp_table->count),
+   GFP_KERNEL);
if (!table)
return -ENOMEM;
 
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 04/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_uvd_clock_voltage_dependency_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_uvd_clock_voltage_dependency_table, instead of a one-element
array, and use the struct_size() helper to calculate the size for the
allocation.

Also, save some heap space as the original code is multiplying
table->numEntries by sizeof(struct phm_uvd_clock_voltage_dependency_table)
when it should have multiplied it by 
sizeof(phm_uvd_clock_voltage_dependency_record)
instead.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c433e.pxkc6ksn6hn%2fldhj%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h|  2 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c  | 11 ---
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index e84cff09af2d..2f1886bc5535 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -140,7 +140,7 @@ struct phm_uvd_clock_voltage_dependency_record {
 
 struct phm_uvd_clock_voltage_dependency_table {
uint8_t count;
-   struct phm_uvd_clock_voltage_dependency_record entries[1];
+   struct phm_uvd_clock_voltage_dependency_record entries[];
 };
 
 struct phm_acp_clock_voltage_dependency_record {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index d9bed4df6f65..305d95c4162d 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -1105,15 +1105,12 @@ static int get_uvd_clock_voltage_limit_table(struct 
pp_hwmgr *hwmgr,
const ATOM_PPLIB_UVD_Clock_Voltage_Limit_Table *table,
const UVDClockInfoArray *array)
 {
-   unsigned long table_size, i;
+   unsigned long i;
struct phm_uvd_clock_voltage_dependency_table *uvd_table;
 
-   table_size = sizeof(unsigned long) +
-sizeof(struct phm_uvd_clock_voltage_dependency_table) *
-table->numEntries;
-
-   uvd_table = kzalloc(table_size, GFP_KERNEL);
-   if (NULL == uvd_table)
+   uvd_table = kzalloc(struct_size(uvd_table, entries, table->numEntries),
+   GFP_KERNEL);
+   if (!uvd_table)
return -ENOMEM;
 
uvd_table->count = table->numEntries;
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/3] drm: commit_work scheduling

2020-10-07 Thread Rob Clark
On Wed, Oct 7, 2020 at 3:36 AM Qais Yousef  wrote:
>
> On 10/06/20 13:04, Rob Clark wrote:
> > On Tue, Oct 6, 2020 at 3:59 AM Qais Yousef  wrote:
> > >
> > > On 10/05/20 16:24, Rob Clark wrote:
> > >
> > > [...]
> > >
> > > > > RT planning and partitioning is not easy task for sure. You might 
> > > > > want to
> > > > > consider using affinities too to get stronger guarantees for some 
> > > > > tasks and
> > > > > prevent cross-talking.
> > > >
> > > > There is some cgroup stuff that is pinning SF and some other stuff to
> > > > the small cores, fwiw.. I think the reasoning is that they shouldn't
> > > > be doing anything heavy enough to need the big cores.
> > >
> > > Ah, so you're on big.LITTLE type of system. I have done some work which 
> > > enables
> > > biasing RT tasks towards big cores and control the default boost value if 
> > > you
> > > have util_clamp and schedutil enabled. You can use util_clamp in general 
> > > to
> > > help with DVFS related response time delays.
> > >
> > > I haven't done any work to try our best to pick a small core first but 
> > > fallback
> > > to big if there's no other alternative.
> > >
> > > It'd be interesting to know how often you end up on a big core if you 
> > > remove
> > > the affinity. The RT scheduler picks the first cpu in the lowest priority 
> > > mask.
> > > So it should have this bias towards picking smaller cores first if they're
> > > in the lower priority mask (ie: not running higher priority RT tasks).
> >
> > fwiw, the issue I'm looking at is actually at the opposite end of the
> > spectrum, less demanding apps that let cpus throttle down to low
> > OPPs.. which stretches out the time taken at each step in the path
> > towards screen (which seems to improve the odds that we hit priority
> > inversion scenarios with SCHED_FIFO things stomping on important CFS
> > things)
>
> So you do have the problem of RT task preempting an important CFS task.
>
> >
> > There is a *big* difference in # of cpu cycles per frame between
> > highest and lowest OPP..
>
> To combat DVFS related delays, you can use util clamp.
>
> Hopefully this article helps explain it if you didn't come across it before
>
> https://lwn.net/Articles/762043/
>
> You can use sched_setattr() to set SCHED_FLAG_UTIL_CLAMP_MIN for a task. This
> will guarantee everytime this task is running it'll appear it has at least
> this utilization value, so schedutil governor (which must be used for this to
> work) will pick up the right performance point (OPP).
>
> The scheduler will try its best to make sure that the task will run on a core
> that meets the minimum requested performance point (hinted by setting
> uclamp_min).

Yeah, I think we will end up making some use of uclamp.. there is
someone else working on that angle

But without it, this is a case that exposes legit prioritization
problems with commit_work which we should fix ;-)

BR,
-R

>
> Thanks
>
> --
> Qais Yousef
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 02/14] drm/amd/pm: Replace one-element array with flexible-array member in struct vi_dpm_table

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Use a flexible-array member in struct vi_dpm_table instead of a
one-element array.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: https://lore.kernel.org/lkml/5f7c433c.ttk9rna+f58kyduy%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index a1dbfd5636e6..d68b547743e6 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -60,7 +60,7 @@ struct vi_dpm_level {
 
 struct vi_dpm_table {
uint32_t count;
-   struct vi_dpm_level dpm_level[1];
+   struct vi_dpm_level dpm_level[];
 };
 
 #define PCIE_PERF_REQ_REMOVE_REGISTRY   0
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 01/14] drm/amd/pm: Replace one-element array with flexible-array member

2020-10-07 Thread Gustavo A. R. Silva
There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of a flexible-array member in
struct phm_clock_voltage_dependency_table, instead of a one-element
array, and use the struct_size() helper to calculate the size for the
allocation.

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Build-tested-by: kernel test robot 
Link: 
https://lore.kernel.org/lkml/5f7c295c.8iqp1ifc6oivdq%2f%2f%25...@intel.com/
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/gpu/drm/amd/pm/inc/hwmgr.h   | 4 ++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 9 +++--
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c  | 2 +-
 drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c  | 5 ++---
 4 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h 
b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
index 3898a95ec28b..a1dbfd5636e6 100644
--- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h
@@ -122,8 +122,8 @@ struct phm_acpclock_voltage_dependency_record {
 };
 
 struct phm_clock_voltage_dependency_table {
-   uint32_t count; 
/* Number of entries. */
-   struct phm_clock_voltage_dependency_record entries[1];  /* 
Dynamically allocate count entries. */
+   uint32_t count; /* 
Number of entries. */
+   struct phm_clock_voltage_dependency_record entries[];   /* 
Dynamically allocate count entries. */
 };
 
 struct phm_phase_shedding_limits_record {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
index 719597c5d27d..d94a7d8e0587 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c
@@ -377,14 +377,11 @@ static int get_clock_voltage_dependency_table(struct 
pp_hwmgr *hwmgr,
const ATOM_PPLIB_Clock_Voltage_Dependency_Table *table)
 {
 
-   unsigned long table_size, i;
+   unsigned long i;
struct phm_clock_voltage_dependency_table *dep_table;
 
-   table_size = sizeof(unsigned long) +
-   sizeof(struct phm_clock_voltage_dependency_table)
-   * table->ucNumEntries;
-
-   dep_table = kzalloc(table_size, GFP_KERNEL);
+   dep_table = kzalloc(struct_size(dep_table, entries, 
table->ucNumEntries),
+   GFP_KERNEL);
if (NULL == dep_table)
return -ENOMEM;
 
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c
index 35ed47ebaf09..ed9b89980184 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c
@@ -276,7 +276,7 @@ static int smu8_init_dynamic_state_adjustment_rule_settings(
 {
struct phm_clock_voltage_dependency_table *table_clk_vlt;
 
-   table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 7),
+   table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 8),
GFP_KERNEL);
 
if (NULL == table_clk_vlt) {
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c 
b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
index 60b5ca974356..b485f8b1d6f2 100644
--- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
+++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c
@@ -492,13 +492,12 @@ int phm_get_sclk_for_voltage_evv(struct pp_hwmgr *hwmgr,
  */
 int phm_initializa_dynamic_state_adjustment_rule_settings(struct pp_hwmgr 
*hwmgr)
 {
-   uint32_t table_size;
struct phm_clock_voltage_dependency_table *table_clk_vlt;
struct phm_ppt_v1_information *pptable_info = (struct 
phm_ppt_v1_information *)(hwmgr->pptable);
 
/* initialize vddc_dep_on_dal_pwrl table */
-   table_size = sizeof(uint32_t) + 4 * sizeof(struct 
phm_clock_voltage_dependency_record);
-   table_clk_vlt = kzalloc(table_size, GFP_KERNEL);
+   table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 4),
+   GFP_KERNEL);
 
if (NULL == table_clk_vlt) {
pr_err("Can not allocate space for vddc_dep_on_dal_pwrl! \n");
-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 00/14] drm/amd/pm: Replace one-element arrays with flexible-array members

2020-10-07 Thread Gustavo A. R. Silva
Hi all,

This series aims to replace one-element arrays with flexible-array
members.

There is a regular need in the kernel to provide a way to declare having
a dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The older
style of one-element or zero-length arrays should no longer be used[2].

Refactor the code according to the use of flexible-array members, instead
of one-element arrays, and use the struct_size() helper to calculate the
size for the dynamic memory allocation.

Also, save some heap space in the process. More on this on each individual
patch.

This series also addresses multiple of the following sorts of warnings:

drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu8_hwmgr.c:1515:37:
warning: array subscript 1 is above array bounds of ‘const struct
phm_clock_voltage_dependency_record[1]’ [-Warray-bounds]

which, in this case, they are false positives, but nervertheless should be
fixed in order to enable -Warray-bounds[3][4].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] 
https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays
[3] https://git.kernel.org/linus/44720996e2d79e47d508b0abe99b931a726a3197
[4] https://github.com/KSPP/linux/issues/109

Gustavo A. R. Silva (14):
  drm/amd/pm: Replace one-element array with flexible-array member
  drm/amd/pm: Replace one-element array with flexible-array member in
struct vi_dpm_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_clock_array
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_uvd_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_acp_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_phase_shedding_limits_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_vce_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_cac_leakage_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_samu_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_ppt_v1_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_ppt_v1_mm_clock_voltage_dependency_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_ppt_v1_voltage_lookup_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
phm_ppt_v1_pcie_table
  drm/amd/pm: Replace one-element array with flexible-array in struct
ATOM_Vega10_GFXCLK_Dependency_Table

 drivers/gpu/drm/amd/pm/inc/hwmgr.h| 20 ++---
 .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h|  8 +-
 .../powerplay/hwmgr/process_pptables_v1_0.c   | 85 +++---
 .../amd/pm/powerplay/hwmgr/processpptables.c  | 85 +++---
 .../drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c   |  2 +-
 .../drm/amd/pm/powerplay/hwmgr/smu_helper.c   |  5 +-
 .../amd/pm/powerplay/hwmgr/vega10_pptable.h   |  2 +-
 .../powerplay/hwmgr/vega10_processpptables.c  | 88 ++-
 8 files changed, 107 insertions(+), 188 deletions(-)

-- 
2.27.0

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH RESEND v3 2/6] dt-bindings: display: sun4i: Add LVDS Dual-Link property

2020-10-07 Thread Rob Herring
On Mon, Oct 05, 2020 at 05:15:40PM +0200, Maxime Ripard wrote:
> The Allwinner SoCs with two TCONs and LVDS output can use both to drive an
> LVDS dual-link. Add a new property to express that link between these two
> TCONs.
> 
> Signed-off-by: Maxime Ripard 
> ---
>  Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml | 6 
> ++
>  1 file changed, 6 insertions(+)
> 
> diff --git 
> a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml 
> b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml
> index e5344c4ae226..ce407f5466a5 100644
> --- a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml
> +++ b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml
> @@ -115,6 +115,12 @@ properties:
>  - const: edp
>  - const: lvds
>  
> +  allwinner,lvds-companion:

We already have 1 vendor property for this. How about 'link-companion' 
for something common.

Rob
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: KASAN: vmalloc-out-of-bounds Write in sys_imageblit

2020-10-07 Thread syzbot
syzbot has found a reproducer for the following issue on:

HEAD commit:c85fb28b Merge tag 'arm64-fixes' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17406d7050
kernel config:  https://syzkaller.appspot.com/x/.config?x=140446ac2aa637e5
dashboard link: https://syzkaller.appspot.com/bug?extid=26dc38a00dc05118a4e6
compiler:   gcc (GCC) 10.1.0-syz 20200507
userspace arch: i386
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=14788d7050
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15158ee050

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+26dc38a00dc05118a...@syzkaller.appspotmail.com

==
BUG: KASAN: vmalloc-out-of-bounds in fast_imageblit 
drivers/video/fbdev/core/sysimgblt.c:229 [inline]
BUG: KASAN: vmalloc-out-of-bounds in sys_imageblit+0x117f/0x1290 
drivers/video/fbdev/core/sysimgblt.c:275
Write of size 4 at addr c90009911000 by task syz-executor045/8761

CPU: 0 PID: 8761 Comm: syz-executor045 Not tainted 5.9.0-rc8-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x198/0x1fd lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 fast_imageblit drivers/video/fbdev/core/sysimgblt.c:229 [inline]
 sys_imageblit+0x117f/0x1290 drivers/video/fbdev/core/sysimgblt.c:275
 drm_fb_helper_sys_imageblit+0x1c/0x180 drivers/gpu/drm/drm_fb_helper.c:767
 bit_putcs_unaligned drivers/video/fbdev/core/bitblit.c:139 [inline]
 bit_putcs+0x6e1/0xd20 drivers/video/fbdev/core/bitblit.c:188
 fbcon_putcs+0x35a/0x450 drivers/video/fbdev/core/fbcon.c:1308
 do_update_region+0x399/0x630 drivers/tty/vt/vt.c:675
 redraw_screen+0x658/0x790 drivers/tty/vt/vt.c:1034
 fbcon_modechanged+0x593/0x6d0 drivers/video/fbdev/core/fbcon.c:2714
 fbcon_update_vcs+0x3a/0x50 drivers/video/fbdev/core/fbcon.c:2759
 do_fb_ioctl+0x62e/0x690 drivers/video/fbdev/core/fbmem.c:1106
 fb_compat_ioctl+0x17c/0xc30 drivers/video/fbdev/core/fbmem.c:1311
 __do_compat_sys_ioctl+0x1d3/0x230 fs/ioctl.c:842
 do_syscall_32_irqs_on arch/x86/entry/common.c:78 [inline]
 __do_fast_syscall_32+0x60/0x90 arch/x86/entry/common.c:137
 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:160
 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
RIP: 0023:0xf7f58549
Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 
03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 
eb 0d 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 002b:f7f531dc EFLAGS: 0246 ORIG_RAX: 0036
RAX: ffda RBX: 0003 RCX: 4601
RDX: 2000 RSI:  RDI: 
RBP:  R08:  R09: 
R10:  R11:  R12: 
R13:  R14:  R15: 


Memory state around the buggy address:
 c90009910f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 c90009910f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>c90009911000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
   ^
 c90009911080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
 c90009911100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Jason Gunthorpe
On Wed, Oct 07, 2020 at 03:34:01PM +0200, Tomasz Figa wrote:

> I think the userptr zero-copy hack should be able to go away indeed,
> given that we now have CMA that allows having carveouts backed by
> struct pages and having the memory represented as DMA-buf normally.

This also needs to figure out how to get references to CMA pages out
of a VMA. IIRC Daniel said these are not pinnable?

> How about the regular userptr use case, though?

Just call pin_user_pages(), that is the easy case.

> Is your intention to drop get_vaddr_frames() or we could still keep
> using it and if vec->is_pfns is true:

get_vaddr_frames() is dangerous, I would like it to go away.

> a) if CONFIG_VIDEO_LEGACY_PFN_USERPTR is set, taint the kernel
> b) otherwise just undo and fail?

For the CONFIG_VIDEO_LEGACY_PFN_USERPTR case all the follow_pfn
related code in get_vaddr_frames() shold move back into media and be
hidden under this config.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Jason Gunthorpe
On Wed, Oct 07, 2020 at 04:11:59PM +0200, Tomasz Figa wrote:

> We also need to bring back the vma_open() that somehow disappeared
> around 4.2, as Marek found.

No

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM

2020-10-07 Thread Jason Gunthorpe
On Wed, Oct 07, 2020 at 03:06:17PM +0200, Tomasz Figa wrote:

> Note that vb2_vmalloc is only used for in-kernel CPU usage, e.g. the
> contents being copied by the driver between vb2 buffers and some
> hardware FIFO or other dedicated buffers. The memory does not go to
> any hardware DMA.

That is even worse, the CPU can't just blindly touch VM_IO pages, that
isn't portable.

> Could you elaborate on what "the REQUIRED behavior is"? I can see that
> both follow the get_vaddr_frames() -> frame_vector_to_pages() flow, as
> you mentioned. Perhaps the only change needed is switching to
> pin_user_pages after all?

It is the comment right on top of get_vaddr_frames():

  if @start belongs to VM_IO | VM_PFNMAP vma, we don't
  touch page structures and the caller must make sure pfns aren't
  reused for anything else while he is using them.

Which means excluding every kind of VMA that is not something this
driver understands and then using special knowledge of the
driver-specific VMA to assure the above.

For instance if you could detect the VMA is from a carevout and do
something special like hold the fget() while knowning that the struct
file guarentees the carveout remains reserved - then you could use
follow_pfn.

But it would be faster and better to ask the carveout FD for the vaddr
range.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML

2020-10-07 Thread Marek Vasut
On 10/7/20 3:24 AM, Laurent Pinchart wrote:
[...]
> +properties:
> +  compatible:
> +enum:
> +  - fsl,imx23-lcdif
> +  - fsl,imx28-lcdif
> +  - fsl,imx6sx-lcdif
> +  - fsl,imx8mq-lcdif

There is no fsl,imx8mq-lcdif in drivers/gpu/drm/mxsfb/mxsfb_drv.c,
so the DT must specify compatible = "fsl,imx8mq-lcdif",
"fsl,imx28-lcdif" (since imx28 is the oldest SoC with LCDIF V4).

Should the compatible be added to drivers/gpu/drm/mxsfb/mxsfb_drv.c or
dropped from the YAML file or neither ?

[...]
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >