Re: [PATCH v3 7/7] dma-buf: system_heap: Add a system-uncached heap re-using the system heap
On Mon, Oct 5, 2020 at 6:45 AM Christoph Hellwig wrote: > > How is this going to deal with VIVT caches? Hrm. That's a good question. I'm not sure I totally have my head around it but, I guess we could make sure to call invalidate_kernel_vmap_range() in begin_cpu_access() and flush_kernel_vmap_range() in end_cpu_access() rather then exiting out early as we do now? Unless you have better guidance? Worse case we could check CONFIG_CPU_CACHE_VIVT and not register the system-uncached heap. thanks -john ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: linux-next: build failure after merge of the drm-misc tree
Hi all, On Thu, 8 Oct 2020 14:09:03 +1100 Stephen Rothwell wrote: > > After merging the drm-misc tree, today's linux-next build (x86_64 > allmodconfig) failed like this: In file included from include/linux/clk.h:13, from drivers/gpu/drm/ingenic/ingenic-drm-drv.c:10: drivers/gpu/drm/ingenic/ingenic-drm-drv.c: In function 'ingenic_drm_update_palette': drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'? 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~~ include/linux/kernel.h:47:33: note: in definition of macro 'ARRAY_SIZE' 47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) | ^~~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'? 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~~ include/linux/kernel.h:47:48: note: in definition of macro 'ARRAY_SIZE' 47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) |^~~ In file included from include/linux/bits.h:22, from include/linux/bitops.h:5, from drivers/gpu/drm/ingenic/ingenic-drm.h:10, from drivers/gpu/drm/ingenic/ingenic-drm-drv.c:7: drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'? 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~~ include/linux/build_bug.h:16:62: note: in definition of macro 'BUILD_BUG_ON_ZERO' 16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) | ^ include/linux/compiler.h:224:46: note: in expansion of macro '__same_type' 224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) | ^~~ include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array' 47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) | ^~~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 'ARRAY_SIZE' 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:35: error: 'struct ingenic_drm' has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'? 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~~ include/linux/build_bug.h:16:62: note: in definition of macro 'BUILD_BUG_ON_ZERO' 16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) | ^ include/linux/compiler.h:224:46: note: in expansion of macro '__same_type' 224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) | ^~~ include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array' 47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) | ^~~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 'ARRAY_SIZE' 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~ include/linux/build_bug.h:16:51: error: bit-field '' width not an integer constant 16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); }))) | ^ include/linux/compiler.h:224:28: note: in expansion of macro 'BUILD_BUG_ON_ZERO' 224 | #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0])) |^ include/linux/kernel.h:47:59: note: in expansion of macro '__must_be_array' 47 | #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr)) | ^~~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:448:18: note: in expansion of macro 'ARRAY_SIZE' 448 | for (i = 0; i < ARRAY_SIZE(priv->dma_hwdescs->palette); i++) { | ^~ drivers/gpu/drm/ingenic/ingenic-drm-drv.c:453:9: error: 'struct ingenic_drm' has no member named 'dma_hwdescs'; did you mean 'dma_hwdesc_f0'? 453 | priv->dma_hwdescs->palette[i] = color; |
Re: [PATCH 3/5] drm/vmwgfx: add a move callback.
On Thu, 8 Oct 2020 at 13:41, Zack Rusin wrote: > > > > On Oct 5, 2020, at 20:06, Dave Airlie wrote: > > > > From: Dave Airlie > > > > This just copies the fallback to vmwgfx, I'm going to iterate on this > > a bit until it's not the same as the fallback path. > > > > Signed-off-by: Dave Airlie > > What are your plans for it? i.e. how is it going to be different? Initial plan is to put move_notify inside the move callback, then eventually get rid of the ttm bind/unbind callback and let the driver do that itself if needed. I've got most of it in a branch (and I posted a 45 patch series a week or two ago), but I need to rebase and clean it up for reposting. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 19/20] drm/tegra: Implement new UAPI
Hi Mikko, Thank you for the patch! Yet something to improve: [auto build test ERROR on tegra-drm/drm/tegra/for-next] [also build test ERROR on tegra/for-next linus/master v5.9-rc8 next-20201007] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 base: git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next config: arm64-randconfig-r004-20201008 (attached as .config) compiler: aarch64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/6a3b3d79ce4488695cc0745edd19015fc2220d97 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 git checkout 6a3b3d79ce4488695cc0745edd19015fc2220d97 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All error/warnings (new ones prefixed by >>): In file included from drivers/gpu/drm/tegra/uapi/uapi.c:12: >> drivers/gpu/drm/tegra/uapi/../drm.h:84:1: error: attempted to randomize >> userland API struct tegra_drm_client_ops 84 | }; | ^ >> drivers/gpu/drm/tegra/uapi/uapi.c:62:5: warning: no previous prototype for >> 'close_channel_ctx' [-Wmissing-prototypes] 62 | int close_channel_ctx(int id, void *p, void *data) | ^ -- In file included from drivers/gpu/drm/tegra/uapi/submit.c:18: >> drivers/gpu/drm/tegra/uapi/../drm.h:84:1: error: attempted to randomize >> userland API struct tegra_drm_client_ops 84 | }; | ^ vim +84 drivers/gpu/drm/tegra/uapi/../drm.h d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 74 53fa7f7204c97d drivers/gpu/host1x/drm/drm.h Thierry Reding 2013-09-24 75 struct tegra_drm_client_ops { 53fa7f7204c97d drivers/gpu/host1x/drm/drm.h Thierry Reding 2013-09-24 76 int (*open_channel)(struct tegra_drm_client *client, c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding 2013-09-26 77 struct tegra_drm_context *context); c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding 2013-09-26 78 void (*close_channel)(struct tegra_drm_context *context); c40f0f1afcb1dc drivers/gpu/drm/tegra/drm.h Thierry Reding 2013-10-10 79 int (*is_addr_reg)(struct device *dev, u32 class, u32 offset); 0f563a4bf66e51 drivers/gpu/drm/tegra/drm.h Dmitry Osipenko 2017-06-15 80 int (*is_valid_class)(u32 class); c88c363072c6dc drivers/gpu/host1x/drm/drm.h Thierry Reding 2013-09-26 81 int (*submit)(struct tegra_drm_context *context, d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 82 struct drm_tegra_submit *args, struct drm_device *drm, d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 83 struct drm_file *file); d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 @84 }; d43f81cbaf4353 drivers/gpu/host1x/drm/drm.h Terje Bergstrom 2013-03-22 85 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[git pull] drm nouveau fixes for 5.9 final
Hi Linus, Karol found two last minute nouveau fixes, they both fix crashes, the TTM one follows what other drivers do already, and the other is for bailing on load on unrecognised chipsets. Thanks, Dave. drm-fixes-2020-10-08: drm nouveau fixes for 5.9 final nouveau: - fix crash in TTM alloc fail path - return error earlier for unknown chipsets The following changes since commit 86fdf61e71046618f6f499542cee12f2348c523c: Merge tag 'drm-misc-fixes-2020-10-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2020-10-06 12:38:28 +1000) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2020-10-08 for you to fetch changes up to d10285a25e29f13353bbf7760be8980048c1ef2f: drm/nouveau/mem: guard against NULL pointer access in mem_del (2020-10-07 15:33:09 +1000) drm nouveau fixes for 5.9 final nouveau: - fix crash in TTM alloc fail path - return error earlier for unknown chipsets Karol Herbst (2): drm/nouveau/device: return error for unknown chipsets drm/nouveau/mem: guard against NULL pointer access in mem_del drivers/gpu/drm/nouveau/nouveau_mem.c | 2 ++ drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 1 + 2 files changed, 3 insertions(+) ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/5] drm/vmwgfx: add a move callback.
> On Oct 5, 2020, at 20:06, Dave Airlie wrote: > > From: Dave Airlie > > This just copies the fallback to vmwgfx, I'm going to iterate on this > a bit until it's not the same as the fallback path. > > Signed-off-by: Dave Airlie What are your plans for it? i.e. how is it going to be different? Reviewed-by: Zack Rusin ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/5] drm/vmwgfx: move null mem checks outside move notifies
> On Oct 5, 2020, at 20:06, Dave Airlie wrote: > > From: Dave Airlie > > Both fns checked mem == NULL, just move the check outside. > > Signed-off-by: Dave Airlie That’s a nice cleanup. Reviewed-by: Zack Rusin ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
linux-next: build failure after merge of the drm-misc tree
Hi all, After merging the drm-misc tree, today's linux-next build (x86_64 allmodconfig) failed like this: I noticed that the ingenic driver revert I had been waiting for appeared in hte drm-misc tree, so I removed the BROKEN dependency for it, but it produced the above errors, so I have marked it BROKEN again. -- Cheers, Stephen Rothwell pgpvnTVhAesU2.pgp Description: OpenPGP digital signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 07/13] mm: close race in generic_access_phys
On 10/7/20 9:44 AM, Daniel Vetter wrote: Way back it was a reasonable assumptions that iomem mappings never change the pfn range they point at. But this has changed: - gpu drivers dynamically manage their memory nowadays, invalidating ptes with unmap_mapping_range when buffers get moved - contiguous dma allocations have moved from dedicated carvetouts to s/carvetouts/carveouts/ cma regions. This means if we miss the unmap the pfn might contain pagecache or anon memory (well anything allocated with GFP_MOVEABLE) - even /dev/mem now invalidates mappings when the kernel requests that iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region") Thanks for putting these references into the log, it's very helpful. ... diff --git a/mm/memory.c b/mm/memory.c index fcfc4ca36eba..8d467e23b44e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4873,28 +4873,68 @@ int follow_phys(struct vm_area_struct *vma, return ret; } +/** + * generic_access_phys - generic implementation for iomem mmap access + * @vma: the vma to access + * @addr: userspace addres, not relative offset within @vma + * @buf: buffer to read/write + * @len: length of transfer + * @write: set to FOLL_WRITE when writing, otherwise reading + * + * This is a generic implementation for &vm_operations_struct.access for an + * iomem mapping. This callback is used by access_process_vm() when the @vma is + * not page based. + */ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write) { resource_size_t phys_addr; unsigned long prot = 0; void __iomem *maddr; + pte_t *ptep, pte; + spinlock_t *ptl; int offset = addr & (PAGE_SIZE-1); + int ret = -EINVAL; + + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) + return -EINVAL; + +retry: + if (follow_pte(vma->vm_mm, addr, &ptep, &ptl)) + return -EINVAL; + pte = *ptep; + pte_unmap_unlock(ptep, ptl); - if (follow_phys(vma, addr, write, &prot, &phys_addr)) + prot = pgprot_val(pte_pgprot(pte)); + phys_addr = (resource_size_t)pte_pfn(pte) << PAGE_SHIFT; + + if ((write & FOLL_WRITE) && !pte_write(pte)) return -EINVAL; maddr = ioremap_prot(phys_addr, PAGE_ALIGN(len + offset), prot); if (!maddr) return -ENOMEM; + if (follow_pte(vma->vm_mm, addr, &ptep, &ptl)) + goto out_unmap; + + if (pte_same(pte, *ptep)) { The ioremap area is something I'm sorta new to, so a newbie question: is it possible for the same pte to already be there, ever? If so, we be stuck in an infinite loop here. I'm sure that's not the case, but it's not yet obvious to me why it's impossible. Resource reservations maybe? thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 09/20] gpu: host1x: DMA fences and userspace fence creation
Hi Mikko, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on tegra-drm/drm/tegra/for-next] [also build test WARNING on tegra/for-next linus/master v5.9-rc8 next-20201007] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 base: git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next config: arm64-randconfig-r004-20201008 (attached as .config) compiler: aarch64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/c4f5ec983027f2b19e6854a362e23a79e1630100 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 git checkout c4f5ec983027f2b19e6854a362e23a79e1630100 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): >> drivers/gpu/host1x/fence.c:105:6: warning: no previous prototype for >> 'host1x_fence_signal' [-Wmissing-prototypes] 105 | void host1x_fence_signal(struct host1x_syncpt_fence *f) | ^~~ vim +/host1x_fence_signal +105 drivers/gpu/host1x/fence.c 104 > 105 void host1x_fence_signal(struct host1x_syncpt_fence *f) 106 { 107 if (atomic_xchg(&f->signaling, 1)) 108 return; 109 110 /* 111 * Cancel pending timeout work - if it races, it will 112 * not get 'f->signaling' and return. 113 */ 114 cancel_delayed_work_sync(&f->timeout_work); 115 116 host1x_intr_put_ref(f->sp->host, f->sp->id, f->waiter_ref); 117 118 dma_fence_signal(&f->base); 119 dma_fence_put(&f->base); 120 } 121 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] drm/fourcc: Add AXBXGXRX106106106106 format
On Wed, 2020-10-07 at 10:27 +0100, Matteo Franchin wrote: > Add ABGR format with 10-bit components packed in 64-bit per pixel. > This format can be used to handle > VK_FORMAT_R10X6G10X6B10X6A10X6_UNORM_4PACK16 on little-endian > architectures. trivial note: > diff --git a/drivers/gpu/drm/drm_fourcc.c b/drivers/gpu/drm/drm_fourcc.c [] > @@ -202,6 +202,7 @@ const struct drm_format_info *__drm_format_info(u32 > format) > { .format = DRM_FORMAT_XBGR16161616F, .depth = 0, > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1 }, > { .format = DRM_FORMAT_ARGB16161616F, .depth = 0, > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true > }, > { .format = DRM_FORMAT_ABGR16161616F, .depth = 0, > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true > }, > + { .format = DRM_FORMAT_AXBXGXRX106106106106,.depth = 0, > .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true > }, My is to separate this into 2 lines so every column including .depth on still visually aligns. + { .format = DRM_FORMAT_AXBXGXRX106106106106, + .depth = 0, .num_planes = 1, .cpp = { 8, 0, 0 }, .hsub = 1, .vsub = 1, .has_alpha = true }, ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 3:23 PM Dan Williams wrote: > > On Wed, Oct 7, 2020 at 12:49 PM Daniel Vetter wrote: > > > > On Wed, Oct 7, 2020 at 9:33 PM Dan Williams > > wrote: > > > > > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter > > > wrote: > > > > > > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > > > > the region") /dev/kmem zaps ptes when the kernel requests exclusive > > > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > > > > the default for all driver uses. > > > > > > > > Except there's two more ways to access pci bars: sysfs and proc mmap > > > > support. Let's plug that hole. > > > > > > Ooh, yes, lets. > > > > > > > For revoke_devmem() to work we need to link our vma into the same > > > > address_space, with consistent vma->vm_pgoff. ->pgoff is already > > > > adjusted, because that's how (io_)remap_pfn_range works, but for the > > > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > > > > at ->open time, but that's a bit tricky here with all the entry points > > > > and arch code. So instead create a fake file and adjust vma->vm_file. > > > > > > I don't think you want to share the devmem inode for this, this should > > > be based off the sysfs inode which I believe there is already only one > > > instance per resource. In contrast /dev/mem can have multiple inodes > > > because anyone can just mknod a new character device file, the same > > > problem does not exist for sysfs. > > > > But then I need to find the right one, plus I also need to find the > > right one for the procfs side. That gets messy, and I already have no > > idea how to really test this. Shared address_space is the same trick > > we're using in drm (where we have multiple things all pointing to the > > same underlying resources, through different files), and it gets the > > job done. So that's why I figured the shared address_space is the > > cleaner solution since then unmap_mapping_range takes care of > > iterating over all vma for us. I guess I could reimplement that logic > > with our own locking and everything in revoke_devmem, but feels a bit > > silly. But it would also solve the problem of having mutliple > > different mknod of /dev/kmem with different address_space behind them. > > Also because of how remap_pfn_range works, all these vma do use the > > same pgoff already anyway. > > True, remap_pfn_range() makes sure that ->pgoff is an absolute > physical address offset for all use cases. So you might be able to > just point proc_bus_pci_open() at the shared devmem address space. For > sysfs it's messier. I think you would need to somehow get the inode > from kernfs_fop_open() to adjust its address space, but only if the > bin_file will ultimately be used for PCI memory. To me this seems like a new sysfs_create_bin_file() flavor that registers the file with the common devmem address_space. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v3 06/20] gpu: host1x: Cleanup and refcounting for syncpoints
Hi Mikko, Thank you for the patch! Yet something to improve: [auto build test ERROR on tegra-drm/drm/tegra/for-next] [also build test ERROR on tegra/for-next linus/master v5.9-rc8 next-20201007] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 base: git://anongit.freedesktop.org/tegra/linux.git drm/tegra/for-next config: arm-allyesconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/3721bf9ddd2b05fe12b3512999f77351ae839d08 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Mikko-Perttunen/Host1x-TegraDRM-UAPI/20201008-034403 git checkout 3721bf9ddd2b05fe12b3512999f77351ae839d08 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): drivers/staging/media/tegra-video/vi.c: In function 'tegra_channel_cleanup': >> drivers/staging/media/tegra-video/vi.c:621:2: error: implicit declaration of >> function 'host1x_syncpt_free'; did you mean 'host1x_syncpt_read'? >> [-Werror=implicit-function-declaration] 621 | host1x_syncpt_free(chan->mw_ack_sp); | ^~ | host1x_syncpt_read cc1: some warnings being treated as errors vim +621 drivers/staging/media/tegra-video/vi.c 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 616 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 617 static void tegra_channel_cleanup(struct tegra_vi_channel *chan) 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 618 { 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 619 v4l2_ctrl_handler_free(&chan->ctrl_handler); 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 620 media_entity_cleanup(&chan->video.entity); 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 @621 host1x_syncpt_free(chan->mw_ack_sp); 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 622 host1x_syncpt_free(chan->frame_start_sp); 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 623 mutex_destroy(&chan->video_lock); 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 624 } 3d8a97eabef0883 Sowjanya Komatineni 2020-05-04 625 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 12:49 PM Daniel Vetter wrote: > > On Wed, Oct 7, 2020 at 9:33 PM Dan Williams wrote: > > > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter > > wrote: > > > > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > > > the region") /dev/kmem zaps ptes when the kernel requests exclusive > > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > > > the default for all driver uses. > > > > > > Except there's two more ways to access pci bars: sysfs and proc mmap > > > support. Let's plug that hole. > > > > Ooh, yes, lets. > > > > > For revoke_devmem() to work we need to link our vma into the same > > > address_space, with consistent vma->vm_pgoff. ->pgoff is already > > > adjusted, because that's how (io_)remap_pfn_range works, but for the > > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > > > at ->open time, but that's a bit tricky here with all the entry points > > > and arch code. So instead create a fake file and adjust vma->vm_file. > > > > I don't think you want to share the devmem inode for this, this should > > be based off the sysfs inode which I believe there is already only one > > instance per resource. In contrast /dev/mem can have multiple inodes > > because anyone can just mknod a new character device file, the same > > problem does not exist for sysfs. > > But then I need to find the right one, plus I also need to find the > right one for the procfs side. That gets messy, and I already have no > idea how to really test this. Shared address_space is the same trick > we're using in drm (where we have multiple things all pointing to the > same underlying resources, through different files), and it gets the > job done. So that's why I figured the shared address_space is the > cleaner solution since then unmap_mapping_range takes care of > iterating over all vma for us. I guess I could reimplement that logic > with our own locking and everything in revoke_devmem, but feels a bit > silly. But it would also solve the problem of having mutliple > different mknod of /dev/kmem with different address_space behind them. > Also because of how remap_pfn_range works, all these vma do use the > same pgoff already anyway. True, remap_pfn_range() makes sure that ->pgoff is an absolute physical address offset for all use cases. So you might be able to just point proc_bus_pci_open() at the shared devmem address space. For sysfs it's messier. I think you would need to somehow get the inode from kernfs_fop_open() to adjust its address space, but only if the bin_file will ultimately be used for PCI memory. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 06/13] media: videobuf2: Move frame_vector into media subsystem
On 10/7/20 9:44 AM, Daniel Vetter wrote: It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR symbol from all over the tree (well just one place, somehow omap media driver still had this in its Kconfig, despite not using it). Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Figa Cc: Mauro Carvalho Chehab Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Daniel Vetter --- Failed to spot any problems here. :) Reviewed-by: John Hubbard thanks, -- John Hubbard NVIDIA drivers/media/common/videobuf2/Kconfig| 1 - drivers/media/common/videobuf2/Makefile | 1 + .../media/common/videobuf2}/frame_vector.c| 2 + drivers/media/platform/omap/Kconfig | 1 - include/linux/mm.h| 42 --- include/media/videobuf2-core.h| 42 +++ mm/Kconfig| 3 -- mm/Makefile | 1 - 8 files changed, 45 insertions(+), 48 deletions(-) rename {mm => drivers/media/common/videobuf2}/frame_vector.c (99%) diff --git a/drivers/media/common/videobuf2/Kconfig b/drivers/media/common/videobuf2/Kconfig index edbc99ebba87..d2223a12c95f 100644 --- a/drivers/media/common/videobuf2/Kconfig +++ b/drivers/media/common/videobuf2/Kconfig @@ -9,7 +9,6 @@ config VIDEOBUF2_V4L2 config VIDEOBUF2_MEMOPS tristate - select FRAME_VECTOR config VIDEOBUF2_DMA_CONTIG tristate diff --git a/drivers/media/common/videobuf2/Makefile b/drivers/media/common/videobuf2/Makefile index 77bebe8b202f..54306f8d096c 100644 --- a/drivers/media/common/videobuf2/Makefile +++ b/drivers/media/common/videobuf2/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 videobuf2-common-objs := videobuf2-core.o +videobuf2-common-objs += frame_vector.o ifeq ($(CONFIG_TRACEPOINTS),y) videobuf2-common-objs += vb2-trace.o diff --git a/mm/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c similarity index 99% rename from mm/frame_vector.c rename to drivers/media/common/videobuf2/frame_vector.c index 39db520a51dc..b95f4f371681 100644 --- a/mm/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -8,6 +8,8 @@ #include #include +#include + /** * get_vaddr_frames() - map virtual addresses to pfns * @start:starting user address diff --git a/drivers/media/platform/omap/Kconfig b/drivers/media/platform/omap/Kconfig index f73b5893220d..de16de46c0f4 100644 --- a/drivers/media/platform/omap/Kconfig +++ b/drivers/media/platform/omap/Kconfig @@ -12,6 +12,5 @@ config VIDEO_OMAP2_VOUT depends on VIDEO_V4L2 select VIDEOBUF2_DMA_CONTIG select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3 - select FRAME_VECTOR help V4L2 Display driver support for OMAP2/3 based boards. diff --git a/include/linux/mm.h b/include/linux/mm.h index 16b799a0522c..acd60fbf1a5a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1743,48 +1743,6 @@ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim); -/* Container for pinned pfns / pages */ -struct frame_vector { - unsigned int nr_allocated; /* Number of frames we have space for */ - unsigned int nr_frames; /* Number of frames stored in ptrs array */ - bool got_ref; /* Did we pin pages by getting page ref? */ - bool is_pfns; /* Does array contain pages or pfns? */ - void *ptrs[]; /* Array of pinned pfns / pages. Use -* pfns_vector_pages() or pfns_vector_pfns() -* for access */ -}; - -struct frame_vector *frame_vector_create(unsigned int nr_frames); -void frame_vector_destroy(struct frame_vector *vec); -int get_vaddr_frames(unsigned long start, unsigned int nr_pfns, -unsigned int gup_flags, struct frame_vector *vec); -void put_vaddr_frames(struct frame_vector *vec); -int frame_vector_to_pages(struct frame_vector *vec); -void frame_vector_to_pfns(struct frame_vector *vec); - -static inline unsigned int frame_vector_count(struct frame_vector *vec) -{ - return vec->nr_frames; -} - -static inline struct page **frame_vector_pages(struct frame_vector *vec) -{ - if (vec->is_pfns) { - int err = frame_vector_to_pages(vec); - - if (err) - return ERR_PTR(err); - } - return (struct page **)(vec->ptrs); -} - -static inline unsigned long *frame_vector_pfns(struct frame_vector *vec) -{ - if (!vec->is
Re: [PATCH 1/2] drm/i915/dpcd_bl: Skip testing control capability with force DPCD quirk
Hi! I thought this patch rang a bell, we actually already had some discussion about this since there's a couple of other systems this was causing issues for. Unfortunately it never seems like that patch got sent out. Satadru? (if I don't hear back from them soon, I'll just send out a patch for this myself) JFYI - the proper fix here is to just drop the DP_EDP_BACKLIGHT_BRIGHTNESS_PWM_PIN_CAP check from the code entirely. As long as the backlight supports AUX_SET_CAP, that should be enough for us to control it. On Wed, 2020-10-07 at 14:58 +0800, Kai-Heng Feng wrote: > HP DreamColor panel needs to be controlled via AUX interface. However, > it has both DP_EDP_BACKLIGHT_BRIGHTNESS_AUX_SET_CAP and > DP_EDP_BACKLIGHT_BRIGHTNESS_PWM_PIN_CAP set, so it fails to pass > intel_dp_aux_display_control_capable() test. > > Skip the test if the panel has force DPCD quirk. > > Signed-off-by: Kai-Heng Feng > --- > drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c | 10 ++ > 1 file changed, 6 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c > b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c > index acbd7eb66cbe..acf2e1c65290 100644 > --- a/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c > +++ b/drivers/gpu/drm/i915/display/intel_dp_aux_backlight.c > @@ -347,9 +347,13 @@ int intel_dp_aux_init_backlight_funcs(struct > intel_connector *intel_connector) > struct intel_panel *panel = &intel_connector->panel; > struct intel_dp *intel_dp = enc_to_intel_dp(intel_connector->encoder); > struct drm_i915_private *i915 = dp_to_i915(intel_dp); > + bool force_dpcd; > + > + force_dpcd = drm_dp_has_quirk(&intel_dp->desc, intel_dp->edid_quirks, > + DP_QUIRK_FORCE_DPCD_BACKLIGHT); > > if (i915->params.enable_dpcd_backlight == 0 || > - !intel_dp_aux_display_control_capable(intel_connector)) > + (!force_dpcd && > !intel_dp_aux_display_control_capable(intel_connector))) > return -ENODEV; > > /* > @@ -358,9 +362,7 @@ int intel_dp_aux_init_backlight_funcs(struct > intel_connector *intel_connector) >*/ > if (i915->vbt.backlight.type != > INTEL_BACKLIGHT_VESA_EDP_AUX_INTERFACE && > - i915->params.enable_dpcd_backlight != 1 && > - !drm_dp_has_quirk(&intel_dp->desc, intel_dp->edid_quirks, > - DP_QUIRK_FORCE_DPCD_BACKLIGHT)) { > + i915->params.enable_dpcd_backlight != 1 && !force_dpcd) { > drm_info(&i915->drm, >"Panel advertises DPCD backlight support, but " >"VBT disagrees. If your backlight controls " -- Sincerely, Lyude Paul (she/her) Software Engineer at Red Hat ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers
On Wed, Oct 7, 2020 at 11:37 PM John Hubbard wrote: > > On 10/7/20 2:32 PM, Daniel Vetter wrote: > > On Wed, Oct 7, 2020 at 10:33 PM John Hubbard wrote: > >> > >> On 10/7/20 9:44 AM, Daniel Vetter wrote: > ... > >>> @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct > >>> g2d_data *g2d, > >>>dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt, > >>> DMA_BIDIRECTIONAL, 0); > >>> > >>> - pages = frame_vector_pages(g2d_userptr->vec); > >>> - if (!IS_ERR(pages)) { > >>> - int i; > >>> + for (i = 0; i < g2d_userptr->npages; i++) > >>> + set_page_dirty_lock(g2d_userptr->pages[i]); > >>> > >>> - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++) > >>> - set_page_dirty_lock(pages[i]); > >>> - } > >>> - put_vaddr_frames(g2d_userptr->vec); > >>> - frame_vector_destroy(g2d_userptr->vec); > >>> + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages); > >>> + kvfree(g2d_userptr->pages); > >> > >> You can avoid writing your own loop, and just simplify the whole thing > >> down to > >> two lines: > >> > >> unpin_user_pages_dirty_lock(g2d_userptr->pages, > >> g2d_userptr->npages, > >> true); > >> kvfree(g2d_userptr->pages); > > > > Oh nice, this is neat. I'll also roll it out in the habanalabs patch, > > that has the same thing. Well almost, it only uses set_page_dirty, not > > the _lock variant. But I have no idea whether that matters or not? > > > It matters. And invariably, call sites that use set_page_dirty() instead > of set_page_dirty_lock() were already wrong. Which is why I never had to > provide anything like "unpin_user_pages_dirty (not locked)". > > Although in habanalabs case, I just reviewed patch 3 and I think they *were* > correctly using set_page_dirty_lock()... Yeah I mixed that up with some other code I read, habanalabs is using _lock. I have seen a pile of gup/pup code though that only uses set_page_dirty. And looking around I did not really parse the comment above set_page_dirty(). I guess just using the _lock variant shouldn't hurt too much. I've found a comment though from the infiniband umem notifier that it's sometimes called with the page locked, and sometimes not, so life is complicated there. But how it avoids races I didn't understand. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers
On 10/7/20 2:32 PM, Daniel Vetter wrote: On Wed, Oct 7, 2020 at 10:33 PM John Hubbard wrote: On 10/7/20 9:44 AM, Daniel Vetter wrote: ... @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d, dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt, DMA_BIDIRECTIONAL, 0); - pages = frame_vector_pages(g2d_userptr->vec); - if (!IS_ERR(pages)) { - int i; + for (i = 0; i < g2d_userptr->npages; i++) + set_page_dirty_lock(g2d_userptr->pages[i]); - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++) - set_page_dirty_lock(pages[i]); - } - put_vaddr_frames(g2d_userptr->vec); - frame_vector_destroy(g2d_userptr->vec); + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages); + kvfree(g2d_userptr->pages); You can avoid writing your own loop, and just simplify the whole thing down to two lines: unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages, true); kvfree(g2d_userptr->pages); Oh nice, this is neat. I'll also roll it out in the habanalabs patch, that has the same thing. Well almost, it only uses set_page_dirty, not the _lock variant. But I have no idea whether that matters or not? It matters. And invariably, call sites that use set_page_dirty() instead of set_page_dirty_lock() were already wrong. Which is why I never had to provide anything like "unpin_user_pages_dirty (not locked)". Although in habanalabs case, I just reviewed patch 3 and I think they *were* correctly using set_page_dirty_lock()... thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers
On Wed, Oct 7, 2020 at 10:33 PM John Hubbard wrote: > > On 10/7/20 9:44 AM, Daniel Vetter wrote: > > All we need are a pages array, pin_user_pages_fast can give us that > > directly. Plus this avoids the entire raw pfn side of get_vaddr_frames. > > > > Signed-off-by: Daniel Vetter > > Cc: Jason Gunthorpe > > Cc: Inki Dae > > Cc: Joonyoung Shim > > Cc: Seung-Woo Kim > > Cc: Kyungmin Park > > Cc: Kukjin Kim > > Cc: Krzysztof Kozlowski > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > --- > > drivers/gpu/drm/exynos/Kconfig | 1 - > > drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 - > > 2 files changed, 22 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig > > index 6417f374b923..43257ef3c09d 100644 > > --- a/drivers/gpu/drm/exynos/Kconfig > > +++ b/drivers/gpu/drm/exynos/Kconfig > > @@ -88,7 +88,6 @@ comment "Sub-drivers" > > config DRM_EXYNOS_G2D > > bool "G2D" > > depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST > > - select FRAME_VECTOR > > help > > Choose this option if you want to use Exynos G2D for DRM. > > > > diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c > > b/drivers/gpu/drm/exynos/exynos_drm_g2d.c > > index 967a5cdc120e..c83f6faac9de 100644 > > --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c > > +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c > > @@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr { > > dma_addr_t dma_addr; > > unsigned long userptr; > > unsigned long size; > > - struct frame_vector *vec; > > + struct page **pages; > > + unsigned intnpages; > > struct sg_table *sgt; > > atomic_trefcount; > > boolin_pool; > > @@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data > > *g2d, > > bool force) > > { > > struct g2d_cmdlist_userptr *g2d_userptr = obj; > > - struct page **pages; > > + int i; > > The above line can also be deleted, see below. > > > > > if (!obj) > > return; > > @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data > > *g2d, > > dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt, > > DMA_BIDIRECTIONAL, 0); > > > > - pages = frame_vector_pages(g2d_userptr->vec); > > - if (!IS_ERR(pages)) { > > - int i; > > + for (i = 0; i < g2d_userptr->npages; i++) > > + set_page_dirty_lock(g2d_userptr->pages[i]); > > > > - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++) > > - set_page_dirty_lock(pages[i]); > > - } > > - put_vaddr_frames(g2d_userptr->vec); > > - frame_vector_destroy(g2d_userptr->vec); > > + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages); > > + kvfree(g2d_userptr->pages); > > You can avoid writing your own loop, and just simplify the whole thing down to > two lines: > > unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages, > true); > kvfree(g2d_userptr->pages); Oh nice, this is neat. I'll also roll it out in the habanalabs patch, that has the same thing. Well almost, it only uses set_page_dirty, not the _lock variant. But I have no idea whether that matters or not? -Daniel > > > > > > if (!g2d_userptr->out_of_list) > > list_del_init(&g2d_userptr->list); > > @@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct > > g2d_data *g2d, > > offset = userptr & ~PAGE_MASK; > > end = PAGE_ALIGN(userptr + size); > > npages = (end - start) >> PAGE_SHIFT; > > - g2d_userptr->vec = frame_vector_create(npages); > > - if (!g2d_userptr->vec) { > > + g2d_userptr->pages = kvmalloc_array(npages, > > sizeof(*g2d_userptr->pages), > > + GFP_KERNEL); > > + if (!g2d_userptr->pages) { > > ret = -ENOMEM; > > goto err_free; > > } > > > > - ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE, > > - g2d_userptr->vec); > > + ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, > > + g2d_userptr->pages); > > if (ret != npages) { > > DRM_DEV_ERROR(g2d->dev, > > "failed to get user pages from userptr.\n"); > > if (ret < 0) > > - goto err_destroy_framevec; > > - ret = -EFAULT; > > - goto err_put_framevec; > > - } > > - if (frame_vector_to_pages(g2d_userptr->
Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM
On Wed, Oct 7, 2020 at 11:13 PM John Hubbard wrote: > > On 10/7/20 9:44 AM, Daniel Vetter wrote: > > This is used by media/videbuf2 for persistent dma mappings, not just > > for a single dma operation and then freed again, so needs > > FOLL_LONGTERM. > > > > Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to > > locking issues. Rework the code to pull the pup path out from the > > mmap_sem critical section as suggested by Jason. > > > > Signed-off-by: Daniel Vetter > > Cc: Jason Gunthorpe > > Cc: Pawel Osciak > > Cc: Marek Szyprowski > > Cc: Kyungmin Park > > Cc: Tomasz Figa > > Cc: Mauro Carvalho Chehab > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > --- > > mm/frame_vector.c | 36 +++- > > 1 file changed, 11 insertions(+), 25 deletions(-) > > > > diff --git a/mm/frame_vector.c b/mm/frame_vector.c > > index 10f82d5643b6..39db520a51dc 100644 > > --- a/mm/frame_vector.c > > +++ b/mm/frame_vector.c > > @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int > > nr_frames, > > struct vm_area_struct *vma; > > int ret = 0; > > int err; > > - int locked; > > > > if (nr_frames == 0) > > return 0; > > @@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int > > nr_frames, > > > > start = untagged_addr(start); > > > > + ret = pin_user_pages_fast(start, nr_frames, > > + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, > > + (struct page **)(vec->ptrs)); > > + if (ret > 0) { > > + vec->got_ref = true; > > + vec->is_pfns = false; > > + goto out_unlocked; > > + } > > This part looks good, and changing to _fast is a potential performance > improvement, > too. > > > + > > mmap_read_lock(mm); > > - locked = 1; > > vma = find_vma_intersection(mm, start, start + 1); > > if (!vma) { > > ret = -EFAULT; > > goto out; > > } > > > > - /* > > - * While get_vaddr_frames() could be used for transient (kernel > > - * controlled lifetime) pinning of memory pages all current > > - * users establish long term (userspace controlled lifetime) > > - * page pinning. Treat get_vaddr_frames() like > > - * get_user_pages_longterm() and disallow it for filesystem-dax > > - * mappings. > > - */ > > - if (vma_is_fsdax(vma)) { > > - ret = -EOPNOTSUPP; > > - goto out; > > - } > > Are you sure we don't need to check vma_is_fsdax() anymore? Since FOLL_LONGTERM checks for this and can only return struct page backed memory, and explicitly excludes VM_IO | VM_PFNMAP, was assuming this is not needed for follow_pfn. And the get_user_pages_locked this used back then didn't have the same check, hence why it was added (and FOLL_LONGTERM still doesn't work for the _locked versions, as you pointed out on the last round of this discussion). But now that you're asking, I have no idea whether fsdax vma can also be of VM_IO | VM_PFNMAP type. I'm not seeing that set anywhere in fs/dax.c, but that says nothing :-) Dan, you added this check originally, do we need it for VM_SPECIAL vmas too? Thanks, Daniel > > > - > > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { > > - vec->got_ref = true; > > - vec->is_pfns = false; > > - ret = pin_user_pages_locked(start, nr_frames, > > - gup_flags, (struct page **)(vec->ptrs), &locked); > > - goto out; > > - } > > - > > vec->got_ref = false; > > vec->is_pfns = true; > > do { > > @@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int > > nr_frames, > > vma = find_vma_intersection(mm, start, start + 1); > > } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); > > out: > > - if (locked) > > - mmap_read_unlock(mm); > > + mmap_read_unlock(mm); > > +out_unlocked: > > if (!ret) > > ret = -EFAULT; > > if (ret > 0) > > > > All of the error handling still looks accurate there. > > thanks, > -- > John Hubbard > NVIDIA -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM
On 10/7/20 9:44 AM, Daniel Vetter wrote: This is used by media/videbuf2 for persistent dma mappings, not just for a single dma operation and then freed again, so needs FOLL_LONGTERM. Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to locking issues. Rework the code to pull the pup path out from the mmap_sem critical section as suggested by Jason. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Figa Cc: Mauro Carvalho Chehab Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- mm/frame_vector.c | 36 +++- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/mm/frame_vector.c b/mm/frame_vector.c index 10f82d5643b6..39db520a51dc 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, struct vm_area_struct *vma; int ret = 0; int err; - int locked; if (nr_frames == 0) return 0; @@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, start = untagged_addr(start); + ret = pin_user_pages_fast(start, nr_frames, + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, + (struct page **)(vec->ptrs)); + if (ret > 0) { + vec->got_ref = true; + vec->is_pfns = false; + goto out_unlocked; + } This part looks good, and changing to _fast is a potential performance improvement, too. + mmap_read_lock(mm); - locked = 1; vma = find_vma_intersection(mm, start, start + 1); if (!vma) { ret = -EFAULT; goto out; } - /* -* While get_vaddr_frames() could be used for transient (kernel -* controlled lifetime) pinning of memory pages all current -* users establish long term (userspace controlled lifetime) -* page pinning. Treat get_vaddr_frames() like -* get_user_pages_longterm() and disallow it for filesystem-dax -* mappings. -*/ - if (vma_is_fsdax(vma)) { - ret = -EOPNOTSUPP; - goto out; - } Are you sure we don't need to check vma_is_fsdax() anymore? - - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { - vec->got_ref = true; - vec->is_pfns = false; - ret = pin_user_pages_locked(start, nr_frames, - gup_flags, (struct page **)(vec->ptrs), &locked); - goto out; - } - vec->got_ref = false; vec->is_pfns = true; do { @@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, vma = find_vma_intersection(mm, start, start + 1); } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); out: - if (locked) - mmap_read_unlock(mm); + mmap_read_unlock(mm); +out_unlocked: if (!ret) ret = -EFAULT; if (ret > 0) All of the error handling still looks accurate there. thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 04/13] misc/habana: Use FOLL_LONGTERM for userptr
On 10/7/20 9:44 AM, Daniel Vetter wrote: These are persistent, not just for the duration of a dma operation. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Oded Gabbay Cc: Omer Shpigelman Cc: Ofir Bitton Cc: Tomer Tayar Cc: Moti Haimovski Cc: Daniel Vetter Cc: Greg Kroah-Hartman Cc: Pawel Piskorski --- drivers/misc/habanalabs/common/memory.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c index ef89cfa2f95a..94bef8faa82a 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -1288,7 +1288,8 @@ static int get_user_memory(struct hl_device *hdev, u64 addr, u64 size, return -ENOMEM; } - rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + rc = pin_user_pages_fast(start, npages, +FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, userptr->pages); if (rc != npages) { Again, from a pin_user_pages_fast() point of view, and not being at all familiar with the habana driver (but their use of this really does seem clearly _LONGTERM!): Reviewed-by: John Hubbard thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 02/13] drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
On 10/7/20 9:44 AM, Daniel Vetter wrote: The exynos g2d interface is very unusual, but it looks like the userptr objects are persistent. Hence they need FOLL_LONGTERM. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Inki Dae Cc: Joonyoung Shim Cc: Seung-Woo Kim Cc: Kyungmin Park Cc: Kukjin Kim Cc: Krzysztof Kozlowski Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- drivers/gpu/drm/exynos/exynos_drm_g2d.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c index c83f6faac9de..514fd000feb1 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c @@ -478,7 +478,8 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d, goto err_free; } - ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + ret = pin_user_pages_fast(start, npages, + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, g2d_userptr->pages); if (ret != npages) { DRM_DEV_ERROR(g2d->dev, Looks good from a pin_user_pages_fast() point of view. I'm of course not a exynos developer, so we still need a look from one of those, ideally, but: Reviewed-by: John Hubbard thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 03/13] misc/habana: Stop using frame_vector helpers
On 10/7/20 9:44 AM, Daniel Vetter wrote: ... @@ -1414,15 +1410,10 @@ void hl_unpin_host_memory(struct hl_device *hdev, struct hl_userptr *userptr) userptr->sgt->nents, userptr->dir); - pages = frame_vector_pages(userptr->vec); - if (!IS_ERR(pages)) { - int i; - - for (i = 0; i < frame_vector_count(userptr->vec); i++) - set_page_dirty_lock(pages[i]); - } - put_vaddr_frames(userptr->vec); - frame_vector_destroy(userptr->vec); + for (i = 0; i < userptr->npages; i++) + set_page_dirty_lock(userptr->pages[i]); + unpin_user_pages(userptr->pages, userptr->npages); + kvfree(userptr->pages); Same thing here as in patch 1: you can further simplify by using unpin_user_pages_dirty_lock(). list_del(&userptr->job_node); thanks, -- John Hubbard NVIDIA ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 01/13] drm/exynos: Stop using frame_vector helpers
On 10/7/20 9:44 AM, Daniel Vetter wrote: All we need are a pages array, pin_user_pages_fast can give us that directly. Plus this avoids the entire raw pfn side of get_vaddr_frames. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Inki Dae Cc: Joonyoung Shim Cc: Seung-Woo Kim Cc: Kyungmin Park Cc: Kukjin Kim Cc: Krzysztof Kozlowski Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- drivers/gpu/drm/exynos/Kconfig | 1 - drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 - 2 files changed, 22 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig index 6417f374b923..43257ef3c09d 100644 --- a/drivers/gpu/drm/exynos/Kconfig +++ b/drivers/gpu/drm/exynos/Kconfig @@ -88,7 +88,6 @@ comment "Sub-drivers" config DRM_EXYNOS_G2D bool "G2D" depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST - select FRAME_VECTOR help Choose this option if you want to use Exynos G2D for DRM. diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c index 967a5cdc120e..c83f6faac9de 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c @@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr { dma_addr_t dma_addr; unsigned long userptr; unsigned long size; - struct frame_vector *vec; + struct page **pages; + unsigned intnpages; struct sg_table *sgt; atomic_trefcount; boolin_pool; @@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d, bool force) { struct g2d_cmdlist_userptr *g2d_userptr = obj; - struct page **pages; + int i; The above line can also be deleted, see below. if (!obj) return; @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d, dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt, DMA_BIDIRECTIONAL, 0); - pages = frame_vector_pages(g2d_userptr->vec); - if (!IS_ERR(pages)) { - int i; + for (i = 0; i < g2d_userptr->npages; i++) + set_page_dirty_lock(g2d_userptr->pages[i]); - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++) - set_page_dirty_lock(pages[i]); - } - put_vaddr_frames(g2d_userptr->vec); - frame_vector_destroy(g2d_userptr->vec); + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages); + kvfree(g2d_userptr->pages); You can avoid writing your own loop, and just simplify the whole thing down to two lines: unpin_user_pages_dirty_lock(g2d_userptr->pages, g2d_userptr->npages, true); kvfree(g2d_userptr->pages); if (!g2d_userptr->out_of_list) list_del_init(&g2d_userptr->list); @@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d, offset = userptr & ~PAGE_MASK; end = PAGE_ALIGN(userptr + size); npages = (end - start) >> PAGE_SHIFT; - g2d_userptr->vec = frame_vector_create(npages); - if (!g2d_userptr->vec) { + g2d_userptr->pages = kvmalloc_array(npages, sizeof(*g2d_userptr->pages), + GFP_KERNEL); + if (!g2d_userptr->pages) { ret = -ENOMEM; goto err_free; } - ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE, - g2d_userptr->vec); + ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + g2d_userptr->pages); if (ret != npages) { DRM_DEV_ERROR(g2d->dev, "failed to get user pages from userptr.\n"); if (ret < 0) - goto err_destroy_framevec; - ret = -EFAULT; - goto err_put_framevec; - } - if (frame_vector_to_pages(g2d_userptr->vec) < 0) { + goto err_destroy_pages; + npages = ret; ret = -EFAULT; - goto err_put_framevec; + goto err_unpin_pages; } + g2d_userptr->npages = npages; sgt = kzalloc(sizeof(*sgt), GFP_KERNEL); if (!sgt) { ret = -ENOMEM; - goto err_put_framevec; + goto err_unpin_pages; } ret = sg_alloc_table_from_pages(sgt, - frame_vector_pages(g2d_userptr->vec), + g2d_userptr->pages,
Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 9:33 PM Dan Williams wrote: > > On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter wrote: > > > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > > the region") /dev/kmem zaps ptes when the kernel requests exclusive > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > > the default for all driver uses. > > > > Except there's two more ways to access pci bars: sysfs and proc mmap > > support. Let's plug that hole. > > Ooh, yes, lets. > > > For revoke_devmem() to work we need to link our vma into the same > > address_space, with consistent vma->vm_pgoff. ->pgoff is already > > adjusted, because that's how (io_)remap_pfn_range works, but for the > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > > at ->open time, but that's a bit tricky here with all the entry points > > and arch code. So instead create a fake file and adjust vma->vm_file. > > I don't think you want to share the devmem inode for this, this should > be based off the sysfs inode which I believe there is already only one > instance per resource. In contrast /dev/mem can have multiple inodes > because anyone can just mknod a new character device file, the same > problem does not exist for sysfs. But then I need to find the right one, plus I also need to find the right one for the procfs side. That gets messy, and I already have no idea how to really test this. Shared address_space is the same trick we're using in drm (where we have multiple things all pointing to the same underlying resources, through different files), and it gets the job done. So that's why I figured the shared address_space is the cleaner solution since then unmap_mapping_range takes care of iterating over all vma for us. I guess I could reimplement that logic with our own locking and everything in revoke_devmem, but feels a bit silly. But it would also solve the problem of having mutliple different mknod of /dev/kmem with different address_space behind them. Also because of how remap_pfn_range works, all these vma do use the same pgoff already anyway. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 11/13] mm: add unsafe_follow_pfn
On Wed, Oct 7, 2020 at 9:00 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 08:10:34PM +0200, Daniel Vetter wrote: > > On Wed, Oct 7, 2020 at 7:36 PM Jason Gunthorpe wrote: > > > > > > On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote: > > > > Way back it was a reasonable assumptions that iomem mappings never > > > > change the pfn range they point at. But this has changed: > > > > > > > > - gpu drivers dynamically manage their memory nowadays, invalidating > > > > ptes with unmap_mapping_range when buffers get moved > > > > > > > > - contiguous dma allocations have moved from dedicated carvetouts to > > > > cma regions. This means if we miss the unmap the pfn might contain > > > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) > > > > > > > > - even /dev/mem now invalidates mappings when the kernel requests that > > > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 > > > > ("/dev/mem: Revoke mappings when a driver claims the region") > > > > > > > > Accessing pfns obtained from ptes without holding all the locks is > > > > therefore no longer a good idea. > > > > > > > > Unfortunately there's some users where this is not fixable (like v4l > > > > userptr of iomem mappings) or involves a pile of work (vfio type1 > > > > iommu). For now annotate these as unsafe and splat appropriately. > > > > > > > > This patch adds an unsafe_follow_pfn, which later patches will then > > > > roll out to all appropriate places. > > > > > > > > Signed-off-by: Daniel Vetter > > > > Cc: Jason Gunthorpe > > > > Cc: Kees Cook > > > > Cc: Dan Williams > > > > Cc: Andrew Morton > > > > Cc: John Hubbard > > > > Cc: Jérôme Glisse > > > > Cc: Jan Kara > > > > Cc: Dan Williams > > > > Cc: linux...@kvack.org > > > > Cc: linux-arm-ker...@lists.infradead.org > > > > Cc: linux-samsung-...@vger.kernel.org > > > > Cc: linux-me...@vger.kernel.org > > > > Cc: k...@vger.kernel.org > > > > include/linux/mm.h | 2 ++ > > > > mm/memory.c| 32 +++- > > > > mm/nommu.c | 17 + > > > > security/Kconfig | 13 + > > > > 4 files changed, 63 insertions(+), 1 deletion(-) > > > > > > Makes sense to me. > > > > > > I wonder if we could change the original follow_pfn to require the > > > ptep and then lockdep_assert_held() it against the page table lock? > > > > The safe variant with the pagetable lock is follow_pte_pmd. The only > > way to make follow_pfn safe is if you have an mmu notifier and > > corresponding retry logic. That is not covered by lockdep (it would > > splat if we annotate the retry side), so I'm not sure how you'd check > > for that? > > Right OK. > > > Checking for ptep lock doesn't work here, since the one leftover safe > > user of this (kvm) doesn't need that at all, because it has the mmu > > notifier. > > Ah, so a better name and/or function kdoc for follow_pfn is probably a > good iead in this patch as well. I did change that already to mention that you need an mmu notifier, and that follow_pte_pmd respectively unsafe_follow_pfn are the alternatives. Do you want more or something else here? Note that I left the kerneldoc for the nommu.c case unchanged, since without an mmu all bets are off anyway. > > So I think we're as good as it gets, since I really have no idea how > > to make sure follow_pfn callers do have an mmu notifier registered. > > Yah, can't be done. Most mmu notifier users should be using > hmm_range_fault anyhow, kvm is really very special here. We could pass an mmu notifier to follow_pfn and check that it has a registration for vma->vm_mm, but that feels like overkill when kvm is the only legit user for this. > > I've followed the few other CONFIG_STRICT_FOO I've seen, which are all > > explicit enables and default to "do not break uapi, damn the > > (security) bugs". Which is I think how this should be done. It is in > > the security section though, so hopefully competent distros will > > enable this all. > > I thought the strict ones were more general and less clear security > worries, not bugs like this. > > This is "allow a user triggerable use after free bug to exist in the > kernel" Since at most you get at GFP_MOVEABLE stuff I'm not sure you can use this to pull the kernel over the table. Maybe best way is if you get a gpu pagetable somehow into your pfn and then use that to access abitrary stuff, but there's still an iommu. I think leveraging this is going to be very tricky, and pretty much has to be device or driver specific somehow. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 11:11 AM Daniel Vetter wrote: > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > the region") /dev/kmem zaps ptes when the kernel requests exclusive > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > the default for all driver uses. > > Except there's two more ways to access pci bars: sysfs and proc mmap > support. Let's plug that hole. Ooh, yes, lets. > > For revoke_devmem() to work we need to link our vma into the same > address_space, with consistent vma->vm_pgoff. ->pgoff is already > adjusted, because that's how (io_)remap_pfn_range works, but for the > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > at ->open time, but that's a bit tricky here with all the entry points > and arch code. So instead create a fake file and adjust vma->vm_file. I don't think you want to share the devmem inode for this, this should be based off the sysfs inode which I believe there is already only one instance per resource. In contrast /dev/mem can have multiple inodes because anyone can just mknod a new character device file, the same problem does not exist for sysfs. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 10/13] PCI: revoke mappings like devmem
On Wed, Oct 7, 2020 at 8:41 PM Bjorn Helgaas wrote: > > Capitalize subject, like other patches in this series and previous > drivers/pci history. > > On Wed, Oct 07, 2020 at 06:44:23PM +0200, Daniel Vetter wrote: > > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > > the region") /dev/kmem zaps ptes when the kernel requests exclusive > > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > > the default for all driver uses. > > > > Except there's two more ways to access pci bars: sysfs and proc mmap > > support. Let's plug that hole. > > s/pci/PCI/ in commit logs and comments. > > > For revoke_devmem() to work we need to link our vma into the same > > address_space, with consistent vma->vm_pgoff. ->pgoff is already > > adjusted, because that's how (io_)remap_pfn_range works, but for the > > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > > at ->open time, but that's a bit tricky here with all the entry points > > and arch code. So instead create a fake file and adjust vma->vm_file. > > > > Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that > > seems to be a subset of architectures support STRICT_DEVMEM, so we > > should be good. > > > > The only difference in access checks left is that sysfs pci mmap does > > not check for CAP_RAWIO. But I think that makes some sense compared to > > /dev/mem and proc, where one file gives you access to everything and > > no ownership applies. > > > --- a/drivers/char/mem.c > > +++ b/drivers/char/mem.c > > @@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t > > offset, int orig) > > } > > > > static struct inode *devmem_inode; > > +static struct vfsmount *devmem_vfs_mount; > > > > #ifdef CONFIG_IO_STRICT_DEVMEM > > void revoke_devmem(struct resource *res) > > @@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res) > > > > unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), > > 1); > > } > > + > > +struct file *devmem_getfile(void) > > +{ > > + struct file *file; > > + > > + file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem", > > + O_RDWR, &kmem_fops); > > + if (IS_ERR(file)) > > + return NULL; > > + > > + file->f_mapping = devmem_indoe->i_mapping; > > "devmem_indoe"? Obviously not compiled, I guess? Yeah apologies, I forgot to compile this with CONFIG_IO_STRICT_DEVMEM set. The entire series is more rfc about the overall problem really, I need to also figure out how to even this this somehow. I guess there's nothing really ready made here? -Daniel > > --- a/include/linux/ioport.h > > +++ b/include/linux/ioport.h > > @@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct > > resource *base, > > > > #ifdef CONFIG_IO_STRICT_DEVMEM > > void revoke_devmem(struct resource *res); > > +struct file *devm_getfile(void); > > #else > > static inline void revoke_devmem(struct resource *res) { }; > > +static inline struct file *devmem_getfile(void) { return NULL; }; > > I guess these names are supposed to match? > > > #endif > > > > #endif /* __ASSEMBLY__ */ > > -- > > 2.28.0 > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 09/13] PCI: obey iomem restrictions for procfs mmap
On Wed, Oct 07, 2020 at 06:44:22PM +0200, Daniel Vetter wrote: > There's three ways to access pci bars from userspace: /dev/mem, sysfs > files, and the old proc interface. Two check against > iomem_is_exclusive, proc never did. And with CONFIG_IO_STRICT_DEVMEM, > this starts to matter, since we don't want random userspace having > access to pci bars while a driver is loaded and using it. > > Fix this. Please mention *how* you're fixing this. I know you can sort of deduce it from the first paragraph, but it's easy to save readers the trouble. s/pci/PCI/ s/bars/BARs/ Capitalize subject to match other patches. > References: 90a545e98126 ("restrict /dev/mem to idle io memory ranges") > Signed-off-by: Daniel Vetter > Cc: Jason Gunthorpe > Cc: Kees Cook > Cc: Dan Williams > Cc: Andrew Morton > Cc: John Hubbard > Cc: Jérôme Glisse > Cc: Jan Kara > Cc: Dan Williams > Cc: linux...@kvack.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-samsung-...@vger.kernel.org > Cc: linux-me...@vger.kernel.org > Cc: Bjorn Helgaas > Cc: linux-...@vger.kernel.org > --- > drivers/pci/proc.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c > index d35186b01d98..3a2f90beb4cb 100644 > --- a/drivers/pci/proc.c > +++ b/drivers/pci/proc.c > @@ -274,6 +274,11 @@ static int proc_bus_pci_mmap(struct file *file, struct > vm_area_struct *vma) > else > return -EINVAL; > } > + > + if (dev->resource[i].flags & IORESOURCE_MEM && > + iomem_is_exclusive(dev->resource[i].start)) > + return -EINVAL; > + > ret = pci_mmap_page_range(dev, i, vma, > fpriv->mmap_state, write_combine); > if (ret < 0) > -- > 2.28.0 > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 10/13] PCI: revoke mappings like devmem
Capitalize subject, like other patches in this series and previous drivers/pci history. On Wed, Oct 07, 2020 at 06:44:23PM +0200, Daniel Vetter wrote: > Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims > the region") /dev/kmem zaps ptes when the kernel requests exclusive > acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is > the default for all driver uses. > > Except there's two more ways to access pci bars: sysfs and proc mmap > support. Let's plug that hole. s/pci/PCI/ in commit logs and comments. > For revoke_devmem() to work we need to link our vma into the same > address_space, with consistent vma->vm_pgoff. ->pgoff is already > adjusted, because that's how (io_)remap_pfn_range works, but for the > mapping we need to adjust vma->vm_file->f_mapping. Usually that's done > at ->open time, but that's a bit tricky here with all the entry points > and arch code. So instead create a fake file and adjust vma->vm_file. > > Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that > seems to be a subset of architectures support STRICT_DEVMEM, so we > should be good. > > The only difference in access checks left is that sysfs pci mmap does > not check for CAP_RAWIO. But I think that makes some sense compared to > /dev/mem and proc, where one file gives you access to everything and > no ownership applies. > --- a/drivers/char/mem.c > +++ b/drivers/char/mem.c > @@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t > offset, int orig) > } > > static struct inode *devmem_inode; > +static struct vfsmount *devmem_vfs_mount; > > #ifdef CONFIG_IO_STRICT_DEVMEM > void revoke_devmem(struct resource *res) > @@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res) > > unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), > 1); > } > + > +struct file *devmem_getfile(void) > +{ > + struct file *file; > + > + file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem", > + O_RDWR, &kmem_fops); > + if (IS_ERR(file)) > + return NULL; > + > + file->f_mapping = devmem_indoe->i_mapping; "devmem_indoe"? Obviously not compiled, I guess? > --- a/include/linux/ioport.h > +++ b/include/linux/ioport.h > @@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct resource > *base, > > #ifdef CONFIG_IO_STRICT_DEVMEM > void revoke_devmem(struct resource *res); > +struct file *devm_getfile(void); > #else > static inline void revoke_devmem(struct resource *res) { }; > +static inline struct file *devmem_getfile(void) { return NULL; }; I guess these names are supposed to match? > #endif > > #endif /* __ASSEMBLY__ */ > -- > 2.28.0 > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 13/13] vfio/type1: Mark follow_pfn as unsafe
On Wed, Oct 7, 2020 at 7:39 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 06:44:26PM +0200, Daniel Vetter wrote: > > The code seems to stuff these pfns into iommu pts (or something like > > that, I didn't follow), but there's no mmu_notifier to ensure that > > access is synchronized with pte updates. > > > > Hence mark these as unsafe. This means that with > > CONFIG_STRICT_FOLLOW_PFN, these will be rejected. > > > > Real fix is to wire up an mmu_notifier ... somehow. Probably means any > > invalidate is a fatal fault for this vfio device, but then this > > shouldn't ever happen if userspace is reasonable. > > > > Signed-off-by: Daniel Vetter > > Cc: Jason Gunthorpe > > Cc: Kees Cook > > Cc: Dan Williams > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > Cc: Alex Williamson > > Cc: Cornelia Huck > > Cc: k...@vger.kernel.org > > --- > > drivers/vfio/vfio_iommu_type1.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c > > b/drivers/vfio/vfio_iommu_type1.c > > index 5fbf0c1f7433..a4d53f3d0a35 100644 > > --- a/drivers/vfio/vfio_iommu_type1.c > > +++ b/drivers/vfio/vfio_iommu_type1.c > > @@ -421,7 +421,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, > > struct mm_struct *mm, > > { > > int ret; > > > > - ret = follow_pfn(vma, vaddr, pfn); > > + ret = unsafe_follow_pfn(vma, vaddr, pfn); > > if (ret) { > > bool unlocked = false; > > > > @@ -435,7 +435,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, > > struct mm_struct *mm, > > if (ret) > > return ret; > > > > - ret = follow_pfn(vma, vaddr, pfn); > > + ret = unsafe_follow_pfn(vma, vaddr, pfn); > > } > > This is actually being commonly used, so it needs fixing. > > When I talked to Alex about this last we had worked out a patch series > that adds a test on vm_ops that the vma came from vfio in the first > place. The VMA's created by VFIO are 'safe' as the PTEs are never changed. Hm, but wouldn't need that the semi-nasty vma_open trick to make sure that vma doesn't untimely disappear? Or is the idea to look up the underlying vfio object, and refcount that directly? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 11/13] mm: add unsafe_follow_pfn
On Wed, Oct 7, 2020 at 7:36 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 06:44:24PM +0200, Daniel Vetter wrote: > > Way back it was a reasonable assumptions that iomem mappings never > > change the pfn range they point at. But this has changed: > > > > - gpu drivers dynamically manage their memory nowadays, invalidating > > ptes with unmap_mapping_range when buffers get moved > > > > - contiguous dma allocations have moved from dedicated carvetouts to > > cma regions. This means if we miss the unmap the pfn might contain > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) > > > > - even /dev/mem now invalidates mappings when the kernel requests that > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 > > ("/dev/mem: Revoke mappings when a driver claims the region") > > > > Accessing pfns obtained from ptes without holding all the locks is > > therefore no longer a good idea. > > > > Unfortunately there's some users where this is not fixable (like v4l > > userptr of iomem mappings) or involves a pile of work (vfio type1 > > iommu). For now annotate these as unsafe and splat appropriately. > > > > This patch adds an unsafe_follow_pfn, which later patches will then > > roll out to all appropriate places. > > > > Signed-off-by: Daniel Vetter > > Cc: Jason Gunthorpe > > Cc: Kees Cook > > Cc: Dan Williams > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > Cc: k...@vger.kernel.org > > --- > > include/linux/mm.h | 2 ++ > > mm/memory.c| 32 +++- > > mm/nommu.c | 17 + > > security/Kconfig | 13 + > > 4 files changed, 63 insertions(+), 1 deletion(-) > > Makes sense to me. > > I wonder if we could change the original follow_pfn to require the > ptep and then lockdep_assert_held() it against the page table lock? The safe variant with the pagetable lock is follow_pte_pmd. The only way to make follow_pfn safe is if you have an mmu notifier and corresponding retry logic. That is not covered by lockdep (it would splat if we annotate the retry side), so I'm not sure how you'd check for that? Checking for ptep lock doesn't work here, since the one leftover safe user of this (kvm) doesn't need that at all, because it has the mmu notifier. Also follow_pte_pmd will splat with lockdep if you get it wrong, since the function leaves you with the right ptlock lock when it returns. If you forget to unlock that, lockdep will complain. So I think we're as good as it gets, since I really have no idea how to make sure follow_pfn callers do have an mmu notifier registered. > > +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, > > + unsigned long *pfn) > > +{ > > +#ifdef CONFIG_STRICT_FOLLOW_PFN > > + pr_info("unsafe follow_pfn usage rejected, see > > CONFIG_STRICT_FOLLOW_PFN\n"); > > Wonder if we can print something useful here, like the current > PID/process name? Yeah adding comm/pid here makes sense. > > diff --git a/security/Kconfig b/security/Kconfig > > index 7561f6f99f1d..48945402e103 100644 > > --- a/security/Kconfig > > +++ b/security/Kconfig > > @@ -230,6 +230,19 @@ config STATIC_USERMODEHELPER_PATH > > If you wish for all usermode helper programs to be disabled, > > specify an empty string here (i.e. ""). > > > > +config STRICT_FOLLOW_PFN > > + bool "Disable unsafe use of follow_pfn" > > + depends on MMU > > I would probably invert this CONFIG_ALLOW_UNSAFE_FOLLOW_PFN > default n I've followed the few other CONFIG_STRICT_FOO I've seen, which are all explicit enables and default to "do not break uapi, damn the (security) bugs". Which is I think how this should be done. It is in the security section though, so hopefully competent distros will enable this all. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 07/13] mm: close race in generic_access_phys
On Wed, Oct 7, 2020 at 7:27 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 06:44:20PM +0200, Daniel Vetter wrote: > > Way back it was a reasonable assumptions that iomem mappings never > > change the pfn range they point at. But this has changed: > > > > - gpu drivers dynamically manage their memory nowadays, invalidating > > ptes with unmap_mapping_range when buffers get moved > > > > - contiguous dma allocations have moved from dedicated carvetouts to > > cma regions. This means if we miss the unmap the pfn might contain > > pagecache or anon memory (well anything allocated with GFP_MOVEABLE) > > > > - even /dev/mem now invalidates mappings when the kernel requests that > > iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 > > ("/dev/mem: Revoke mappings when a driver claims the region") > > > > Accessing pfns obtained from ptes without holding all the locks is > > therefore no longer a good idea. Fix this. > > > > Since ioremap might need to manipulate pagetables too we need to drop > > the pt lock and have a retry loop if we raced. > > > > While at it, also add kerneldoc and improve the comment for the > > vma_ops->access function. It's for accessing, not for moving the > > memory from iomem to system memory, as the old comment seemed to > > suggest. > > > > References: 28b2ee20c7cb ("access_process_vm device memory infrastructure") > > Cc: Jason Gunthorpe > > Cc: Dan Williams > > Cc: Kees Cook > > Cc: Rik van Riel > > Cc: Benjamin Herrensmidt > > Cc: Dave Airlie > > Cc: Hugh Dickins > > Cc: Andrew Morton > > Cc: John Hubbard > > Cc: Jérôme Glisse > > Cc: Jan Kara > > Cc: Dan Williams > > Cc: linux...@kvack.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-samsung-...@vger.kernel.org > > Cc: linux-me...@vger.kernel.org > > Signed-off-by: Daniel Vetter > > --- > > include/linux/mm.h | 3 ++- > > mm/memory.c| 44 ++-- > > 2 files changed, 44 insertions(+), 3 deletions(-) > > This does seem to solve the race with revoke_devmem(), but it is really ugly. > > It would be much nicer to wrap a rwsem around this access and the unmap. > > Any place using it has a nice linear translation from vm_off to pfn, > so I don't think there is a such a good reason to use follow_pte in > the first place. > > ie why not the helper be this: > > int generic_access_phys(unsigned long pfn, unsigned long pgprot, > void *buf, size_t len, bool write) > > Then something like dev/mem would compute pfn and obtain the lock: > > dev_access(struct vm_area_struct *vma, unsigned long addr, void *buf, int > len, int write) > { > cpu_addr = vma->vm_pgoff*PAGE_SIZE + (addr - vma->vm_start)); > > /* FIXME: Has to be over each page of len */ > if (!devmem_is_allowed_access(PHYS_PFN(cpu_addr/4096))) >return -EPERM; > > down_read(&mem_sem); > generic_access_phys(cpu_addr/4096, pgprot_val(vma->vm_page_prot), > buf, len, write); > up_read(&mem_sem); > } > > The other cases looked simpler because they don't revoke, here the > mmap_sem alone should be enough protection, they would just need to > provide the linear translation to pfn. > > What do you think? I think it'd fix the bug, until someone wires ->access up for drivers/gpu, or the next subsystem. This is also just for ptrace, so we really don't care when we stall the vm badly and other silly things. So I figured the somewhat ugly, but full generic solution is the better one, so that people who want to be able to ptrace read/write their iomem mmaps can just sprinkle this wherever they feel like. But yeah if we go with most minimal fix, i.e. only trying to fix the current users, then your thing should work and is simpler. But it leaves the door open for future problems. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 09/20] gpu: host1x: DMA fences and userspace fence creation
Add an implementation of dma_fences based on syncpoints. Syncpoint interrupts are used to signal fences. Additionally, after software signaling has been enabled, a 30 second timeout is started. If the syncpoint threshold is not reached within this period, the fence is signalled with an -ETIMEDOUT error code. This is to allow fences that would never reach their syncpoint threshold to be cleaned up. Additionally, add a new /dev/host1x IOCTL for creating sync_file file descriptors backed by syncpoint fences. Signed-off-by: Mikko Perttunen --- v3: * Move declaration of host1x_fence_extract to public header --- drivers/gpu/host1x/Makefile | 1 + drivers/gpu/host1x/fence.c | 207 drivers/gpu/host1x/fence.h | 13 +++ drivers/gpu/host1x/intr.c | 9 ++ drivers/gpu/host1x/intr.h | 2 + drivers/gpu/host1x/uapi.c | 106 ++ include/linux/host1x.h | 4 + 7 files changed, 342 insertions(+) create mode 100644 drivers/gpu/host1x/fence.c create mode 100644 drivers/gpu/host1x/fence.h diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile index 882f928d75e1..a48af2cefae1 100644 --- a/drivers/gpu/host1x/Makefile +++ b/drivers/gpu/host1x/Makefile @@ -10,6 +10,7 @@ host1x-y = \ debug.o \ mipi.o \ uapi.o \ + fence.o \ hw/host1x01.o \ hw/host1x02.o \ hw/host1x04.o \ diff --git a/drivers/gpu/host1x/fence.c b/drivers/gpu/host1x/fence.c new file mode 100644 index ..400da6c1ab48 --- /dev/null +++ b/drivers/gpu/host1x/fence.c @@ -0,0 +1,207 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Syncpoint dma_fence implementation + * + * Copyright (c) 2020, NVIDIA Corporation. + */ + +#include +#include +#include +#include +#include + +#include "intr.h" +#include "syncpt.h" + +DEFINE_SPINLOCK(lock); + +struct host1x_syncpt_fence { + struct dma_fence base; + + atomic_t signaling; + + struct host1x_syncpt *sp; + u32 threshold; + + struct host1x_waitlist *waiter; + void *waiter_ref; + + struct delayed_work timeout_work; +}; + +static const char *syncpt_fence_get_driver_name(struct dma_fence *f) +{ + return "host1x"; +} + +static const char *syncpt_fence_get_timeline_name(struct dma_fence *f) +{ + return "syncpoint"; +} + +static bool syncpt_fence_enable_signaling(struct dma_fence *f) +{ + struct host1x_syncpt_fence *sf = + container_of(f, struct host1x_syncpt_fence, base); + int err; + + if (host1x_syncpt_is_expired(sf->sp, sf->threshold)) + return false; + + dma_fence_get(f); + + /* +* The dma_fence framework requires the fence driver to keep a +* reference to any fences for which 'enable_signaling' has been +* called (and that have not been signalled). +* +* We provide a userspace API to create arbitrary syncpoint fences, +* so we cannot normally guarantee that all fences get signalled. +* As such, setup a timeout, so that long-lasting fences will get +* reaped eventually. +*/ + schedule_delayed_work(&sf->timeout_work, msecs_to_jiffies(3)); + + err = host1x_intr_add_action(sf->sp->host, sf->sp, sf->threshold, +HOST1X_INTR_ACTION_SIGNAL_FENCE, f, +sf->waiter, &sf->waiter_ref); + if (err) { + cancel_delayed_work_sync(&sf->timeout_work); + dma_fence_put(f); + return false; + } + + /* intr framework takes ownership of waiter */ + sf->waiter = NULL; + + /* +* The fence may get signalled at any time after the above call, +* so we need to initialize all state used by signalling +* before it. +*/ + + return true; +} + +static void syncpt_fence_release(struct dma_fence *f) +{ + struct host1x_syncpt_fence *sf = + container_of(f, struct host1x_syncpt_fence, base); + + if (sf->waiter) + kfree(sf->waiter); + + dma_fence_free(f); +} + +const struct dma_fence_ops syncpt_fence_ops = { + .get_driver_name = syncpt_fence_get_driver_name, + .get_timeline_name = syncpt_fence_get_timeline_name, + .enable_signaling = syncpt_fence_enable_signaling, + .release = syncpt_fence_release, +}; + +void host1x_fence_signal(struct host1x_syncpt_fence *f) +{ + if (atomic_xchg(&f->signaling, 1)) + return; + + /* +* Cancel pending timeout work - if it races, it will +* not get 'f->signaling' and return. +*/ + cancel_delayed_work_sync(&f->timeout_work); + + host1x_intr_put_ref(f->sp->host, f->sp->id, f->waiter_ref); + + dma_fence_signal(&f->base); + dma_fence_put(&f->base); +} + +static void do_fence_timeout(struct work_struct *work) +{ + struct delayed_work *dwork = (struct delayed_work
[PATCH v3 07/20] gpu: host1x: Introduce UAPI header
Add the userspace interface header, specifying interfaces for allocating and accessing syncpoints from userspace, and for creating sync_file based fences based on syncpoint thresholds. Signed-off-by: Mikko Perttunen --- include/uapi/linux/host1x.h | 134 1 file changed, 134 insertions(+) create mode 100644 include/uapi/linux/host1x.h diff --git a/include/uapi/linux/host1x.h b/include/uapi/linux/host1x.h new file mode 100644 index ..9c8fb9425cb2 --- /dev/null +++ b/include/uapi/linux/host1x.h @@ -0,0 +1,134 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* Copyright (c) 2020 NVIDIA Corporation */ + +#ifndef _UAPI__LINUX_HOST1X_H +#define _UAPI__LINUX_HOST1X_H + +#include +#include + +#if defined(__cplusplus) +extern "C" { +#endif + +struct host1x_allocate_syncpoint { + /** +* @fd: [out] +* +* New file descriptor representing the allocated syncpoint. +*/ + __s32 fd; + + __u32 reserved[3]; +}; + +struct host1x_syncpoint_info { + /** +* @id: [out] +* +* System-global ID of the syncpoint. +*/ + __u32 id; + + __u32 reserved[3]; +}; + +struct host1x_syncpoint_increment { + /** +* @count: [in] +* +* Number of times to increment the syncpoint. The syncpoint can +* be observed at in-between values, but each increment is atomic. +*/ + __u32 count; +}; + +struct host1x_read_syncpoint { + /** +* @id: [in] +* +* ID of the syncpoint to read. +*/ + __u32 id; + + /** +* @value: [out] +* +* Current value of the syncpoint. +*/ + __u32 value; +}; + +struct host1x_create_fence { + /** +* @id: [in] +* +* ID of the syncpoint to create a fence for. +*/ + __u32 id; + + /** +* @threshold: [in] +* +* When the syncpoint reaches this value, the fence will be signaled. +* The syncpoint is considered to have reached the threshold when the +* following condition is true: +* +* ((value - threshold) & 0x8000U) == 0U +* +*/ + __u32 threshold; + + /** +* @fence_fd: [out] +* +* New sync_file file descriptor containing the created fence. +*/ + __s32 fence_fd; + + __u32 reserved[1]; +}; + +struct host1x_fence_extract_fence { + __u32 id; + __u32 threshold; +}; + +struct host1x_fence_extract { + /** +* @fence_fd: [in] +* +* sync_file file descriptor +*/ + __s32 fence_fd; + + /** +* @num_fences: [in,out] +* +* In: size of the `fences_ptr` array counted in elements. +* Out: required size of the `fences_ptr` array counted in elements. +*/ + __u32 num_fences; + + /** +* @fences_ptr: [in] +* +* Pointer to array of `struct host1x_fence_extract_fence`. +*/ + __u64 fences_ptr; + + __u32 reserved[2]; +}; + +#define HOST1X_IOCTL_ALLOCATE_SYNCPOINT _IOWR('X', 0x00, struct host1x_allocate_syncpoint) +#define HOST1X_IOCTL_READ_SYNCPOINT _IOR ('X', 0x01, struct host1x_read_syncpoint) +#define HOST1X_IOCTL_CREATE_FENCE_IOWR('X', 0x02, struct host1x_create_fence) +#define HOST1X_IOCTL_SYNCPOINT_INFO _IOWR('X', 0x03, struct host1x_syncpoint_info) +#define HOST1X_IOCTL_SYNCPOINT_INCREMENT _IOWR('X', 0x04, struct host1x_syncpoint_increment) +#define HOST1X_IOCTL_FENCE_EXTRACT _IOWR('X', 0x05, struct host1x_fence_extract) + +#if defined(__cplusplus) +} +#endif + +#endif -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 11/20] gpu: host1x: Add job release callback
Add a callback field to the job structure, to be called just before the job is to be freed. This allows the job's submitter to clean up any of its own state, like decrement runtime PM refcounts. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/job.c | 3 +++ include/linux/host1x.h | 4 2 files changed, 7 insertions(+) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index e4f16fc899b0..acf322beb56c 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -79,6 +79,9 @@ static void job_free(struct kref *ref) { struct host1x_job *job = container_of(ref, struct host1x_job, ref); + if (job->release) + job->release(job); + if (job->waiter) host1x_intr_put_ref(job->syncpt->host, job->syncpt->id, job->waiter); diff --git a/include/linux/host1x.h b/include/linux/host1x.h index fb62cc8b77dd..d7070fd65833 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -265,6 +265,10 @@ struct host1x_job { /* Fast-forward syncpoint increments on job timeout */ bool syncpt_recovery; + + /* Callback called when job is freed */ + void (*release)(struct host1x_job *job); + void *user_data; }; struct host1x_job *host1x_job_alloc(struct host1x_channel *ch, -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 15/20] drm/tegra: Add new UAPI to header
Update the tegra_drm.h UAPI header, adding the new proposed UAPI. The old staging UAPI is left in for now, with minor modification to avoid name collisions. Signed-off-by: Mikko Perttunen --- v3: * Remove timeout field * Inline the syncpt_incrs array to the submit structure * Remove WRITE_RELOC (it is now implicit) --- include/uapi/drm/tegra_drm.h | 420 --- 1 file changed, 393 insertions(+), 27 deletions(-) diff --git a/include/uapi/drm/tegra_drm.h b/include/uapi/drm/tegra_drm.h index c4df3c3668b3..9588d5e3308f 100644 --- a/include/uapi/drm/tegra_drm.h +++ b/include/uapi/drm/tegra_drm.h @@ -1,24 +1,5 @@ -/* - * Copyright (c) 2012-2013, NVIDIA CORPORATION. All rights reserved. - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * the rights to use, copy, modify, merge, publish, distribute, sublicense, - * and/or sell copies of the Software, and to permit persons to whom the - * Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice shall be included in - * all copies or substantial portions of the Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL - * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR - * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, - * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR - * OTHER DEALINGS IN THE SOFTWARE. - */ +/* SPDX-License-Identifier: MIT */ +/* Copyright (c) 2012-2020 NVIDIA Corporation */ #ifndef _UAPI_TEGRA_DRM_H_ #define _UAPI_TEGRA_DRM_H_ @@ -29,6 +10,8 @@ extern "C" { #endif +/* TegraDRM legacy UAPI. Only enabled with STAGING */ + #define DRM_TEGRA_GEM_CREATE_TILED (1 << 0) #define DRM_TEGRA_GEM_CREATE_BOTTOM_UP (1 << 1) @@ -644,13 +627,13 @@ struct drm_tegra_gem_get_flags { __u32 flags; }; -#define DRM_TEGRA_GEM_CREATE 0x00 -#define DRM_TEGRA_GEM_MMAP 0x01 +#define DRM_TEGRA_GEM_CREATE_LEGACY0x00 +#define DRM_TEGRA_GEM_MMAP_LEGACY 0x01 #define DRM_TEGRA_SYNCPT_READ 0x02 #define DRM_TEGRA_SYNCPT_INCR 0x03 #define DRM_TEGRA_SYNCPT_WAIT 0x04 -#define DRM_TEGRA_OPEN_CHANNEL 0x05 -#define DRM_TEGRA_CLOSE_CHANNEL0x06 +#define DRM_TEGRA_OPEN_CHANNEL 0x05 +#define DRM_TEGRA_CLOSE_CHANNEL0x06 #define DRM_TEGRA_GET_SYNCPT 0x07 #define DRM_TEGRA_SUBMIT 0x08 #define DRM_TEGRA_GET_SYNCPT_BASE 0x09 @@ -659,8 +642,8 @@ struct drm_tegra_gem_get_flags { #define DRM_TEGRA_GEM_SET_FLAGS0x0c #define DRM_TEGRA_GEM_GET_FLAGS0x0d -#define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create) -#define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap) +#define DRM_IOCTL_TEGRA_GEM_CREATE_LEGACY DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE_LEGACY, struct drm_tegra_gem_create) +#define DRM_IOCTL_TEGRA_GEM_MMAP_LEGACY DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP_LEGACY, struct drm_tegra_gem_mmap) #define DRM_IOCTL_TEGRA_SYNCPT_READ DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_READ, struct drm_tegra_syncpt_read) #define DRM_IOCTL_TEGRA_SYNCPT_INCR DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_INCR, struct drm_tegra_syncpt_incr) #define DRM_IOCTL_TEGRA_SYNCPT_WAIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SYNCPT_WAIT, struct drm_tegra_syncpt_wait) @@ -674,6 +657,389 @@ struct drm_tegra_gem_get_flags { #define DRM_IOCTL_TEGRA_GEM_SET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_SET_FLAGS, struct drm_tegra_gem_set_flags) #define DRM_IOCTL_TEGRA_GEM_GET_FLAGS DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_GET_FLAGS, struct drm_tegra_gem_get_flags) +/* New TegraDRM UAPI */ + +struct drm_tegra_channel_open { + /** +* @host1x_class: [in] +* +* Host1x class of the engine that will be programmed using this +* channel. +*/ + __u32 host1x_class; + + /** +* @flags: [in] +* +* Flags. +*/ + __u32 flags; + + /** +* @channel_ctx: [out] +* +* Opaque identifier corresponding to the opened channel. +*/ + __u32 channel_ctx; + + /** +* @hardware_version: [out] +* +* Version of the engine hardware. This can be used by userspace +* to determine how the engine needs to be programmed. +*/ + __u32 hardware_version; + + __u32 reserved[2]; +}; + +struct drm_t
[PATCH v3 18/20] drm/tegra: Allocate per-engine channel in core code
To avoid duplication, allocate the per-engine shared channel in the core code instead. Once MLOCKs are implemented on Host1x side, we can also update this to avoid allocating a shared channel when MLOCKs are enabled. Signed-off-by: Mikko Perttunen --- drivers/gpu/drm/tegra/drm.c | 11 +++ drivers/gpu/drm/tegra/drm.h | 4 2 files changed, 15 insertions(+) diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 7437c67924aa..7124b0b0154b 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -887,6 +887,14 @@ static struct drm_driver tegra_drm_driver = { int tegra_drm_register_client(struct tegra_drm *tegra, struct tegra_drm_client *client) { + /* +* When MLOCKs are implemented, change to allocate a shared channel +* only when MLOCKs are disabled. +*/ + client->shared_channel = host1x_channel_request(&client->base); + if (!client->shared_channel) + return -EBUSY; + mutex_lock(&tegra->clients_lock); list_add_tail(&client->list, &tegra->clients); client->drm = tegra; @@ -903,6 +911,9 @@ int tegra_drm_unregister_client(struct tegra_drm *tegra, client->drm = NULL; mutex_unlock(&tegra->clients_lock); + if (client->shared_channel) + host1x_channel_put(client->shared_channel); + return 0; } diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index b25443255be6..3fc42fd97911 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -86,8 +86,12 @@ struct tegra_drm_client { struct list_head list; struct tegra_drm *drm; + /* Set by driver */ unsigned int version; const struct tegra_drm_client_ops *ops; + + /* Set by TegraDRM core */ + struct host1x_channel *shared_channel; }; static inline struct tegra_drm_client * -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 19/20] drm/tegra: Implement new UAPI
Implement the new UAPI, and bump the TegraDRM major version. Signed-off-by: Mikko Perttunen --- v3: * Remove WRITE_RELOC. Relocations are now patched implicitly when patching is needed. * Directly call PM runtime APIs on devices instead of using power_on/power_off callbacks. * Remove incorrect mutex unlock in tegra_drm_ioctl_channel_open * Use XA_FLAGS_ALLOC1 instead of XA_FLAGS_ALLOC * Accommodate for removal of timeout field and inlining of syncpt_incrs array. * Copy entire user arrays at a time instead of going through elements one-by-one. * Implement waiting of DMA reservations. * Split out gather_bo implementation into a separate file. * Fix length parameter passed to sg_init_one in gather_bo * Cosmetic cleanup. --- drivers/gpu/drm/tegra/Makefile | 3 + drivers/gpu/drm/tegra/drm.c| 46 +- drivers/gpu/drm/tegra/drm.h| 5 + drivers/gpu/drm/tegra/uapi.h | 63 +++ drivers/gpu/drm/tegra/uapi/gather_bo.c | 86 drivers/gpu/drm/tegra/uapi/gather_bo.h | 22 + drivers/gpu/drm/tegra/uapi/submit.c| 675 + drivers/gpu/drm/tegra/uapi/submit.h| 17 + drivers/gpu/drm/tegra/uapi/uapi.c | 326 9 files changed, 1225 insertions(+), 18 deletions(-) create mode 100644 drivers/gpu/drm/tegra/uapi.h create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.c create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.h create mode 100644 drivers/gpu/drm/tegra/uapi/submit.c create mode 100644 drivers/gpu/drm/tegra/uapi/submit.h create mode 100644 drivers/gpu/drm/tegra/uapi/uapi.c diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile index d6cf202414f0..059322e88943 100644 --- a/drivers/gpu/drm/tegra/Makefile +++ b/drivers/gpu/drm/tegra/Makefile @@ -3,6 +3,9 @@ ccflags-$(CONFIG_DRM_TEGRA_DEBUG) += -DDEBUG tegra-drm-y := \ drm.o \ + uapi/uapi.o \ + uapi/submit.o \ + uapi/gather_bo.o \ gem.o \ fb.o \ dp.o \ diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index 7124b0b0154b..88226dd0fd88 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -20,24 +20,20 @@ #include #include +#include "uapi.h" #include "drm.h" #include "gem.h" #define DRIVER_NAME "tegra" #define DRIVER_DESC "NVIDIA Tegra graphics" #define DRIVER_DATE "20120330" -#define DRIVER_MAJOR 0 +#define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 #define DRIVER_PATCHLEVEL 0 #define CARVEOUT_SZ SZ_64M #define CDMA_GATHER_FETCHES_MAX_NB 16383 -struct tegra_drm_file { - struct idr contexts; - struct mutex lock; -}; - static int tegra_atomic_check(struct drm_device *drm, struct drm_atomic_state *state) { @@ -90,7 +86,8 @@ static int tegra_drm_open(struct drm_device *drm, struct drm_file *filp) if (!fpriv) return -ENOMEM; - idr_init(&fpriv->contexts); + idr_init(&fpriv->legacy_contexts); + xa_init_flags(&fpriv->contexts, XA_FLAGS_ALLOC1); mutex_init(&fpriv->lock); filp->driver_priv = fpriv; @@ -432,7 +429,7 @@ static int tegra_client_open(struct tegra_drm_file *fpriv, if (err < 0) return err; - err = idr_alloc(&fpriv->contexts, context, 1, 0, GFP_KERNEL); + err = idr_alloc(&fpriv->legacy_contexts, context, 1, 0, GFP_KERNEL); if (err < 0) { client->ops->close_channel(context); return err; @@ -487,13 +484,13 @@ static int tegra_close_channel(struct drm_device *drm, void *data, mutex_lock(&fpriv->lock); - context = idr_find(&fpriv->contexts, args->context); + context = idr_find(&fpriv->legacy_contexts, args->context); if (!context) { err = -EINVAL; goto unlock; } - idr_remove(&fpriv->contexts, context->id); + idr_remove(&fpriv->legacy_contexts, context->id); tegra_drm_context_free(context); unlock: @@ -512,7 +509,7 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data, mutex_lock(&fpriv->lock); - context = idr_find(&fpriv->contexts, args->context); + context = idr_find(&fpriv->legacy_contexts, args->context); if (!context) { err = -ENODEV; goto unlock; @@ -541,7 +538,7 @@ static int tegra_submit(struct drm_device *drm, void *data, mutex_lock(&fpriv->lock); - context = idr_find(&fpriv->contexts, args->context); + context = idr_find(&fpriv->legacy_contexts, args->context); if (!context) { err = -ENODEV; goto unlock; @@ -566,7 +563,7 @@ static int tegra_get_syncpt_base(struct drm_device *drm, void *data, mutex_lock(&fpriv->lock); - context = idr_find(&fpriv->contexts, args->context); + context = idr_find(&fpriv->legacy_contexts, args->context); if (!context) { err = -EN
[PATCH v3 14/20] gpu: host1x: Reserve VBLANK syncpoints at initialization
On T20-T148 chips, the bootloader can set up a boot splash screen with DC configured to increment syncpoint 26/27 at VBLANK. Because of this we shouldn't allow these syncpoints to be allocated until DC has been reset and will no longer increment them in the background. As such, on these chips, reserve those two syncpoints at initialization, and only mark them free once the DC driver has indicated it's safe to do so. Signed-off-by: Mikko Perttunen --- v3: * New patch --- drivers/gpu/drm/tegra/dc.c | 6 ++ drivers/gpu/host1x/dev.c| 6 ++ drivers/gpu/host1x/dev.h| 6 ++ drivers/gpu/host1x/syncpt.c | 34 +- include/linux/host1x.h | 3 +++ 5 files changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index efb41c10dad4..0b23e0922c25 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -2031,6 +2031,12 @@ static int tegra_dc_init(struct host1x_client *client) struct drm_plane *cursor = NULL; int err; + /* +* DC has been reset by now, so VBLANK syncpoint can be released +* for general use. +*/ + host1x_syncpt_release_vblank_reservation(client, 26 + dc->pipe); + /* * XXX do not register DCs with no window groups because we cannot * assign a primary plane to them, which in turn will cause KMS to diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 641317d23828..8b50fbb22846 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -77,6 +77,7 @@ static const struct host1x_info host1x01_info = { .has_hypervisor = false, .num_sid_entries = 0, .sid_table = NULL, + .reserve_vblank_syncpts = true, }; static const struct host1x_info host1x02_info = { @@ -91,6 +92,7 @@ static const struct host1x_info host1x02_info = { .has_hypervisor = false, .num_sid_entries = 0, .sid_table = NULL, + .reserve_vblank_syncpts = true, }; static const struct host1x_info host1x04_info = { @@ -105,6 +107,7 @@ static const struct host1x_info host1x04_info = { .has_hypervisor = false, .num_sid_entries = 0, .sid_table = NULL, + .reserve_vblank_syncpts = false, }; static const struct host1x_info host1x05_info = { @@ -119,6 +122,7 @@ static const struct host1x_info host1x05_info = { .has_hypervisor = false, .num_sid_entries = 0, .sid_table = NULL, + .reserve_vblank_syncpts = false, }; static const struct host1x_sid_entry tegra186_sid_table[] = { @@ -142,6 +146,7 @@ static const struct host1x_info host1x06_info = { .has_hypervisor = true, .num_sid_entries = ARRAY_SIZE(tegra186_sid_table), .sid_table = tegra186_sid_table, + .reserve_vblank_syncpts = false, }; static const struct host1x_sid_entry tegra194_sid_table[] = { @@ -165,6 +170,7 @@ static const struct host1x_info host1x07_info = { .has_hypervisor = true, .num_sid_entries = ARRAY_SIZE(tegra194_sid_table), .sid_table = tegra194_sid_table, + .reserve_vblank_syncpts = false, }; static const struct of_device_id host1x_of_match[] = { diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index 7b8b7e20e32b..e360bc4a25f6 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -102,6 +102,12 @@ struct host1x_info { bool has_hypervisor; /* has hypervisor registers */ unsigned int num_sid_entries; const struct host1x_sid_entry *sid_table; + /* +* On T20-T148, the boot chain may setup DC to increment syncpoints +* 26/27 on VBLANK. As such we cannot use these syncpoints until +* the display driver disables VBLANK increments. +*/ + bool reserve_vblank_syncpts; }; struct host1x { diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 99d31932eb34..d0be7bdbc6c9 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -52,7 +52,7 @@ struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, mutex_lock(&host->syncpt_mutex); - for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++) + for (i = 0; i < host->info->nb_pts && kref_read(&sp->ref); i++, sp++) ; if (i >= host->info->nb_pts) @@ -359,6 +359,11 @@ int host1x_syncpt_init(struct host1x *host) if (!host->nop_sp) return -ENOMEM; + if (host->info->reserve_vblank_syncpts) { + kref_init(&host->syncpt[26].ref); + kref_init(&host->syncpt[27].ref); + } + return 0; } @@ -545,3 +550,30 @@ u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base) return base->id; } EXPORT_SYMBOL(host1x_syncpt_base_id); + +static void do_nothing(struct kref *ref) +{ +} + +/** + * host1x_syncpt_release_vblank_reservation() - Make VBLANK syncpoint + *
[PATCH v3 04/20] gpu: host1x: Remove cancelled waiters immediately
Before this patch, cancelled waiters would only be cleaned up once their threshold value was reached. Make host1x_intr_put_ref process the cancellation immediately to fix this. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/intr.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/host1x/intr.c b/drivers/gpu/host1x/intr.c index 9245add23b5d..5d328d20ce6d 100644 --- a/drivers/gpu/host1x/intr.c +++ b/drivers/gpu/host1x/intr.c @@ -247,13 +247,17 @@ void host1x_intr_put_ref(struct host1x *host, unsigned int id, void *ref) struct host1x_waitlist *waiter = ref; struct host1x_syncpt *syncpt; - while (atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED) == - WLS_REMOVED) - schedule(); + atomic_cmpxchg(&waiter->state, WLS_PENDING, WLS_CANCELLED); syncpt = host->syncpt + id; - (void)process_wait_list(host, syncpt, - host1x_syncpt_load(host->syncpt + id)); + + spin_lock(&syncpt->intr.lock); + if (atomic_cmpxchg(&waiter->state, WLS_CANCELLED, WLS_HANDLED) == + WLS_CANCELLED) { + list_del(&waiter->list); + kref_put(&waiter->refcount, waiter_release); + } + spin_unlock(&syncpt->intr.lock); kref_put(&waiter->refcount, waiter_release); } -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 03/20] gpu: host1x: Show number of pending waiters in debugfs
Show the number of pending waiters in the debugfs status file. This is useful for testing to verify that waiters do not leak or accumulate incorrectly. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/debug.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/host1x/debug.c b/drivers/gpu/host1x/debug.c index 3eee4318b158..2d06a7406b3b 100644 --- a/drivers/gpu/host1x/debug.c +++ b/drivers/gpu/host1x/debug.c @@ -69,6 +69,7 @@ static int show_channel(struct host1x_channel *ch, void *data, bool show_fifo) static void show_syncpts(struct host1x *m, struct output *o) { + struct list_head *pos; unsigned int i; host1x_debug_output(o, " syncpts \n"); @@ -76,12 +77,19 @@ static void show_syncpts(struct host1x *m, struct output *o) for (i = 0; i < host1x_syncpt_nb_pts(m); i++) { u32 max = host1x_syncpt_read_max(m->syncpt + i); u32 min = host1x_syncpt_load(m->syncpt + i); + unsigned int waiters = 0; - if (!min && !max) + spin_lock(&m->syncpt[i].intr.lock); + list_for_each(pos, &m->syncpt[i].intr.wait_head) + waiters++; + spin_unlock(&m->syncpt[i].intr.lock); + + if (!min && !max && !waiters) continue; - host1x_debug_output(o, "id %u (%s) min %d max %d\n", - i, m->syncpt[i].name, min, max); + host1x_debug_output(o, + "id %u (%s) min %d max %d (%d waiters)\n", + i, m->syncpt[i].name, min, max, waiters); } for (i = 0; i < host1x_syncpt_nb_bases(m); i++) { -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 05/20] gpu: host1x: Use HW-equivalent syncpoint expiration check
Make syncpoint expiration checks always use the same logic used by the hardware. This ensures that there are no race conditions that could occur because of the hardware triggering a syncpoint interrupt and then the driver disagreeing. One situation where this could occur is if a job incremented a syncpoint too many times -- then the hardware would trigger an interrupt, but the driver would assume that a syncpoint value greater than the syncpoint's max value is in the future, and not clean up the job. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/syncpt.c | 51 ++--- 1 file changed, 2 insertions(+), 49 deletions(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 5982fdf64e1c..9ca0d852e32f 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -306,59 +306,12 @@ EXPORT_SYMBOL(host1x_syncpt_wait); bool host1x_syncpt_is_expired(struct host1x_syncpt *sp, u32 thresh) { u32 current_val; - u32 future_val; smp_rmb(); current_val = (u32)atomic_read(&sp->min_val); - future_val = (u32)atomic_read(&sp->max_val); - - /* Note the use of unsigned arithmetic here (mod 1<<32). -* -* c = current_val = min_val= the current value of the syncpoint. -* t = thresh = the value we are checking -* f = future_val = max_val= the value c will reach when all -*outstanding increments have completed. -* -* Note that c always chases f until it reaches f. -* -* Dtf = (f - t) -* Dtc = (c - t) -* -* Consider all cases: -* -* A) .c..t..f.Dtf < Dtc need to wait -* B) .c.f..t..Dtf > Dtc expired -* C) ..t..c.f.Dtf > Dtc expired(Dct very large) -* -* Any case where f==c: always expired (for any t).Dtf == Dcf -* Any case where t==c: always expired (for any f).Dtf >= Dtc (because Dtc==0) -* Any case where t==f!=c: always wait.Dtf < Dtc (because Dtf==0, -* Dtc!=0) -* -* Other cases: -* -* A) .t..f..c.Dtf < Dtc need to wait -* A) .f..c..t.Dtf < Dtc need to wait -* A) .f..t..c.Dtf > Dtc expired -* -* So: -* Dtf >= Dtc implies EXPIRED (return true) -* Dtf < Dtc implies WAIT (return false) -* -* Note: If t is expired then we *cannot* wait on it. We would wait -* forever (hang the system). -* -* Note: do NOT get clever and remove the -thresh from both sides. It -* is NOT the same. -* -* If future valueis zero, we have a client managed sync point. In that -* case we do a direct comparison. -*/ - if (!host1x_syncpt_client_managed(sp)) - return future_val - thresh >= current_val - thresh; - else - return (s32)(current_val - thresh) >= 0; + + return ((current_val - thresh) & 0x8000U) == 0U; } int host1x_syncpt_init(struct host1x *host) -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 06/20] gpu: host1x: Cleanup and refcounting for syncpoints
Add reference counting for allocated syncpoints to allow keeping them allocated while jobs are referencing them. Additionally, clean up various places using syncpoint IDs to use host1x_syncpt pointers instead. Signed-off-by: Mikko Perttunen --- drivers/gpu/drm/tegra/dc.c | 4 +- drivers/gpu/drm/tegra/drm.c| 17 --- drivers/gpu/drm/tegra/gr2d.c | 4 +- drivers/gpu/drm/tegra/gr3d.c | 4 +- drivers/gpu/drm/tegra/vic.c| 4 +- drivers/gpu/host1x/cdma.c | 11 ++--- drivers/gpu/host1x/dev.h | 7 ++- drivers/gpu/host1x/hw/cdma_hw.c| 2 +- drivers/gpu/host1x/hw/channel_hw.c | 10 ++-- drivers/gpu/host1x/hw/debug_hw.c | 2 +- drivers/gpu/host1x/job.c | 5 +- drivers/gpu/host1x/syncpt.c| 75 +++--- drivers/gpu/host1x/syncpt.h| 3 ++ include/linux/host1x.h | 8 ++-- 14 files changed, 99 insertions(+), 57 deletions(-) diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c index 9a0b3240bc58..efb41c10dad4 100644 --- a/drivers/gpu/drm/tegra/dc.c +++ b/drivers/gpu/drm/tegra/dc.c @@ -2127,7 +2127,7 @@ static int tegra_dc_init(struct host1x_client *client) drm_plane_cleanup(primary); host1x_client_iommu_detach(client); - host1x_syncpt_free(dc->syncpt); + host1x_syncpt_put(dc->syncpt); return err; } @@ -2152,7 +2152,7 @@ static int tegra_dc_exit(struct host1x_client *client) } host1x_client_iommu_detach(client); - host1x_syncpt_free(dc->syncpt); + host1x_syncpt_put(dc->syncpt); return 0; } diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index ba9d1c3e7cac..ceea9db341f0 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -171,7 +171,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, struct drm_tegra_syncpt syncpt; struct host1x *host1x = dev_get_drvdata(drm->dev->parent); struct drm_gem_object **refs; - struct host1x_syncpt *sp; + struct host1x_syncpt *sp = NULL; struct host1x_job *job; unsigned int num_refs; int err; @@ -298,8 +298,8 @@ int tegra_drm_submit(struct tegra_drm_context *context, goto fail; } - /* check whether syncpoint ID is valid */ - sp = host1x_syncpt_get(host1x, syncpt.id); + /* Syncpoint ref will be dropped on job release. */ + sp = host1x_syncpt_get_by_id(host1x, syncpt.id); if (!sp) { err = -ENOENT; goto fail; @@ -308,7 +308,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, job->is_addr_reg = context->client->ops->is_addr_reg; job->is_valid_class = context->client->ops->is_valid_class; job->syncpt_incrs = syncpt.incrs; - job->syncpt_id = syncpt.id; + job->syncpt = sp; job->timeout = 1; if (args->timeout && args->timeout < 1) @@ -327,6 +327,9 @@ int tegra_drm_submit(struct tegra_drm_context *context, args->fence = job->syncpt_end; fail: + if (sp) + host1x_syncpt_put(sp); + while (num_refs--) drm_gem_object_put(refs[num_refs]); @@ -380,7 +383,7 @@ static int tegra_syncpt_read(struct drm_device *drm, void *data, struct drm_tegra_syncpt_read *args = data; struct host1x_syncpt *sp; - sp = host1x_syncpt_get(host, args->id); + sp = host1x_syncpt_get_by_id_noref(host, args->id); if (!sp) return -EINVAL; @@ -395,7 +398,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data, struct drm_tegra_syncpt_incr *args = data; struct host1x_syncpt *sp; - sp = host1x_syncpt_get(host1x, args->id); + sp = host1x_syncpt_get_by_id_noref(host1x, args->id); if (!sp) return -EINVAL; @@ -409,7 +412,7 @@ static int tegra_syncpt_wait(struct drm_device *drm, void *data, struct drm_tegra_syncpt_wait *args = data; struct host1x_syncpt *sp; - sp = host1x_syncpt_get(host1x, args->id); + sp = host1x_syncpt_get_by_id_noref(host1x, args->id); if (!sp) return -EINVAL; diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 1a0d3ba6e525..d857a99b21a7 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -67,7 +67,7 @@ static int gr2d_init(struct host1x_client *client) detach: host1x_client_iommu_detach(client); free: - host1x_syncpt_free(client->syncpts[0]); + host1x_syncpt_put(client->syncpts[0]); put: host1x_channel_put(gr2d->channel); return err; @@ -86,7 +86,7 @@ static int gr2d_exit(struct host1x_client *client) return err; host1x_client_iommu_detach(client); - host1x_syncpt_free(client->syncpts[0]); + host1x_syncpt_put(client->syncpts[0]); host1x_channe
[PATCH v3 02/20] gpu: host1x: Allow syncpoints without associated client
Syncpoints don't need to be associated with any client, so remove the property, and expose host1x_syncpt_alloc. This will allow allocating syncpoints without prior knowledge of the engine that it will be used with. Signed-off-by: Mikko Perttunen --- v3: * Clean up host1x_syncpt_alloc signature to allow specifying a name for the syncpoint. * Export the function. --- drivers/gpu/host1x/syncpt.c | 22 ++ drivers/gpu/host1x/syncpt.h | 1 - include/linux/host1x.h | 3 +++ 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index fce7892d5137..5982fdf64e1c 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -42,13 +42,13 @@ static void host1x_syncpt_base_free(struct host1x_syncpt_base *base) base->requested = false; } -static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, -struct host1x_client *client, -unsigned long flags) +struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, + unsigned long flags, + const char *name) { struct host1x_syncpt *sp = host->syncpt; + char *full_name; unsigned int i; - char *name; mutex_lock(&host->syncpt_mutex); @@ -64,13 +64,11 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, goto unlock; } - name = kasprintf(GFP_KERNEL, "%02u-%s", sp->id, -client ? dev_name(client->dev) : NULL); - if (!name) + full_name = kasprintf(GFP_KERNEL, "%u-%s", sp->id, name); + if (!full_name) goto free_base; - sp->client = client; - sp->name = name; + sp->name = full_name; if (flags & HOST1X_SYNCPT_CLIENT_MANAGED) sp->client_managed = true; @@ -87,6 +85,7 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, mutex_unlock(&host->syncpt_mutex); return NULL; } +EXPORT_SYMBOL(host1x_syncpt_alloc); /** * host1x_syncpt_id() - retrieve syncpoint ID @@ -401,7 +400,7 @@ int host1x_syncpt_init(struct host1x *host) host1x_hw_syncpt_enable_protection(host); /* Allocate sync point to use for clearing waits for expired fences */ - host->nop_sp = host1x_syncpt_alloc(host, NULL, 0); + host->nop_sp = host1x_syncpt_alloc(host, 0, "reserved-nop"); if (!host->nop_sp) return -ENOMEM; @@ -423,7 +422,7 @@ struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client, { struct host1x *host = dev_get_drvdata(client->host->parent); - return host1x_syncpt_alloc(host, client, flags); + return host1x_syncpt_alloc(host, flags, dev_name(client->dev)); } EXPORT_SYMBOL(host1x_syncpt_request); @@ -447,7 +446,6 @@ void host1x_syncpt_free(struct host1x_syncpt *sp) host1x_syncpt_base_free(sp->base); kfree(sp->name); sp->base = NULL; - sp->client = NULL; sp->name = NULL; sp->client_managed = false; diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index 8e1d04dacaa0..3aa6b25b1b9c 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -33,7 +33,6 @@ struct host1x_syncpt { const char *name; bool client_managed; struct host1x *host; - struct host1x_client *client; struct host1x_syncpt_base *base; /* interrupt data */ diff --git a/include/linux/host1x.h b/include/linux/host1x.h index f711fc0154f4..099eff8a06d2 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -154,6 +154,9 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, struct host1x_syncpt *host1x_syncpt_request(struct host1x_client *client, unsigned long flags); void host1x_syncpt_free(struct host1x_syncpt *sp); +struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, + unsigned long flags, + const char *name); struct host1x_syncpt_base *host1x_syncpt_get_base(struct host1x_syncpt *sp); u32 host1x_syncpt_base_id(struct host1x_syncpt_base *base); -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 10/20] gpu: host1x: Add no-recovery mode
Add a new property for jobs to enable or disable recovery i.e. CPU increments of syncpoints to max value on job timeout. This allows for a more solid model for hanged jobs, where userspace doesn't need to guess if a syncpoint increment happened because the job completed, or because job timeout was triggered. On job timeout, we stop the channel, NOP all future jobs on the channel using the same syncpoint, mark the syncpoint as locked and resume the channel from the next job, if any. The future jobs are NOPed, since because we don't do the CPU increments, the value of the syncpoint is no longer synchronized, and any waiters would become confused if a future job incremented the syncpoint. The syncpoint is marked locked to ensure that any future jobs cannot increment the syncpoint either, until the application has recognized the situation and reallocated the syncpoint. Signed-off-by: Mikko Perttunen --- v3: * Move 'locked' check inside CDMA lock to prevent race * Add clarifying comment to NOP-patching code --- drivers/gpu/drm/tegra/drm.c| 1 + drivers/gpu/host1x/cdma.c | 58 ++ drivers/gpu/host1x/hw/channel_hw.c | 2 +- drivers/gpu/host1x/job.c | 4 +++ drivers/gpu/host1x/syncpt.c| 2 ++ drivers/gpu/host1x/syncpt.h| 12 +++ include/linux/host1x.h | 9 + 7 files changed, 81 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index ceea9db341f0..7437c67924aa 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -197,6 +197,7 @@ int tegra_drm_submit(struct tegra_drm_context *context, job->client = client; job->class = client->class; job->serialize = true; + job->syncpt_recovery = true; /* * Track referenced BOs so that they can be unreferenced after the diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c index 6e6ca774f68d..bd151c3a2a5f 100644 --- a/drivers/gpu/host1x/cdma.c +++ b/drivers/gpu/host1x/cdma.c @@ -312,10 +312,6 @@ static void update_cdma_locked(struct host1x_cdma *cdma) bool signal = false; struct host1x_job *job, *n; - /* If CDMA is stopped, queue is cleared and we can return */ - if (!cdma->running) - return; - /* * Walk the sync queue, reading the sync point registers as necessary, * to consume as many sync queue entries as possible without blocking @@ -324,7 +320,8 @@ static void update_cdma_locked(struct host1x_cdma *cdma) struct host1x_syncpt *sp = job->syncpt; /* Check whether this syncpt has completed, and bail if not */ - if (!host1x_syncpt_is_expired(sp, job->syncpt_end)) { + if (!host1x_syncpt_is_expired(sp, job->syncpt_end) && + !job->cancelled) { /* Start timer on next pending syncpt */ if (job->timeout) cdma_start_timer_locked(cdma, job); @@ -413,8 +410,11 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma *cdma, else restart_addr = cdma->last_pos; + if (!job) + goto resume; + /* do CPU increments for the remaining syncpts */ - if (job) { + if (job->syncpt_recovery) { dev_dbg(dev, "%s: perform CPU incr on pending buffers\n", __func__); @@ -433,8 +433,44 @@ void host1x_cdma_update_sync_queue(struct host1x_cdma *cdma, dev_dbg(dev, "%s: finished sync_queue modification\n", __func__); + } else { + struct host1x_job *failed_job = job; + + host1x_job_dump(dev, job); + + host1x_syncpt_set_locked(job->syncpt); + failed_job->cancelled = true; + + list_for_each_entry_continue(job, &cdma->sync_queue, list) { + unsigned int i; + + if (job->syncpt != failed_job->syncpt) + continue; + + for (i = 0; i < job->num_slots; i++) { + unsigned int slot = (job->first_get/8 + i) % + HOST1X_PUSHBUFFER_SLOTS; + u32 *mapped = cdma->push_buffer.mapped; + + /* +* Overwrite opcodes with 0 word writes to +* to offset 0xbad. This does nothing but +* has a easily detected signature in debug +* traces. +*/ + mapped[2*slot+0] = 0x1bad; + mapped[2*slot+1] = 0x1bad; + } + + job->cancelled = true; + } + + wmb
[PATCH v3 20/20] drm/tegra: Add job firewall
Add a firewall that validates jobs before submission to ensure they don't do anything they aren't allowed to do, like accessing memory they should not access. The firewall is functionality-wise a copy of the firewall already implemented in gpu/host1x. It is copied here as it makes more sense for it to live on the DRM side, as it is only needed for userspace job submissions, and generally the data it needs to do its job is easier to access here. In the future, the other implementation will be removed. Signed-off-by: Mikko Perttunen --- v3: * New patch --- drivers/gpu/drm/tegra/Makefile| 1 + drivers/gpu/drm/tegra/uapi/firewall.c | 197 ++ drivers/gpu/drm/tegra/uapi/submit.c | 4 + drivers/gpu/drm/tegra/uapi/submit.h | 3 + 4 files changed, 205 insertions(+) create mode 100644 drivers/gpu/drm/tegra/uapi/firewall.c diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile index 059322e88943..4e3295f436f1 100644 --- a/drivers/gpu/drm/tegra/Makefile +++ b/drivers/gpu/drm/tegra/Makefile @@ -5,6 +5,7 @@ tegra-drm-y := \ drm.o \ uapi/uapi.o \ uapi/submit.o \ + uapi/firewall.o \ uapi/gather_bo.o \ gem.o \ fb.o \ diff --git a/drivers/gpu/drm/tegra/uapi/firewall.c b/drivers/gpu/drm/tegra/uapi/firewall.c new file mode 100644 index ..a9c5b71bc235 --- /dev/null +++ b/drivers/gpu/drm/tegra/uapi/firewall.c @@ -0,0 +1,197 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2010-2020 NVIDIA Corporation */ + +#include "../drm.h" +#include "../uapi.h" + +#include "submit.h" + +struct tegra_drm_firewall { + struct tegra_drm_submit_data *submit; + struct tegra_drm_client *client; + u32 *data; + u32 pos; + u32 end; +}; + +static int fw_next(struct tegra_drm_firewall *fw, u32 *word) +{ + if (fw->pos == fw->end) + return -EINVAL; + + *word = fw->data[fw->pos++]; + + return 0; +} + +static bool fw_check_addr_valid(struct tegra_drm_firewall *fw, u32 offset) +{ + u32 i; + + for (i = 0; i < fw->submit->num_used_mappings; i++) { + struct tegra_drm_mapping *m = fw->submit->used_mappings[i].mapping; + + if (offset >= m->iova && offset <= m->iova_end) + return true; + } + + return false; +} + +static int fw_check_reg(struct tegra_drm_firewall *fw, u32 offset) +{ + bool is_addr; + u32 word; + int err; + + err = fw_next(fw, &word); + if (err) + return err; + + if (!fw->client->ops->is_addr_reg) + return 0; + + is_addr = fw->client->ops->is_addr_reg( + fw->client->base.dev, fw->client->base.class, offset); + + if (!is_addr) + return 0; + + if (!fw_check_addr_valid(fw, word)) + return -EINVAL; + + return 0; +} + +static int fw_check_regs_seq(struct tegra_drm_firewall *fw, u32 offset, +u32 count, bool incr) +{ + u32 i; + + for (i = 0; i < count; i++) { + if (fw_check_reg(fw, offset)) + return -EINVAL; + + if (incr) + offset++; + } + + return 0; +} + +static int fw_check_regs_mask(struct tegra_drm_firewall *fw, u32 offset, + u16 mask) +{ + unsigned long bmask = mask; + unsigned int bit; + + for_each_set_bit(bit, &bmask, 16) { + if (fw_check_reg(fw, offset+bit)) + return -EINVAL; + } + + return 0; +} + +static int fw_check_regs_imm(struct tegra_drm_firewall *fw, u32 offset) +{ + bool is_addr; + + is_addr = fw->client->ops->is_addr_reg(fw->client->base.dev, + fw->client->base.class, offset); + if (is_addr) + return -EINVAL; + + return 0; +} + +enum { +HOST1X_OPCODE_SETCLASS = 0x00, +HOST1X_OPCODE_INCR = 0x01, +HOST1X_OPCODE_NONINCR = 0x02, +HOST1X_OPCODE_MASK = 0x03, +HOST1X_OPCODE_IMM = 0x04, +HOST1X_OPCODE_RESTART = 0x05, +HOST1X_OPCODE_GATHER= 0x06, +HOST1X_OPCODE_SETSTRMID = 0x07, +HOST1X_OPCODE_SETAPPID = 0x08, +HOST1X_OPCODE_SETPYLD = 0x09, +HOST1X_OPCODE_INCR_W= 0x0a, +HOST1X_OPCODE_NONINCR_W = 0x0b, +HOST1X_OPCODE_GATHER_W = 0x0c, +HOST1X_OPCODE_RESTART_W = 0x0d, +HOST1X_OPCODE_EXTEND= 0x0e, +}; + +int tegra_drm_fw_validate(struct tegra_drm_client *client, u32 *data, u32 start, + u32 words, struct tegra_drm_submit_data *submit) +{ + struct tegra_drm_firewall fw = { + .submit = submit, + .client = client, + .data = data, + .pos = start, + .end = start+words, + }; + bool payl
[PATCH v3 12/20] gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
Add support for inserting syncpoint waits in the CDMA pushbuffer. These waits need to be done in HOST1X class, while gather submitted by the application execute in engine class. Support is added by converting the gather list of job into a command list that can include both gathers and waits. When the job is submitted, these commands are pushed as the appropriate opcodes on the CDMA pushbuffer. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/hw/channel_hw.c | 51 +++ drivers/gpu/host1x/hw/debug_hw.c | 9 +++- drivers/gpu/host1x/job.c | 67 +- drivers/gpu/host1x/job.h | 14 +++ include/linux/host1x.h | 5 ++- 5 files changed, 105 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index bf21512e5078..d88a32f73f5e 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -55,31 +55,46 @@ static void submit_gathers(struct host1x_job *job) #endif unsigned int i; - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; - dma_addr_t addr = g->base + g->offset; - u32 op2, op3; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_cmd *cmd = &job->cmds[i]; - op2 = lower_32_bits(addr); - op3 = upper_32_bits(addr); + if (cmd->is_wait) { + /* TODO use modern wait */ + host1x_cdma_push(cdma, +host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + host1x_uclass_wait_syncpt_r(), 1), +host1x_class_host_wait_syncpt(cmd->wait.id, + cmd->wait.threshold)); + host1x_cdma_push( + cdma, host1x_opcode_setclass(job->class, 0, 0), + HOST1X_OPCODE_NOP); + } else { + struct host1x_job_gather *g = &cmd->gather; - trace_write_gather(cdma, g->bo, g->offset, g->words); + dma_addr_t addr = g->base + g->offset; + u32 op2, op3; - if (op3 != 0) { + op2 = lower_32_bits(addr); + op3 = upper_32_bits(addr); + + trace_write_gather(cdma, g->bo, g->offset, g->words); + + if (op3 != 0) { #if HOST1X_HW >= 6 - u32 op1 = host1x_opcode_gather_wide(g->words); - u32 op4 = HOST1X_OPCODE_NOP; + u32 op1 = host1x_opcode_gather_wide(g->words); + u32 op4 = HOST1X_OPCODE_NOP; - host1x_cdma_push_wide(cdma, op1, op2, op3, op4); + host1x_cdma_push_wide(cdma, op1, op2, op3, op4); #else - dev_err(dev, "invalid gather for push buffer %pad\n", - &addr); - continue; + dev_err(dev, "invalid gather for push buffer %pad\n", + &addr); + continue; #endif - } else { - u32 op1 = host1x_opcode_gather(g->words); + } else { + u32 op1 = host1x_opcode_gather(g->words); - host1x_cdma_push(cdma, op1, op2); + host1x_cdma_push(cdma, op1, op2); + } } } } @@ -126,7 +141,7 @@ static int channel_submit(struct host1x_job *job) struct host1x *host = dev_get_drvdata(ch->dev->parent); trace_host1x_channel_submit(dev_name(ch->dev), - job->num_gathers, job->num_relocs, + job->num_cmds, job->num_relocs, job->syncpt->id, job->syncpt_incrs); /* before error checks, return current max */ diff --git a/drivers/gpu/host1x/hw/debug_hw.c b/drivers/gpu/host1x/hw/debug_hw.c index ceb48229d14b..35952fd5597e 100644 --- a/drivers/gpu/host1x/hw/debug_hw.c +++ b/drivers/gpu/host1x/hw/debug_hw.c @@ -208,10 +208,15 @@ static void show_channel_gathers(struct output *o, struct host1x_cdma *cdma) job->first_get, job->timeout, job->num_slots, job->num_unpins); - for (i = 0; i < job->num_gathers; i++) { - struct host1x_job_gather *g = &job->gathers[i]; + for (i = 0; i < job->num_cmds; i++) { + struct host1x_job_gather *g; u32 *mapped; + if (job->cmds[i].is_wait) +
[PATCH v3 16/20] drm/tegra: Boot VIC during runtime PM resume
With the new UAPI implementation, engines are powered on and off when there are active jobs, and the core code handles channel allocation. To accommodate that, boot the engine as part of runtime PM instead of using the open_channel callback, which is not used by the new submit path. Signed-off-by: Mikko Perttunen --- v3: * runtime_get/put is now done directly from submit path, so no callbacks are added * Reworded. --- drivers/gpu/drm/tegra/vic.c | 114 +--- 1 file changed, 53 insertions(+), 61 deletions(-) diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c index cb476da59adc..5d2ad125dca3 100644 --- a/drivers/gpu/drm/tegra/vic.c +++ b/drivers/gpu/drm/tegra/vic.c @@ -29,7 +29,6 @@ struct vic_config { struct vic { struct falcon falcon; - bool booted; void __iomem *regs; struct tegra_drm_client client; @@ -52,48 +51,6 @@ static void vic_writel(struct vic *vic, u32 value, unsigned int offset) writel(value, vic->regs + offset); } -static int vic_runtime_resume(struct device *dev) -{ - struct vic *vic = dev_get_drvdata(dev); - int err; - - err = clk_prepare_enable(vic->clk); - if (err < 0) - return err; - - usleep_range(10, 20); - - err = reset_control_deassert(vic->rst); - if (err < 0) - goto disable; - - usleep_range(10, 20); - - return 0; - -disable: - clk_disable_unprepare(vic->clk); - return err; -} - -static int vic_runtime_suspend(struct device *dev) -{ - struct vic *vic = dev_get_drvdata(dev); - int err; - - err = reset_control_assert(vic->rst); - if (err < 0) - return err; - - usleep_range(2000, 4000); - - clk_disable_unprepare(vic->clk); - - vic->booted = false; - - return 0; -} - static int vic_boot(struct vic *vic) { #ifdef CONFIG_IOMMU_API @@ -103,9 +60,6 @@ static int vic_boot(struct vic *vic) void *hdr; int err = 0; - if (vic->booted) - return 0; - #ifdef CONFIG_IOMMU_API if (vic->config->supports_sid && spec) { u32 value; @@ -153,8 +107,6 @@ static int vic_boot(struct vic *vic) return err; } - vic->booted = true; - return 0; } @@ -308,35 +260,76 @@ static int vic_load_firmware(struct vic *vic) return err; } -static int vic_open_channel(struct tegra_drm_client *client, - struct tegra_drm_context *context) + +static int vic_runtime_resume(struct device *dev) { - struct vic *vic = to_vic(client); + struct vic *vic = dev_get_drvdata(dev); int err; - err = pm_runtime_get_sync(vic->dev); + err = clk_prepare_enable(vic->clk); if (err < 0) return err; + usleep_range(10, 20); + + err = reset_control_deassert(vic->rst); + if (err < 0) + goto disable; + + usleep_range(10, 20); + err = vic_load_firmware(vic); if (err < 0) - goto rpm_put; + goto assert; err = vic_boot(vic); if (err < 0) - goto rpm_put; + goto assert; + + return 0; + +assert: + reset_control_assert(vic->rst); +disable: + clk_disable_unprepare(vic->clk); + return err; +} + +static int vic_runtime_suspend(struct device *dev) +{ + struct vic *vic = dev_get_drvdata(dev); + int err; + + err = reset_control_assert(vic->rst); + if (err < 0) + return err; + + usleep_range(2000, 4000); + + clk_disable_unprepare(vic->clk); + + return 0; +} + +static int vic_open_channel(struct tegra_drm_client *client, + struct tegra_drm_context *context) +{ + struct vic *vic = to_vic(client); + int err; + + err = pm_runtime_get_sync(vic->dev); + if (err < 0) { + pm_runtime_put(vic->dev); + return err; + } context->channel = host1x_channel_get(vic->channel); if (!context->channel) { - err = -ENOMEM; - goto rpm_put; + pm_runtime_put(vic->dev); + return -ENOMEM; } return 0; - -rpm_put: - pm_runtime_put(vic->dev); - return err; } static void vic_close_channel(struct tegra_drm_context *context) @@ -344,7 +337,6 @@ static void vic_close_channel(struct tegra_drm_context *context) struct vic *vic = to_vic(context->client); host1x_channel_put(context->channel); - pm_runtime_put(vic->dev); } -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 00/20] Host1x/TegraDRM UAPI
Hi all, here's the third revision of the Host1x/TegraDRM UAPI proposal. The open issues from RFCv2 should be resolved now, so I'm dropping the RFC tag. The series is still only tested with Tegra186 so I'm hoping for people with devices with other chips to test this out. The test suite[1] has been updated for the changes in this revision, and also includes tests for the newly added DMA reservation support. If there are no further issues with the UAPI definition, I'll look at porting other userspace next - hoping for some help with that as well since most of it is for chips I don't have easy access to. The series can be also found in https://github.com/cyndis/linux/commits/work/host1x-uapi-v3. Older versions: v1: https://www.spinics.net/lists/linux-tegra/msg51000.html v2: https://www.spinics.net/lists/linux-tegra/msg53061.html Thank you, Mikko [1] https://github.com/cyndis/uapi-test Mikko Perttunen (20): gpu: host1x: Use different lock classes for each client gpu: host1x: Allow syncpoints without associated client gpu: host1x: Show number of pending waiters in debugfs gpu: host1x: Remove cancelled waiters immediately gpu: host1x: Use HW-equivalent syncpoint expiration check gpu: host1x: Cleanup and refcounting for syncpoints gpu: host1x: Introduce UAPI header gpu: host1x: Implement /dev/host1x device node gpu: host1x: DMA fences and userspace fence creation gpu: host1x: Add no-recovery mode gpu: host1x: Add job release callback gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer gpu: host1x: Reset max value when freeing a syncpoint gpu: host1x: Reserve VBLANK syncpoints at initialization drm/tegra: Add new UAPI to header drm/tegra: Boot VIC during runtime PM resume drm/tegra: Set resv fields when importing/exporting GEMs drm/tegra: Allocate per-engine channel in core code drm/tegra: Implement new UAPI drm/tegra: Add job firewall drivers/gpu/drm/tegra/Makefile | 4 + drivers/gpu/drm/tegra/dc.c | 10 +- drivers/gpu/drm/tegra/drm.c| 75 ++- drivers/gpu/drm/tegra/drm.h| 9 + drivers/gpu/drm/tegra/gem.c| 2 + drivers/gpu/drm/tegra/gr2d.c | 4 +- drivers/gpu/drm/tegra/gr3d.c | 4 +- drivers/gpu/drm/tegra/uapi.h | 63 +++ drivers/gpu/drm/tegra/uapi/firewall.c | 197 +++ drivers/gpu/drm/tegra/uapi/gather_bo.c | 86 drivers/gpu/drm/tegra/uapi/gather_bo.h | 22 + drivers/gpu/drm/tegra/uapi/submit.c| 679 + drivers/gpu/drm/tegra/uapi/submit.h| 20 + drivers/gpu/drm/tegra/uapi/uapi.c | 326 drivers/gpu/drm/tegra/vic.c| 118 ++--- drivers/gpu/host1x/Makefile| 2 + drivers/gpu/host1x/bus.c | 7 +- drivers/gpu/host1x/cdma.c | 69 ++- drivers/gpu/host1x/debug.c | 14 +- drivers/gpu/host1x/dev.c | 15 + drivers/gpu/host1x/dev.h | 16 +- drivers/gpu/host1x/fence.c | 207 drivers/gpu/host1x/fence.h | 13 + drivers/gpu/host1x/hw/cdma_hw.c| 2 +- drivers/gpu/host1x/hw/channel_hw.c | 63 ++- drivers/gpu/host1x/hw/debug_hw.c | 11 +- drivers/gpu/host1x/intr.c | 23 +- drivers/gpu/host1x/intr.h | 2 + drivers/gpu/host1x/job.c | 79 ++- drivers/gpu/host1x/job.h | 14 + drivers/gpu/host1x/syncpt.c| 185 --- drivers/gpu/host1x/syncpt.h| 16 +- drivers/gpu/host1x/uapi.c | 382 ++ drivers/gpu/host1x/uapi.h | 22 + include/linux/host1x.h | 47 +- include/uapi/drm/tegra_drm.h | 420 ++- include/uapi/linux/host1x.h| 134 + 37 files changed, 3076 insertions(+), 286 deletions(-) create mode 100644 drivers/gpu/drm/tegra/uapi.h create mode 100644 drivers/gpu/drm/tegra/uapi/firewall.c create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.c create mode 100644 drivers/gpu/drm/tegra/uapi/gather_bo.h create mode 100644 drivers/gpu/drm/tegra/uapi/submit.c create mode 100644 drivers/gpu/drm/tegra/uapi/submit.h create mode 100644 drivers/gpu/drm/tegra/uapi/uapi.c create mode 100644 drivers/gpu/host1x/fence.c create mode 100644 drivers/gpu/host1x/fence.h create mode 100644 drivers/gpu/host1x/uapi.c create mode 100644 drivers/gpu/host1x/uapi.h create mode 100644 include/uapi/linux/host1x.h -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 08/20] gpu: host1x: Implement /dev/host1x device node
Add the /dev/host1x device node, implementing the following functionality: - Reading syncpoint values - Allocating syncpoints (providing syncpoint FDs) - Incrementing syncpoints (based on syncpoint FD) Signed-off-by: Mikko Perttunen --- v3: * Pass process name as syncpoint name when allocating syncpoint. --- drivers/gpu/host1x/Makefile | 1 + drivers/gpu/host1x/dev.c| 9 ++ drivers/gpu/host1x/dev.h| 3 + drivers/gpu/host1x/uapi.c | 276 drivers/gpu/host1x/uapi.h | 22 +++ include/linux/host1x.h | 2 + 6 files changed, 313 insertions(+) create mode 100644 drivers/gpu/host1x/uapi.c create mode 100644 drivers/gpu/host1x/uapi.h diff --git a/drivers/gpu/host1x/Makefile b/drivers/gpu/host1x/Makefile index 096017b8789d..882f928d75e1 100644 --- a/drivers/gpu/host1x/Makefile +++ b/drivers/gpu/host1x/Makefile @@ -9,6 +9,7 @@ host1x-y = \ job.o \ debug.o \ mipi.o \ + uapi.o \ hw/host1x01.o \ hw/host1x02.o \ hw/host1x04.o \ diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index d0ebb70e2fdd..641317d23828 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -461,6 +461,12 @@ static int host1x_probe(struct platform_device *pdev) goto deinit_syncpt; } + err = host1x_uapi_init(&host->uapi, host); + if (err) { + dev_err(&pdev->dev, "failed to initialize uapi\n"); + goto deinit_intr; + } + host1x_debug_init(host); if (host->info->has_hypervisor) @@ -480,6 +486,8 @@ static int host1x_probe(struct platform_device *pdev) host1x_unregister(host); deinit_debugfs: host1x_debug_deinit(host); + host1x_uapi_deinit(&host->uapi); +deinit_intr: host1x_intr_deinit(host); deinit_syncpt: host1x_syncpt_deinit(host); @@ -501,6 +509,7 @@ static int host1x_remove(struct platform_device *pdev) host1x_unregister(host); host1x_debug_deinit(host); + host1x_uapi_deinit(&host->uapi); host1x_intr_deinit(host); host1x_syncpt_deinit(host); reset_control_assert(host->rst); diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index 63010ae37a97..7b8b7e20e32b 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -17,6 +17,7 @@ #include "intr.h" #include "job.h" #include "syncpt.h" +#include "uapi.h" struct host1x_syncpt; struct host1x_syncpt_base; @@ -143,6 +144,8 @@ struct host1x { struct list_head list; struct device_dma_parameters dma_parms; + + struct host1x_uapi uapi; }; void host1x_hypervisor_writel(struct host1x *host1x, u32 r, u32 v); diff --git a/drivers/gpu/host1x/uapi.c b/drivers/gpu/host1x/uapi.c new file mode 100644 index ..4747d8de132e --- /dev/null +++ b/drivers/gpu/host1x/uapi.c @@ -0,0 +1,276 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * /dev/host1x syncpoint interface + * + * Copyright (c) 2020, NVIDIA Corporation. + */ + +#include +#include +#include +#include +#include +#include + +#include "dev.h" +#include "syncpt.h" +#include "uapi.h" + +#include + +static int syncpt_file_release(struct inode *inode, struct file *file) +{ + struct host1x_syncpt *sp = file->private_data; + + host1x_syncpt_put(sp); + + return 0; +} + +static int syncpt_file_ioctl_info(struct host1x_syncpt *sp, void __user *data) +{ + struct host1x_syncpoint_info args; + unsigned long copy_err; + + copy_err = copy_from_user(&args, data, sizeof(args)); + if (copy_err) + return -EFAULT; + + if (args.reserved[0] || args.reserved[1] || args.reserved[2]) + return -EINVAL; + + args.id = sp->id; + + copy_err = copy_to_user(data, &args, sizeof(args)); + if (copy_err) + return -EFAULT; + + return 0; +} + +static int syncpt_file_ioctl_incr(struct host1x_syncpt *sp, void __user *data) +{ + struct host1x_syncpoint_increment args; + unsigned long copy_err; + u32 i; + + copy_err = copy_from_user(&args, data, sizeof(args)); + if (copy_err) + return -EFAULT; + + for (i = 0; i < args.count; i++) { + host1x_syncpt_incr(sp); + if (signal_pending(current)) + return -EINTR; + } + + return 0; +} + +static long syncpt_file_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + void __user *data = (void __user *)arg; + long err; + + switch (cmd) { + case HOST1X_IOCTL_SYNCPOINT_INFO: + err = syncpt_file_ioctl_info(file->private_data, data); + break; + + case HOST1X_IOCTL_SYNCPOINT_INCREMENT: + err = syncpt_file_ioctl_incr(file->private_data, data); + break; + + default: + err = -ENOTTY; + } + +
[PATCH v3 13/20] gpu: host1x: Reset max value when freeing a syncpoint
With job recovery becoming optional, syncpoints may have a mismatch between their value and max value when freed. As such, when freeing, set the max value to the current value of the syncpoint so that it is in a sane state for the next user. Signed-off-by: Mikko Perttunen --- v3: * Use host1x_syncpt_read instead of read_min to ensure syncpoint value is current. --- drivers/gpu/host1x/syncpt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 8d658e5f7db2..99d31932eb34 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -385,6 +385,7 @@ static void syncpt_release(struct kref *ref) { struct host1x_syncpt *sp = container_of(ref, struct host1x_syncpt, ref); + atomic_set(&sp->max_val, host1x_syncpt_read(sp)); sp->locked = false; mutex_lock(&sp->host->syncpt_mutex); -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 17/20] drm/tegra: Set resv fields when importing/exporting GEMs
To allow sharing of implicit fences when exporting/importing dma_buf objects, set the 'resv' fields when importing or exporting GEM objects. Signed-off-by: Mikko Perttunen --- drivers/gpu/drm/tegra/gem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 723df142a981..4a8acd4724bd 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -423,6 +423,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, } bo->gem.import_attach = attach; + bo->gem.resv = buf->resv; return bo; @@ -675,6 +676,7 @@ struct dma_buf *tegra_gem_prime_export(struct drm_gem_object *gem, exp_info.size = gem->size; exp_info.flags = flags; exp_info.priv = gem; + exp_info.resv = gem->resv; return drm_gem_dmabuf_export(gem->dev, &exp_info); } -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 01/20] gpu: host1x: Use different lock classes for each client
To avoid false lockdep warnings, give each client lock a different lock class, passed from the initialization site by macro. Signed-off-by: Mikko Perttunen --- drivers/gpu/host1x/bus.c | 7 --- include/linux/host1x.h | 9 - 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c index e201f62d62c0..4101f64bd545 100644 --- a/drivers/gpu/host1x/bus.c +++ b/drivers/gpu/host1x/bus.c @@ -714,13 +714,14 @@ EXPORT_SYMBOL(host1x_driver_unregister); * device and call host1x_device_init(), which will in turn call each client's * &host1x_client_ops.init implementation. */ -int host1x_client_register(struct host1x_client *client) +int __host1x_client_register(struct host1x_client *client, + struct lock_class_key *key) { struct host1x *host1x; int err; INIT_LIST_HEAD(&client->list); - mutex_init(&client->lock); + __mutex_init(&client->lock, "host1x client lock", key); client->usecount = 0; mutex_lock(&devices_lock); @@ -741,7 +742,7 @@ int host1x_client_register(struct host1x_client *client) return 0; } -EXPORT_SYMBOL(host1x_client_register); +EXPORT_SYMBOL(__host1x_client_register); /** * host1x_client_unregister() - unregister a host1x client diff --git a/include/linux/host1x.h b/include/linux/host1x.h index 20c885d0bddc..f711fc0154f4 100644 --- a/include/linux/host1x.h +++ b/include/linux/host1x.h @@ -320,7 +320,14 @@ static inline struct host1x_device *to_host1x_device(struct device *dev) int host1x_device_init(struct host1x_device *device); int host1x_device_exit(struct host1x_device *device); -int host1x_client_register(struct host1x_client *client); +int __host1x_client_register(struct host1x_client *client, +struct lock_class_key *key); +#define host1x_client_register(class) \ + ({ \ + static struct lock_class_key __key; \ + __host1x_client_register(class, &__key); \ + }) + int host1x_client_unregister(struct host1x_client *client); int host1x_client_suspend(struct host1x_client *client); -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM
On Wed, Oct 7, 2020 at 6:53 PM Jason Gunthorpe wrote: > > On Wed, Oct 07, 2020 at 06:44:18PM +0200, Daniel Vetter wrote: > > > > - /* > > - * While get_vaddr_frames() could be used for transient (kernel > > - * controlled lifetime) pinning of memory pages all current > > - * users establish long term (userspace controlled lifetime) > > - * page pinning. Treat get_vaddr_frames() like > > - * get_user_pages_longterm() and disallow it for filesystem-dax > > - * mappings. > > - */ > > - if (vma_is_fsdax(vma)) { > > - ret = -EOPNOTSUPP; > > - goto out; > > - } > > - > > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { > > - vec->got_ref = true; > > - vec->is_pfns = false; > > - ret = pin_user_pages_locked(start, nr_frames, > > - gup_flags, (struct page **)(vec->ptrs), &locked); > > - goto out; > > - } > > The vm_flags still need to be checked before going into the while > loop. If the break is taken then nothing would check vm_flags Hm right that's a bin inconsistent. follow_pfn also checks for this, so I think we can just ditch this entirely both here and in the do {} while () check, simplifying the latter to just while (vma). Well, just make it a real loop with less confusing control flow probably. Or prefer I keep this and touch the code less? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v3 4/6] drm/dp: Add LTTPR helpers
Add the helpers and register definitions needed to read out the common and per-PHY LTTPR capabilities and perform link training in the LTTPR non-transparent mode. v2: - Add drm_dp_dpcd_read_phy_link_status() and DP_PHY_LTTPR() here instead of adding these to i915. (Ville) v3: - Use memmove() to convert LTTPR to DPRX link status format. (Ville) Cc: dri-devel@lists.freedesktop.org Cc: Ville Syrjälä Reviewed-by: Ville Syrjälä Signed-off-by: Imre Deak --- drivers/gpu/drm/drm_dp_helper.c | 232 +++- include/drm/drm_dp_helper.h | 62 + 2 files changed, 290 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c index 478dd51f738d..79732402336d 100644 --- a/drivers/gpu/drm/drm_dp_helper.c +++ b/drivers/gpu/drm/drm_dp_helper.c @@ -150,11 +150,8 @@ void drm_dp_link_train_clock_recovery_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE]) } EXPORT_SYMBOL(drm_dp_link_train_clock_recovery_delay); -void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE]) +static void __drm_dp_link_train_channel_eq_delay(unsigned long rd_interval) { - unsigned long rd_interval = dpcd[DP_TRAINING_AUX_RD_INTERVAL] & -DP_TRAINING_AUX_RD_MASK; - if (rd_interval > 4) DRM_DEBUG_KMS("AUX interval %lu, out of range (max 4)\n", rd_interval); @@ -166,8 +163,35 @@ void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE]) usleep_range(rd_interval, rd_interval * 2); } + +void drm_dp_link_train_channel_eq_delay(const u8 dpcd[DP_RECEIVER_CAP_SIZE]) +{ + __drm_dp_link_train_channel_eq_delay(dpcd[DP_TRAINING_AUX_RD_INTERVAL] & +DP_TRAINING_AUX_RD_MASK); +} EXPORT_SYMBOL(drm_dp_link_train_channel_eq_delay); +void drm_dp_lttpr_link_train_clock_recovery_delay(void) +{ + usleep_range(100, 200); +} +EXPORT_SYMBOL(drm_dp_lttpr_link_train_clock_recovery_delay); + +static u8 dp_lttpr_phy_cap(const u8 phy_cap[DP_LTTPR_PHY_CAP_SIZE], int r) +{ + return phy_cap[r - DP_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1]; +} + +void drm_dp_lttpr_link_train_channel_eq_delay(const u8 phy_cap[DP_LTTPR_PHY_CAP_SIZE]) +{ + u8 interval = dp_lttpr_phy_cap(phy_cap, + DP_TRAINING_AUX_RD_INTERVAL_PHY_REPEATER1) & + DP_TRAINING_AUX_RD_MASK; + + __drm_dp_link_train_channel_eq_delay(interval); +} +EXPORT_SYMBOL(drm_dp_lttpr_link_train_channel_eq_delay); + u8 drm_dp_link_rate_to_bw_code(int link_rate) { /* Spec says link_bw = link_rate / 0.27Gbps */ @@ -363,6 +387,59 @@ int drm_dp_dpcd_read_link_status(struct drm_dp_aux *aux, } EXPORT_SYMBOL(drm_dp_dpcd_read_link_status); +/** + * drm_dp_dpcd_read_phy_link_status - get the link status information for a DP PHY + * @aux: DisplayPort AUX channel + * @dp_phy: the DP PHY to get the link status for + * @link_status: buffer to return the status in + * + * Fetch the AUX DPCD registers for the DPRX or an LTTPR PHY link status. The + * layout of the returned @link_status matches the DPCD register layout of the + * DPRX PHY link status. + * + * Returns 0 if the information was read successfully or a negative error code + * on failure. + */ +int drm_dp_dpcd_read_phy_link_status(struct drm_dp_aux *aux, +enum drm_dp_phy dp_phy, +u8 link_status[DP_LINK_STATUS_SIZE]) +{ + int ret; + + if (dp_phy == DP_PHY_DPRX) { + ret = drm_dp_dpcd_read(aux, + DP_LANE0_1_STATUS, + link_status, + DP_LINK_STATUS_SIZE); + + if (ret < 0) + return ret; + + WARN_ON(ret != DP_LINK_STATUS_SIZE); + + return 0; + } + + ret = drm_dp_dpcd_read(aux, + DP_LANE0_1_STATUS_PHY_REPEATER(dp_phy), + link_status, + DP_LINK_STATUS_SIZE - 1); + + if (ret < 0) + return ret; + + WARN_ON(ret != DP_LINK_STATUS_SIZE - 1); + + /* Convert the LTTPR to the sink PHY link status layout */ + memmove(&link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS + 1], + &link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS], + DP_LINK_STATUS_SIZE - (DP_SINK_STATUS - DP_LANE0_1_STATUS) - 1); + link_status[DP_SINK_STATUS - DP_LANE0_1_STATUS] = 0; + + return 0; +} +EXPORT_SYMBOL(drm_dp_dpcd_read_phy_link_status); + static bool is_edid_digital_input_dp(const struct edid *edid) { return edid && edid->revision >= 4 && @@ -2098,6 +2175,153 @@ int drm_dp_dsc_sink_supported_input_bpcs(const u8 dsc_dpcd[DP_DSC_RECEIVER_CAP_S } EXPORT_SYMBOL(drm_dp_dsc_sink_supported_input_bpcs); +/** + *
Re: [PATCH 2/5] thermal: devfreq_cooling: get a copy of device status
On Monday 21 Sep 2020 at 13:20:04 (+0100), Lukasz Luba wrote: > Devfreq cooling needs to now the correct status of the device in order > to operate. Do not rely on Devfreq last_status which might be a stale data > and get more up-to-date values of the load. > > Devfreq framework can change the device status in the background. To > mitigate this situation make a copy of the status structure and use it > for internal calculations. > > In addition this patch adds normalization function, which also makes sure > that whatever data comes from the device, it is in a sane range. > > Signed-off-by: Lukasz Luba > --- > drivers/thermal/devfreq_cooling.c | 52 +-- > 1 file changed, 43 insertions(+), 9 deletions(-) > > diff --git a/drivers/thermal/devfreq_cooling.c > b/drivers/thermal/devfreq_cooling.c > index 7063ccb7b86d..cf045bd4d16b 100644 > --- a/drivers/thermal/devfreq_cooling.c > +++ b/drivers/thermal/devfreq_cooling.c > @@ -227,6 +227,24 @@ static inline unsigned long get_total_power(struct > devfreq_cooling_device *dfc, > voltage); > } > > +static void _normalize_load(struct devfreq_dev_status *status) Is there a reason for the leading "_" ? AFAIK, "__name()" is meant to suggest a "worker" function for another "name()" function, but that would not apply here. > +{ > + /* Make some space if needed */ > + if (status->busy_time > 0x) { > + status->busy_time >>= 10; > + status->total_time >>= 10; > + } How about removing the above code and adding here: status->busy_time = status->busy_time ? : 1; > + > + if (status->busy_time > status->total_time) This check would then cover the possibility that total_time is 0. > + status->busy_time = status->total_time; But a reversal is needed here: status->total_time = status->busy_time; > + > + status->busy_time *= 100; > + status->busy_time /= status->total_time ? : 1; > + > + /* Avoid division by 0 */ > + status->busy_time = status->busy_time ? : 1; > + status->total_time = 100; Then all of this code can be replaced by: status->busy_time = (unsigned long)div64_u64((u64)status->busy_time << 10, status->total_time); status->total_time = 1 << 10; This way you gain some resolution to busy_time and the divisions in the callers would just become shifts by 10. Hope it helps, Ionela. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 0/3] drm: commit_work scheduling
On 10/07/20 08:57, Rob Clark wrote: > Yeah, I think we will end up making some use of uclamp.. there is > someone else working on that angle > > But without it, this is a case that exposes legit prioritization > problems with commit_work which we should fix ;-) I wasn't suggesting this as an alternative to fixing the other problem. But it seemed you had a different problem here that I thought I could help with :-) I did give my opinion about how to handle that priority issue. If the 2 threads are kernel threads and by design they need relative priorities IMO the kernel need to be taught to set this relative priority. It seemed the vblank worker could run as SCHED_DEADLINE. If this works, then the priority problem for commit_work disappears as SCHED_DEADLINE will preempt RT. If commit_work uses sched_set_fifo(), its priority will be 50, hence your SF threads can no longer preempt it. And you can manage the SF threads to be any value you want relative to 50 anyway without having to manage commit_work itself. I'm not sure if you have problems with RT tasks preempting important CFS tasks. My brain registered two conflicting statements. Thanks -- Qais Yousef ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 12/13] media/videbuf1|2: Mark follow_pfn usage as unsafe
The media model assumes that buffers are all preallocated, so that when a media pipeline is running we never miss a deadline because the buffers aren't allocated or available. This means we cannot fix the v4l follow_pfn usage through mmu_notifier, without breaking how this all works. The only real fix is to deprecate userptr support for VM_IO | VM_PFNMAP mappings and tell everyone to cut over to dma-buf memory sharing for zerocopy. userptr for normal memory will keep working as-is. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Kees Cook Cc: Dan Williams Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Figa Cc: Laurent Dufour Cc: Vlastimil Babka Cc: Daniel Jordan Cc: Michel Lespinasse --- drivers/media/common/videobuf2/frame_vector.c | 2 +- drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c index b95f4f371681..d56eb6258f09 100644 --- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -71,7 +71,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, unsigned long *nums = frame_vector_pfns(vec); while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) { - err = follow_pfn(vma, start, &nums[ret]); + err = unsafe_follow_pfn(vma, start, &nums[ret]); if (err) { if (ret == 0) ret = err; diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index 52312ce2ba05..821c4a76ab96 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -183,7 +183,7 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, user_address = untagged_baddr; while (pages_done < (mem->size >> PAGE_SHIFT)) { - ret = follow_pfn(vma, user_address, &this_pfn); + ret = unsafe_follow_pfn(vma, user_address, &this_pfn); if (ret) break; -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 08/13] s390/pci: Remove races against pte updates
Way back it was a reasonable assumptions that iomem mappings never change the pfn range they point at. But this has changed: - gpu drivers dynamically manage their memory nowadays, invalidating ptes with unmap_mapping_range when buffers get moved - contiguous dma allocations have moved from dedicated carvetouts to cma regions. This means if we miss the unmap the pfn might contain pagecache or anon memory (well anything allocated with GFP_MOVEABLE) - even /dev/mem now invalidates mappings when the kernel requests that iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region") Accessing pfns obtained from ptes without holding all the locks is therefore no longer a good idea. Fix this. Since zpci_memcpy_from|toio seems to not do anything nefarious with locks we just need to open code get_pfn and follow_pfn and make sure we drop the locks only after we've done. The write function also needs the copy_from_user move, since we can't take userspace faults while holding the mmap sem. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Dan Williams Cc: Kees Cook Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Niklas Schnelle Cc: Gerald Schaefer Cc: linux-s...@vger.kernel.org --- arch/s390/pci/pci_mmio.c | 98 +++- 1 file changed, 57 insertions(+), 41 deletions(-) diff --git a/arch/s390/pci/pci_mmio.c b/arch/s390/pci/pci_mmio.c index 401cf670a243..4d194cb09372 100644 --- a/arch/s390/pci/pci_mmio.c +++ b/arch/s390/pci/pci_mmio.c @@ -119,33 +119,15 @@ static inline int __memcpy_toio_inuser(void __iomem *dst, return rc; } -static long get_pfn(unsigned long user_addr, unsigned long access, - unsigned long *pfn) -{ - struct vm_area_struct *vma; - long ret; - - mmap_read_lock(current->mm); - ret = -EINVAL; - vma = find_vma(current->mm, user_addr); - if (!vma) - goto out; - ret = -EACCES; - if (!(vma->vm_flags & access)) - goto out; - ret = follow_pfn(vma, user_addr, pfn); -out: - mmap_read_unlock(current->mm); - return ret; -} - SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, const void __user *, user_buffer, size_t, length) { u8 local_buf[64]; void __iomem *io_addr; void *buf; - unsigned long pfn; + struct vm_area_struct *vma; + pte_t *ptep; + spinlock_t *ptl; long ret; if (!zpci_is_enabled()) @@ -158,7 +140,7 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, * We only support write access to MIO capable devices if we are on * a MIO enabled system. Otherwise we would have to check for every * address if it is a special ZPCI_ADDR and would have to do -* a get_pfn() which we don't need for MIO capable devices. Currently +* a pfn lookup which we don't need for MIO capable devices. Currently * ISM devices are the only devices without MIO support and there is no * known need for accessing these from userspace. */ @@ -176,21 +158,37 @@ SYSCALL_DEFINE3(s390_pci_mmio_write, unsigned long, mmio_addr, } else buf = local_buf; - ret = get_pfn(mmio_addr, VM_WRITE, &pfn); + ret = -EFAULT; + if (copy_from_user(buf, user_buffer, length)) + goto out_free; + + mmap_read_lock(current->mm); + ret = -EINVAL; + vma = find_vma(current->mm, mmio_addr); + if (!vma) + goto out_unlock_mmap; + ret = -EACCES; + if (!(vma->vm_flags & VM_WRITE)) + goto out_unlock_mmap; + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) + goto out_unlock_mmap; + + ret = follow_pte_pmd(vma->vm_mm, mmio_addr, NULL, &ptep, NULL, &ptl); if (ret) - goto out; - io_addr = (void __iomem *)((pfn << PAGE_SHIFT) | + goto out_unlock_mmap; + + io_addr = (void __iomem *)((pte_pfn(*ptep) << PAGE_SHIFT) | (mmio_addr & ~PAGE_MASK)); - ret = -EFAULT; if ((unsigned long) io_addr < ZPCI_IOMAP_ADDR_BASE) - goto out; - - if (copy_from_user(buf, user_buffer, length)) - goto out; + goto out_unlock_pt; ret = zpci_memcpy_toio(io_addr, buf, length); -out: +out_unlock_pt: + pte_unmap_unlock(ptep, ptl); +out_unlock_mmap: + mmap_read_unlock(current->mm); +out_free: if (buf != local_buf) kfree(buf); return ret; @@ -274,7 +272,9 @@ SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr, u8 local_buf[64]; void __iomem *io_addr; void *buf; -
[PATCH 09/13] PCI: obey iomem restrictions for procfs mmap
There's three ways to access pci bars from userspace: /dev/mem, sysfs files, and the old proc interface. Two check against iomem_is_exclusive, proc never did. And with CONFIG_IO_STRICT_DEVMEM, this starts to matter, since we don't want random userspace having access to pci bars while a driver is loaded and using it. Fix this. References: 90a545e98126 ("restrict /dev/mem to idle io memory ranges") Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Kees Cook Cc: Dan Williams Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Bjorn Helgaas Cc: linux-...@vger.kernel.org --- drivers/pci/proc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c index d35186b01d98..3a2f90beb4cb 100644 --- a/drivers/pci/proc.c +++ b/drivers/pci/proc.c @@ -274,6 +274,11 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma) else return -EINVAL; } + + if (dev->resource[i].flags & IORESOURCE_MEM && + iomem_is_exclusive(dev->resource[i].start)) + return -EINVAL; + ret = pci_mmap_page_range(dev, i, vma, fpriv->mmap_state, write_combine); if (ret < 0) -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 02/13] drm/exynos: Use FOLL_LONGTERM for g2d cmdlists
The exynos g2d interface is very unusual, but it looks like the userptr objects are persistent. Hence they need FOLL_LONGTERM. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Inki Dae Cc: Joonyoung Shim Cc: Seung-Woo Kim Cc: Kyungmin Park Cc: Kukjin Kim Cc: Krzysztof Kozlowski Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- drivers/gpu/drm/exynos/exynos_drm_g2d.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c index c83f6faac9de..514fd000feb1 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c @@ -478,7 +478,8 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d, goto err_free; } - ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + ret = pin_user_pages_fast(start, npages, + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, g2d_userptr->pages); if (ret != npages) { DRM_DEV_ERROR(g2d->dev, -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 0/3] drm: commit_work scheduling
On Mon, Oct 5, 2020 at 5:15 AM Ville Syrjälä wrote: > > On Fri, Oct 02, 2020 at 10:55:52AM -0700, Rob Clark wrote: > > On Fri, Oct 2, 2020 at 4:05 AM Ville Syrjälä > > wrote: > > > > > > On Fri, Oct 02, 2020 at 01:52:56PM +0300, Ville Syrjälä wrote: > > > > On Thu, Oct 01, 2020 at 05:25:55PM +0200, Daniel Vetter wrote: > > > > > On Thu, Oct 1, 2020 at 5:15 PM Rob Clark wrote: > > > > > > > > > > > > I'm leaning towards converting the other drivers over to use the > > > > > > per-crtc kwork, and then dropping the 'commit_work` from atomic > > > > > > state. > > > > > > I can add a patch to that, but figured I could postpone that churn > > > > > > until there is some by-in on this whole idea. > > > > > > > > > > i915 has its own commit code, it's not even using the current commit > > > > > helpers (nor the commit_work). Not sure how much other fun there is. > > > > > > > > I don't think we want per-crtc threads for this in i915. Seems > > > > to me easier to guarantee atomicity across multiple crtcs if > > > > we just commit them from the same thread. > > > > > > Oh, and we may have to commit things in a very specific order > > > to guarantee the hw doesn't fall over, so yeah definitely per-crtc > > > thread is a no go. > > > > If I'm understanding the i915 code, this is only the case for modeset > > commits? I suppose we could achieve the same result by just deciding > > to pick the kthread of the first CRTC for modeset commits. I'm not > > really so much concerned about parallelism for modeset. > > I'm not entirely happy about the random differences between modesets > and other commits. Ideally we wouldn't need any. > > Anyways, even if we ignore modesets we still have the issue with > atomicity guarantees across multiple crtcs. So I think we still > don't want per-crtc threads, rather it should be thread for each > commit. I don't really see any other way to solve the priority inversion other than per-CRTC kthreads. I've been thinking about it a bit more, and my conclusion is: (1) There isn't really any use for the N+1'th commit to start running before the kthread_work for the N'th commit completes, so I don't mind losing the unbound aspect of the workqueue approach (2) For cases where there does need to be serialization between commits on different CRTCs, since there is a per-CRTC kthread, you could achieve this with locking Since i915 isn't using the atomic helpers here, I suppose it is an option for i915 to just continue doing what it is doing. And I could ofc just stop using the atomic commit helper and do the kthreads thing in msm. But my first preference would be that the commit helper does generally the right thing. BR, -R ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 05/13] mm/frame-vector: Use FOLL_LONGTERM
This is used by media/videbuf2 for persistent dma mappings, not just for a single dma operation and then freed again, so needs FOLL_LONGTERM. Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to locking issues. Rework the code to pull the pup path out from the mmap_sem critical section as suggested by Jason. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Figa Cc: Mauro Carvalho Chehab Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- mm/frame_vector.c | 36 +++- 1 file changed, 11 insertions(+), 25 deletions(-) diff --git a/mm/frame_vector.c b/mm/frame_vector.c index 10f82d5643b6..39db520a51dc 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, struct vm_area_struct *vma; int ret = 0; int err; - int locked; if (nr_frames == 0) return 0; @@ -48,35 +47,22 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, start = untagged_addr(start); + ret = pin_user_pages_fast(start, nr_frames, + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, + (struct page **)(vec->ptrs)); + if (ret > 0) { + vec->got_ref = true; + vec->is_pfns = false; + goto out_unlocked; + } + mmap_read_lock(mm); - locked = 1; vma = find_vma_intersection(mm, start, start + 1); if (!vma) { ret = -EFAULT; goto out; } - /* -* While get_vaddr_frames() could be used for transient (kernel -* controlled lifetime) pinning of memory pages all current -* users establish long term (userspace controlled lifetime) -* page pinning. Treat get_vaddr_frames() like -* get_user_pages_longterm() and disallow it for filesystem-dax -* mappings. -*/ - if (vma_is_fsdax(vma)) { - ret = -EOPNOTSUPP; - goto out; - } - - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) { - vec->got_ref = true; - vec->is_pfns = false; - ret = pin_user_pages_locked(start, nr_frames, - gup_flags, (struct page **)(vec->ptrs), &locked); - goto out; - } - vec->got_ref = false; vec->is_pfns = true; do { @@ -101,8 +87,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, vma = find_vma_intersection(mm, start, start + 1); } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP)); out: - if (locked) - mmap_read_unlock(mm); + mmap_read_unlock(mm); +out_unlocked: if (!ret) ret = -EFAULT; if (ret > 0) -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 07/13] mm: close race in generic_access_phys
Way back it was a reasonable assumptions that iomem mappings never change the pfn range they point at. But this has changed: - gpu drivers dynamically manage their memory nowadays, invalidating ptes with unmap_mapping_range when buffers get moved - contiguous dma allocations have moved from dedicated carvetouts to cma regions. This means if we miss the unmap the pfn might contain pagecache or anon memory (well anything allocated with GFP_MOVEABLE) - even /dev/mem now invalidates mappings when the kernel requests that iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region") Accessing pfns obtained from ptes without holding all the locks is therefore no longer a good idea. Fix this. Since ioremap might need to manipulate pagetables too we need to drop the pt lock and have a retry loop if we raced. While at it, also add kerneldoc and improve the comment for the vma_ops->access function. It's for accessing, not for moving the memory from iomem to system memory, as the old comment seemed to suggest. References: 28b2ee20c7cb ("access_process_vm device memory infrastructure") Cc: Jason Gunthorpe Cc: Dan Williams Cc: Kees Cook Cc: Rik van Riel Cc: Benjamin Herrensmidt Cc: Dave Airlie Cc: Hugh Dickins Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Signed-off-by: Daniel Vetter --- include/linux/mm.h | 3 ++- mm/memory.c| 44 ++-- 2 files changed, 44 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index acd60fbf1a5a..2a16631c1fda 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -566,7 +566,8 @@ struct vm_operations_struct { vm_fault_t (*pfn_mkwrite)(struct vm_fault *vmf); /* called by access_process_vm when get_user_pages() fails, typically -* for use by special VMAs that can switch between memory and hardware +* for use by special VMAs. See also generic_access_phys() for a generic +* implementation useful for any iomem mapping. */ int (*access)(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write); diff --git a/mm/memory.c b/mm/memory.c index fcfc4ca36eba..8d467e23b44e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4873,28 +4873,68 @@ int follow_phys(struct vm_area_struct *vma, return ret; } +/** + * generic_access_phys - generic implementation for iomem mmap access + * @vma: the vma to access + * @addr: userspace addres, not relative offset within @vma + * @buf: buffer to read/write + * @len: length of transfer + * @write: set to FOLL_WRITE when writing, otherwise reading + * + * This is a generic implementation for &vm_operations_struct.access for an + * iomem mapping. This callback is used by access_process_vm() when the @vma is + * not page based. + */ int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, void *buf, int len, int write) { resource_size_t phys_addr; unsigned long prot = 0; void __iomem *maddr; + pte_t *ptep, pte; + spinlock_t *ptl; int offset = addr & (PAGE_SIZE-1); + int ret = -EINVAL; + + if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) + return -EINVAL; + +retry: + if (follow_pte(vma->vm_mm, addr, &ptep, &ptl)) + return -EINVAL; + pte = *ptep; + pte_unmap_unlock(ptep, ptl); - if (follow_phys(vma, addr, write, &prot, &phys_addr)) + prot = pgprot_val(pte_pgprot(pte)); + phys_addr = (resource_size_t)pte_pfn(pte) << PAGE_SHIFT; + + if ((write & FOLL_WRITE) && !pte_write(pte)) return -EINVAL; maddr = ioremap_prot(phys_addr, PAGE_ALIGN(len + offset), prot); if (!maddr) return -ENOMEM; + if (follow_pte(vma->vm_mm, addr, &ptep, &ptl)) + goto out_unmap; + + if (pte_same(pte, *ptep)) { + pte_unmap_unlock(ptep, ptl); + iounmap(maddr); + + goto retry; + } + if (write) memcpy_toio(maddr + offset, buf, len); else memcpy_fromio(buf, maddr + offset, len); + ret = len; + pte_unmap_unlock(ptep, ptl); +out_unmap: iounmap(maddr); - return len; + return ret; } EXPORT_SYMBOL_GPL(generic_access_phys); #endif -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 03/13] misc/habana: Stop using frame_vector helpers
All we need are a pages array, pin_user_pages_fast can give us that directly. Plus this avoids the entire raw pfn side of get_vaddr_frames. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Oded Gabbay Cc: Omer Shpigelman Cc: Ofir Bitton Cc: Tomer Tayar Cc: Moti Haimovski Cc: Daniel Vetter Cc: Greg Kroah-Hartman Cc: Pawel Piskorski --- drivers/misc/habanalabs/Kconfig | 1 - drivers/misc/habanalabs/common/habanalabs.h | 3 +- drivers/misc/habanalabs/common/memory.c | 51 + 3 files changed, 23 insertions(+), 32 deletions(-) diff --git a/drivers/misc/habanalabs/Kconfig b/drivers/misc/habanalabs/Kconfig index 8eb5d38c618e..2f04187f7167 100644 --- a/drivers/misc/habanalabs/Kconfig +++ b/drivers/misc/habanalabs/Kconfig @@ -6,7 +6,6 @@ config HABANA_AI tristate "HabanaAI accelerators (habanalabs)" depends on PCI && HAS_IOMEM - select FRAME_VECTOR select DMA_SHARED_BUFFER select GENERIC_ALLOCATOR select HWMON diff --git a/drivers/misc/habanalabs/common/habanalabs.h b/drivers/misc/habanalabs/common/habanalabs.h index edbd627b29d2..c1b3ad613b15 100644 --- a/drivers/misc/habanalabs/common/habanalabs.h +++ b/drivers/misc/habanalabs/common/habanalabs.h @@ -881,7 +881,8 @@ struct hl_ctx_mgr { struct hl_userptr { enum vm_type_t vm_type; /* must be first */ struct list_headjob_node; - struct frame_vector *vec; + struct page **pages; + unsigned intnpages; struct sg_table *sgt; enum dma_data_direction dir; struct list_headdebugfs_list; diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c index 5ff4688683fd..ef89cfa2f95a 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -1281,45 +1281,41 @@ static int get_user_memory(struct hl_device *hdev, u64 addr, u64 size, return -EFAULT; } - userptr->vec = frame_vector_create(npages); - if (!userptr->vec) { + userptr->pages = kvmalloc_array(npages, sizeof(*userptr->pages), + GFP_KERNEL); + if (!userptr->pages) { dev_err(hdev->dev, "Failed to create frame vector\n"); return -ENOMEM; } - rc = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE, - userptr->vec); + rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, +userptr->pages); if (rc != npages) { dev_err(hdev->dev, "Failed to map host memory, user ptr probably wrong\n"); if (rc < 0) - goto destroy_framevec; + goto destroy_pages; + npages = rc; rc = -EFAULT; - goto put_framevec; - } - - if (frame_vector_to_pages(userptr->vec) < 0) { - dev_err(hdev->dev, - "Failed to translate frame vector to pages\n"); - rc = -EFAULT; - goto put_framevec; + goto put_pages; } + userptr->npages = npages; rc = sg_alloc_table_from_pages(userptr->sgt, - frame_vector_pages(userptr->vec), - npages, offset, size, GFP_ATOMIC); + userptr->pages, + npages, offset, size, GFP_ATOMIC); if (rc < 0) { dev_err(hdev->dev, "failed to create SG table from pages\n"); - goto put_framevec; + goto put_pages; } return 0; -put_framevec: - put_vaddr_frames(userptr->vec); -destroy_framevec: - frame_vector_destroy(userptr->vec); +put_pages: + unpin_user_pages(userptr->pages, npages); +destroy_pages: + kvfree(userptr->pages); return rc; } @@ -1405,7 +1401,7 @@ int hl_pin_host_memory(struct hl_device *hdev, u64 addr, u64 size, */ void hl_unpin_host_memory(struct hl_device *hdev, struct hl_userptr *userptr) { - struct page **pages; + int i; hl_debugfs_remove_userptr(hdev, userptr); @@ -1414,15 +1410,10 @@ void hl_unpin_host_memory(struct hl_device *hdev, struct hl_userptr *userptr) userptr->sgt->nents, userptr->dir); - pages = frame_vector_pages(userptr->vec); - if (!IS_ERR(pages)) { - int i; - - for (i = 0; i < frame_vector_count(userptr->v
[PATCH 13/13] vfio/type1: Mark follow_pfn as unsafe
The code seems to stuff these pfns into iommu pts (or something like that, I didn't follow), but there's no mmu_notifier to ensure that access is synchronized with pte updates. Hence mark these as unsafe. This means that with CONFIG_STRICT_FOLLOW_PFN, these will be rejected. Real fix is to wire up an mmu_notifier ... somehow. Probably means any invalidate is a fatal fault for this vfio device, but then this shouldn't ever happen if userspace is reasonable. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Kees Cook Cc: Dan Williams Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Alex Williamson Cc: Cornelia Huck Cc: k...@vger.kernel.org --- drivers/vfio/vfio_iommu_type1.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 5fbf0c1f7433..a4d53f3d0a35 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -421,7 +421,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, { int ret; - ret = follow_pfn(vma, vaddr, pfn); + ret = unsafe_follow_pfn(vma, vaddr, pfn); if (ret) { bool unlocked = false; @@ -435,7 +435,7 @@ static int follow_fault_pfn(struct vm_area_struct *vma, struct mm_struct *mm, if (ret) return ret; - ret = follow_pfn(vma, vaddr, pfn); + ret = unsafe_follow_pfn(vma, vaddr, pfn); } return ret; -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 04/13] misc/habana: Use FOLL_LONGTERM for userptr
These are persistent, not just for the duration of a dma operation. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Oded Gabbay Cc: Omer Shpigelman Cc: Ofir Bitton Cc: Tomer Tayar Cc: Moti Haimovski Cc: Daniel Vetter Cc: Greg Kroah-Hartman Cc: Pawel Piskorski --- drivers/misc/habanalabs/common/memory.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanalabs/common/memory.c index ef89cfa2f95a..94bef8faa82a 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -1288,7 +1288,8 @@ static int get_user_memory(struct hl_device *hdev, u64 addr, u64 size, return -ENOMEM; } - rc = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + rc = pin_user_pages_fast(start, npages, +FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, userptr->pages); if (rc != npages) { -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 11/13] mm: add unsafe_follow_pfn
Way back it was a reasonable assumptions that iomem mappings never change the pfn range they point at. But this has changed: - gpu drivers dynamically manage their memory nowadays, invalidating ptes with unmap_mapping_range when buffers get moved - contiguous dma allocations have moved from dedicated carvetouts to cma regions. This means if we miss the unmap the pfn might contain pagecache or anon memory (well anything allocated with GFP_MOVEABLE) - even /dev/mem now invalidates mappings when the kernel requests that iomem region when CONFIG_IO_STRICT_DEVMEM is set, see 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region") Accessing pfns obtained from ptes without holding all the locks is therefore no longer a good idea. Unfortunately there's some users where this is not fixable (like v4l userptr of iomem mappings) or involves a pile of work (vfio type1 iommu). For now annotate these as unsafe and splat appropriately. This patch adds an unsafe_follow_pfn, which later patches will then roll out to all appropriate places. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Kees Cook Cc: Dan Williams Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: k...@vger.kernel.org --- include/linux/mm.h | 2 ++ mm/memory.c| 32 +++- mm/nommu.c | 17 + security/Kconfig | 13 + 4 files changed, 63 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2a16631c1fda..ec8c90928fc9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1653,6 +1653,8 @@ int follow_pte_pmd(struct mm_struct *mm, unsigned long address, pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, + unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, unsigned int flags, unsigned long *prot, resource_size_t *phys); int generic_access_phys(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/memory.c b/mm/memory.c index 8d467e23b44e..8db7ad1c261c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4821,7 +4821,12 @@ EXPORT_SYMBOL(follow_pte_pmd); * @address: user virtual address * @pfn: location to store found PFN * - * Only IO mappings and raw PFN mappings are allowed. + * Only IO mappings and raw PFN mappings are allowed. Note that callers must + * ensure coherency with pte updates by using a &mmu_notifier to follow updates. + * If this is not feasible, or the access to the @pfn is only very short term, + * use follow_pte_pmd() instead and hold the pagetable lock for the duration of + * the access instead. Any caller not following these requirements must use + * unsafe_follow_pfn() instead. * * Return: zero and the pfn at @pfn on success, -ve otherwise. */ @@ -4844,6 +4849,31 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, } EXPORT_SYMBOL(follow_pfn); +/** + * unsafe_follow_pfn - look up PFN at a user virtual address + * @vma: memory mapping + * @address: user virtual address + * @pfn: location to store found PFN + * + * Only IO mappings and raw PFN mappings are allowed. + * + * Returns zero and the pfn at @pfn on success, -ve otherwise. + */ +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, + unsigned long *pfn) +{ +#ifdef CONFIG_STRICT_FOLLOW_PFN + pr_info("unsafe follow_pfn usage rejected, see CONFIG_STRICT_FOLLOW_PFN\n"); + return -EINVAL; +#else + WARN_ONCE(1, "unsafe follow_pfn usage\n"); + add_taint(TAINT_USER, LOCKDEP_STILL_OK); + + return follow_pfn(vma, address, pfn); +#endif +} +EXPORT_SYMBOL(unsafe_follow_pfn); + #ifdef CONFIG_HAVE_IOREMAP_PROT int follow_phys(struct vm_area_struct *vma, unsigned long address, unsigned int flags, diff --git a/mm/nommu.c b/mm/nommu.c index 75a327149af1..3db2910f0d64 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -132,6 +132,23 @@ int follow_pfn(struct vm_area_struct *vma, unsigned long address, } EXPORT_SYMBOL(follow_pfn); +/** + * unsafe_follow_pfn - look up PFN at a user virtual address + * @vma: memory mapping + * @address: user virtual address + * @pfn: location to store found PFN + * + * Only IO mappings and raw PFN mappings are allowed. + * + * Returns zero and the pfn at @pfn on success, -ve otherwise. + */ +int unsafe_follow_pfn(struct vm_area_struct *vma, unsigned long address, + unsigned long *pfn) +{ + return follow_pfn(vma, address, pfn); +} +EXPORT_SYMBOL(unsafe_follow_pfn); + LIST_HEAD(vmap_area_list); void vfree(const void *addr) diff --git a/security/Kconfig b/security/Kconfig inde
[PATCH 10/13] PCI: revoke mappings like devmem
Since 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region") /dev/kmem zaps ptes when the kernel requests exclusive acccess to an iomem region. And with CONFIG_IO_STRICT_DEVMEM, this is the default for all driver uses. Except there's two more ways to access pci bars: sysfs and proc mmap support. Let's plug that hole. For revoke_devmem() to work we need to link our vma into the same address_space, with consistent vma->vm_pgoff. ->pgoff is already adjusted, because that's how (io_)remap_pfn_range works, but for the mapping we need to adjust vma->vm_file->f_mapping. Usually that's done at ->open time, but that's a bit tricky here with all the entry points and arch code. So instead create a fake file and adjust vma->vm_file. Note this only works for ARCH_GENERIC_PCI_MMAP_RESOURCE. But that seems to be a subset of architectures support STRICT_DEVMEM, so we should be good. The only difference in access checks left is that sysfs pci mmap does not check for CAP_RAWIO. But I think that makes some sense compared to /dev/mem and proc, where one file gives you access to everything and no ownership applies. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Kees Cook Cc: Dan Williams Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Bjorn Helgaas Cc: linux-...@vger.kernel.org --- drivers/char/mem.c | 16 +++- drivers/pci/mmap.c | 3 +++ include/linux/ioport.h | 2 ++ 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/char/mem.c b/drivers/char/mem.c index abd4ffdc8cde..5e58a326d4ee 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -810,6 +810,7 @@ static loff_t memory_lseek(struct file *file, loff_t offset, int orig) } static struct inode *devmem_inode; +static struct vfsmount *devmem_vfs_mount; #ifdef CONFIG_IO_STRICT_DEVMEM void revoke_devmem(struct resource *res) @@ -843,6 +844,20 @@ void revoke_devmem(struct resource *res) unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 1); } + +struct file *devmem_getfile(void) +{ + struct file *file; + + file = alloc_file_pseudo(devmem_inode, devmem_vfs_mount, "devmem", +O_RDWR, &kmem_fops); + if (IS_ERR(file)) + return NULL; + + file->f_mapping = devmem_indoe->i_mapping; + + return file; +} #endif static int open_port(struct inode *inode, struct file *filp) @@ -1010,7 +1025,6 @@ static struct file_system_type devmem_fs_type = { static int devmem_init_inode(void) { - static struct vfsmount *devmem_vfs_mount; static int devmem_fs_cnt; struct inode *inode; int rc; diff --git a/drivers/pci/mmap.c b/drivers/pci/mmap.c index b8c9011987f4..63786cc9c746 100644 --- a/drivers/pci/mmap.c +++ b/drivers/pci/mmap.c @@ -7,6 +7,7 @@ * Author: David Woodhouse */ +#include #include #include #include @@ -64,6 +65,8 @@ int pci_mmap_resource_range(struct pci_dev *pdev, int bar, vma->vm_pgoff += (pci_resource_start(pdev, bar) >> PAGE_SHIFT); vma->vm_ops = &pci_phys_vm_ops; + fput(vma->vm_file); + vma->vm_file = devmem_getfile(); return io_remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6c2b06fe8beb..83238cba19fe 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -304,8 +304,10 @@ struct resource *request_free_mem_region(struct resource *base, #ifdef CONFIG_IO_STRICT_DEVMEM void revoke_devmem(struct resource *res); +struct file *devm_getfile(void); #else static inline void revoke_devmem(struct resource *res) { }; +static inline struct file *devmem_getfile(void) { return NULL; }; #endif #endif /* __ASSEMBLY__ */ -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 06/13] media: videobuf2: Move frame_vector into media subsystem
It's the only user. This also garbage collects the CONFIG_FRAME_VECTOR symbol from all over the tree (well just one place, somehow omap media driver still had this in its Kconfig, despite not using it). Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Pawel Osciak Cc: Marek Szyprowski Cc: Kyungmin Park Cc: Tomasz Figa Cc: Mauro Carvalho Chehab Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org Cc: Daniel Vetter --- drivers/media/common/videobuf2/Kconfig| 1 - drivers/media/common/videobuf2/Makefile | 1 + .../media/common/videobuf2}/frame_vector.c| 2 + drivers/media/platform/omap/Kconfig | 1 - include/linux/mm.h| 42 --- include/media/videobuf2-core.h| 42 +++ mm/Kconfig| 3 -- mm/Makefile | 1 - 8 files changed, 45 insertions(+), 48 deletions(-) rename {mm => drivers/media/common/videobuf2}/frame_vector.c (99%) diff --git a/drivers/media/common/videobuf2/Kconfig b/drivers/media/common/videobuf2/Kconfig index edbc99ebba87..d2223a12c95f 100644 --- a/drivers/media/common/videobuf2/Kconfig +++ b/drivers/media/common/videobuf2/Kconfig @@ -9,7 +9,6 @@ config VIDEOBUF2_V4L2 config VIDEOBUF2_MEMOPS tristate - select FRAME_VECTOR config VIDEOBUF2_DMA_CONTIG tristate diff --git a/drivers/media/common/videobuf2/Makefile b/drivers/media/common/videobuf2/Makefile index 77bebe8b202f..54306f8d096c 100644 --- a/drivers/media/common/videobuf2/Makefile +++ b/drivers/media/common/videobuf2/Makefile @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 videobuf2-common-objs := videobuf2-core.o +videobuf2-common-objs += frame_vector.o ifeq ($(CONFIG_TRACEPOINTS),y) videobuf2-common-objs += vb2-trace.o diff --git a/mm/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c similarity index 99% rename from mm/frame_vector.c rename to drivers/media/common/videobuf2/frame_vector.c index 39db520a51dc..b95f4f371681 100644 --- a/mm/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -8,6 +8,8 @@ #include #include +#include + /** * get_vaddr_frames() - map virtual addresses to pfns * @start: starting user address diff --git a/drivers/media/platform/omap/Kconfig b/drivers/media/platform/omap/Kconfig index f73b5893220d..de16de46c0f4 100644 --- a/drivers/media/platform/omap/Kconfig +++ b/drivers/media/platform/omap/Kconfig @@ -12,6 +12,5 @@ config VIDEO_OMAP2_VOUT depends on VIDEO_V4L2 select VIDEOBUF2_DMA_CONTIG select OMAP2_VRFB if ARCH_OMAP2 || ARCH_OMAP3 - select FRAME_VECTOR help V4L2 Display driver support for OMAP2/3 based boards. diff --git a/include/linux/mm.h b/include/linux/mm.h index 16b799a0522c..acd60fbf1a5a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1743,48 +1743,6 @@ int account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc); int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc, struct task_struct *task, bool bypass_rlim); -/* Container for pinned pfns / pages */ -struct frame_vector { - unsigned int nr_allocated; /* Number of frames we have space for */ - unsigned int nr_frames; /* Number of frames stored in ptrs array */ - bool got_ref; /* Did we pin pages by getting page ref? */ - bool is_pfns; /* Does array contain pages or pfns? */ - void *ptrs[]; /* Array of pinned pfns / pages. Use -* pfns_vector_pages() or pfns_vector_pfns() -* for access */ -}; - -struct frame_vector *frame_vector_create(unsigned int nr_frames); -void frame_vector_destroy(struct frame_vector *vec); -int get_vaddr_frames(unsigned long start, unsigned int nr_pfns, -unsigned int gup_flags, struct frame_vector *vec); -void put_vaddr_frames(struct frame_vector *vec); -int frame_vector_to_pages(struct frame_vector *vec); -void frame_vector_to_pfns(struct frame_vector *vec); - -static inline unsigned int frame_vector_count(struct frame_vector *vec) -{ - return vec->nr_frames; -} - -static inline struct page **frame_vector_pages(struct frame_vector *vec) -{ - if (vec->is_pfns) { - int err = frame_vector_to_pages(vec); - - if (err) - return ERR_PTR(err); - } - return (struct page **)(vec->ptrs); -} - -static inline unsigned long *frame_vector_pfns(struct frame_vector *vec) -{ - if (!vec->is_pfns) - frame_vector_to_pfns(vec); - return (unsigned long *)(vec->ptrs); -} - struct kvec; int get_kernel_pages(const struct kvec *iov, int nr_pages, i
[PATCH 01/13] drm/exynos: Stop using frame_vector helpers
All we need are a pages array, pin_user_pages_fast can give us that directly. Plus this avoids the entire raw pfn side of get_vaddr_frames. Signed-off-by: Daniel Vetter Cc: Jason Gunthorpe Cc: Inki Dae Cc: Joonyoung Shim Cc: Seung-Woo Kim Cc: Kyungmin Park Cc: Kukjin Kim Cc: Krzysztof Kozlowski Cc: Andrew Morton Cc: John Hubbard Cc: Jérôme Glisse Cc: Jan Kara Cc: Dan Williams Cc: linux...@kvack.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-samsung-...@vger.kernel.org Cc: linux-me...@vger.kernel.org --- drivers/gpu/drm/exynos/Kconfig | 1 - drivers/gpu/drm/exynos/exynos_drm_g2d.c | 48 - 2 files changed, 22 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig index 6417f374b923..43257ef3c09d 100644 --- a/drivers/gpu/drm/exynos/Kconfig +++ b/drivers/gpu/drm/exynos/Kconfig @@ -88,7 +88,6 @@ comment "Sub-drivers" config DRM_EXYNOS_G2D bool "G2D" depends on VIDEO_SAMSUNG_S5P_G2D=n || COMPILE_TEST - select FRAME_VECTOR help Choose this option if you want to use Exynos G2D for DRM. diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c index 967a5cdc120e..c83f6faac9de 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c @@ -205,7 +205,8 @@ struct g2d_cmdlist_userptr { dma_addr_t dma_addr; unsigned long userptr; unsigned long size; - struct frame_vector *vec; + struct page **pages; + unsigned intnpages; struct sg_table *sgt; atomic_trefcount; boolin_pool; @@ -378,7 +379,7 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d, bool force) { struct g2d_cmdlist_userptr *g2d_userptr = obj; - struct page **pages; + int i; if (!obj) return; @@ -398,15 +399,11 @@ static void g2d_userptr_put_dma_addr(struct g2d_data *g2d, dma_unmap_sgtable(to_dma_dev(g2d->drm_dev), g2d_userptr->sgt, DMA_BIDIRECTIONAL, 0); - pages = frame_vector_pages(g2d_userptr->vec); - if (!IS_ERR(pages)) { - int i; + for (i = 0; i < g2d_userptr->npages; i++) + set_page_dirty_lock(g2d_userptr->pages[i]); - for (i = 0; i < frame_vector_count(g2d_userptr->vec); i++) - set_page_dirty_lock(pages[i]); - } - put_vaddr_frames(g2d_userptr->vec); - frame_vector_destroy(g2d_userptr->vec); + unpin_user_pages(g2d_userptr->pages, g2d_userptr->npages); + kvfree(g2d_userptr->pages); if (!g2d_userptr->out_of_list) list_del_init(&g2d_userptr->list); @@ -474,35 +471,34 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d, offset = userptr & ~PAGE_MASK; end = PAGE_ALIGN(userptr + size); npages = (end - start) >> PAGE_SHIFT; - g2d_userptr->vec = frame_vector_create(npages); - if (!g2d_userptr->vec) { + g2d_userptr->pages = kvmalloc_array(npages, sizeof(*g2d_userptr->pages), + GFP_KERNEL); + if (!g2d_userptr->pages) { ret = -ENOMEM; goto err_free; } - ret = get_vaddr_frames(start, npages, FOLL_FORCE | FOLL_WRITE, - g2d_userptr->vec); + ret = pin_user_pages_fast(start, npages, FOLL_FORCE | FOLL_WRITE, + g2d_userptr->pages); if (ret != npages) { DRM_DEV_ERROR(g2d->dev, "failed to get user pages from userptr.\n"); if (ret < 0) - goto err_destroy_framevec; - ret = -EFAULT; - goto err_put_framevec; - } - if (frame_vector_to_pages(g2d_userptr->vec) < 0) { + goto err_destroy_pages; + npages = ret; ret = -EFAULT; - goto err_put_framevec; + goto err_unpin_pages; } + g2d_userptr->npages = npages; sgt = kzalloc(sizeof(*sgt), GFP_KERNEL); if (!sgt) { ret = -ENOMEM; - goto err_put_framevec; + goto err_unpin_pages; } ret = sg_alloc_table_from_pages(sgt, - frame_vector_pages(g2d_userptr->vec), + g2d_userptr->pages, npages, offset, size, GFP_KERNEL); if (ret < 0) { DRM_DEV_ERROR(g2d->dev, "failed to get sgt from pages.\n"); @@ -538,11 +534,11 @@ static dma_addr_t *g2d_userptr_get_dma_addr(struct g2d_data *g2d, err_free_sgt: kfree(sgt); -err_put_framevec: - put_
[PATCH 00/13] follow_pfn and other iomap races
Hi all, This developed from a discussion with Jason, starting with some patches touching get_vaddr_frame that I typed up. The problem is that way back VM_IO | VM_PFNMAP mappings were pretty static, and so just following the ptes to derive a pfn and then use that somewhere else was ok. But we're no longer in such a world, there's tons of little races and some fundamental problems. This series here is an attempt to at least scope the problem, it's all the issues I've found with quite some code reading all over the tree: - first part tries to move mm/frame-vector.c away, it's fundamentally an unsafe thing - two patches to close follow_pfn races by holding pt locks - two pci patches where I spotted inconsinstencies between the 3 different ways userspace can map pci bars - and finally some patches to mark up the remaining issue No testing beyond "it compiles", this is very much an rfc to figure out whether this makes sense, whether it's a real thing, and how to fix this up properly. Cheers, Daniel Daniel Vetter (13): drm/exynos: Stop using frame_vector helpers drm/exynos: Use FOLL_LONGTERM for g2d cmdlists misc/habana: Stop using frame_vector helpers misc/habana: Use FOLL_LONGTERM for userptr mm/frame-vector: Use FOLL_LONGTERM media: videobuf2: Move frame_vector into media subsystem mm: close race in generic_access_phys s390/pci: Remove races against pte updates PCI: obey iomem restrictions for procfs mmap PCI: revoke mappings like devmem mm: add unsafe_follow_pfn media/videbuf1|2: Mark follow_pfn usage as unsafe vfio/type1: Mark follow_pfn as unsafe arch/s390/pci/pci_mmio.c | 98 +++ drivers/char/mem.c| 16 ++- drivers/gpu/drm/exynos/Kconfig| 1 - drivers/gpu/drm/exynos/exynos_drm_g2d.c | 49 +- drivers/media/common/videobuf2/Kconfig| 1 - drivers/media/common/videobuf2/Makefile | 1 + .../media/common/videobuf2}/frame_vector.c| 40 +++- drivers/media/platform/omap/Kconfig | 1 - drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +- drivers/misc/habanalabs/Kconfig | 1 - drivers/misc/habanalabs/common/habanalabs.h | 3 +- drivers/misc/habanalabs/common/memory.c | 52 +- drivers/pci/mmap.c| 3 + drivers/pci/proc.c| 5 + drivers/vfio/vfio_iommu_type1.c | 4 +- include/linux/ioport.h| 2 + include/linux/mm.h| 47 + include/media/videobuf2-core.h| 42 mm/Kconfig| 3 - mm/Makefile | 1 - mm/memory.c | 76 +- mm/nommu.c| 17 security/Kconfig | 13 +++ 23 files changed, 296 insertions(+), 182 deletions(-) rename {mm => drivers/media/common/videobuf2}/frame_vector.c (90%) -- 2.28.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 14/14] drm/amd/pm: Replace one-element array with flexible-array in struct ATOM_Vega10_GFXCLK_Dependency_Table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use a flexible-array member in struct ATOM_Vega10_GFXCLK_Dependency_Table instead of a one-element array. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7d61dd.o8jxxi5c6p9fob%2fd%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h index c934e9612c1b..a6968009acc4 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_pptable.h @@ -163,7 +163,7 @@ typedef struct _ATOM_Vega10_MCLK_Dependency_Record { typedef struct _ATOM_Vega10_GFXCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ -ATOM_Vega10_GFXCLK_Dependency_Record entries[1];/* Dynamically allocate entries. */ +ATOM_Vega10_GFXCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Vega10_GFXCLK_Dependency_Table; typedef struct _ATOM_Vega10_MCLK_Dependency_Table { -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 13/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_pcie_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_ppt_v1_pcie_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7db0bc.7xivn4k83f7xw0ug%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h| 2 +- .../powerplay/hwmgr/process_pptables_v1_0.c | 22 --- .../powerplay/hwmgr/vega10_processpptables.c | 10 +++-- 3 files changed, 13 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h index e11298cdeb30..729615aff126 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h @@ -103,7 +103,7 @@ typedef struct phm_ppt_v1_pcie_record phm_ppt_v1_pcie_record; struct phm_ppt_v1_pcie_table { uint32_t count;/* Number of entries. */ - phm_ppt_v1_pcie_record entries[1]; /* Dynamically allocate count entries. */ + phm_ppt_v1_pcie_record entries[]; /* Dynamically allocate count entries. */ }; typedef struct phm_ppt_v1_pcie_table phm_ppt_v1_pcie_table; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index 426655b9c678..4fa58614e26a 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -478,7 +478,7 @@ static int get_pcie_table( PPTable_Generic_SubTable_Header const *ptable ) { - uint32_t table_size, i, pcie_count; + uint32_t i, pcie_count; phm_ppt_v1_pcie_table *pcie_table; struct phm_ppt_v1_information *pp_table_information = (struct phm_ppt_v1_information *)(hwmgr->pptable); @@ -491,12 +491,10 @@ static int get_pcie_table( PP_ASSERT_WITH_CODE((atom_pcie_table->ucNumEntries != 0), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_pcie_record) * atom_pcie_table->ucNumEntries; - - pcie_table = kzalloc(table_size, GFP_KERNEL); - - if (pcie_table == NULL) + pcie_table = kzalloc(struct_size(pcie_table, entries, +atom_pcie_table->ucNumEntries), +GFP_KERNEL); + if (!pcie_table) return -ENOMEM; /* @@ -530,12 +528,10 @@ static int get_pcie_table( PP_ASSERT_WITH_CODE((atom_pcie_table->ucNumEntries != 0), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_pcie_record) * atom_pcie_table->ucNumEntries; - - pcie_table = kzalloc(table_size, GFP_KERNEL); - - if (pcie_table == NULL) + pcie_table = kzalloc(struct_size(pcie_table, entries, +atom_pcie_table->ucNumEntries), +GFP_KERNEL); + if (!pcie_table) return -ENOMEM; /* diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c index 3d7f915381c8..535404de78a2 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c @@ -784,7 +784,7 @@ static int get_pcie_table(struct pp_hwmgr *hwmgr, struct phm_ppt_v1_pcie_table **vega10_pcie_table, const Vega10_PPTable_Generic_SubTable_Header *table) { - uint32_t table_size, i, pcie_count; + uint32_t i, pcie_count; struct phm_ppt_v1_pcie_table *pcie_table; struct phm_ppt_v2_information *table_info = (struct phm_ppt_v2_information *)(hwmgr->pptable); @@ -795,12 +795,8 @@ static int get_pcie_table(struct pp_hwmgr *hwmgr, "Invalid PowerPlay Table!", return 0); - table_size = sizeof(uint32_t) + - sizeof(struc
[PATCH 11/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_mm_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_ppt_v1_mm_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7d61e2.qitvtyg2pvog8bb0%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h| 2 +- .../amd/pm/powerplay/hwmgr/process_pptables_v1_0.c| 11 --- .../amd/pm/powerplay/hwmgr/vega10_processpptables.c | 9 +++-- 3 files changed, 8 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h index c167083b0872..923cc04e405a 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h @@ -71,7 +71,7 @@ typedef struct phm_ppt_v1_mm_clock_voltage_dependency_record phm_ppt_v1_mm_clock struct phm_ppt_v1_mm_clock_voltage_dependency_table { uint32_t count; /* Number of entries. */ - phm_ppt_v1_mm_clock_voltage_dependency_record entries[1]; /* Dynamically allocate count entries. */ + phm_ppt_v1_mm_clock_voltage_dependency_record entries[]; /* Dynamically allocate count entries. */ }; typedef struct phm_ppt_v1_mm_clock_voltage_dependency_table phm_ppt_v1_mm_clock_voltage_dependency_table; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index 0725531fbfff..5d8016cd1986 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -678,19 +678,16 @@ static int get_mm_clock_voltage_table( const ATOM_Tonga_MM_Dependency_Table * mm_dependency_table ) { - uint32_t table_size, i; + uint32_t i; const ATOM_Tonga_MM_Dependency_Record *mm_dependency_record; phm_ppt_v1_mm_clock_voltage_dependency_table *mm_table; phm_ppt_v1_mm_clock_voltage_dependency_record *mm_table_record; PP_ASSERT_WITH_CODE((0 != mm_dependency_table->ucNumEntries), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_mm_clock_voltage_dependency_record) - * mm_dependency_table->ucNumEntries; - mm_table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == mm_table) + mm_table = kzalloc(struct_size(mm_table, entries, mm_dependency_table->ucNumEntries), + GFP_KERNEL); + if (!mm_table) return -ENOMEM; mm_table->count = mm_dependency_table->ucNumEntries; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c index 787b23fa25e7..4f6a73a2cf28 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c @@ -344,18 +344,15 @@ static int get_mm_clock_voltage_table( phm_ppt_v1_mm_clock_voltage_dependency_table **vega10_mm_table, const ATOM_Vega10_MM_Dependency_Table *mm_dependency_table) { - uint32_t table_size, i; + uint32_t i; const ATOM_Vega10_MM_Dependency_Record *mm_dependency_record; phm_ppt_v1_mm_clock_voltage_dependency_table *mm_table; PP_ASSERT_WITH_CODE((mm_dependency_table->ucNumEntries != 0), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_mm_clock_voltage_dependency_record) * - mm_dependency_table->ucNumEntries; - mm_table = kzalloc(table_size, GFP_KERNEL); - + mm_table = kzalloc(struct_size(mm_table, entries, mm_dependency_table->ucNumEntries), + GFP_KERNEL); if (!mm_table) return -ENOMEM; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 3/7] dt-bindings: display: mxsfb: Add a bus-width endpoint property
On Wed, 07 Oct 2020 04:24:34 +0300, Laurent Pinchart wrote: > When the PCB routes the display data signals in an unconventional way, > the output bus width may differ from the bus width of the connected > panel or encoder. For instance, when a 18-bit RGB panel has its R[5:0], > G[5:0] and B[5:0] signals connected to LCD_DATA[7:2], LCD_DATA[15:10] > and LCD_DATA[23:18], the output bus width is 24 instead of 18 when the > signals are routed to LCD_DATA[5:0], LCD_DATA[11:6] and LCD_DATA[17:12]. > > Add a bus-width property to describe this data routing. > > Signed-off-by: Laurent Pinchart > --- > Changes since v1: > > - Fix property name in binding > --- > .../devicetree/bindings/display/fsl,lcdif.yaml | 12 > 1 file changed, 12 insertions(+) > Reviewed-by: Rob Herring ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 12/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_voltage_lookup_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_ppt_v1_voltage_lookup_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7d61df.jwrffnjxgbjskpop%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h | 2 +- .../drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c | 10 +++--- .../amd/pm/powerplay/hwmgr/vega10_processpptables.c| 10 +++--- 3 files changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h index 923cc04e405a..e11298cdeb30 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h @@ -86,7 +86,7 @@ typedef struct phm_ppt_v1_voltage_lookup_record phm_ppt_v1_voltage_lookup_record struct phm_ppt_v1_voltage_lookup_table { uint32_t count; - phm_ppt_v1_voltage_lookup_record entries[1];/* Dynamically allocate count entries. */ + phm_ppt_v1_voltage_lookup_record entries[];/* Dynamically allocate count entries. */ }; typedef struct phm_ppt_v1_voltage_lookup_table phm_ppt_v1_voltage_lookup_table; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index 5d8016cd1986..426655b9c678 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -157,7 +157,7 @@ static int get_vddc_lookup_table( uint32_t max_levels ) { - uint32_t table_size, i; + uint32_t i; phm_ppt_v1_voltage_lookup_table *table; phm_ppt_v1_voltage_lookup_record *record; ATOM_Tonga_Voltage_Lookup_Record *atom_record; @@ -165,12 +165,8 @@ static int get_vddc_lookup_table( PP_ASSERT_WITH_CODE((0 != vddc_lookup_pp_tables->ucNumEntries), "Invalid CAC Leakage PowerPlay Table!", return 1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_voltage_lookup_record) * max_levels; - - table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == table) + table = kzalloc(struct_size(table, entries, max_levels), GFP_KERNEL); + if (!table) return -ENOMEM; table->count = vddc_lookup_pp_tables->ucNumEntries; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c index 4f6a73a2cf28..3d7f915381c8 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c @@ -1040,18 +1040,14 @@ static int get_vddc_lookup_table( const ATOM_Vega10_Voltage_Lookup_Table *vddc_lookup_pp_tables, uint32_t max_levels) { - uint32_t table_size, i; + uint32_t i; phm_ppt_v1_voltage_lookup_table *table; PP_ASSERT_WITH_CODE((vddc_lookup_pp_tables->ucNumEntries != 0), "Invalid SOC_VDDD Lookup Table!", return 1); - table_size = sizeof(uint32_t) + - sizeof(phm_ppt_v1_voltage_lookup_record) * max_levels; - - table = kzalloc(table_size, GFP_KERNEL); - - if (table == NULL) + table = kzalloc(struct_size(table, entries, max_levels), GFP_KERNEL); + if (!table) return -ENOMEM; table->count = vddc_lookup_pp_tables->ucNumEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML
On Wed, Oct 07, 2020 at 11:00:20AM -0500, Rob Herring wrote: > On Wed, Oct 07, 2020 at 04:24:32AM +0300, Laurent Pinchart wrote: > > Convert the mxsfb binding to YAML. The deprecated binding is dropped, as > > neither the DT sources nor the driver support it anymore. The converted > > binding is named fsl,lcdif.yaml to match the usual bindings naming > > scheme. > > > > The compatible strings are messy, and DT sources use different kinds of > > combination of documented and undocumented values. Keep it simple for > > now, and update the example to make it valid. Aligning the binding with > > the existing DT sources will be performed separately. > > > > Signed-off-by: Laurent Pinchart > > Reviewed-by: Sam Ravnborg > > -- > > Changes since v1: > > > > - Drop unneeded quotes in string > > - Replace minItems with maxItems in conditional check > > - Add blank line before ... > > - Squash the rename in this commit > > --- > > .../bindings/display/fsl,lcdif.yaml | 116 ++ > > .../devicetree/bindings/display/mxsfb.txt | 87 - > > MAINTAINERS | 2 +- > > 3 files changed, 117 insertions(+), 88 deletions(-) > > create mode 100644 Documentation/devicetree/bindings/display/fsl,lcdif.yaml > > delete mode 100644 Documentation/devicetree/bindings/display/mxsfb.txt > > > > diff --git a/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > > b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > > new file mode 100644 > > index ..063bb8c58114 > > --- /dev/null > > +++ b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > > @@ -0,0 +1,116 @@ > > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > > +%YAML 1.2 > > +--- > > +$id: http://devicetree.org/schemas/display/fsl,lcdif.yaml# > > +$schema: http://devicetree.org/meta-schemas/core.yaml# > > + > > +title: Freescale/NXP i.MX LCD Interface (LCDIF) > > + > > +maintainers: > > + - Marek Vasut > > + - Stefan Agner > > + > > +description: | > > + (e)LCDIF display controller found in the Freescale/NXP i.MX SoCs. > > + > > +properties: > > + compatible: > > +enum: > > + - fsl,imx23-lcdif > > + - fsl,imx28-lcdif > > + - fsl,imx6sx-lcdif > > + - fsl,imx8mq-lcdif > > + > > + reg: > > +maxItems: 1 > > + > > + clocks: > > +items: > > + - description: Pixel clock > > + - description: Bus clock > > + - description: Display AXI clock > > +minItems: 1 > > + > > + clock-names: > > +items: > > + - const: pix > > + - const: axi > > + - const: disp_axi > > +minItems: 1 > > + > > + interrupts: > > +maxItems: 1 > > + > > + port: > > +description: The LCDIF output port > > +type: object > > + > > +properties: > > + endpoint: > > What happened on the graph binding schema work? I started a meta-schema > for it BTW. > > You can drop all the endpoint parts. With that, NM, I see in patch 3 you need it. > > Reviewed-by: Rob Herring > > > +type: object > > + > > +properties: > > + remote-endpoint: > > +$ref: /schemas/types.yaml#/definitions/phandle > > + > > +required: > > + - remote-endpoint > > + > > +additionalProperties: false > > + > > +additionalProperties: false > > + > > +required: > > + - compatible > > + - reg > > + - clocks > > + - interrupts > > + - port > > + > > +additionalProperties: false > > + > > +allOf: > > + - if: > > + properties: > > +compatible: > > + contains: > > +const: fsl,imx6sx-lcdif > > +then: > > + properties: > > +clocks: > > + minItems: 2 > > + maxItems: 3 > > +clock-names: > > + minItems: 2 > > + maxItems: 3 > > + required: > > +- clock-names > > +else: > > + properties: > > +clocks: > > + maxItems: 1 > > +clock-names: > > + maxItems: 1 > > + > > +examples: > > + - | > > +#include > > +#include > > + > > +display-controller@222 { > > +compatible = "fsl,imx6sx-lcdif"; > > +reg = <0x0222 0x4000>; > > +interrupts = ; > > +clocks = <&clks IMX6SX_CLK_LCDIF1_PIX>, > > + <&clks IMX6SX_CLK_LCDIF_APB>, > > + <&clks IMX6SX_CLK_DISPLAY_AXI>; > > +clock-names = "pix", "axi", "disp_axi"; > > + > > +port { > > +endpoint { > > +remote-endpoint = <&panel_in>; > > +}; > > +}; > > +}; > > + > > +... > > diff --git a/Documentation/devicetree/bindings/display/mxsfb.txt > > b/Documentation/devicetree/bindings/display/mxsfb.txt > > deleted file mode 100644 > > index c985871c46b3.. > > --- a/Documentation/devicetree/bindings/display/mxsfb.txt > > +++ /dev/null > > @@ -1,87 +0,0 @@ > > -* Freescale MXS LCD Interface (LCDIF) > > - > > -New bindings: > > -= > > -
[PATCH 10/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_ppt_v1_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: Signed-off-by: Gustavo A. R. Silva --- .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h| 2 +- .../powerplay/hwmgr/process_pptables_v1_0.c | 31 .../powerplay/hwmgr/vega10_processpptables.c | 50 ++- 3 files changed, 27 insertions(+), 56 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h index c0193e09d58a..c167083b0872 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h @@ -48,7 +48,7 @@ typedef struct phm_ppt_v1_clock_voltage_dependency_record phm_ppt_v1_clock_volta struct phm_ppt_v1_clock_voltage_dependency_table { uint32_t count;/* Number of entries. */ - phm_ppt_v1_clock_voltage_dependency_record entries[1]; /* Dynamically allocate count entries. */ + phm_ppt_v1_clock_voltage_dependency_record entries[]; /* Dynamically allocate count entries. */ }; typedef struct phm_ppt_v1_clock_voltage_dependency_table phm_ppt_v1_clock_voltage_dependency_table; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index 52188f6cd150..0725531fbfff 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -367,7 +367,7 @@ static int get_mclk_voltage_dependency_table( ATOM_Tonga_MCLK_Dependency_Table const *mclk_dep_table ) { - uint32_t table_size, i; + uint32_t i; phm_ppt_v1_clock_voltage_dependency_table *mclk_table; phm_ppt_v1_clock_voltage_dependency_record *mclk_table_record; ATOM_Tonga_MCLK_Dependency_Record *mclk_dep_record; @@ -375,12 +375,9 @@ static int get_mclk_voltage_dependency_table( PP_ASSERT_WITH_CODE((0 != mclk_dep_table->ucNumEntries), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + sizeof(phm_ppt_v1_clock_voltage_dependency_record) - * mclk_dep_table->ucNumEntries; - - mclk_table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == mclk_table) + mclk_table = kzalloc(struct_size(mclk_table, entries, mclk_dep_table->ucNumEntries), +GFP_KERNEL); + if (!mclk_table) return -ENOMEM; mclk_table->count = (uint32_t)mclk_dep_table->ucNumEntries; @@ -410,7 +407,7 @@ static int get_sclk_voltage_dependency_table( PPTable_Generic_SubTable_Header const *sclk_dep_table ) { - uint32_t table_size, i; + uint32_t i; phm_ppt_v1_clock_voltage_dependency_table *sclk_table; phm_ppt_v1_clock_voltage_dependency_record *sclk_table_record; @@ -422,12 +419,9 @@ static int get_sclk_voltage_dependency_table( PP_ASSERT_WITH_CODE((0 != tonga_table->ucNumEntries), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + sizeof(phm_ppt_v1_clock_voltage_dependency_record) - * tonga_table->ucNumEntries; - - sclk_table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == sclk_table) + sclk_table = kzalloc(struct_size(sclk_table, entries, tonga_table->ucNumEntries), +GFP_KERNEL); + if (!sclk_table) return -ENOMEM; sclk_table->count = (uint32_t)tonga_table->ucNumEntries; @@ -454,12 +448,9 @@ static int get_sclk_voltage_dependency_table( PP_ASSERT_WITH_CODE((0 != polaris_table->ucNumEntries), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + sizeof(phm_ppt_v1_clock_voltage_dependency_record) - * polaris_table->ucNumEntries; - - sclk_table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == sclk_table) + sclk_table = kzalloc(struct_size(sclk_table, entries, polaris_table->ucNumEntries), +GFP_KERNEL); +
Re: [PATCH v2 2/7] dt-bindings: display: mxsfb: Add and fix compatible strings
On Wed, 07 Oct 2020 04:24:33 +0300, Laurent Pinchart wrote: > Additional compatible strings have been added in DT source for the > i.MX6SL, i.MX6SLL, i.MX6UL and i.MX7D without updating the bindings. > Most of the upstream DT sources use the fsl,imx28-lcdif compatible > string, which mostly predates the realization that the LCDIF in the > i.MX6 and newer SoCs have extra features compared to the i.MX28. > > Update the bindings to add the missing compatible strings, with the > correct fallback values. This fails to validate some of the upstream DT > sources. Instead of adding the incorrect compatible fallback to the > binding, the sources should be updated separately. > > Signed-off-by: Laurent Pinchart > Reviewed-by: Sam Ravnborg > --- > Changes since v1: > > - Fix indentation under enum > --- > .../devicetree/bindings/display/fsl,lcdif.yaml | 18 +- > 1 file changed, 13 insertions(+), 5 deletions(-) > Reviewed-by: Rob Herring ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 09/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_samu_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_samu_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying table->numEntries by sizeof(struct phm_samu_clock_voltage_dependency_table) when it should have been multiplied it by sizeof(struct phm_samu_clock_voltage_dependency_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c5d3a.rym4gmzr3e0jezy+%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h| 2 +- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 11 --- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index 7e0c948a7097..dad703ba0522 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -404,7 +404,7 @@ struct phm_samu_clock_voltage_dependency_record { struct phm_samu_clock_voltage_dependency_table { uint8_t count; - struct phm_samu_clock_voltage_dependency_record entries[1]; + struct phm_samu_clock_voltage_dependency_record entries[]; }; struct phm_cac_tdp_table { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index e059802d1e25..48d550d26c6a 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1163,15 +1163,12 @@ static int get_samu_clock_voltage_limit_table(struct pp_hwmgr *hwmgr, struct phm_samu_clock_voltage_dependency_table **ptable, const ATOM_PPLIB_SAMClk_Voltage_Limit_Table *table) { - unsigned long table_size, i; + unsigned long i; struct phm_samu_clock_voltage_dependency_table *samu_table; - table_size = sizeof(unsigned long) + - sizeof(struct phm_samu_clock_voltage_dependency_table) * - table->numEntries; - - samu_table = kzalloc(table_size, GFP_KERNEL); - if (NULL == samu_table) + samu_table = kzalloc(struct_size(samu_table, entries, table->numEntries), +GFP_KERNEL); + if (!samu_table) return -ENOMEM; samu_table->count = table->numEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 08/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_cac_leakage_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_cac_leakage_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying table->ucNumEntries by sizeof(struct phm_cac_leakage_table) when it should have been multiplied it by sizeof(struct phm_cac_leakage_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c5d38.it%2fqtjn+659xudo5%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h | 2 +- .../drm/amd/pm/powerplay/hwmgr/processpptables.c| 13 + 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index b8e33325fac6..7e0c948a7097 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -393,7 +393,7 @@ union phm_cac_leakage_record { struct phm_cac_leakage_table { uint32_t count; - union phm_cac_leakage_record entries[1]; + union phm_cac_leakage_record entries[]; }; struct phm_samu_clock_voltage_dependency_record { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index 7719f52e6d52..e059802d1e25 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1384,17 +1384,14 @@ static int get_cac_leakage_table(struct pp_hwmgr *hwmgr, const ATOM_PPLIB_CAC_Leakage_Table *table) { struct phm_cac_leakage_table *cac_leakage_table; - unsigned longtable_size, i; + unsigned long i; - if (hwmgr == NULL || table == NULL || ptable == NULL) + if (!hwmgr || !table || !ptable) return -EINVAL; - table_size = sizeof(ULONG) + - (sizeof(struct phm_cac_leakage_table) * table->ucNumEntries); - - cac_leakage_table = kzalloc(table_size, GFP_KERNEL); - - if (cac_leakage_table == NULL) + cac_leakage_table = kzalloc(struct_size(cac_leakage_table, entries, table->ucNumEntries), + GFP_KERNEL); + if (!cac_leakage_table) return -ENOMEM; cac_leakage_table->count = (ULONG)table->ucNumEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML
On Wed, Oct 07, 2020 at 04:24:32AM +0300, Laurent Pinchart wrote: > Convert the mxsfb binding to YAML. The deprecated binding is dropped, as > neither the DT sources nor the driver support it anymore. The converted > binding is named fsl,lcdif.yaml to match the usual bindings naming > scheme. > > The compatible strings are messy, and DT sources use different kinds of > combination of documented and undocumented values. Keep it simple for > now, and update the example to make it valid. Aligning the binding with > the existing DT sources will be performed separately. > > Signed-off-by: Laurent Pinchart > Reviewed-by: Sam Ravnborg > -- > Changes since v1: > > - Drop unneeded quotes in string > - Replace minItems with maxItems in conditional check > - Add blank line before ... > - Squash the rename in this commit > --- > .../bindings/display/fsl,lcdif.yaml | 116 ++ > .../devicetree/bindings/display/mxsfb.txt | 87 - > MAINTAINERS | 2 +- > 3 files changed, 117 insertions(+), 88 deletions(-) > create mode 100644 Documentation/devicetree/bindings/display/fsl,lcdif.yaml > delete mode 100644 Documentation/devicetree/bindings/display/mxsfb.txt > > diff --git a/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > new file mode 100644 > index ..063bb8c58114 > --- /dev/null > +++ b/Documentation/devicetree/bindings/display/fsl,lcdif.yaml > @@ -0,0 +1,116 @@ > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) > +%YAML 1.2 > +--- > +$id: http://devicetree.org/schemas/display/fsl,lcdif.yaml# > +$schema: http://devicetree.org/meta-schemas/core.yaml# > + > +title: Freescale/NXP i.MX LCD Interface (LCDIF) > + > +maintainers: > + - Marek Vasut > + - Stefan Agner > + > +description: | > + (e)LCDIF display controller found in the Freescale/NXP i.MX SoCs. > + > +properties: > + compatible: > +enum: > + - fsl,imx23-lcdif > + - fsl,imx28-lcdif > + - fsl,imx6sx-lcdif > + - fsl,imx8mq-lcdif > + > + reg: > +maxItems: 1 > + > + clocks: > +items: > + - description: Pixel clock > + - description: Bus clock > + - description: Display AXI clock > +minItems: 1 > + > + clock-names: > +items: > + - const: pix > + - const: axi > + - const: disp_axi > +minItems: 1 > + > + interrupts: > +maxItems: 1 > + > + port: > +description: The LCDIF output port > +type: object > + > +properties: > + endpoint: What happened on the graph binding schema work? I started a meta-schema for it BTW. You can drop all the endpoint parts. With that, Reviewed-by: Rob Herring > +type: object > + > +properties: > + remote-endpoint: > +$ref: /schemas/types.yaml#/definitions/phandle > + > +required: > + - remote-endpoint > + > +additionalProperties: false > + > +additionalProperties: false > + > +required: > + - compatible > + - reg > + - clocks > + - interrupts > + - port > + > +additionalProperties: false > + > +allOf: > + - if: > + properties: > +compatible: > + contains: > +const: fsl,imx6sx-lcdif > +then: > + properties: > +clocks: > + minItems: 2 > + maxItems: 3 > +clock-names: > + minItems: 2 > + maxItems: 3 > + required: > +- clock-names > +else: > + properties: > +clocks: > + maxItems: 1 > +clock-names: > + maxItems: 1 > + > +examples: > + - | > +#include > +#include > + > +display-controller@222 { > +compatible = "fsl,imx6sx-lcdif"; > +reg = <0x0222 0x4000>; > +interrupts = ; > +clocks = <&clks IMX6SX_CLK_LCDIF1_PIX>, > + <&clks IMX6SX_CLK_LCDIF_APB>, > + <&clks IMX6SX_CLK_DISPLAY_AXI>; > +clock-names = "pix", "axi", "disp_axi"; > + > +port { > +endpoint { > +remote-endpoint = <&panel_in>; > +}; > +}; > +}; > + > +... > diff --git a/Documentation/devicetree/bindings/display/mxsfb.txt > b/Documentation/devicetree/bindings/display/mxsfb.txt > deleted file mode 100644 > index c985871c46b3.. > --- a/Documentation/devicetree/bindings/display/mxsfb.txt > +++ /dev/null > @@ -1,87 +0,0 @@ > -* Freescale MXS LCD Interface (LCDIF) > - > -New bindings: > -= > -Required properties: > -- compatible:Should be "fsl,imx23-lcdif" for i.MX23. > - Should be "fsl,imx28-lcdif" for i.MX28. > - Should be "fsl,imx6sx-lcdif" for i.MX6SX. > - Should be "fsl,imx8mq-lcdif" for i.MX8MQ. > -- reg: Address and length of the register set for LCDIF > -- interrupts:Should contain LCDIF interrupt > -- clocks:A list of phandle + clock-specifier pa
[PATCH 07/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_vce_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_vce_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying table->numEntries by sizeof(struct phm_vce_clock_voltage_dependency_table) when it should have multiplied it by sizeof(struct phm_vce_clock_voltage_dependency_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c5d35.pjtogs3h9khzk6ws%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h| 2 +- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 11 --- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index ad614e32079e..b8e33325fac6 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -186,7 +186,7 @@ struct phm_acpclock_voltage_dependency_table { struct phm_vce_clock_voltage_dependency_table { uint8_t count; - struct phm_vce_clock_voltage_dependency_record entries[1]; + struct phm_vce_clock_voltage_dependency_record entries[]; }; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index b2ef76580c6a..7719f52e6d52 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1135,15 +1135,12 @@ static int get_vce_clock_voltage_limit_table(struct pp_hwmgr *hwmgr, const ATOM_PPLIB_VCE_Clock_Voltage_Limit_Table *table, const VCEClockInfoArray*array) { - unsigned long table_size, i; + unsigned long i; struct phm_vce_clock_voltage_dependency_table *vce_table = NULL; - table_size = sizeof(unsigned long) + - sizeof(struct phm_vce_clock_voltage_dependency_table) - * table->numEntries; - - vce_table = kzalloc(table_size, GFP_KERNEL); - if (NULL == vce_table) + vce_table = kzalloc(struct_size(vce_table, entries, table->numEntries), + GFP_KERNEL); + if (!vce_table) return -ENOMEM; vce_table->count = table->numEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 06/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_phase_shedding_limits_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_phase_shedding_limits_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying ptable->ucNumEntries by sizeof(struct phm_phase_shedding_limits_table) when it should have multiplied it by sizeof(struct phm_phase_shedding_limits_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c5d36.6pstuzp2hrxaz7im%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h | 2 +- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 12 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index 361cb1125351..ad614e32079e 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -161,7 +161,7 @@ struct phm_vce_clock_voltage_dependency_record { struct phm_phase_shedding_limits_table { uint32_t count; - struct phm_phase_shedding_limits_record entries[1]; + struct phm_phase_shedding_limits_record entries[]; }; struct phm_vceclock_voltage_dependency_table { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index a1b198045978..b2ef76580c6a 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1530,16 +1530,12 @@ static int init_phase_shedding_table(struct pp_hwmgr *hwmgr, (((unsigned long)powerplay_table4) + le16_to_cpu(powerplay_table4->usVddcPhaseShedLimitsTableOffset)); struct phm_phase_shedding_limits_table *table; - unsigned long size, i; + unsigned long i; - size = sizeof(unsigned long) + - (sizeof(struct phm_phase_shedding_limits_table) * - ptable->ucNumEntries); - - table = kzalloc(size, GFP_KERNEL); - - if (table == NULL) + table = kzalloc(struct_size(table, entries, ptable->ucNumEntries), + GFP_KERNEL); + if (!table) return -ENOMEM; table->count = (unsigned long)ptable->ucNumEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 05/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_acp_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_acp_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying table->numEntries by sizeof(struct phm_acp_clock_voltage_dependency_table) when it should have multiplied it by sizeof(phm_acp_clock_voltage_dependency_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c5d3c.tyfohg%2fa6jycl6zn%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h| 2 +- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 11 --- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index 2f1886bc5535..361cb1125351 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -150,7 +150,7 @@ struct phm_acp_clock_voltage_dependency_record { struct phm_acp_clock_voltage_dependency_table { uint32_t count; - struct phm_acp_clock_voltage_dependency_record entries[1]; + struct phm_acp_clock_voltage_dependency_record entries[]; }; struct phm_vce_clock_voltage_dependency_record { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index 305d95c4162d..a1b198045978 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1194,15 +1194,12 @@ static int get_acp_clock_voltage_limit_table(struct pp_hwmgr *hwmgr, struct phm_acp_clock_voltage_dependency_table **ptable, const ATOM_PPLIB_ACPClk_Voltage_Limit_Table *table) { - unsigned table_size, i; + unsigned long i; struct phm_acp_clock_voltage_dependency_table *acp_table; - table_size = sizeof(unsigned long) + - sizeof(struct phm_acp_clock_voltage_dependency_table) * - table->numEntries; - - acp_table = kzalloc(table_size, GFP_KERNEL); - if (NULL == acp_table) + acp_table = kzalloc(struct_size(acp_table, entries, table->numEntries), + GFP_KERNEL); + if (!acp_table) return -ENOMEM; acp_table->count = (unsigned long)table->numEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 03/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_clock_array
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_clock_array, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c433f.zymd+yuivawihgve%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h| 2 +- .../amd/pm/powerplay/hwmgr/process_pptables_v1_0.c| 11 --- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 7 +++ .../amd/pm/powerplay/hwmgr/vega10_processpptables.c | 9 +++-- 4 files changed, 11 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index d68b547743e6..e84cff09af2d 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -91,7 +91,7 @@ struct phm_set_power_state_input { struct phm_clock_array { uint32_t count; - uint32_t values[1]; + uint32_t values[]; }; struct phm_clock_voltage_dependency_record { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c index b760f95e7fa7..52188f6cd150 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/process_pptables_v1_0.c @@ -318,19 +318,16 @@ static int get_valid_clk( phm_ppt_v1_clock_voltage_dependency_table const *clk_volt_pp_table ) { - uint32_t table_size, i; + uint32_t i; struct phm_clock_array *table; phm_ppt_v1_clock_voltage_dependency_record *dep_record; PP_ASSERT_WITH_CODE((0 != clk_volt_pp_table->count), "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(uint32_t) * clk_volt_pp_table->count; - - table = kzalloc(table_size, GFP_KERNEL); - - if (NULL == table) + table = kzalloc(struct_size(table, values, clk_volt_pp_table->count), + GFP_KERNEL); + if (!table) return -ENOMEM; table->count = (uint32_t)clk_volt_pp_table->count; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index d94a7d8e0587..d9bed4df6f65 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -404,12 +404,11 @@ static int get_valid_clk(struct pp_hwmgr *hwmgr, struct phm_clock_array **ptable, const struct phm_clock_voltage_dependency_table *table) { - unsigned long table_size, i; + unsigned long i; struct phm_clock_array *clock_table; - table_size = sizeof(unsigned long) + sizeof(unsigned long) * table->count; - clock_table = kzalloc(table_size, GFP_KERNEL); - if (NULL == clock_table) + clock_table = kzalloc(struct_size(clock_table, values, table->count), GFP_KERNEL); + if (!clock_table) return -ENOMEM; clock_table->count = (unsigned long)table->count; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c index f29af5ca0aa0..e655c04ccdfb 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/vega10_processpptables.c @@ -875,17 +875,14 @@ static int get_valid_clk( struct phm_clock_array **clk_table, const phm_ppt_v1_clock_voltage_dependency_table *clk_volt_pp_table) { - uint32_t table_size, i; + uint32_t i; struct phm_clock_array *table; PP_ASSERT_WITH_CODE(clk_volt_pp_table->count, "Invalid PowerPlay Table!", return -1); - table_size = sizeof(uint32_t) + - sizeof(uint32_t) * clk_volt_pp_table->count; - - table = kzalloc(table_size, GFP_KERNEL); - + table = kzalloc(struct_size(table, values, clk_volt_pp_table->count), + GFP_KERNEL); if (!table) return -ENOMEM; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 04/14] drm/amd/pm: Replace one-element array with flexible-array in struct phm_uvd_clock_voltage_dependency_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_uvd_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. Also, save some heap space as the original code is multiplying table->numEntries by sizeof(struct phm_uvd_clock_voltage_dependency_table) when it should have multiplied it by sizeof(phm_uvd_clock_voltage_dependency_record) instead. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c433e.pxkc6ksn6hn%2fldhj%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h| 2 +- .../gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 11 --- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index e84cff09af2d..2f1886bc5535 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -140,7 +140,7 @@ struct phm_uvd_clock_voltage_dependency_record { struct phm_uvd_clock_voltage_dependency_table { uint8_t count; - struct phm_uvd_clock_voltage_dependency_record entries[1]; + struct phm_uvd_clock_voltage_dependency_record entries[]; }; struct phm_acp_clock_voltage_dependency_record { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index d9bed4df6f65..305d95c4162d 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -1105,15 +1105,12 @@ static int get_uvd_clock_voltage_limit_table(struct pp_hwmgr *hwmgr, const ATOM_PPLIB_UVD_Clock_Voltage_Limit_Table *table, const UVDClockInfoArray *array) { - unsigned long table_size, i; + unsigned long i; struct phm_uvd_clock_voltage_dependency_table *uvd_table; - table_size = sizeof(unsigned long) + -sizeof(struct phm_uvd_clock_voltage_dependency_table) * -table->numEntries; - - uvd_table = kzalloc(table_size, GFP_KERNEL); - if (NULL == uvd_table) + uvd_table = kzalloc(struct_size(uvd_table, entries, table->numEntries), + GFP_KERNEL); + if (!uvd_table) return -ENOMEM; uvd_table->count = table->numEntries; -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 0/3] drm: commit_work scheduling
On Wed, Oct 7, 2020 at 3:36 AM Qais Yousef wrote: > > On 10/06/20 13:04, Rob Clark wrote: > > On Tue, Oct 6, 2020 at 3:59 AM Qais Yousef wrote: > > > > > > On 10/05/20 16:24, Rob Clark wrote: > > > > > > [...] > > > > > > > > RT planning and partitioning is not easy task for sure. You might > > > > > want to > > > > > consider using affinities too to get stronger guarantees for some > > > > > tasks and > > > > > prevent cross-talking. > > > > > > > > There is some cgroup stuff that is pinning SF and some other stuff to > > > > the small cores, fwiw.. I think the reasoning is that they shouldn't > > > > be doing anything heavy enough to need the big cores. > > > > > > Ah, so you're on big.LITTLE type of system. I have done some work which > > > enables > > > biasing RT tasks towards big cores and control the default boost value if > > > you > > > have util_clamp and schedutil enabled. You can use util_clamp in general > > > to > > > help with DVFS related response time delays. > > > > > > I haven't done any work to try our best to pick a small core first but > > > fallback > > > to big if there's no other alternative. > > > > > > It'd be interesting to know how often you end up on a big core if you > > > remove > > > the affinity. The RT scheduler picks the first cpu in the lowest priority > > > mask. > > > So it should have this bias towards picking smaller cores first if they're > > > in the lower priority mask (ie: not running higher priority RT tasks). > > > > fwiw, the issue I'm looking at is actually at the opposite end of the > > spectrum, less demanding apps that let cpus throttle down to low > > OPPs.. which stretches out the time taken at each step in the path > > towards screen (which seems to improve the odds that we hit priority > > inversion scenarios with SCHED_FIFO things stomping on important CFS > > things) > > So you do have the problem of RT task preempting an important CFS task. > > > > > There is a *big* difference in # of cpu cycles per frame between > > highest and lowest OPP.. > > To combat DVFS related delays, you can use util clamp. > > Hopefully this article helps explain it if you didn't come across it before > > https://lwn.net/Articles/762043/ > > You can use sched_setattr() to set SCHED_FLAG_UTIL_CLAMP_MIN for a task. This > will guarantee everytime this task is running it'll appear it has at least > this utilization value, so schedutil governor (which must be used for this to > work) will pick up the right performance point (OPP). > > The scheduler will try its best to make sure that the task will run on a core > that meets the minimum requested performance point (hinted by setting > uclamp_min). Yeah, I think we will end up making some use of uclamp.. there is someone else working on that angle But without it, this is a case that exposes legit prioritization problems with commit_work which we should fix ;-) BR, -R > > Thanks > > -- > Qais Yousef ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 02/14] drm/amd/pm: Replace one-element array with flexible-array member in struct vi_dpm_table
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Use a flexible-array member in struct vi_dpm_table instead of a one-element array. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c433c.ttk9rna+f58kyduy%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index a1dbfd5636e6..d68b547743e6 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -60,7 +60,7 @@ struct vi_dpm_level { struct vi_dpm_table { uint32_t count; - struct vi_dpm_level dpm_level[1]; + struct vi_dpm_level dpm_level[]; }; #define PCIE_PERF_REQ_REMOVE_REGISTRY 0 -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 01/14] drm/amd/pm: Replace one-element array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of a flexible-array member in struct phm_clock_voltage_dependency_table, instead of a one-element array, and use the struct_size() helper to calculate the size for the allocation. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays Build-tested-by: kernel test robot Link: https://lore.kernel.org/lkml/5f7c295c.8iqp1ifc6oivdq%2f%2f%25...@intel.com/ Signed-off-by: Gustavo A. R. Silva --- drivers/gpu/drm/amd/pm/inc/hwmgr.h | 4 ++-- drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c | 9 +++-- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c | 5 ++--- 4 files changed, 8 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/inc/hwmgr.h b/drivers/gpu/drm/amd/pm/inc/hwmgr.h index 3898a95ec28b..a1dbfd5636e6 100644 --- a/drivers/gpu/drm/amd/pm/inc/hwmgr.h +++ b/drivers/gpu/drm/amd/pm/inc/hwmgr.h @@ -122,8 +122,8 @@ struct phm_acpclock_voltage_dependency_record { }; struct phm_clock_voltage_dependency_table { - uint32_t count; /* Number of entries. */ - struct phm_clock_voltage_dependency_record entries[1]; /* Dynamically allocate count entries. */ + uint32_t count; /* Number of entries. */ + struct phm_clock_voltage_dependency_record entries[]; /* Dynamically allocate count entries. */ }; struct phm_phase_shedding_limits_record { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c index 719597c5d27d..d94a7d8e0587 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/processpptables.c @@ -377,14 +377,11 @@ static int get_clock_voltage_dependency_table(struct pp_hwmgr *hwmgr, const ATOM_PPLIB_Clock_Voltage_Dependency_Table *table) { - unsigned long table_size, i; + unsigned long i; struct phm_clock_voltage_dependency_table *dep_table; - table_size = sizeof(unsigned long) + - sizeof(struct phm_clock_voltage_dependency_table) - * table->ucNumEntries; - - dep_table = kzalloc(table_size, GFP_KERNEL); + dep_table = kzalloc(struct_size(dep_table, entries, table->ucNumEntries), + GFP_KERNEL); if (NULL == dep_table) return -ENOMEM; diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c index 35ed47ebaf09..ed9b89980184 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c @@ -276,7 +276,7 @@ static int smu8_init_dynamic_state_adjustment_rule_settings( { struct phm_clock_voltage_dependency_table *table_clk_vlt; - table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 7), + table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 8), GFP_KERNEL); if (NULL == table_clk_vlt) { diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c index 60b5ca974356..b485f8b1d6f2 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu_helper.c @@ -492,13 +492,12 @@ int phm_get_sclk_for_voltage_evv(struct pp_hwmgr *hwmgr, */ int phm_initializa_dynamic_state_adjustment_rule_settings(struct pp_hwmgr *hwmgr) { - uint32_t table_size; struct phm_clock_voltage_dependency_table *table_clk_vlt; struct phm_ppt_v1_information *pptable_info = (struct phm_ppt_v1_information *)(hwmgr->pptable); /* initialize vddc_dep_on_dal_pwrl table */ - table_size = sizeof(uint32_t) + 4 * sizeof(struct phm_clock_voltage_dependency_record); - table_clk_vlt = kzalloc(table_size, GFP_KERNEL); + table_clk_vlt = kzalloc(struct_size(table_clk_vlt, entries, 4), + GFP_KERNEL); if (NULL == table_clk_vlt) { pr_err("Can not allocate space for vddc_dep_on_dal_pwrl! \n"); -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 00/14] drm/amd/pm: Replace one-element arrays with flexible-array members
Hi all, This series aims to replace one-element arrays with flexible-array members. There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code according to the use of flexible-array members, instead of one-element arrays, and use the struct_size() helper to calculate the size for the dynamic memory allocation. Also, save some heap space in the process. More on this on each individual patch. This series also addresses multiple of the following sorts of warnings: drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu8_hwmgr.c:1515:37: warning: array subscript 1 is above array bounds of ‘const struct phm_clock_voltage_dependency_record[1]’ [-Warray-bounds] which, in this case, they are false positives, but nervertheless should be fixed in order to enable -Warray-bounds[3][4]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays [3] https://git.kernel.org/linus/44720996e2d79e47d508b0abe99b931a726a3197 [4] https://github.com/KSPP/linux/issues/109 Gustavo A. R. Silva (14): drm/amd/pm: Replace one-element array with flexible-array member drm/amd/pm: Replace one-element array with flexible-array member in struct vi_dpm_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_clock_array drm/amd/pm: Replace one-element array with flexible-array in struct phm_uvd_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_acp_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_phase_shedding_limits_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_vce_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_cac_leakage_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_samu_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_mm_clock_voltage_dependency_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_voltage_lookup_table drm/amd/pm: Replace one-element array with flexible-array in struct phm_ppt_v1_pcie_table drm/amd/pm: Replace one-element array with flexible-array in struct ATOM_Vega10_GFXCLK_Dependency_Table drivers/gpu/drm/amd/pm/inc/hwmgr.h| 20 ++--- .../drm/amd/pm/powerplay/hwmgr/hwmgr_ppt.h| 8 +- .../powerplay/hwmgr/process_pptables_v1_0.c | 85 +++--- .../amd/pm/powerplay/hwmgr/processpptables.c | 85 +++--- .../drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c | 2 +- .../drm/amd/pm/powerplay/hwmgr/smu_helper.c | 5 +- .../amd/pm/powerplay/hwmgr/vega10_pptable.h | 2 +- .../powerplay/hwmgr/vega10_processpptables.c | 88 ++- 8 files changed, 107 insertions(+), 188 deletions(-) -- 2.27.0 ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH RESEND v3 2/6] dt-bindings: display: sun4i: Add LVDS Dual-Link property
On Mon, Oct 05, 2020 at 05:15:40PM +0200, Maxime Ripard wrote: > The Allwinner SoCs with two TCONs and LVDS output can use both to drive an > LVDS dual-link. Add a new property to express that link between these two > TCONs. > > Signed-off-by: Maxime Ripard > --- > Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml | 6 > ++ > 1 file changed, 6 insertions(+) > > diff --git > a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml > b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml > index e5344c4ae226..ce407f5466a5 100644 > --- a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml > +++ b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml > @@ -115,6 +115,12 @@ properties: > - const: edp > - const: lvds > > + allwinner,lvds-companion: We already have 1 vendor property for this. How about 'link-companion' for something common. Rob ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: KASAN: vmalloc-out-of-bounds Write in sys_imageblit
syzbot has found a reproducer for the following issue on: HEAD commit:c85fb28b Merge tag 'arm64-fixes' of git://git.kernel.org/p.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=17406d7050 kernel config: https://syzkaller.appspot.com/x/.config?x=140446ac2aa637e5 dashboard link: https://syzkaller.appspot.com/bug?extid=26dc38a00dc05118a4e6 compiler: gcc (GCC) 10.1.0-syz 20200507 userspace arch: i386 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14788d7050 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15158ee050 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+26dc38a00dc05118a...@syzkaller.appspotmail.com == BUG: KASAN: vmalloc-out-of-bounds in fast_imageblit drivers/video/fbdev/core/sysimgblt.c:229 [inline] BUG: KASAN: vmalloc-out-of-bounds in sys_imageblit+0x117f/0x1290 drivers/video/fbdev/core/sysimgblt.c:275 Write of size 4 at addr c90009911000 by task syz-executor045/8761 CPU: 0 PID: 8761 Comm: syz-executor045 Not tainted 5.9.0-rc8-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x198/0x1fd lib/dump_stack.c:118 print_address_description.constprop.0.cold+0x5/0x497 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530 fast_imageblit drivers/video/fbdev/core/sysimgblt.c:229 [inline] sys_imageblit+0x117f/0x1290 drivers/video/fbdev/core/sysimgblt.c:275 drm_fb_helper_sys_imageblit+0x1c/0x180 drivers/gpu/drm/drm_fb_helper.c:767 bit_putcs_unaligned drivers/video/fbdev/core/bitblit.c:139 [inline] bit_putcs+0x6e1/0xd20 drivers/video/fbdev/core/bitblit.c:188 fbcon_putcs+0x35a/0x450 drivers/video/fbdev/core/fbcon.c:1308 do_update_region+0x399/0x630 drivers/tty/vt/vt.c:675 redraw_screen+0x658/0x790 drivers/tty/vt/vt.c:1034 fbcon_modechanged+0x593/0x6d0 drivers/video/fbdev/core/fbcon.c:2714 fbcon_update_vcs+0x3a/0x50 drivers/video/fbdev/core/fbcon.c:2759 do_fb_ioctl+0x62e/0x690 drivers/video/fbdev/core/fbmem.c:1106 fb_compat_ioctl+0x17c/0xc30 drivers/video/fbdev/core/fbmem.c:1311 __do_compat_sys_ioctl+0x1d3/0x230 fs/ioctl.c:842 do_syscall_32_irqs_on arch/x86/entry/common.c:78 [inline] __do_fast_syscall_32+0x60/0x90 arch/x86/entry/common.c:137 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:160 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c RIP: 0023:0xf7f58549 Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:f7f531dc EFLAGS: 0246 ORIG_RAX: 0036 RAX: ffda RBX: 0003 RCX: 4601 RDX: 2000 RSI: RDI: RBP: R08: R09: R10: R11: R12: R13: R14: R15: Memory state around the buggy address: c90009910f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c90009910f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >c90009911000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ c90009911080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 c90009911100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 == ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM
On Wed, Oct 07, 2020 at 03:34:01PM +0200, Tomasz Figa wrote: > I think the userptr zero-copy hack should be able to go away indeed, > given that we now have CMA that allows having carveouts backed by > struct pages and having the memory represented as DMA-buf normally. This also needs to figure out how to get references to CMA pages out of a VMA. IIRC Daniel said these are not pinnable? > How about the regular userptr use case, though? Just call pin_user_pages(), that is the easy case. > Is your intention to drop get_vaddr_frames() or we could still keep > using it and if vec->is_pfns is true: get_vaddr_frames() is dangerous, I would like it to go away. > a) if CONFIG_VIDEO_LEGACY_PFN_USERPTR is set, taint the kernel > b) otherwise just undo and fail? For the CONFIG_VIDEO_LEGACY_PFN_USERPTR case all the follow_pfn related code in get_vaddr_frames() shold move back into media and be hidden under this config. Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM
On Wed, Oct 07, 2020 at 04:11:59PM +0200, Tomasz Figa wrote: > We also need to bring back the vma_open() that somehow disappeared > around 4.2, as Marek found. No Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/2] mm/frame-vec: use FOLL_LONGTERM
On Wed, Oct 07, 2020 at 03:06:17PM +0200, Tomasz Figa wrote: > Note that vb2_vmalloc is only used for in-kernel CPU usage, e.g. the > contents being copied by the driver between vb2 buffers and some > hardware FIFO or other dedicated buffers. The memory does not go to > any hardware DMA. That is even worse, the CPU can't just blindly touch VM_IO pages, that isn't portable. > Could you elaborate on what "the REQUIRED behavior is"? I can see that > both follow the get_vaddr_frames() -> frame_vector_to_pages() flow, as > you mentioned. Perhaps the only change needed is switching to > pin_user_pages after all? It is the comment right on top of get_vaddr_frames(): if @start belongs to VM_IO | VM_PFNMAP vma, we don't touch page structures and the caller must make sure pfns aren't reused for anything else while he is using them. Which means excluding every kind of VMA that is not something this driver understands and then using special knowledge of the driver-specific VMA to assure the above. For instance if you could detect the VMA is from a carevout and do something special like hold the fget() while knowning that the struct file guarentees the carveout remains reserved - then you could use follow_pfn. But it would be faster and better to ask the carveout FD for the vaddr range. Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/7] dt-bindings: display: mxsfb: Convert binding to YAML
On 10/7/20 3:24 AM, Laurent Pinchart wrote: [...] > +properties: > + compatible: > +enum: > + - fsl,imx23-lcdif > + - fsl,imx28-lcdif > + - fsl,imx6sx-lcdif > + - fsl,imx8mq-lcdif There is no fsl,imx8mq-lcdif in drivers/gpu/drm/mxsfb/mxsfb_drv.c, so the DT must specify compatible = "fsl,imx8mq-lcdif", "fsl,imx28-lcdif" (since imx28 is the oldest SoC with LCDIF V4). Should the compatible be added to drivers/gpu/drm/mxsfb/mxsfb_drv.c or dropped from the YAML file or neither ? [...] ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel