Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags
Am 13.10.2016 um 02:20 schrieb Lorenzo Stoakes: This patch series adjusts functions in the get_user_pages* family such that desired FOLL_* flags are passed as an argument rather than implied by flags. The purpose of this change is to make the use of FOLL_FORCE explicit so it is easier to grep for and clearer to callers that this flag is being used. The use of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the VMA whose pages we are reading from/writing to, which can result in surprising behaviour. The patch series came out of the discussion around commit 38e0885, which addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE set but having been overridden by FOLL_FORCE. do_numa_page() was run on the assumption the page _must_ be one marked for NUMA node migration as an actual PROT_NONE page would have been dealt with prior to this code path, however FOLL_FORCE introduced a situation where this assumption did not hold. See https://marc.info/?l=linux-mm&m=147585445805166 for the patch proposal. Lorenzo Stoakes (10): mm: remove write/force parameters from __get_user_pages_locked() mm: remove write/force parameters from __get_user_pages_unlocked() mm: replace get_user_pages_unlocked() write/force parameters with gup_flags mm: replace get_user_pages_locked() write/force parameters with gup_flags mm: replace get_vaddr_frames() write/force parameters with gup_flags mm: replace get_user_pages() write/force parameters with gup_flags mm: replace get_user_pages_remote() write/force parameters with gup_flags mm: replace __access_remote_vm() write parameter with gup_flags mm: replace access_remote_vm() write parameter with gup_flags mm: replace access_process_vm() write parameter with gup_flags Patch number 6 in this series (which touches drivers I co-maintain) is Acked-by: Christian König . In general looks like a very nice cleanup to me, but I'm not enlightened enough to full judge. Regards, Christian. arch/alpha/kernel/ptrace.c | 9 ++-- arch/blackfin/kernel/ptrace.c | 5 ++- arch/cris/arch-v32/drivers/cryptocop.c | 4 +- arch/cris/arch-v32/kernel/ptrace.c | 4 +- arch/ia64/kernel/err_inject.c | 2 +- arch/ia64/kernel/ptrace.c | 14 +++--- arch/m32r/kernel/ptrace.c | 15 --- arch/mips/kernel/ptrace32.c| 5 ++- arch/mips/mm/gup.c | 2 +- arch/powerpc/kernel/ptrace32.c | 5 ++- arch/s390/mm/gup.c | 3 +- arch/score/kernel/ptrace.c | 10 +++-- arch/sh/mm/gup.c | 3 +- arch/sparc/kernel/ptrace_64.c | 24 +++ arch/sparc/mm/gup.c| 3 +- arch/x86/kernel/step.c | 3 +- arch/x86/mm/gup.c | 2 +- arch/x86/mm/mpx.c | 5 +-- arch/x86/um/ptrace_32.c| 3 +- arch/x86/um/ptrace_64.c| 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 7 ++- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 7 ++- drivers/gpu/drm/exynos/exynos_drm_g2d.c| 3 +- drivers/gpu/drm/i915/i915_gem_userptr.c| 6 ++- drivers/gpu/drm/radeon/radeon_ttm.c| 3 +- drivers/gpu/drm/via/via_dmablit.c | 4 +- drivers/infiniband/core/umem.c | 6 ++- drivers/infiniband/core/umem_odp.c | 7 ++- drivers/infiniband/hw/mthca/mthca_memfree.c| 2 +- drivers/infiniband/hw/qib/qib_user_pages.c | 3 +- drivers/infiniband/hw/usnic/usnic_uiom.c | 5 ++- drivers/media/pci/ivtv/ivtv-udma.c | 4 +- drivers/media/pci/ivtv/ivtv-yuv.c | 5 ++- drivers/media/platform/omap/omap_vout.c| 2 +- drivers/media/v4l2-core/videobuf-dma-sg.c | 7 ++- drivers/media/v4l2-core/videobuf2-memops.c | 6 ++- drivers/misc/mic/scif/scif_rma.c | 3 +- drivers/misc/sgi-gru/grufault.c| 2 +- drivers/platform/goldfish/goldfish_pipe.c | 3 +- drivers/rapidio/devices/rio_mport_cdev.c | 3 +- drivers/scsi/st.c | 5 +-- .../interface/vchiq_arm/vchiq_2835_arm.c | 3 +- .../vc04_services/interface/vchiq_arm/vchiq_arm.c | 3 +- drivers/video/fbdev/pvr2fb.c | 4 +- drivers/virt/fsl_hypervisor.c | 4 +- fs/exec.c | 9 +++- fs/proc/base.c | 19 +--- include/linux/mm.h
Re: consolidate swiotlb dma_map implementations
Acked-by: Christian König for the whole series. Regards, Christian. Am 10.01.2018 um 09:09 schrieb Christoph Hellwig: A lot of architectures have essentially identical dma_map_ops implementations to use swiotlb. This series adds new generic swiotlb_alloc/free helpers that take the attrs argument exposed in dma_map_ops, and which do an enhanced direct allocation modelled after x86 and reused from the dma-direct code, and then switches most architectures over to it. The only exceptions are mips, which requires additional cache flushing which will need a new abstraction, and x86 itself which will be handled in a later series with other x86 dma mapping changes. To support the generic code a few architectures that currently use ZONE_DMA/GFP_DMA for <= 32-bit allocations are switched to implement ZONE_DMA32 instead. This series is based on the previously sent series to consolidate the direct dma mapping implementation. A git tree with this series as well as the prerequisites is available here: git://git.infradead.org/users/hch/misc.git swiotlb Gitweb: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/swiotlb
Re: consolidate swiotlb dma_map implementations
Hi Konrad, can you send the first patch to Linus for inclusion in 4.15 if you haven't already done so? I'm still getting reports from people complaining about the error message. Thanks, Christian. Am 16.01.2018 um 08:53 schrieb Christoph Hellwig: I've pulled this into the dma-mapping for-next tree, including the missing free_pages noted. I'd be fine to rebase another day or two for additional reviews or important fixes.
Re: consolidate swiotlb dma_map implementations
Am 16.01.2018 um 09:28 schrieb Christoph Hellwig: On Tue, Jan 16, 2018 at 09:22:52AM +0100, Christian König wrote: Hi Konrad, can you send the first patch to Linus for inclusion in 4.15 if you haven't already done so? It's in the 4.16 queue with a cc to stable. I guess we're ok with a duplicate commit if we have to. Yeah, while it's only a false positive warning it would be really nice to have in 4.15. It affects all drivers using TTM in the system and not just the two I'm the maintainer of. Regards, Christian.
Re: [patch net-next 0/3] net/sched: Improve getting objects by indexes
Am 16.08.2017 um 04:12 schrieb Chris Mi: Using current TC code, it is very slow to insert a lot of rules. In order to improve the rules update rate in TC, we introduced the following two changes: 1) changed cls_flower to use IDR to manage the filters. 2) changed all act_xxx modules to use IDR instead of a small hash table But IDR has a limitation that it uses int. TC handle uses u32. To make sure there is no regression, we also changed IDR to use unsigned long. All clients of IDR are changed to use new IDR API. WOW, wait a second. The idr change is touching a lot of drivers and to be honest doesn't looks correct at all. Just look at the first chunk of your modification: @@ -998,8 +999,9 @@ int bsg_register_queue(struct request_queue *q, struct device *parent, mutex_lock(&bsg_mutex); - ret = idr_alloc(&bsg_minor_idr, bcd, 0, BSG_MAX_DEVS, GFP_KERNEL); - if (ret < 0) { + ret = idr_alloc(&bsg_minor_idr, bcd, &idr_index, 0, BSG_MAX_DEVS, + GFP_KERNEL); + if (ret) { if (ret == -ENOSPC) { printk(KERN_ERR "bsg: too many bsg devices\n"); ret = -EINVAL; The condition "if (ret)" will now always be true after the first allocation and so we always run into the error handling after that. I've never read the bsg code before, but that's certainly not correct. And that incorrect pattern repeats over and over again in this code. Apart from that why the heck do you want to allocate more than 1<<31 handles? Regards, Christian. Chris Mi (3): idr: Use unsigned long instead of int net/sched: Change cls_flower to use IDR net/sched: Change act_api and act_xxx modules to use IDR block/bsg.c | 8 +- block/genhd.c | 12 +- drivers/atm/nicstar.c | 11 +- drivers/block/drbd/drbd_main.c | 31 +-- drivers/block/drbd/drbd_nl.c| 22 ++- drivers/block/drbd/drbd_proc.c | 3 +- drivers/block/drbd/drbd_receiver.c | 15 +- drivers/block/drbd/drbd_state.c | 34 ++-- drivers/block/drbd/drbd_worker.c| 6 +- drivers/block/loop.c| 17 +- drivers/block/nbd.c | 20 +- drivers/block/zram/zram_drv.c | 9 +- drivers/char/tpm/tpm-chip.c | 10 +- drivers/char/tpm/tpm.h | 2 +- drivers/dca/dca-sysfs.c | 9 +- drivers/firewire/core-cdev.c| 18 +- drivers/firewire/core-device.c | 15 +- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 2 +- drivers/gpu/drm/drm_auth.c | 9 +- drivers/gpu/drm/drm_connector.c | 10 +- drivers/gpu/drm/drm_context.c | 20 +- drivers/gpu/drm/drm_dp_aux_dev.c| 11 +- drivers/gpu/drm/drm_drv.c | 6 +- drivers/gpu/drm/drm_gem.c | 19 +- drivers/gpu/drm/drm_info.c | 2 +- drivers/gpu/drm/drm_mode_object.c | 11 +- drivers/gpu/drm/drm_syncobj.c | 18 +- drivers/gpu/drm/exynos/exynos_drm_ipp.c | 25 ++- drivers/gpu/drm/i915/gvt/display.c | 2 +- drivers/gpu/drm/i915/gvt/kvmgt.c| 2 +- drivers/gpu/drm/i915/gvt/vgpu.c | 9 +- drivers/gpu/drm/i915/i915_debugfs.c | 6 +- drivers/gpu/drm/i915/i915_gem_context.c | 9 +- drivers/gpu/drm/qxl/qxl_cmd.c | 8 +- drivers/gpu/drm/qxl/qxl_release.c | 14 +- drivers/gpu/drm/sis/sis_mm.c| 8 +- drivers/gpu/drm/tegra/drm.c | 10 +- drivers/gpu/drm/tilcdc/tilcdc_slave_compat.c| 3 +- drivers/gpu/drm/vgem/vgem_fence.c | 12 +- drivers/gpu/drm/via/via_mm.c| 8 +- drivers/gpu/drm/virtio/virtgpu_kms.c| 5 +- drivers/gpu/drm/virtio/virtgpu_vq.c | 5 +- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c| 9 +- drivers/i2c/i2c-core-base.c | 19 +- drivers/infiniband/core/cm.c| 8 +- drivers/infiniband/core/cma.c | 12 +- drivers/infiniband/core/rdma_core.c | 9 +- drivers/infiniband/core/sa_query.c | 23 +-- drivers/infiniband/core/ucm.c | 7 +- drivers/infiniband/core/ucma.c | 14 +- drivers/infiniband/hw/cxgb3/iwch.c | 4 +- drivers/infiniband/hw/cxgb3/iwch.h
Re: [PATCH V1 10/10] drm: Remove drm specific kmap_atomic code
Am 30.04.20 um 22:38 schrieb ira.we...@intel.com: From: Ira Weiny kmap_atomic_prot() is now exported by all architectures. Use this function rather than open coding a driver specific kmap_atomic. Signed-off-by: Ira Weiny Ah, yes looking into this once more this was on my TODO list for quite a while as well. Patch is Reviewed-by: Christian König , feel free to push it upstream through whatever channel you like or ping me if I should pick it up into drm-misc-next. Regards, Christian. --- drivers/gpu/drm/ttm/ttm_bo_util.c| 56 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_blit.c | 16 include/drm/ttm/ttm_bo_api.h | 4 -- 3 files changed, 12 insertions(+), 64 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index 52d2b71f1588..f09b096ba4fd 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -257,54 +257,6 @@ static int ttm_copy_io_page(void *dst, void *src, unsigned long page) return 0; } -#ifdef CONFIG_X86 -#define __ttm_kmap_atomic_prot(__page, __prot) kmap_atomic_prot(__page, __prot) -#define __ttm_kunmap_atomic(__addr) kunmap_atomic(__addr) -#else -#define __ttm_kmap_atomic_prot(__page, __prot) vmap(&__page, 1, 0, __prot) -#define __ttm_kunmap_atomic(__addr) vunmap(__addr) -#endif - - -/** - * ttm_kmap_atomic_prot - Efficient kernel map of a single page with - * specified page protection. - * - * @page: The page to map. - * @prot: The page protection. - * - * This function maps a TTM page using the kmap_atomic api if available, - * otherwise falls back to vmap. The user must make sure that the - * specified page does not have an aliased mapping with a different caching - * policy unless the architecture explicitly allows it. Also mapping and - * unmapping using this api must be correctly nested. Unmapping should - * occur in the reverse order of mapping. - */ -void *ttm_kmap_atomic_prot(struct page *page, pgprot_t prot) -{ - if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) - return kmap_atomic(page); - else - return __ttm_kmap_atomic_prot(page, prot); -} -EXPORT_SYMBOL(ttm_kmap_atomic_prot); - -/** - * ttm_kunmap_atomic_prot - Unmap a page that was mapped using - * ttm_kmap_atomic_prot. - * - * @addr: The virtual address from the map. - * @prot: The page protection. - */ -void ttm_kunmap_atomic_prot(void *addr, pgprot_t prot) -{ - if (pgprot_val(prot) == pgprot_val(PAGE_KERNEL)) - kunmap_atomic(addr); - else - __ttm_kunmap_atomic(addr); -} -EXPORT_SYMBOL(ttm_kunmap_atomic_prot); - static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, unsigned long page, pgprot_t prot) @@ -316,13 +268,13 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, return -ENOMEM; src = (void *)((unsigned long)src + (page << PAGE_SHIFT)); - dst = ttm_kmap_atomic_prot(d, prot); + dst = kmap_atomic_prot(d, prot); if (!dst) return -ENOMEM; memcpy_fromio(dst, src, PAGE_SIZE); - ttm_kunmap_atomic_prot(dst, prot); + kunmap_atomic(dst); return 0; } @@ -338,13 +290,13 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, return -ENOMEM; dst = (void *)((unsigned long)dst + (page << PAGE_SHIFT)); - src = ttm_kmap_atomic_prot(s, prot); + src = kmap_atomic_prot(s, prot); if (!src) return -ENOMEM; memcpy_toio(dst, src, PAGE_SIZE); - ttm_kunmap_atomic_prot(src, prot); + kunmap_atomic(src); return 0; } diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c index bb46ca0c458f..94d456a1d1a9 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c @@ -374,12 +374,12 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset); if (unmap_src) { - ttm_kunmap_atomic_prot(d->src_addr, d->src_prot); + kunmap_atomic(d->src_addr); d->src_addr = NULL; } if (unmap_dst) { - ttm_kunmap_atomic_prot(d->dst_addr, d->dst_prot); + kunmap_atomic(d->dst_addr); d->dst_addr = NULL; } @@ -388,8 +388,8 @@ static int vmw_bo_cpu_blit_line(struct vmw_bo_blit_line_data *d, return -EINVAL; d->dst_addr = - ttm_kmap_atomic_prot(d->dst_pages[dst_page], -d->dst_prot); + kmap_atomic_prot(d->dst_pages[dst_page], +
Re: [PATCH v1 12/15] powerpc/uaccess: Refactor get/put_user() and __get/put_user()
The radeon warning is trivial to fix, going to send out a patch in a few moments. Regards, Christian. Am 08.03.21 um 13:14 schrieb Christophe Leroy: +Evgeniy for W1 Dallas +Alex & Christian for RADEON Le 07/03/2021 à 11:23, kernel test robot a écrit : Hi Christophe, I love your patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on v5.12-rc2 next-20210305] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-Cleanup-of-uaccess-h/20210226-015715 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-randconfig-s031-20210307 (attached as .config) compiler: powerpc-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.3-245-gacc5c298-dirty # https://github.com/0day-ci/linux/commit/449bdbf978936e67e4919be8be0eec3e490a65e2 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Christophe-Leroy/powerpc-Cleanup-of-uaccess-h/20210226-015715 git checkout 449bdbf978936e67e4919be8be0eec3e490a65e2 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot The mentioned patch is not the source of the problem, it only allows to spot it. Christophe "sparse warnings: (new ones prefixed by >>)" drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected char [noderef] __user *_pu_addr @@ got char *buf @@ drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: expected char [noderef] __user *_pu_addr drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: got char *buf drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected char const [noderef] __user *_gu_addr @@ got char const *buf @@ drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: expected char const [noderef] __user *_gu_addr drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: got char const *buf -- drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast removes address space '__user' of expression drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast removes address space '__user' of expression drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected unsigned int [noderef] __user *_pu_addr @@ got unsigned int [usertype] * @@ drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: expected unsigned int [noderef] __user *_pu_addr drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: got unsigned int [usertype] * drivers/gpu/drm/radeon/radeon_ttm.c:933:21: sparse: sparse: cast removes address space '__user' of expression vim +342 drivers/w1/slaves/w1_ds28e04.c fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21 338 fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21 339 static ssize_t crccheck_show(struct device *dev, struct device_attribute *attr, fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21 340 char *buf) fbf7f7b4e2ae40 Markus Franke 2012-05-26 341 { fbf7f7b4e2ae40 Markus Franke 2012-05-26 @342 if (put_user(w1_enable_crccheck + 0x30, buf)) fbf7f7b4e2ae40 Markus Franke 2012-05-26 343 return -EFAULT; fbf7f7b4e2ae40 Markus Franke 2012-05-26 344 fbf7f7b4e2ae40 Markus Franke 2012-05-26 345 return sizeof(w1_enable_crccheck); fbf7f7b4e2ae40 Markus Franke 2012-05-26 346 } fbf7f7b4e2ae40 Markus Franke 2012-05-26 347 fa33a65a9cf7e2 Greg Kroah-Hartman 2013-08-21 348 static ssize_t crccheck_store(struct device *dev, struct device_attribute *attr, fbf7f7b4e2ae40 Markus Franke 2012-05-26 349 const char *buf, size_t count) fbf7f7b4e2ae40 Markus Franke 2012-05-26 350 { fbf7f7b4e2ae40 Markus Franke 2012-05-26 351 char val; fbf7f7b4e2ae40 Markus Franke 2012-05-26 352 fbf7f7b4e2ae40 Markus Franke 2012-05-26 353 if (count != 1 || !buf) fbf7f7b4e2ae40 Markus Franke 2012-05-26 354 return -EINVAL; fbf7f7b4e2ae40 Markus Franke 2012-05-26 355 fbf7f7b4e2ae40 Markus Franke 2012-05-26 @356 if (get_user(val, buf)) fbf7f7b4e2ae40 Markus Franke 2012-05-26 357 return -EFAULT; fbf7f7b4e2ae40 Markus Franke 2012-05-26 358 fbf7f7b4e2ae40 Markus Franke
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
Am 28.02.22 um 12:08 schrieb Jakob Koschel: If the list does not contain the expected element, the value of list_for_each_entry() iterator will not point to a valid structure. To avoid type confusion in such case, the list iterator scope will be limited to list_for_each_entry() loop. We explicitly have the list_entry_is_head() macro to test after a loop if the element pointer points to the head of the list instead of a valid list entry. So at least from my side I absolutely don't think that this is a good idea. In preparation to limiting scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element. Determining if an element was found is then simply checking if the pointer is != NULL. Since when do we actually want to do this? Take this code here as an example: diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c index 48afe96ae0f0..6c916416decc 100644 --- a/arch/x86/kernel/cpu/sgx/encl.c +++ b/arch/x86/kernel/cpu/sgx/encl.c @@ -450,7 +450,8 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) { struct sgx_encl_mm *encl_mm = container_of(mn, struct sgx_encl_mm, mmu_notifier); - struct sgx_encl_mm *tmp = NULL; + struct sgx_encl_mm *found_encl_mm = NULL; + struct sgx_encl_mm *tmp; /* * The enclave itself can remove encl_mm. Note, objects can't be moved @@ -460,12 +461,13 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn, list_for_each_entry(tmp, &encl_mm->encl->mm_list, list) { if (tmp == encl_mm) { list_del_rcu(&encl_mm->list); + found_encl_mm = tmp; break; } } spin_unlock(&encl_mm->encl->mm_lock); - if (tmp == encl_mm) { + if (found_encl_mm) { synchronize_srcu(&encl_mm->encl->srcu); mmu_notifier_put(mn); } I don't think that using the extra variable makes the code in any way more reliable or easier to read. Regards, Christian.
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
Am 28.02.22 um 20:56 schrieb Linus Torvalds: On Mon, Feb 28, 2022 at 4:19 AM Christian König wrote: I don't think that using the extra variable makes the code in any way more reliable or easier to read. So I think the next step is to do the attached patch (which requires that "-std=gnu11" that was discussed in the original thread). That will guarantee that the 'pos' parameter of list_for_each_entry() is only updated INSIDE the for_each_list_entry() loop, and can never point to the (wrongly typed) head entry. And I would actually hope that it should actually cause compiler warnings about possibly uninitialized variables if people then use the 'pos' pointer outside the loop. Except (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for inexplicable reasons - possibly because it already expected this behavior (b) when I remove that NULL initializer, I still don't get a warning, because we've disabled -Wno-maybe-uninitialized since it results in so many false positives. Oh well. Anyway, give this patch a look, and at least if it's expanded to do "(pos) = NULL" in the entry statement for the for-loop, it will avoid the HEAD type confusion that Jakob is working on. And I think in a cleaner way than the horrid games he plays. (But it won't avoid possible CPU speculation of such type confusion. That, in my opinion, is a completely different issue) Yes, completely agree. I do wish we could actually poison the 'pos' value after the loop somehow - but clearly the "might be uninitialized" I was hoping for isn't the way to do it. Anybody have any ideas? I think we should look at the use cases why code is touching (pos) after the loop. Just from skimming over the patches to change this and experience with the drivers/subsystems I help to maintain I think the primary pattern looks something like this: list_for_each_entry(entry, head, member) { if (some_condition_checking(entry)) break; } do_something_with(entry); So the solution should probably not be to change all those use cases to use more temporary variables, but rather to add a list_find_entry(..., condition) macro and consistently use that one instead. Regards, Christian. Linus
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
Am 28.02.22 um 21:42 schrieb James Bottomley: On Mon, 2022-02-28 at 21:07 +0100, Christian König wrote: Am 28.02.22 um 20:56 schrieb Linus Torvalds: On Mon, Feb 28, 2022 at 4:19 AM Christian König wrote: [SNIP] Anybody have any ideas? I think we should look at the use cases why code is touching (pos) after the loop. Just from skimming over the patches to change this and experience with the drivers/subsystems I help to maintain I think the primary pattern looks something like this: list_for_each_entry(entry, head, member) { if (some_condition_checking(entry)) break; } do_something_with(entry); Actually, we usually have a check to see if the loop found anything, but in that case it should something like if (list_entry_is_head(entry, head, member)) { return with error; } do_somethin_with(entry); Suffice? The list_entry_is_head() macro is designed to cope with the bogus entry on head problem. That will work and is also what people already do. The key problem is that we let people do the same thing over and over again with slightly different implementations. Out in the wild I've seen at least using a separate variable, using a bool to indicate that something was found and just assuming that the list has an entry. The last case is bogus and basically what can break badly. If we would have an unified macro which search for an entry combined with automated reporting on patches to use that macro I think the potential to introduce such issues will already go down massively without auditing tons of existing code. Regards, Christian. James
Re: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr
Am 28.02.22 um 22:13 schrieb James Bottomley: On Mon, 2022-02-28 at 21:56 +0100, Christian König wrote: Am 28.02.22 um 21:42 schrieb James Bottomley: On Mon, 2022-02-28 at 21:07 +0100, Christian König wrote: Am 28.02.22 um 20:56 schrieb Linus Torvalds: On Mon, Feb 28, 2022 at 4:19 AM Christian König wrote: [SNIP] Anybody have any ideas? I think we should look at the use cases why code is touching (pos) after the loop. Just from skimming over the patches to change this and experience with the drivers/subsystems I help to maintain I think the primary pattern looks something like this: list_for_each_entry(entry, head, member) { if (some_condition_checking(entry)) break; } do_something_with(entry); Actually, we usually have a check to see if the loop found anything, but in that case it should something like if (list_entry_is_head(entry, head, member)) { return with error; } do_somethin_with(entry); Suffice? The list_entry_is_head() macro is designed to cope with the bogus entry on head problem. That will work and is also what people already do. The key problem is that we let people do the same thing over and over again with slightly different implementations. Out in the wild I've seen at least using a separate variable, using a bool to indicate that something was found and just assuming that the list has an entry. The last case is bogus and basically what can break badly. Yes, I understand that. I'm saying we should replace that bogus checks of entry->something against some_value loop termination condition with the list_entry_is_head() macro. That should be a one line and fairly mechanical change rather than the explosion of code changes we seem to have in the patch series. Yes, exactly that's my thinking as well. Christian. James
Re: [RFC 0/2] Add generic FPU api similar to x86
While you already CCed a bunch of people stuff like that needs to go to the appropriate mailing list and not just amd-gfx. Especially LKML so that other core devs can take a look as well. Regards, Christian. Am 19.07.21 um 21:52 schrieb Anson Jacob: This is an attempt to have generic FPU enable/disable calls similar to x86. So that we can simplify gpu/drm/amd/display/dc/os_types.h Also adds FPU correctness logic seen in x86. Anson Jacob (2): ppc/fpu: Add generic FPU api similar to x86 drm/amd/display: Use PPC FPU functions arch/powerpc/include/asm/switch_to.h | 29 ++--- arch/powerpc/kernel/process.c | 130 ++ drivers/gpu/drm/amd/display/dc/os_types.h | 28 + 3 files changed, 139 insertions(+), 48 deletions(-)
Re: [RFC v2 2/2] drm/amd/display: Use PPC FPU functions
Am 21.07.21 um 06:48 schrieb Anson Jacob: Use kernel_fpu_begin & kernel_fpu_end for PPC Depends on "ppc/fpu: Add generic FPU api similar to x86" v2: - Got rid of macro switch for PPC as header file with same name as x86 is added by previous patch in the series Signed-off-by: Anson Jacob CC: Christoph Hellwig CC: Rodrigo Siqueira CC: Harry Wentland CC: Christian König Looks good in general, but question is what about other architectures like ARM? Regards, Christian. --- drivers/gpu/drm/amd/display/dc/os_types.h | 29 --- 1 file changed, 29 deletions(-) diff --git a/drivers/gpu/drm/amd/display/dc/os_types.h b/drivers/gpu/drm/amd/display/dc/os_types.h index 126c2f3a4dd3..47ef434f93d8 100644 --- a/drivers/gpu/drm/amd/display/dc/os_types.h +++ b/drivers/gpu/drm/amd/display/dc/os_types.h @@ -51,38 +51,9 @@ #define dm_error(fmt, ...) DRM_ERROR(fmt, ##__VA_ARGS__) #if defined(CONFIG_DRM_AMD_DC_DCN) -#if defined(CONFIG_X86) #include #define DC_FP_START() kernel_fpu_begin() #define DC_FP_END() kernel_fpu_end() -#elif defined(CONFIG_PPC64) -#include -#include -#define DC_FP_START() { \ - if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ - preempt_disable(); \ - enable_kernel_vsx(); \ - } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ - preempt_disable(); \ - enable_kernel_altivec(); \ - } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ - preempt_disable(); \ - enable_kernel_fp(); \ - } \ -} -#define DC_FP_END() { \ - if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ - disable_kernel_vsx(); \ - preempt_enable(); \ - } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ - disable_kernel_altivec(); \ - preempt_enable(); \ - } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ - disable_kernel_fp(); \ - preempt_enable(); \ - } \ -} -#endif #endif /*
Re: [RFC v2 2/2] drm/amd/display: Use PPC FPU functions
Am 21.07.21 um 08:51 schrieb Christoph Hellwig: On Wed, Jul 21, 2021 at 08:29:43AM +0200, Christian K??nig wrote: Looks good in general, but question is what about other architectures like ARM? DRM_AMD_DC_DCN currently requires X86 || PPC64. And exactly that's the problem I'm noting here. At least officially AMD claims that we support ARM and some very brave still use the hardware together with MIPS as well. Maybe a good think would be to add a new KERNEL_FPU_API Kconfig symbol, selected by x86 and powerpc (I think ppc32 should be fine too now) so that we get these arch dependencies out of the driver. Good idea. Christian.
Re: [PATCH 00/11] Implement generic prot_guest_has() helper function
Am 28.07.21 um 00:26 schrieb Tom Lendacky: This patch series provides a generic helper function, prot_guest_has(), to replace the sme_active(), sev_active(), sev_es_active() and mem_encrypt_active() functions. It is expected that as new protected virtualization technologies are added to the kernel, they can all be covered by a single function call instead of a collection of specific function calls all called from the same locations. The powerpc and s390 patches have been compile tested only. Can the folks copied on this series verify that nothing breaks for them. As GPU driver dev I'm only one end user of this, but at least from the high level point of view that makes totally sense to me. Feel free to add an Acked-by: Christian König . We could run that through the AMD GPU unit tests, but I fear we actually don't test on a system with SEV/SME active. Going to raise that on our team call today. Regards, Christian. Cc: Andi Kleen Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Daniel Vetter Cc: Dave Hansen Cc: Dave Young Cc: David Airlie Cc: Heiko Carstens Cc: Ingo Molnar Cc: Joerg Roedel Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Michael Ellerman Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Thomas Zimmermann Cc: Vasily Gorbik Cc: VMware Graphics Cc: Will Deacon --- Patches based on: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master commit 79e920060fa7 ("Merge branch 'WIP/fixes'") Tom Lendacky (11): mm: Introduce a function to check for virtualization protection features x86/sev: Add an x86 version of prot_guest_has() powerpc/pseries/svm: Add a powerpc version of prot_guest_has() x86/sme: Replace occurrences of sme_active() with prot_guest_has() x86/sev: Replace occurrences of sev_active() with prot_guest_has() x86/sev: Replace occurrences of sev_es_active() with prot_guest_has() treewide: Replace the use of mem_encrypt_active() with prot_guest_has() mm: Remove the now unused mem_encrypt_active() function x86/sev: Remove the now unused mem_encrypt_active() function powerpc/pseries/svm: Remove the now unused mem_encrypt_active() function s390/mm: Remove the now unused mem_encrypt_active() function arch/Kconfig | 3 ++ arch/powerpc/include/asm/mem_encrypt.h | 5 -- arch/powerpc/include/asm/protected_guest.h | 30 +++ arch/powerpc/platforms/pseries/Kconfig | 1 + arch/s390/include/asm/mem_encrypt.h| 2 - arch/x86/Kconfig | 1 + arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 13 + arch/x86/include/asm/protected_guest.h | 27 ++ arch/x86/kernel/crash_dump_64.c| 4 +- arch/x86/kernel/head64.c | 4 +- arch/x86/kernel/kvm.c | 3 +- arch/x86/kernel/kvmclock.c | 4 +- arch/x86/kernel/machine_kexec_64.c | 19 +++ arch/x86/kernel/pci-swiotlb.c | 9 ++-- arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/kernel/sev.c | 6 +-- arch/x86/kvm/svm/svm.c | 3 +- arch/x86/mm/ioremap.c | 16 +++--- arch/x86/mm/mem_encrypt.c | 60 +++--- arch/x86/mm/mem_encrypt_identity.c | 3 +- arch/x86/mm/pat/set_memory.c | 3 +- arch/x86/platform/efi/efi_64.c | 9 ++-- arch/x86/realmode/init.c | 8 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 4 +- drivers/gpu/drm/drm_cache.c| 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c| 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c| 6 +-- drivers/iommu/amd/init.c | 7 +-- drivers/iommu/amd/iommu.c | 3 +- drivers/iommu/amd/iommu_v2.c | 3 +- drivers/iommu/iommu.c | 3 +- fs/proc/vmcore.c | 6 +-- include/linux/mem_encrypt.h| 4 -- include/linux/protected_guest.h| 37 + kernel/dma/swiotlb.c | 4 +- 36 files changed, 218 insertions(+), 104 deletions(-) create mode 100644 arch/powerpc/include/asm/protected_guest.h create mode 100644 arch/x86/include/asm/protected_guest.h create mode 100644 include/linux/protected_guest.h
Re: [PATCH v4 00/15] Unified cross-architecture kernel-mode FPU API
I only skimmed over the platform patches and spend only a few minutes on the amdgpu stuff. From what I've seen this series seems to make perfect sense to me, I just can't fully judge everything. So feel free to add Acked-by: Christian König but I strongly suggest that Harry and Rodrigo take a look as well. Regards, Christian. Am 29.03.24 um 08:18 schrieb Samuel Holland: This series unifies the kernel-mode FPU API across several architectures by wrapping the existing functions (where needed) in consistently-named functions placed in a consistent header location, with mostly the same semantics: they can be called from preemptible or non-preemptible task context, and are not assumed to be reentrant. Architectures are also expected to provide CFLAGS adjustments for compiling FPU-dependent code. For the moment, SIMD/vector units are out of scope for this common API. This allows us to remove the ifdeffery and duplicated Makefile logic at each FPU user. It then implements the common API on RISC-V, and converts a couple of users to the new API: the AMDGPU DRM driver, and the FPU self test. The underlying goal of this series is to allow using newer AMD GPUs (e.g. Navi) on RISC-V boards such as SiFive's HiFive Unmatched. Those GPUs need CONFIG_DRM_AMD_DC_FP to initialize, which requires kernel-mode FPU support. Previous versions: v3: https://lore.kernel.org/linux-kernel/20240327200157.1097089-1-samuel.holl...@sifive.com/ v2: https://lore.kernel.org/linux-kernel/20231228014220.3562640-1-samuel.holl...@sifive.com/ v1: https://lore.kernel.org/linux-kernel/20231208055501.2916202-1-samuel.holl...@sifive.com/ v0: https://lore.kernel.org/linux-kernel/20231122030621.3759313-1-samuel.holl...@sifive.com/ Changes in v4: - Add missed CFLAGS changes for recov_neon_inner.c (fixes arm build failures) - Fix x86 include guard issue (fixes x86 build failures) Changes in v3: - Rebase on v6.9-rc1 - Limit riscv ARCH_HAS_KERNEL_FPU_SUPPORT to 64BIT Changes in v2: - Add documentation explaining the built-time and runtime APIs - Add a linux/fpu.h header for generic isolation enforcement - Remove file name from header comment - Clean up arch/arm64/lib/Makefile, like for arch/arm - Remove RISC-V architecture-specific preprocessor check - Split altivec removal to a separate patch - Use linux/fpu.h instead of asm/fpu.h in consumers - Declare test_fpu() in a header Michael Ellerman (1): drm/amd/display: Only use hard-float, not altivec on powerpc Samuel Holland (14): arch: Add ARCH_HAS_KERNEL_FPU_SUPPORT ARM: Implement ARCH_HAS_KERNEL_FPU_SUPPORT ARM: crypto: Use CC_FLAGS_FPU for NEON CFLAGS arm64: Implement ARCH_HAS_KERNEL_FPU_SUPPORT arm64: crypto: Use CC_FLAGS_FPU for NEON CFLAGS lib/raid6: Use CC_FLAGS_FPU for NEON CFLAGS LoongArch: Implement ARCH_HAS_KERNEL_FPU_SUPPORT powerpc: Implement ARCH_HAS_KERNEL_FPU_SUPPORT x86/fpu: Fix asm/fpu/types.h include guard x86: Implement ARCH_HAS_KERNEL_FPU_SUPPORT riscv: Add support for kernel-mode FPU drm/amd/display: Use ARCH_HAS_KERNEL_FPU_SUPPORT selftests/fpu: Move FP code to a separate translation unit selftests/fpu: Allow building on other architectures Documentation/core-api/floating-point.rst | 78 +++ Documentation/core-api/index.rst | 1 + Makefile | 5 ++ arch/Kconfig | 6 ++ arch/arm/Kconfig | 1 + arch/arm/Makefile | 7 ++ arch/arm/include/asm/fpu.h| 15 arch/arm/lib/Makefile | 3 +- arch/arm64/Kconfig| 1 + arch/arm64/Makefile | 9 ++- arch/arm64/include/asm/fpu.h | 15 arch/arm64/lib/Makefile | 6 +- arch/loongarch/Kconfig| 1 + arch/loongarch/Makefile | 5 +- arch/loongarch/include/asm/fpu.h | 1 + arch/powerpc/Kconfig | 1 + arch/powerpc/Makefile | 5 +- arch/powerpc/include/asm/fpu.h| 28 +++ arch/riscv/Kconfig| 1 + arch/riscv/Makefile | 3 + arch/riscv/include/asm/fpu.h | 16 arch/riscv/kernel/Makefile| 1 + arch/riscv/kernel/kernel_mode_fpu.c | 28 +++ arch/x86/Kconfig | 1 + arch/x86/Makefile | 20 + arch/x86/include/asm/fpu.h| 13 arch/x86/include/asm/fpu/types.h | 6 +- drivers/gpu/drm/amd/display/Kconfig | 2 +- .../gpu/drm/amd/display/amdgpu_dm/dc_fpu.c| 35 + drivers/gpu/drm/amd/display/dc/dml/Makefile | 36 + drivers/gpu/drm/amd/display/dc/dml2