from:"Greg KH"

Re: [linux-next:master] BUILD REGRESSION 088b9c375534d905a4d337c78db3b3bfbb52c4a0

2022-07-07 Thread Greg KH

On Thu, Jul 07, 2022 at 02:56:34PM +0800, kernel test robot wrote:
> tree/branch: 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> branch HEAD: 088b9c375534d905a4d337c78db3b3bfbb52c4a0  Add linux-next 
> specific files for 20220706
> 
> Error/Warning reports:
> 
> https://lore.kernel.org/linux-doc/202207070644.x48xoovs-...@intel.com
> 
> Error/Warning: (recently discovered and may have been fixed)
> 
> Documentation/arm/google/chromebook-boot-flow.rst: WARNING: document isn't 
> included in any toctree
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1108): undefined reference to 
> `__aeabi_ddiv'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1124): undefined reference to 
> `__aeabi_ui2d'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1164): undefined reference to 
> `__aeabi_dmul'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1170): undefined reference to 
> `__aeabi_dadd'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1180): undefined reference to 
> `__aeabi_dsub'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x1190): undefined reference to 
> `__aeabi_d2uiz'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x162c): undefined reference to 
> `__aeabi_d2iz'
> arm-linux-gnueabi-ld: dc_dmub_srv.c:(.text+0x16b0): undefined reference to 
> `__aeabi_i2d'
> dc_dmub_srv.c:(.text+0x10f8): undefined reference to `__aeabi_ui2d'
> dc_dmub_srv.c:(.text+0x464): undefined reference to `__floatunsidf'
> dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x33c): undefined 
> reference to `__floatunsidf'
> drivers/pci/endpoint/functions/pci-epf-vntb.c:975:5: warning: no previous 
> prototype for 'pci_read' [-Wmissing-prototypes]
> drivers/pci/endpoint/functions/pci-epf-vntb.c:984:5: warning: no previous 
> prototype for 'pci_write' [-Wmissing-prototypes]
> drivers/vfio/vfio_iommu_type1.c:2141:35: warning: cast to smaller integer 
> type 'enum iommu_cap' from 'void *' [-Wvoid-pointer-to-enum-cast]
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x34c): 
> undefined reference to `__floatunsidf'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x378): 
> undefined reference to `__divdf3'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x38c): 
> undefined reference to `__muldf3'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x3a0): 
> undefined reference to `__adddf3'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x3b4): 
> undefined reference to `__subdf3'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x3d4): 
> undefined reference to `__fixunsdfsi'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x750): 
> undefined reference to `__fixdfsi'
> mips-linux-ld: dc_dmub_srv.c:(.text.dc_dmub_setup_subvp_dmub_command+0x7c0): 
> undefined reference to `__floatsidf'
> powerpc-linux-ld: drivers/pci/endpoint/functions/pci-epf-vntb.c:174: 
> undefined reference to `ntb_link_event'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x468): undefined reference to 
> `__divdf3'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x46c): undefined reference to 
> `__muldf3'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x470): undefined reference to 
> `__adddf3'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x474): undefined reference to 
> `__subdf3'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x478): undefined reference to 
> `__fixunsdfsi'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x47c): undefined reference to 
> `__fixdfsi'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x480): undefined reference to 
> `__floatsidf'
> xtensa-linux-ld: dc_dmub_srv.c:(.text+0x60c): undefined reference to 
> `__floatunsidf'
> 
> Unverified Error/Warning (likely false positive, please contact us if 
> interested):
> 
> arch/x86/events/core.c:2114 init_hw_perf_events() warn: missing error code 
> 'err'
> drivers/android/binder.c:1481:19-23: ERROR: from is NULL but dereferenced.
> drivers/android/binder.c:2920:29-33: ERROR: target_thread is NULL but 
> dereferenced.
> drivers/android/binder.c:353:25-35: ERROR: node -> proc is NULL but 
> dereferenced.
> drivers/android/binder.c:4888:16-20: ERROR: t is NULL but dereferenced.
> drivers/base/regmap/regmap.c:1996:1: internal compiler error: in arc_ifcvt, 
> at config/arc/arc.c:9637
> drivers/char/random.c:869:1: internal compiler error: in arc_ifcvt, at 
> config/arc/arc.c:9637
> drivers/firmware/arm_scmi/clock.c:394:1: internal compiler error: in 
> arc_ifcvt, at config/arc/arc.c:9637
> drivers/firmware/arm_scmi/powercap.c:376:1: internal compiler error: in 
> arc_ifcvt, at config/arc/arc.c:9637
> drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/vega10_powertune.c:1214:1: 
> internal compiler error: in arc_ifcvt, at config/arc/arc.c:9637
> drivers/gpu/drm/amd/display/dc/os_types.h: drm/drm_print.h is included more 
> than once.
> drivers/gpu/drm/bridge/ite-it66121.c:1398:1: internal compiler error: in 
> arc_ifcvt, at config/arc/arc.c:9637
> drivers/greybus/ope

Re: [PATCH v9 0/8] Add support for HiSilicon PCIe Tune and Trace device

2022-06-27 Thread Greg KH

On Mon, Jun 27, 2022 at 07:18:12PM +0800, Yicong Yang wrote:
> Hi Greg,
> 
> Since the kernel side of this device has been reviewed for 8 versions with
> all comments addressed and no more comment since v9 posted in 5.19-rc1,
> is it ok to merge it first (for Patch 1-3 and 7-8)?

I am not the maintainer of this subsystem, so I do not understand why
you are asking me :(

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] thunderbolt: Make iommu_dma_protection more accurate

2022-03-17 Thread Greg KH

On Thu, Mar 17, 2022 at 04:17:07PM +, Robin Murphy wrote:
> Between me trying to get rid of iommu_present() and Mario wanting to
> support the AMD equivalent of DMAR_PLATFORM_OPT_IN, scrutiny has shown
> that the iommu_dma_protection attribute is being far too optimistic.
> Even if an IOMMU might be present for some PCI segment in the system,
> that doesn't necessarily mean it provides translation for the device(s)
> we care about. Furthermore, all that DMAR_PLATFORM_OPT_IN really does
> is tell us that memory was protected before the kernel was loaded, and
> prevent the user from disabling the intel-iommu driver entirely. What
> actually matters is whether we trust individual devices, based on the
> "external facing" property that we expect firmware to describe for
> Thunderbolt ports.
> 
> Avoid false positives by looking as close as possible to the same PCI
> topology that the IOMMU layer will consider once a Thunderbolt endpoint
> appears. Crucially, we can't assume that IOMMU translation being enabled
> for any reason is sufficient on its own; full (expensive) DMA protection
> will still only be imposed on untrusted devices.
> 
> CC: Mario Limonciello 
> Signed-off-by: Robin Murphy 
> ---
> 
> This supersedes my previous attempt just trying to replace
> iommu_present() at [1], further to the original discussion at [2].
> 
> [1] 
> https://lore.kernel.org/linux-iommu/bl1pr12mb515799c0be396377dbbef055e2...@bl1pr12mb5157.namprd12.prod.outlook.com/T/
> [2] https://lore.kernel.org/linux-iommu/202203160844.lkviwr1q-...@intel.com/T/
> 
>  drivers/thunderbolt/domain.c | 12 +++-
>  drivers/thunderbolt/nhi.c| 35 +++
>  include/linux/thunderbolt.h  |  2 ++
>  3 files changed, 40 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/thunderbolt/domain.c b/drivers/thunderbolt/domain.c
> index 7018d959f775..d5c825e84ac8 100644
> --- a/drivers/thunderbolt/domain.c
> +++ b/drivers/thunderbolt/domain.c
> @@ -7,9 +7,7 @@
>   */
>  
>  #include 
> -#include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -257,13 +255,9 @@ static ssize_t iommu_dma_protection_show(struct device 
> *dev,
>struct device_attribute *attr,
>char *buf)
>  {
> - /*
> -  * Kernel DMA protection is a feature where Thunderbolt security is
> -  * handled natively using IOMMU. It is enabled when IOMMU is
> -  * enabled and ACPI DMAR table has DMAR_PLATFORM_OPT_IN set.
> -  */
> - return sprintf(buf, "%d\n",
> -iommu_present(&pci_bus_type) && dmar_platform_optin());
> + struct tb *tb = container_of(dev, struct tb, dev);
> +
> + return sprintf(buf, "%d\n", tb->nhi->iommu_dma_protection);

sysfs_emit() please.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] Revert "swiotlb: fix info leak with DMA_FROM_DEVICE"

2022-03-04 Thread Greg KH

On Fri, Mar 04, 2022 at 05:34:47PM +0100, Halil Pasic wrote:
> On Fri, 4 Mar 2022 15:28:44 +0100
> Greg KH  wrote:
> 
> > On Fri, Mar 04, 2022 at 02:58:58PM +0100, Halil Pasic wrote:
> > > This reverts commit ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e.  
> > 
> > Why???
> 
> TLDR; We got v4 merged instead of v7

That makes no sense at all to me, you need to describe it in detail.

You know better than this :(

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/2] swiotlb: fix info leak with DMA_FROM_DEVICE

2022-03-04 Thread Greg KH

On Fri, Mar 04, 2022 at 02:58:59PM +0100, Halil Pasic wrote:
> The problem I'm addressing was discovered by the LTP test covering
> cve-2018-1000204.
> 
> A short description of what happens follows:
> 1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
>interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
>and a corresponding dxferp. The peculiar thing about this is that TUR
>is not reading from the device.
> 2) In sg_start_req() the invocation of blk_rq_map_user() effectively
>bounces the user-space buffer. As if the device was to transfer into
>it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in
>sg_build_indirect()") we make sure this first bounce buffer is
>allocated with GFP_ZERO.
> 3) For the rest of the story we keep ignoring that we have a TUR, so the
>device won't touch the buffer we prepare as if the we had a
>DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
>and the  buffer allocated by SG is mapped by the function
>virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
>scatter-gather and not scsi generics). This mapping involves bouncing
>via the swiotlb (we need swiotlb to do virtio in protected guest like
>s390 Secure Execution, or AMD SEV).
> 4) When the SCSI TUR is done, we first copy back the content of the second
>(that is swiotlb) bounce buffer (which most likely contains some
>previous IO data), to the first bounce buffer, which contains all
>zeros.  Then we copy back the content of the first bounce buffer to
>the user-space buffer.
> 5) The test case detects that the buffer, which it zero-initialized,
>   ain't all zeros and fails.
> 
> This is an swiotlb problem, because the swiotlb should be transparent in
> a sense that it does not affect the outcome (if all other participants
> are well behaved), and without swiotlb we leak all zeros.  Even if
> swiotlb_tbl_map_single() zero-initialised the allocated slot(s) to avoid
> leaking stale data back to the caller later, when it comes to unmap or
> sync_for_cpu it still fundamentally cannot tell how much of the contents
> of the bounce slot have actually changed, therefore if the caller was
> expecting the device to do a partial write, the rest of the mapped
> buffer *will* be corrupted by bouncing the whole thing back again.
> 
> Copying the content of the original buffer into the swiotlb buffer is
> the only way I can think of to make swiotlb transparent in such
> scenarios. So let's do just that.
> 
> The extra bounce is expected to hurt the performance. For the cases
> where the extra bounce is not necessary we could get rid of it, if we
> were told by the client code, that it is not needed. Such optimisations
> are out of scope for this patch, and are likely to be a subject of some
> future work.
> 
> Signed-off-by: Halil Pasic 
> Reported-by: Ali Haider 
> Reviewed-by: Christoph Hellwig 
> ---
>  kernel/dma/swiotlb.c | 22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] Revert "swiotlb: fix info leak with DMA_FROM_DEVICE"

2022-03-04 Thread Greg KH

On Fri, Mar 04, 2022 at 02:58:58PM +0100, Halil Pasic wrote:
> This reverts commit ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e.

Why???

> Signed-off-by: Halil Pasic 

You need a blank line before this one.

also:

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH -next v3 1/2] platform-msi: Save the msg context to desc in platform_msi_write_msg()

2021-09-14 Thread Greg KH

On Sat, Aug 28, 2021 at 02:39:14PM +0800, Bixuan Cui wrote:
> Save the msg context to desc when the msi interrupt is requested.
> The drivers can use it in special scenarios(such as resume).
> 
> Signed-off-by: Bixuan Cui 
> ---
>  drivers/base/platform-msi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/base/platform-msi.c b/drivers/base/platform-msi.c
> index 3d6c8f9caf43..60962a224fcc 100644
> --- a/drivers/base/platform-msi.c
> +++ b/drivers/base/platform-msi.c
> @@ -90,6 +90,9 @@ static void platform_msi_write_msg(struct irq_data *data, 
> struct msi_msg *msg)
>  
>   priv_data = desc->platform.msi_priv_data;
>  
> + desc->msg.address_lo = msg->address_lo;
> + desc->msg.address_hi = msg->address_hi;
> + desc->msg.data = msg->data;
>   priv_data->write_msg(desc, msg);
>  }
>  
> -- 
> 2.17.1
> 

Acked-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V4 05/13] hyperv: Add Write/Read MSR registers via ghcb page

2021-08-27 Thread Greg KH

On Fri, Aug 27, 2021 at 01:21:03PM -0400, Tianyu Lan wrote:
> From: Tianyu Lan 
> 
> Hyperv provides GHCB protocol to write Synthetic Interrupt
> Controller MSR registers in Isolation VM with AMD SEV SNP
> and these registers are emulated by hypervisor directly.
> Hyperv requires to write SINTx MSR registers twice. First
> writes MSR via GHCB page to communicate with hypervisor
> and then writes wrmsr instruction to talk with paravisor
> which runs in VMPL0. Guest OS ID MSR also needs to be set
> via GHCB page.
> 
> Signed-off-by: Tianyu Lan 
> ---
> Change since v1:
>  * Introduce sev_es_ghcb_hv_call_simple() and share code
>between SEV and Hyper-V code.
> Change since v3:
>  * Pass old_msg_type to hv_signal_eom() as parameter.
>* Use HV_REGISTER_* marcro instead of HV_X64_MSR_*
>* Add hv_isolation_type_snp() weak function.
>* Add maros to set syinc register in ARM code.
> ---
>  arch/arm64/include/asm/mshyperv.h |  23 ++
>  arch/x86/hyperv/hv_init.c |  36 ++
>  arch/x86/hyperv/ivm.c | 112 ++
>  arch/x86/include/asm/mshyperv.h   |  80 -
>  arch/x86/include/asm/sev.h|   3 +
>  arch/x86/kernel/sev-shared.c  |  63 ++---
>  drivers/hv/hv.c   | 112 --
>  drivers/hv/hv_common.c|   6 ++
>  include/asm-generic/mshyperv.h|   4 +-
>  9 files changed, 345 insertions(+), 94 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mshyperv.h 
> b/arch/arm64/include/asm/mshyperv.h
> index 20070a847304..ced83297e009 100644
> --- a/arch/arm64/include/asm/mshyperv.h
> +++ b/arch/arm64/include/asm/mshyperv.h
> @@ -41,6 +41,29 @@ static inline u64 hv_get_register(unsigned int reg)
>   return hv_get_vpreg(reg);
>  }
>  
> +#define hv_get_simp(val) { val = hv_get_register(HV_REGISTER_SIMP); }
> +#define hv_set_simp(val) hv_set_register(HV_REGISTER_SIMP, val)
> +
> +#define hv_get_siefp(val){ val = hv_get_register(HV_REGISTER_SIEFP); }
> +#define hv_set_siefp(val)hv_set_register(HV_REGISTER_SIEFP, val)
> +
> +#define hv_get_synint_state(int_num, val) {  \
> + val = hv_get_register(HV_REGISTER_SINT0 + int_num); \
> + }
> +
> +#define hv_set_synint_state(int_num, val)\
> + hv_set_register(HV_REGISTER_SINT0 + int_num, val)
> +
> +#define hv_get_synic_state(val) {\
> + val = hv_get_register(HV_REGISTER_SCONTROL);\
> + }
> +
> +#define hv_set_synic_state(val)  \
> + hv_set_register(HV_REGISTER_SCONTROL, val)
> +
> +#define hv_signal_eom(old_msg_type)   \
> + hv_set_register(HV_REGISTER_EOM, 0)

Please just use real inline functions and not #defines if you really
need it.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V4 04/13] hyperv: Mark vmbus ring buffer visible to host in Isolation VM

2021-08-27 Thread Greg KH

On Fri, Aug 27, 2021 at 01:21:02PM -0400, Tianyu Lan wrote:
> From: Tianyu Lan 
> 
> Mark vmbus ring buffer visible with set_memory_decrypted() when
> establish gpadl handle.
> 
> Signed-off-by: Tianyu Lan 
> ---
> Change since v3:
>* Change vmbus_teardown_gpadl() parameter and put gpadl handle,
>buffer and buffer size in the struct vmbus_gpadl.
> ---
>  drivers/hv/channel.c| 36 -
>  drivers/net/hyperv/hyperv_net.h |  1 +
>  drivers/net/hyperv/netvsc.c | 16 +++
>  drivers/uio/uio_hv_generic.c| 14 +++--
>  include/linux/hyperv.h  |  8 +++-
>  5 files changed, 63 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index f3761c73b074..82650beb3af0 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -17,6 +17,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  
> @@ -474,6 +475,13 @@ static int __vmbus_establish_gpadl(struct vmbus_channel 
> *channel,
>   if (ret)
>   return ret;
>  
> + ret = set_memory_decrypted((unsigned long)kbuffer,
> +HVPFN_UP(size));
> + if (ret) {
> + pr_warn("Failed to set host visibility for new GPADL %d.\n", 
> ret);

dev_warn()?  You have access to a struct device, why not use it?

same for all other instances here.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v10 16/17] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-07-29 Thread Greg KH

On Thu, Jul 29, 2021 at 03:35:02PM +0800, Xie Yongji wrote:
> +/*
> + * The basic configuration of a VDUSE device, which is used by
> + * VDUSE_CREATE_DEV ioctl to create a VDUSE device.
> + */
> +struct vduse_dev_config {

Please document this structure using kernel doc so we know what all the
fields are.

> +#define VDUSE_NAME_MAX   256
> + char name[VDUSE_NAME_MAX]; /* vduse device name, needs to be NUL 
> terminated */
> + __u32 vendor_id; /* virtio vendor id */
> + __u32 device_id; /* virtio device id */
> + __u64 features; /* virtio features */
> + __u32 vq_num; /* the number of virtqueues */
> + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */
> + __u32 reserved[13]; /* for future use */

This HAS to be tested to be all 0, otherwise you can never use it in the
future.  I did not see the code doing that at all.

> + __u32 config_size; /* the size of the configuration space */
> + __u8 config[0]; /* the buffer of the configuration space */

config[]; please instead?  I thought we were getting rid of all of the
0-length arrays in the kernel tree.

> +};
> +
> +/* Create a VDUSE device which is represented by a char device 
> (/dev/vduse/$NAME) */
> +#define VDUSE_CREATE_DEV _IOW(VDUSE_BASE, 0x02, struct vduse_dev_config)
> +
> +/*
> + * Destroy a VDUSE device. Make sure there are no more references
> + * to the char device (/dev/vduse/$NAME).
> + */
> +#define VDUSE_DESTROY_DEV_IOW(VDUSE_BASE, 0x03, char[VDUSE_NAME_MAX])
> +
> +/* The ioctls for VDUSE device (/dev/vduse/$NAME) */
> +
> +/*
> + * The information of one IOVA region, which is retrieved from
> + * VDUSE_IOTLB_GET_FD ioctl.
> + */
> +struct vduse_iotlb_entry {
> + __u64 offset; /* the mmap offset on returned file descriptor */
> + __u64 start; /* start of the IOVA range: [start, last] */
> + __u64 last; /* last of the IOVA range: [start, last] */
> +#define VDUSE_ACCESS_RO 0x1
> +#define VDUSE_ACCESS_WO 0x2
> +#define VDUSE_ACCESS_RW 0x3
> + __u8 perm; /* access permission of this region */
> +};
> +
> +/*
> + * Find the first IOVA region that overlaps with the range [start, last]
> + * and return the corresponding file descriptor. Return -EINVAL means the
> + * IOVA region doesn't exist. Caller should set start and last fields.
> + */
> +#define VDUSE_IOTLB_GET_FD   _IOWR(VDUSE_BASE, 0x10, struct 
> vduse_iotlb_entry)
> +
> +/*
> + * Get the negotiated virtio features. It's a subset of the features in
> + * struct vduse_dev_config which can be accepted by virtio driver. It's
> + * only valid after FEATURES_OK status bit is set.
> + */
> +#define VDUSE_DEV_GET_FEATURES   _IOR(VDUSE_BASE, 0x11, __u64)
> +
> +/*
> + * The information that is used by VDUSE_DEV_SET_CONFIG ioctl to update
> + * device configuration space.
> + */
> +struct vduse_config_data {
> + __u32 offset; /* offset from the beginning of configuration space */
> + __u32 length; /* the length to write to configuration space */
> + __u8 buffer[0]; /* buffer used to write from */

again, buffer[];?

> +};
> +
> +/* Set device configuration space */
> +#define VDUSE_DEV_SET_CONFIG _IOW(VDUSE_BASE, 0x12, struct vduse_config_data)
> +
> +/*
> + * Inject a config interrupt. It's usually used to notify virtio driver
> + * that device configuration space has changed.
> + */
> +#define VDUSE_DEV_INJECT_CONFIG_IRQ  _IO(VDUSE_BASE, 0x13)
> +
> +/*
> + * The basic configuration of a virtqueue, which is used by
> + * VDUSE_VQ_SETUP ioctl to setup a virtqueue.
> + */
> +struct vduse_vq_config {
> + __u32 index; /* virtqueue index */
> + __u16 max_size; /* the max size of virtqueue */
> +};
> +
> +/*
> + * Setup the specified virtqueue. Make sure all virtqueues have been
> + * configured before the device is attached to vDPA bus.
> + */
> +#define VDUSE_VQ_SETUP   _IOW(VDUSE_BASE, 0x14, struct 
> vduse_vq_config)
> +
> +struct vduse_vq_state_split {
> + __u16 avail_index; /* available index */
> +};
> +
> +struct vduse_vq_state_packed {
> + __u16 last_avail_counter:1; /* last driver ring wrap counter observed 
> by device */
> + __u16 last_avail_idx:15; /* device available index */

Bit fields in a user structure?  Are you sure this is going to work
well?  Why not just make this a __u16 and then mask off what you want so
that you do not run into endian issues?

> + __u16 last_used_counter:1; /* device ring wrap counter */
> + __u16 last_used_idx:15; /* used index */
> +};
> +
> +/*
> + * The information of a virtqueue, which is retrieved from
> + * VDUSE_VQ_GET_INFO ioctl.
> + */
> +struct vduse_vq_info {
> + __u32 index; /* virtqueue index */
> + __u32 num; /* the size of virtqueue */
> + __u64 desc_addr; /* address of desc area */
> + __u64 driver_addr; /* address of driver area */
> + __u64 device_addr; /* address of device area */
> + union {
> + struct vduse_vq_state_split split; /* split virtqueue state */
> +

Re: [PATCH v9 16/17] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-07-13 Thread Greg KH

On Wed, Jul 14, 2021 at 02:02:50PM +0800, Jason Wang wrote:
> 
> 在 2021/7/14 下午1:54, Michael S. Tsirkin 写道:
> > On Wed, Jul 14, 2021 at 01:45:39PM +0800, Jason Wang wrote:
> > > > +static int vduse_dev_msg_sync(struct vduse_dev *dev,
> > > > + struct vduse_dev_msg *msg)
> > > > +{
> > > > +   int ret;
> > > > +
> > > > +   init_waitqueue_head(&msg->waitq);
> > > > +   spin_lock(&dev->msg_lock);
> > > > +   msg->req.request_id = dev->msg_unique++;
> > > > +   vduse_enqueue_msg(&dev->send_list, msg);
> > > > +   wake_up(&dev->waitq);
> > > > +   spin_unlock(&dev->msg_lock);
> > > > +
> > > > +   wait_event_killable_timeout(msg->waitq, msg->completed,
> > > > +   VDUSE_REQUEST_TIMEOUT * HZ);
> > > > +   spin_lock(&dev->msg_lock);
> > > > +   if (!msg->completed) {
> > > > +   list_del(&msg->list);
> > > > +   msg->resp.result = VDUSE_REQ_RESULT_FAILED;
> > > > +   }
> > > > +   ret = (msg->resp.result == VDUSE_REQ_RESULT_OK) ? 0 : -EIO;
> > > 
> > > I think we should mark the device as malfunction when there is a timeout 
> > > and
> > > forbid any userspace operations except for the destroy aftwards for 
> > > safety.
> > This looks like if one tried to run gdb on the program the behaviour
> > will change completely because kernel wants it to respond within
> > specific time. Looks like a receipe for heisenbugs.
> > 
> > Let's not build interfaces with arbitrary timeouts like that.
> > Interruptible wait exists for this very reason.
> 
> 
> The problem is. Do we want userspace program like modprobe to be stuck for
> indefinite time and expect the administrator to kill that?

Why would modprobe be stuck for forever?

Is this on the module probe path?
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 6/6] mmc: sdhci-msm: Request non-strict IOMMU mode

2021-06-24 Thread Greg KH

On Mon, Jun 21, 2021 at 04:52:48PM -0700, Douglas Anderson wrote:
> IOMMUs can be run in "strict" mode or in "non-strict" mode. The
> quick-summary difference between the two is that in "strict" mode we
> wait until everything is flushed out when we unmap DMA memory. In
> "non-strict" we don't.
> 
> Using the IOMMU in "strict" mode is more secure/safer but slower
> because we have to sit and wait for flushes while we're unmapping. To
> explain a bit why "non-strict" mode is unsafe, let's imagine two
> examples.
> 
> An example of "non-strict" being insecure when reading from a device:
> a) Linux driver maps memory for DMA.
> b) Linux driver starts DMA on the device.
> c) Device write to RAM subject to bounds checking done by IOMMU.
> d) Device finishes writing to RAM and signals transfer is finished.
> e) Linux driver starts unmapping DMA memory but doesn't flush.

Why doesn't it "flush"?

> f) Linux driver validates that the data in memory looks sane and that
>accessing it won't cause the driver to, for instance, overflow a
>buffer.
> g) Device takes advantage of knowledge of how the Linux driver works
>and sneaks in a modification to the data after the validation but
>before the IOMMU unmap flush finishes.
> h) Device has now caused the Linux driver to access memory it
>shouldn't.

So you are now saying we need to not trust hardware?  If so, that's a
massive switch for how the kernel needs to work right?

And what driver does f) and allows g) to happen?  That would be a normal
bug anyway, why not just fix the driver?

> An example of "non-strict" being insecure when writing to a device:
> a) Linux driver writes data intended for the device to RAM.
> b) Linux driver maps memory for DMA.
> c) Linux driver starts DMA on the device.
> d) Device reads from RAM subject to bounds checking done by IOMMU.
> e) Device finishes reading from RAM and signals transfer is finished.
> f) Linux driver starts unmapping DMA memory but doesn't flush.

Why does it not flush?

What do you mean by "flush"

> g) Linux driver frees memory and returns it to the pool.

What pool?

> h) Memory is allocated for another purpose.

Allocated by what?

We have memory allocators that write over the data when freed, why not
just use this if you are concerned about this type of issue?

> i) Device takes advantage of the period of time before IOMMU flush to
>read memory that it shouldn't have had access to.

What memory would that be?

And if you really care about these issues, are you not able to take the
"hit" for the flush all the time as that is a hardware thing, not a
software thing.  Why not just always take advantage of that, no driver
changes needed?

And this isn't going to work, again, because the "pre_probe" isn't going
to be acceptable, sorry.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/6] PCI: Indicate that we want to force strict DMA for untrusted devices

2021-06-24 Thread Greg KH

On Mon, Jun 21, 2021 at 04:52:45PM -0700, Douglas Anderson wrote:
> At the moment the generic IOMMU framework reaches into the PCIe device
> to check the "untrusted" state and uses this information to figure out
> if it should be running the IOMMU in strict or non-strict mode. Let's
> instead set the new boolean in "struct device" to indicate when we
> want forced strictness.
> 
> NOTE: we still continue to set the "untrusted" bit in PCIe since that
> apparently is used for more than just IOMMU strictness. It probably
> makes sense for a later patchset to clarify all of the other needs we
> have for "untrusted" PCIe devices (perhaps add more booleans into the
> "struct device") so we can fully eliminate the need for the IOMMU
> framework to reach into a PCIe device.

It feels like the iommu code should not be messing with pci devices at
all, please don't do this.

Why does this matter?  Why wouldn't a pci device use "strict" iommu at
all times?  What happens if it does not?  Why are PCI devices special?

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/6] drivers: base: Add bits to struct device to control iommu strictness

2021-06-24 Thread Greg KH

On Mon, Jun 21, 2021 at 04:52:44PM -0700, Douglas Anderson wrote:
> How to control the "strictness" of an IOMMU is a bit of a mess right
> now. As far as I can tell, right now:
> * You can set the default to "non-strict" and some devices (right now,
>   only PCI devices) can request to run in "strict" mode.
> * You can set the default to "strict" and no devices in the system are
>   allowed to run as "non-strict".
> 
> I believe this needs to be improved a bit. Specifically:
> * We should be able to default to "strict" mode but let devices that
>   claim to be fairly low risk request that they be run in "non-strict"
>   mode.
> * We should allow devices outside of PCIe to request "strict" mode if
>   the system default is "non-strict".
> 
> I believe the correct way to do this is two bits in "struct
> device". One allows a device to force things to "strict" mode and the
> other allows a device to _request_ "non-strict" mode. The asymmetry
> here is on purpose. Generally if anything in the system makes a
> request for strictness of something then we want it strict. Thus
> drivers can only request (but not force) non-strictness.
> 
> It's expected that the strictness fields can be filled in by the bus
> code like in the patch ("PCI: Indicate that we want to force strict
> DMA for untrusted devices") or by using the new pre_probe concept
> introduced in the patch ("drivers: base: Add the concept of
> "pre_probe" to drivers").
> 
> Signed-off-by: Douglas Anderson 
> ---
> 
>  include/linux/device.h | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index f1a00040fa53..c1b985e10c47 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -449,6 +449,15 @@ struct dev_links_info {
>   *   and optionall (if the coherent mask is large enough) also
>   *   for dma allocations.  This flag is managed by the dma ops
>   *   instance from ->dma_supported.
> + * @force_strict_iommu: If set to %true then we should force this device to
> + *   iommu.strict regardless of the other defaults in the
> + *   system. Only has an effect if an IOMMU is in place.

Why would you ever NOT want to do this?

> + * @request_non_strict_iommu: If set to %true and there are no other known
> + * reasons to make the iommu.strict for this device,
> + * then default to non-strict mode. This implies
> + * some belief that the DMA master for this device
> + * won't abuse the DMA path to compromise the kernel.
> + * Only has an effect if an IOMMU is in place.

This feels in contrast to the previous field you just added, how do they
both interact?  Would an enum work better?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/6] drivers: base: Add the concept of "pre_probe" to drivers

2021-06-24 Thread Greg KH

On Mon, Jun 21, 2021 at 04:52:43PM -0700, Douglas Anderson wrote:
> Right now things are a bit awkward if a driver would like a chance to
> run before some of the more "automatic" things (pinctrl, DMA, IOMMUs,
> ...) happen to a device. This patch aims to fix that problem by
> introducing the concept of a "pre_probe" function that drivers can
> implement to run before the "automatic" stuff.
> 
> Why would you want to run before the "automatic" stuff? The incentive
> in my case is that I want to be able to fill in some boolean flags in
> the "struct device" before the IOMMU init runs. It appears that the
> strictness vs. non-strictness of a device's iommu config is determined
> once at init time and can't be changed afterwards. However, I would
> like to avoid hardcoding the rules for strictness in the IOMMU
> driver. Instead I'd like to let individual drivers be able to make
> informed decisions about the appropriateness of strictness
> vs. non-strictness.
> 
> The desire for running code pre_probe is likely not limited to my use
> case. I believe that the list "qcom_smmu_client_of_match" is hacked
> into the iommu driver specifically because there was no real good
> framework for this. For the existing list it wasn't _quite_ as ugly as
> my needs since the decision could be made solely on compatible string,
> but it still feels like it would have been better for individual
> drivers to run code and setup some state rather than coding up a big
> list in the IOMMU driver.
> 
> Even without this patch, I believe it is possible for a driver to run
> before the "automatic" things by registering for
> "BUS_NOTIFY_BIND_DRIVER" in its init call, though I haven't personally
> tested this. Using the notifier is a bit awkward, though, and I'd
> rather avoid it. Also, using "BUS_NOTIFY_BIND_DRIVER" would require
> drivers to stop using the convenience module_platform_driver() helper
> and roll a bunch of boilerplate code.
> 
> NOTE: the pre_probe here is listed in the driver structure. As a side
> effect of this it will be passed a "struct device *" rather than the
> more specific device type (like the "struct platform_device *" that
> most platform devices get passed to their probe). Presumably this
> won't cause trouble and it's a lot less code to write but if we need
> to make it more symmetric that's also possible by touching more files.

No, please please no.

If a bus really wants to do crud like this, it can do it in it's own
probe callback, the driver core doesn't need to mess with this.

If you need to mess with iommu values in struct device, again, do that
in the bus core for the devices on that specific bus, that's where those
values are supposed to be set anyway, right?

If the iommu drivers need to be run before a specific bus is
initialized, then fix that there, the driver core does not need to care
about this at all.

so a big NACK on this one, sorry.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-30 Thread Greg KH

On Mon, May 31, 2021 at 02:19:37PM +0800, Yongji Xie wrote:
> Hi Greg,
> 
> Thanks a lot for the review!
> 
> On Mon, May 31, 2021 at 12:56 PM Greg KH  wrote:
> >
> > On Mon, May 17, 2021 at 05:55:12PM +0800, Xie Yongji wrote:
> > > +struct vduse_dev {
> > > + struct vduse_vdpa *vdev;
> > > + struct device dev;
> > > + struct cdev cdev;
> >
> > You now have 2 reference counted devices controling the lifespace of a
> > single structure.  A mess that is guaranteed to go wrong.  Please never
> > do this.
> >
> 
> These two are both used by cdev_device_add(). Looks like I didn't find
> any problem. Any suggestions?

Make one of these dynamic and do not have them both control the lifespan
of the structure.

> > > + struct vduse_virtqueue *vqs;
> > > + struct vduse_iova_domain *domain;
> > > + char *name;
> > > + struct mutex lock;
> > > + spinlock_t msg_lock;
> > > + atomic64_t msg_unique;
> >
> > Why do you need an atomic and a lock?
> >
> 
> You are right. We don't need an atomic here.
> 
> > > + wait_queue_head_t waitq;
> > > + struct list_head send_list;
> > > + struct list_head recv_list;
> > > + struct list_head list;
> > > + struct vdpa_callback config_cb;
> > > + struct work_struct inject;
> > > + spinlock_t irq_lock;
> > > + unsigned long api_version;
> > > + bool connected;
> > > + int minor;
> > > + u16 vq_size_max;
> > > + u32 vq_num;
> > > + u32 vq_align;
> > > + u32 config_size;
> > > + u32 device_id;
> > > + u32 vendor_id;
> > > +};
> > > +
> > > +struct vduse_dev_msg {
> > > + struct vduse_dev_request req;
> > > + struct vduse_dev_response resp;
> > > + struct list_head list;
> > > + wait_queue_head_t waitq;
> > > + bool completed;
> > > +};
> > > +
> > > +struct vduse_control {
> > > + unsigned long api_version;
> >
> > u64?
> >
> 
> OK.
> 
> > > +};
> > > +
> > > +static unsigned long max_bounce_size = (64 * 1024 * 1024);
> > > +module_param(max_bounce_size, ulong, 0444);
> > > +MODULE_PARM_DESC(max_bounce_size, "Maximum bounce buffer size. (default: 
> > > 64M)");
> > > +
> > > +static unsigned long max_iova_size = (128 * 1024 * 1024);
> > > +module_param(max_iova_size, ulong, 0444);
> > > +MODULE_PARM_DESC(max_iova_size, "Maximum iova space size (default: 
> > > 128M)");
> > > +
> > > +static bool allow_unsafe_device_emulation;
> > > +module_param(allow_unsafe_device_emulation, bool, 0444);
> > > +MODULE_PARM_DESC(allow_unsafe_device_emulation, "Allow emulating unsafe 
> > > device."
> > > + " We must make sure the userspace device emulation process is 
> > > trusted."
> > > + " Otherwise, don't enable this option. (default: false)");
> > > +
> >
> > This is not the 1990's anymore, please never use module parameters, make
> > these per-device attributes if you really need them.
> >
> 
> These parameters will be used before the device is created. Or do you
> mean add some attributes to the control device?

You need to do something, as no one can mess with a module parameter
easily.  Why do you need them at all, shouldn't it "just work" properly
with no need for userspace interaction?

> > > +static int vduse_init(void)
> > > +{
> > > + int ret;
> > > +
> > > + if (max_bounce_size >= max_iova_size)
> > > + return -EINVAL;
> > > +
> > > + ret = misc_register(&vduse_misc);
> > > + if (ret)
> > > + return ret;
> > > +
> > > + vduse_class = class_create(THIS_MODULE, "vduse");
> >
> > If you have a misc device, you do not need to create a class at the same
> > time.  Why are you doing both here?  Just stick with the misc device, no
> > need for anything else.
> >
> 
> The misc device is the control device represented by
> /dev/vduse/control. Then a VDUSE device represented by
> /dev/vduse/$NAME can be created by the ioctl(VDUSE_CREATE_DEV) on this
> control device.

Ah.  Then how about using the same MAJOR for all of these, and just have
the first minor (0) be your control?  That happens for other device
types (raw, loop, etc.).  Or just document this really well please, as
it was not obvious what you were doing here.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 11/12] vduse: Introduce VDUSE - vDPA Device in Userspace

2021-05-30 Thread Greg KH

On Mon, May 17, 2021 at 05:55:12PM +0800, Xie Yongji wrote:
> +struct vduse_dev {
> + struct vduse_vdpa *vdev;
> + struct device dev;
> + struct cdev cdev;

You now have 2 reference counted devices controling the lifespace of a
single structure.  A mess that is guaranteed to go wrong.  Please never
do this.

> + struct vduse_virtqueue *vqs;
> + struct vduse_iova_domain *domain;
> + char *name;
> + struct mutex lock;
> + spinlock_t msg_lock;
> + atomic64_t msg_unique;

Why do you need an atomic and a lock?

> + wait_queue_head_t waitq;
> + struct list_head send_list;
> + struct list_head recv_list;
> + struct list_head list;
> + struct vdpa_callback config_cb;
> + struct work_struct inject;
> + spinlock_t irq_lock;
> + unsigned long api_version;
> + bool connected;
> + int minor;
> + u16 vq_size_max;
> + u32 vq_num;
> + u32 vq_align;
> + u32 config_size;
> + u32 device_id;
> + u32 vendor_id;
> +};
> +
> +struct vduse_dev_msg {
> + struct vduse_dev_request req;
> + struct vduse_dev_response resp;
> + struct list_head list;
> + wait_queue_head_t waitq;
> + bool completed;
> +};
> +
> +struct vduse_control {
> + unsigned long api_version;

u64?

> +};
> +
> +static unsigned long max_bounce_size = (64 * 1024 * 1024);
> +module_param(max_bounce_size, ulong, 0444);
> +MODULE_PARM_DESC(max_bounce_size, "Maximum bounce buffer size. (default: 
> 64M)");
> +
> +static unsigned long max_iova_size = (128 * 1024 * 1024);
> +module_param(max_iova_size, ulong, 0444);
> +MODULE_PARM_DESC(max_iova_size, "Maximum iova space size (default: 128M)");
> +
> +static bool allow_unsafe_device_emulation;
> +module_param(allow_unsafe_device_emulation, bool, 0444);
> +MODULE_PARM_DESC(allow_unsafe_device_emulation, "Allow emulating unsafe 
> device."
> + " We must make sure the userspace device emulation process is trusted."
> + " Otherwise, don't enable this option. (default: false)");
> +

This is not the 1990's anymore, please never use module parameters, make
these per-device attributes if you really need them.

> +static int vduse_init(void)
> +{
> + int ret;
> +
> + if (max_bounce_size >= max_iova_size)
> + return -EINVAL;
> +
> + ret = misc_register(&vduse_misc);
> + if (ret)
> + return ret;
> +
> + vduse_class = class_create(THIS_MODULE, "vduse");

If you have a misc device, you do not need to create a class at the same
time.  Why are you doing both here?  Just stick with the misc device, no
need for anything else.

> + if (IS_ERR(vduse_class)) {
> + ret = PTR_ERR(vduse_class);
> + goto err_class;
> + }
> + vduse_class->devnode = vduse_devnode;
> +
> + ret = alloc_chrdev_region(&vduse_major, 0, VDUSE_DEV_MAX, "vduse");

Wait, you want a whole major?  What is the misc device for?

> +MODULE_VERSION(DRV_VERSION);

MODULE_VERSION() makes no sense when the code is merged into the kernel
tree, so you can just drop that.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: 5.10.37 won't boot on my system, but 5.10.36 with same config does

2021-05-17 Thread Greg KH

On Tue, May 18, 2021 at 12:30:15AM +0200, Christoph Biedl wrote:
> Greg KH wrote...
> 
> > On Mon, May 17, 2021 at 02:45:01PM +0200, Jack Wang wrote:
> 
> > > So it's caused by this commit[1], and it should be fixed by latest
> > > 5.10.38-rc1 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-5.10.y
> > > [1]https://lore.kernel.org/stable/20210515132855.4bn7ve2ozvdhp...@nabokov.fritz.box/
> > 
> > Hopefully the "real" 5.10.38-rc1 release that is out now should fix
> > this.  If not, please let us know.
> 
> Good news: Fixed with that "real" 5.10.38-rc1.

Wonderful!  Thanks for testing and letting us know.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: 5.10.37 won't boot on my system, but 5.10.36 with same config does

2021-05-17 Thread Greg KH

On Mon, May 17, 2021 at 02:45:01PM +0200, Jack Wang wrote:
> Christoph Biedl  于2021年5月17日周一 下午2:25写道：
> >
> > Jack Wang wrote...
> >
> > > Christoph Biedl  于2021年5月17日周一 
> > > 下午1:52写道：
> > > >
> > > > Christoph Biedl wrote...
> > > >
> > > > > Thanks for taking care, unfortunately no improvement with 5.10.38-rc1 
> > > > > here.
> > > >
> > > > So in case this is related to the .config, I'm attaching it. Hardware 
> > > > is,
> > > > as said before, an old Thinkpad x200, vendor BIOS and no particular 
> > > > modifications.
> > > > After disabling all vga/video/fbcon parameters I see the system suffers
> > > > a kernel panic but unfortunately only the last 25 lines are visible.
> > > > Possibly (typos are mine)
> > > >
> > > > RIP: 0010:__domain_mapping+0xa7/0x3a0
> > > >
> > > > is a hint into the right direction?
> > > This looks intel_iommu related, can you try boot with
> > > "intel_iommu=off" in kernel parameter?
> >
> > Gotcha. System boots up fine then.
> >
> > Christoph
> So it's caused by this commit[1], and it should be fixed by latest
> 5.10.38-rc1 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-5.10.y
> [1]https://lore.kernel.org/stable/20210515132855.4bn7ve2ozvdhp...@nabokov.fritz.box/

Hopefully the "real" 5.10.38-rc1 release that is out now should fix
this.  If not, please let us know.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dma-api debugfs directory is not created since debugfs is not initialized

2021-04-27 Thread Greg KH

On Tue, Apr 27, 2021 at 01:32:50PM +0100, Robin Murphy wrote:
> On 2021-04-27 12:39, Greg KH wrote:
> > On Tue, Apr 27, 2021 at 01:34:27PM +0200, Corentin Labbe wrote:
> > > Hello
> > > 
> > > I try to debug some DMA problem on next-20210427, and so I have enabled 
> > > CONFIG_DMA_API_DEBUG=y.
> > > But the dma-api directory does show up in debugfs, but lot of other 
> > > directory exists in it.
> > 
> > Does it show up properly in 5.12?
> > 
> > > After debugging it seems due to commit 56348560d495 ("debugfs: do not 
> > > attempt to create a new file before the filesystem is initalized")
> > > Reverting the commit permit to "dma-api" debugfs to be found. (but seems 
> > > not the right way to fix it).
> > 
> > We have had some odd start-up ordering issues that the above commit has
> > caused to show.  Given that this commit is now in stable kernels, with
> > no report of this issue so far, I'm worried that maybe this is a dma
> > subsystem ordering issue?
> 
> Both debugfs_init() and dma_debug_init() do run at core_initcall level, and
> disassembling the vmlinux from my current working tree does indeed suggest
> that they somehow end up in the wrong relative order:
> 
> [...]
> 80001160d0c8 <__initcall__kmod_debug__325_918_dma_debug_init1>:
> 80001160d0c8:   feb0d528.word   0xfeb0d528
> 
> [...]
> 
> 80001160d108 <__initcall__kmod_debugfs__357_848_debugfs_init1>:
> 80001160d108:   fff4326c.word   0xfff4326c
> [...]
> 
> 
> I always had the impression that initcall ordering tended to work out
> roughly alphabetical, such that entries from fs/* might come before
> kernel/*, but I guess it's at the whims of the linker in the end :/

init call ordering happens from link ordering.

> Perhaps the easiest thing to do is split out dma_debug_fs_init() and run
> that at a later level? We do want the dma-debug infrastructure itself to be
> up as early as possible, but I think the debugfs view of its internals can
> happily wait until closer to the time that there's actually a userspace to
> be able to look at it.

That seems like a better idea here, there's no need for "special
treatment" of debugfs.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: dma-api debugfs directory is not created since debugfs is not initialized

2021-04-27 Thread Greg KH

On Tue, Apr 27, 2021 at 01:34:27PM +0200, Corentin Labbe wrote:
> Hello
> 
> I try to debug some DMA problem on next-20210427, and so I have enabled 
> CONFIG_DMA_API_DEBUG=y.
> But the dma-api directory does show up in debugfs, but lot of other directory 
> exists in it.

Does it show up properly in 5.12?

> After debugging it seems due to commit 56348560d495 ("debugfs: do not attempt 
> to create a new file before the filesystem is initalized")
> Reverting the commit permit to "dma-api" debugfs to be found. (but seems not 
> the right way to fix it).

We have had some odd start-up ordering issues that the above commit has
caused to show.  Given that this commit is now in stable kernels, with
no report of this issue so far, I'm worried that maybe this is a dma
subsystem ordering issue?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [Resend RFC PATCH V2 08/12] UIO/Hyper-V: Not load UIO HV driver in the isolation VM.

2021-04-14 Thread Greg KH

On Wed, Apr 14, 2021 at 10:49:41AM -0400, Tianyu Lan wrote:
> From: Tianyu Lan 
> 
> UIO HV driver should not load in the isolation VM for security reason.
> Return ENOTSUPP in the hv_uio_probe() in the isolation VM.
> 
> Signed-off-by: Tianyu Lan 
> ---
>  drivers/uio/uio_hv_generic.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
> index 0330ba99730e..678b021d66f8 100644
> --- a/drivers/uio/uio_hv_generic.c
> +++ b/drivers/uio/uio_hv_generic.c
> @@ -29,6 +29,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "../hv/hyperv_vmbus.h"
>  
> @@ -241,6 +242,10 @@ hv_uio_probe(struct hv_device *dev,
>   void *ring_buffer;
>   int ret;
>  
> + /* UIO driver should not be loaded in the isolation VM.*/
> + if (hv_is_isolation_supported())
> + return -ENOTSUPP;
> + 
>   /* Communicating with host has to be via shared memory not hypercall */
>   if (!channel->offermsg.monitor_allocated) {
>   dev_err(&dev->device, "vmbus channel requires hypercall\n");
> -- 
> 2.25.1
> 

Again you send out known-wrong patches?

:(
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 5.4 v3 1/1] iommu/vt-d: Fix agaw for a supported 48 bit guest address width

2021-04-13 Thread Greg KH

On Mon, Apr 12, 2021 at 01:27:35PM -0700, Saeed Mirzamohammadi wrote:
> The IOMMU driver calculates the guest addressability for a DMA request
> based on the value of the mgaw reported from the IOMMU. However, this
> is a fused value and as mentioned in the spec, the guest width
> should be calculated based on the minimum of supported adjusted guest
> address width (SAGAW) and MGAW.
> 
> This is from specification:
> "Guest addressability for a given DMA request is limited to the
> minimum of the value reported through this field and the adjusted
> guest address width of the corresponding page-table structure.
> (Adjusted guest address widths supported by hardware are reported
> through the SAGAW field)."
> 
> This causes domain initialization to fail and following
> errors appear for EHCI PCI driver:
> 
> [2.486393] ehci-pci :01:00.4: EHCI Host Controller
> [2.486624] ehci-pci :01:00.4: new USB bus registered, assigned bus
> number 1
> [2.489127] ehci-pci :01:00.4: DMAR: Allocating domain failed
> [2.489350] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity
> mapping
> [2.489359] ehci-pci :01:00.4: can't setup: -12
> [2.489531] ehci-pci :01:00.4: USB bus 1 deregistered
> [2.490023] ehci-pci :01:00.4: init :01:00.4 fail, -12
> [2.490358] ehci-pci: probe of :01:00.4 failed with error -12
> 
> This issue happens when the value of the sagaw corresponds to a
> 48-bit agaw. This fix updates the calculation of the agaw based on
> the minimum of IOMMU's sagaw value and MGAW.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Saeed Mirzamohammadi 
> Tested-by: Camille Lu 
> Reviewed-by: Lu Baolu 
> 
> ---

Also when you resend this, please state the commit that this fixes, as
this must be a regression, right?  What kernel version did this previous
work for?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 5.4 v3 1/1] iommu/vt-d: Fix agaw for a supported 48 bit guest address width

2021-04-13 Thread Greg KH

On Tue, Apr 13, 2021 at 11:05:34AM -0700, Saeed Mirzamohammadi wrote:
> Hi Greg,
> 
> I don’t have any commit ID since the fix is not in mainline or any Linus’ 
> tree yet. The driver has completely changed for newer stable versions (and 
> also mainline) and the fix only applies for 5.4, 4.19, and 4.14 stable 
> kernels.

Why can we not just take what is in mainline?

And if not, then you need to document the heck out of this in the
changelog text, and get all of the related maintainers in the area to
sign off on this.  Diverging from Linus's tree creates a big burden over
time, you have to make this really really obvious why you are doing
this, and what you are doing here so that everyone agrees with it.

Remember, 90% of all of these types of "do it differently than Linus's
tree" are buggy and cause problems, be very careful.

Please fix up and resend.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 5.4 v3 1/1] iommu/vt-d: Fix agaw for a supported 48 bit guest address width

2021-04-13 Thread Greg KH



On Mon, Apr 12, 2021 at 01:27:35PM -0700, Saeed Mirzamohammadi wrote:
> The IOMMU driver calculates the guest addressability for a DMA request
> based on the value of the mgaw reported from the IOMMU. However, this
> is a fused value and as mentioned in the spec, the guest width
> should be calculated based on the minimum of supported adjusted guest
> address width (SAGAW) and MGAW.
> 
> This is from specification:
> "Guest addressability for a given DMA request is limited to the
> minimum of the value reported through this field and the adjusted
> guest address width of the corresponding page-table structure.
> (Adjusted guest address widths supported by hardware are reported
> through the SAGAW field)."
> 
> This causes domain initialization to fail and following
> errors appear for EHCI PCI driver:
> 
> [2.486393] ehci-pci :01:00.4: EHCI Host Controller
> [2.486624] ehci-pci :01:00.4: new USB bus registered, assigned bus
> number 1
> [2.489127] ehci-pci :01:00.4: DMAR: Allocating domain failed
> [2.489350] ehci-pci :01:00.4: DMAR: 32bit DMA uses non-identity
> mapping
> [2.489359] ehci-pci :01:00.4: can't setup: -12
> [2.489531] ehci-pci :01:00.4: USB bus 1 deregistered
> [2.490023] ehci-pci :01:00.4: init :01:00.4 fail, -12
> [2.490358] ehci-pci: probe of :01:00.4 failed with error -12
> 
> This issue happens when the value of the sagaw corresponds to a
> 48-bit agaw. This fix updates the calculation of the agaw based on
> the minimum of IOMMU's sagaw value and MGAW.
> 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Saeed Mirzamohammadi 
> Tested-by: Camille Lu 
> Reviewed-by: Lu Baolu 

What is the git commit id of this patch in Linus's tree?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 00/30] DMA: Mundane typo fixes

2021-03-29 Thread Greg KH

On Mon, Mar 29, 2021 at 11:25:11AM +0530, Bhaskar Chowdhury wrote:
> On 07:29 Mon 29 Mar 2021, Christoph Hellwig wrote:
> > I really don't think these typo patchbomb are that useful.  I'm all
> > for fixing typos when working with a subsystem, but I'm not sure these
> > patchbombs help anything.
> > 
> I am sure you are holding the wrong end of the wand and grossly failing to
> understand.

Please stop statements like this, it is not helpful and is doing nothing
but ensure that your patches will not be looked at in the future.

> Anyway, I hope I give a heads up ...find "your way" to fix those damn
> thing...it's glaring

There is no requirement that anyone accept patches that are sent to
them.  When you complain when receiving comments on them, that
shows you do not wish to work with others.

Sorry, but you are now on my local blacklist for a while, and I
encourage other maintainers to just ignore these patches as well.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin

2021-02-09 Thread Greg KH

On Tue, Feb 09, 2021 at 07:58:15PM +0800, Zhou Wang wrote:
> On 2021/2/9 17:37, Greg KH wrote:
> > On Tue, Feb 09, 2021 at 05:17:46PM +0800, Zhou Wang wrote:
> >> On 2021/2/8 6:02, Andy Lutomirski wrote:
> >>>
> >>>
> >>>> On Feb 7, 2021, at 12:31 AM, Zhou Wang  wrote:
> >>>>
> >>>> SVA(share virtual address) offers a way for device to share process 
> >>>> virtual
> >>>> address space safely, which makes more convenient for user space device
> >>>> driver coding. However, IO page faults may happen when doing DMA
> >>>> operations. As the latency of IO page fault is relatively big, DMA
> >>>> performance will be affected severely when there are IO page faults.
> >>>> From a long term view, DMA performance will be not stable.
> >>>>
> >>>> In high-performance I/O cases, accelerators might want to perform
> >>>> I/O on a memory without IO page faults which can result in dramatically
> >>>> increased latency. Current memory related APIs could not achieve this
> >>>> requirement, e.g. mlock can only avoid memory to swap to backup device,
> >>>> page migration can still trigger IO page fault.
> >>>>
> >>>> Various drivers working under traditional non-SVA mode are using
> >>>> their own specific ioctl to do pin. Such ioctl can be seen in v4l2,
> >>>> gpu, infiniband, media, vfio, etc. Drivers are usually doing dma
> >>>> mapping while doing pin.
> >>>>
> >>>> But, in SVA mode, pin could be a common need which isn't necessarily
> >>>> bound with any drivers, and neither is dma mapping needed by drivers
> >>>> since devices are using the virtual address of CPU. Thus, It is better
> >>>> to introduce a new common syscall for it.
> >>>>
> >>>> This patch leverages the design of userfaultfd and adds mempinfd for pin
> >>>> to avoid messing up mm_struct. A fd will be got by mempinfd, then user
> >>>> space can do pin/unpin pages by ioctls of this fd, all pinned pages under
> >>>> one file will be unpinned in file release process. Like pin page cases in
> >>>> other places, can_do_mlock is used to check permission and input
> >>>> parameters.
> >>>
> >>>
> >>> Can you document what the syscall does?
> >>
> >> Will add related document in Documentation/vm.
> > 
> > A manpage is always good, and will be required eventually :)
> 
> manpage is maintained in another repo. Do you mean add a manpage
> patch in this series?

It's good to show how it will be used, don't you think?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: DMA direct mapping fix for 5.4 and earlier stable branches

2021-02-09 Thread Greg KH

On Tue, Feb 09, 2021 at 10:19:07AM +, obayashi.yoshim...@socionext.com 
wrote:
> > How do you judge "mature"?
> 
>   My basic criteria are
> * Function is exist, but bug fix is necessary: "mature"
> * Function extension is necessary: "immature"
> 
> > And again, if a feature isn't present in a specific kernel version, why 
> > would you think that it would be
> > a viable solution for you to use?
> 
>   So, "a feature isn't present in a specific kernel version", 
> also means "immature" according to my criteria.

Great, please use the 5.10 or newer kernel release then.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin

2021-02-09 Thread Greg KH

On Tue, Feb 09, 2021 at 05:17:46PM +0800, Zhou Wang wrote:
> On 2021/2/8 6:02, Andy Lutomirski wrote:
> > 
> > 
> >> On Feb 7, 2021, at 12:31 AM, Zhou Wang  wrote:
> >>
> >> SVA(share virtual address) offers a way for device to share process 
> >> virtual
> >> address space safely, which makes more convenient for user space device
> >> driver coding. However, IO page faults may happen when doing DMA
> >> operations. As the latency of IO page fault is relatively big, DMA
> >> performance will be affected severely when there are IO page faults.
> >> From a long term view, DMA performance will be not stable.
> >>
> >> In high-performance I/O cases, accelerators might want to perform
> >> I/O on a memory without IO page faults which can result in dramatically
> >> increased latency. Current memory related APIs could not achieve this
> >> requirement, e.g. mlock can only avoid memory to swap to backup device,
> >> page migration can still trigger IO page fault.
> >>
> >> Various drivers working under traditional non-SVA mode are using
> >> their own specific ioctl to do pin. Such ioctl can be seen in v4l2,
> >> gpu, infiniband, media, vfio, etc. Drivers are usually doing dma
> >> mapping while doing pin.
> >>
> >> But, in SVA mode, pin could be a common need which isn't necessarily
> >> bound with any drivers, and neither is dma mapping needed by drivers
> >> since devices are using the virtual address of CPU. Thus, It is better
> >> to introduce a new common syscall for it.
> >>
> >> This patch leverages the design of userfaultfd and adds mempinfd for pin
> >> to avoid messing up mm_struct. A fd will be got by mempinfd, then user
> >> space can do pin/unpin pages by ioctls of this fd, all pinned pages under
> >> one file will be unpinned in file release process. Like pin page cases in
> >> other places, can_do_mlock is used to check permission and input
> >> parameters.
> > 
> > 
> > Can you document what the syscall does?
> 
> Will add related document in Documentation/vm.

A manpage is always good, and will be required eventually :)

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: DMA direct mapping fix for 5.4 and earlier stable branches

2021-02-09 Thread Greg KH

On Tue, Feb 09, 2021 at 09:05:40AM +, obayashi.yoshim...@socionext.com 
wrote:
> > > As the drivers are currently under development and Socionext has
> > > chosen 5.4 stable kernel for their development. So I will let
> > > Obayashi-san answer this if it's possible for them to migrate to 5.10
> > > instead?
> 
>   We have started this development project from last August, 
> so we have selected 5.4 as most recent and longest lifetime LTS 
> version at that time.
> 
>   And we have already finished to develop other device drivers, 
> and Video converter and CODEC drivers are now in development.
> 
> > Why pick a kernel that doesn not support the features they require?
> > That seems very odd and unwise.
> 
>   From the view point of ZeroCopy using DMABUF, is 5.4 not 
> mature enough, and is 5.10 enough mature ?
>   This is the most important point for judging migration.

How do you judge "mature"?

And again, if a feature isn't present in a specific kernel version, why
would you think that it would be a viable solution for you to use?

good luck!

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: preserve DMA offsets when using swiotlb v2

2021-02-09 Thread Greg KH

On Tue, Feb 09, 2021 at 09:41:56AM +0100, Christoph Hellwig wrote:
> Sorry for being a little pushy, any chance we could get this reviewed
> in time for the 5.12 merge window?

No objection from me, my ack is already on the patch that you need it
for :)

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/8] driver core: add a min_align_mask field to struct device_dma_parameters

2021-02-04 Thread Greg KH

On Thu, Feb 04, 2021 at 08:30:28PM +0100, Christoph Hellwig wrote:
> From: Jianxiong Gao 
> 
> Some devices rely on the address offset in a page to function
> correctly (NVMe driver as an example). These devices may use
> a different page size than the Linux kernel. The address offset
> has to be preserved upon mapping, and in order to do so, we
> need to record the page_offset_mask first.
> 
> Signed-off-by: Jianxiong Gao 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/linux/device.h  |  1 +
>  include/linux/dma-mapping.h | 16 
>  2 files changed, 17 insertions(+)

Acked-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2 0/3] SWIOTLB: Preserve swiotlb map offset when needed.

2021-02-02 Thread Greg KH

On Mon, Feb 01, 2021 at 10:30:14AM -0800, Jianxiong Gao wrote:
> NVMe driver and other applications may depend on the data offset
> to operate correctly. Currently when unaligned data is mapped via
> SWIOTLB, the data is mapped as slab aligned with the SWIOTLB. This
> patch adds an option to make sure the mapped data preserves its
> offset of the orginal addrss.
> 
> Without the patch when creating xfs formatted disk on NVMe backends,
> with swiotlb=force in kernel boot option, creates the following error:
> meta-data=/dev/nvme2n1   isize=512agcount=4, agsize=131072 blks
>  =   sectsz=512   attr=2, projid32bit=1
>  =   crc=1finobt=1, sparse=0, rmapbt=0, refl
> ink=0
> data =   bsize=4096   blocks=524288, imaxpct=25
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
> log  =internal log   bsize=4096   blocks=2560, version=2
>  =   sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none   extsz=4096   blocks=0, rtextents=0
> mkfs.xfs: pwrite failed: Input/output error
> 
> Jianxiong Gao (3):
>   Adding page_offset_mask to device_dma_parameters
>   Add swiotlb offset preserving mapping when
> dma_dma_parameters->page_offset_mask is non zero.
>   Adding device_dma_parameters->offset_preserve_mask to NVMe driver.
> 
>  drivers/nvme/host/pci.c |  4 
>  include/linux/device.h  |  1 +
>  include/linux/dma-mapping.h | 17 +
>  kernel/dma/swiotlb.c| 16 +++-
>  4 files changed, 37 insertions(+), 1 deletion(-)
> 
> -- 
> 2.27.0
> 

You forgot to mention somewhere, what changed from v1 to v2 :(
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/3] Adding offset keeping option when mapping data via SWIOTLB.*

2021-01-28 Thread Greg KH

On Wed, Jan 27, 2021 at 04:38:26PM -0800, Jianxiong Gao wrote:
> NVMe driver and other applications may depend on the data offset
> to operate correctly. Currently when unaligned data is mapped via
> SWIOTLB, the data is mapped as slab aligned with the SWIOTLB. This
> patch adds an option to make sure the mapped data preserves its
> offset of the orginal addrss.
> 
> Without the patch when creating xfs formatted disk on NVMe backends,
> with swiotlb=force in kernel boot option, creates the following error:
> meta-data=/dev/nvme2n1   isize=512agcount=4, agsize=131072 blks
>  =   sectsz=512   attr=2, projid32bit=1
>  =   crc=1finobt=1, sparse=0, rmapbt=0, 
> refl
> ink=0
> data =   bsize=4096   blocks=524288, imaxpct=25
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
> log  =internal log   bsize=4096   blocks=2560, version=2
>  =   sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none   extsz=4096   blocks=0, rtextents=0
> mkfs.xfs: pwrite failed: Input/output error
> 
> Jianxiong Gao (3):
>   Adding page_offset_mask to device_dma_parameters
>   Add swiotlb offset preserving mapping when
> dma_dma_parameters->page_offset_mask is non zero.
>   Adding device_dma_parameters->offset_preserve_mask to NVMe driver.

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You did not write a descriptive Subject: for the patch, allowing Greg,
  and everyone else, to know what this patch is all about.  Please read
  the section entitled "The canonical patch format" in the kernel file,
  Documentation/SubmittingPatches for what a proper Subject: line should
  look like.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: gracefully handle DMAR units with no supported address widths

2021-01-20 Thread Greg KH

On Wed, Jan 20, 2021 at 03:55:05PM +, David Woodhouse wrote:
> On Wed, 2021-01-20 at 13:06 +0100, Greg KH wrote:
> > On Wed, Jan 20, 2021 at 09:42:43AM +, David Woodhouse wrote:
> > > On Thu, 2020-09-24 at 15:08 +0100, David Woodhouse wrote:
> > > > From: David Woodhouse 
> > > > 
> > > > Instead of bailing out completely, such a unit can still be used for
> > > > interrupt remapping.
> > > > 
> > > > Signed-off-by: David Woodhouse 
> > > 
> > > Could we have this for stable too please, along with the trivial
> > > subsequent fixup. They are:
> > > 
> > > c40c1018 ("iommu/vt-d: Gracefully handle DMAR units with no supported 
> > > address widths")
> > > 9def3b1a07c4 ("iommu/vt-d: Don't dereference iommu_device if IOMMU_API is 
> > > not built")
> > > 
> > > They apply fairly straightforwardly when backported; let me know if you
> > > want us to send patches.
> > 
> > What stable kernel(s) do you want this in?  The above patches are
> > already in 5.10.
> 
> It's a fairly simple bug fix, to still use a given IOMMU for interrupt
> remapping even if it can't be used for DMA mapping.
> 
> Those features are somewhat orthogonal, and it was wrong for the kernel
> to bail out on the IOMMU hardware completely.
> 
> The interrupt remapping support is what's required for Intel boxes (or
> VMs) to run with more than 255 CPUs. It should be fairly simple to fix
> the same bug at least as far back as 4.14.

I tried applying these to 5.4, 4.19, and 4.14, and they all fail to
build:

drivers/iommu/dmar.c: In function ‘free_iommu’:
drivers/iommu/dmar.c:1140:35: error: ‘struct intel_iommu’ has no member named 
‘drhd’
 1140 |  if (intel_iommu_enabled && !iommu->drhd->ignored) {
  |   ^~

So if you could provide a working set of patches backported, I will be
glad to queue them up.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/vt-d: gracefully handle DMAR units with no supported address widths

2021-01-20 Thread Greg KH

On Wed, Jan 20, 2021 at 09:42:43AM +, David Woodhouse wrote:
> On Thu, 2020-09-24 at 15:08 +0100, David Woodhouse wrote:
> > From: David Woodhouse 
> > 
> > Instead of bailing out completely, such a unit can still be used for
> > interrupt remapping.
> > 
> > Signed-off-by: David Woodhouse 
> 
> Could we have this for stable too please, along with the trivial
> subsequent fixup. They are:
> 
> c40c1018 ("iommu/vt-d: Gracefully handle DMAR units with no supported 
> address widths")
> 9def3b1a07c4 ("iommu/vt-d: Don't dereference iommu_device if IOMMU_API is not 
> built")
> 
> They apply fairly straightforwardly when backported; let me know if you
> want us to send patches.

What stable kernel(s) do you want this in?  The above patches are
already in 5.10.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 2/6] swiotlb: Add restricted DMA pool

2021-01-13 Thread Greg KH

On Wed, Jan 13, 2021 at 12:51:26PM +0100, Christoph Hellwig wrote:
> On Wed, Jan 06, 2021 at 08:50:03AM +0100, Greg KH wrote:
> > > --- a/include/linux/device.h
> > > +++ b/include/linux/device.h
> > > @@ -413,6 +413,7 @@ struct dev_links_info {
> > >   * @dma_pools:   Dma pools (if dma'ble device).
> > >   * @dma_mem: Internal for coherent mem override.
> > >   * @cma_area:Contiguous memory area for dma allocations
> > > + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.
> > 
> > Why does this have to be added here?  Shouldn't the platform-specific
> > code handle it instead?
> 
> The whole code added here is pretty generic.  What we need to eventually
> do, though is to add a separate dma_device instead of adding more and more
> bloat to struct device.

I have no objections for that happening!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dma: mark unmapped DMA scatter/gather invalid

2021-01-11 Thread Greg KH

On Mon, Jan 11, 2021 at 07:43:35AM -0800, Marc Orr wrote:
> This patch updates dma_direct_unmap_sg() to mark each scatter/gather
> entry invalid, after it's unmapped. This fixes two issues:
> 
> 1. It makes the unmapping code able to tolerate a double unmap.
> 2. It prevents the NVMe driver from erroneously treating an unmapped DMA
> address as mapped.
> 
> The bug that motivated this patch was the following sequence, which
> occurred within the NVMe driver, with the kernel flag `swiotlb=force`.
> 
> * NVMe driver calls dma_direct_map_sg()
> * dma_direct_map_sg() fails part way through the scatter gather/list
> * dma_direct_map_sg() calls dma_direct_unmap_sg() to unmap any entries
>   succeeded.
> * NVMe driver calls dma_direct_unmap_sg(), redundantly, leading to a
>   double unmap, which is a bug.
> 
> With this patch, a hadoop workload running on a cluster of three AMD
> SEV VMs, is able to succeed. Without the patch, the hadoop workload
> suffers application-level and even VM-level failures.
> 
> Tested-by: Jianxiong Gao 
> Tested-by: Marc Orr 
> Reviewed-by: Jianxiong Gao 
> Signed-off-by: Marc Orr 
> ---
>  kernel/dma/direct.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v3 2/6] swiotlb: Add restricted DMA pool

2021-01-05 Thread Greg KH

On Wed, Jan 06, 2021 at 11:41:20AM +0800, Claire Chang wrote:
> Add the initialization function to create restricted DMA pools from
> matching reserved-memory nodes in the device tree.
> 
> Signed-off-by: Claire Chang 
> ---
>  include/linux/device.h  |   4 ++
>  include/linux/swiotlb.h |   7 +-
>  kernel/dma/Kconfig  |   1 +
>  kernel/dma/swiotlb.c| 144 ++--
>  4 files changed, 131 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/device.h b/include/linux/device.h
> index 89bb8b84173e..ca6f71ec8871 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -413,6 +413,7 @@ struct dev_links_info {
>   * @dma_pools:   Dma pools (if dma'ble device).
>   * @dma_mem: Internal for coherent mem override.
>   * @cma_area:Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Internal for swiotlb io_tlb_mem override.

Why does this have to be added here?  Shouldn't the platform-specific
code handle it instead?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE

2020-09-25 Thread Greg KH

On Fri, Sep 25, 2020 at 11:45:05AM +, Suravee Suthikulpanit wrote:
> When using 128-bit interrupt-remapping table entry (IRTE) (a.k.a GA mode),
> current driver disables interrupt remapping when it updates the IRTE
> so that the upper and lower 64-bit values can be updated safely.
> 
> However, this creates a small window, where the interrupt could
> arrive and result in IO_PAGE_FAULT (for interrupt) as shown below.
> 
>   IOMMU DriverDevice IRQ
>   ===
>   irte.RemapEn=0
>...
>change IRTEIRQ from device ==> IO_PAGE_FAULT !!
>...
>   irte.RemapEn=1
> 
> This scenario has been observed when changing irq affinity on a system
> running I/O-intensive workload, in which the destination APIC ID
> in the IRTE is updated.
> 
> Instead, use cmpxchg_double() to update the 128-bit IRTE at once without
> disabling the interrupt remapping. However, this means several features,
> which require GA (128-bit IRTE) support will also be affected if cmpxchg16b
> is not supported (which is unprecedented for AMD processors w/ IOMMU).
> 
> Cc: sta...@vger.kernel.org
> Fixes: 880ac60e2538 ("iommu/amd: Introduce interrupt remapping ops structure")
> Reported-by: Sean Osborne 
> Signed-off-by: Suravee Suthikulpanit 
> Tested-by: Erik Rockstrom 
> Reviewed-by: Joao Martins 
> Link: 
> https://lore.kernel.org/r/20200903093822.52012-3-suravee.suthikulpa...@amd.com
> Signed-off-by: Joerg Roedel 
> ---
> Note: This patch is the back-port on top of the stable branch linux-5.4.y
> for the upstream commit e52d58d54a32 ("iommu/amd: Use cmpxchg_double() when
> updating 128-bit IRTE") since the original patch does not apply cleanly.

Now queued up, thanks.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-10 Thread Greg KH

On Thu, Sep 10, 2020 at 11:13:51AM +0200, Christoph Hellwig wrote:
> On Thu, Sep 10, 2020 at 09:53:51AM +0200, Greg KH wrote:
> > >   /*
> > >* Please refer to usb_alloc_dev() to see why we set
> > > -  * dma_mask and dma_pfn_offset.
> > > +  * dma_mask and dma_range_map.
> > >*/
> > >   intf->dev.dma_mask = dev->dev.dma_mask;
> > > - intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> > > + if (dma_direct_copy_range_map(&intf->dev, &dev->dev))
> > > + dev_err(&dev->dev, "failed to copy DMA map\n");
> > 
> > We tell the user, but then just keep on running?  Is there anything that
> > we can do here?
> > 
> > If not, why not have dma_direct_copy_range_map() print out the error?
> 
> At least for USB I'm pretty sure this isn't required at all.  I've been
> running with the patch below on my desktop for two days now trying all
> the usb toys I have (in addition to grepping for obvious abuses in
> the drivers).  remoteproc is a different story, but the DMA handling
> seems there is sketchy to start with..
> 
> ---
> >From 8bae3e6833f2ca431dcfcbc8f9cced7d5e972a01 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig 
> Date: Wed, 9 Sep 2020 08:28:59 +0200
> Subject: usb: don't inherity DMA properties for USB devices
> 
> As the comment in usb_alloc_dev correctly states, drivers can't use
> the DMA API on usb device, and at least calling dma_set_mask on them
> is highly dangerous.  Unlike what the comment states upper level drivers
> also can't really use the presence of a dma mask to check for DMA
> support, as the dma_mask is set by default for most busses.
> 
> Remove the copying over of DMA information, and remove the now unused
> dma_direct_copy_range_map export.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/usb/core/message.c |  7 ---
>  drivers/usb/core/usb.c | 13 -
>  kernel/dma/direct.c|  1 -
>  3 files changed, 21 deletions(-)
> 
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 935ee98e049f65..9e45732dc1d1d1 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1954,13 +1954,6 @@ int usb_set_configuration(struct usb_device *dev, int 
> configuration)
>   intf->dev.bus = &usb_bus_type;
>   intf->dev.type = &usb_if_device_type;
>   intf->dev.groups = usb_interface_groups;
> - /*
> -  * Please refer to usb_alloc_dev() to see why we set
> -  * dma_mask and dma_range_map.
> -  */
> - intf->dev.dma_mask = dev->dev.dma_mask;
> - if (dma_direct_copy_range_map(&intf->dev, &dev->dev))
> - dev_err(&dev->dev, "failed to copy DMA map\n");
>   INIT_WORK(&intf->reset_ws, __usb_queue_reset_device);
>   intf->minor = -1;
>   device_initialize(&intf->dev);
> diff --git a/drivers/usb/core/usb.c b/drivers/usb/core/usb.c
> index 23d451f6894d70..9b4ac4415f1a47 100644
> --- a/drivers/usb/core/usb.c
> +++ b/drivers/usb/core/usb.c
> @@ -599,19 +599,6 @@ struct usb_device *usb_alloc_dev(struct usb_device 
> *parent,
>   dev->dev.bus = &usb_bus_type;
>   dev->dev.type = &usb_device_type;
>   dev->dev.groups = usb_device_groups;
> - /*
> -  * Fake a dma_mask/offset for the USB device:
> -  * We cannot really use the dma-mapping API (dma_alloc_* and
> -  * dma_map_*) for USB devices but instead need to use
> -  * usb_alloc_coherent and pass data in 'urb's, but some subsystems
> -  * manually look into the mask/offset pair to determine whether
> -  * they need bounce buffers.
> -  * Note: calling dma_set_mask() on a USB device would set the
> -  * mask for the entire HCD, so don't do that.
> -  */
> - dev->dev.dma_mask = bus->sysdev->dma_mask;
> - if (dma_direct_copy_range_map(&dev->dev, bus->sysdev))
> - dev_err(&dev->dev, "failed to copy DMA map\n");
>   set_dev_node(&dev->dev, dev_to_node(bus->sysdev));
>   dev->state = USB_STATE_ATTACHED;
>   dev->lpm_disable_count = 1;
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index fc815f7375e282..3af257571a3b42 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -552,4 +552,3 @@ int dma_direct_copy_range_map(struct device *to, struct 
> device *from)
>   to->dma_range_map = new_map;
>   return 0;
>  }
> -EXPORT_SYMBOL_GPL(dma_direct_copy_range_map);

If you think this is safe to do, great, but for some reason I thought
host controllers wanted this information, and that the scsi layer was
the offending layer that also wanted this type of thing.  But it's been
a really long time so I don't remember for sure, sorry.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/3] dma-mapping: introduce DMA range map, supplanting dma_pfn_offset

2020-09-10 Thread Greg KH

On Thu, Sep 10, 2020 at 07:40:38AM +0200, Christoph Hellwig wrote:
> From: Jim Quinlan 
> 
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs.  It subsumes the role of "dev->dma_pfn_offset" which was only
> capable of holding a single uniform offset and had no region bounds
> checking.
> 
> The function of_dma_get_range() has been modified so that it takes a single
> argument -- the device node -- and returns a map, NULL, or an error code.
> The map is an array that holds the information regarding the DMA regions.
> Each range entry contains the address offset, the cpu_start address, the
> dma_start address, and the size of the region.
> 
> of_dma_configure() is the typical manner to set range offsets but there are
> a number of ad hoc assignments to "dev->dma_pfn_offset" in the kernel
> driver code.  These cases now invoke the function
> dma_attach_offset_range(dev, cpu_addr, dma_addr, size).

So if an error happens, we don't do anything?

ice_init(dev->dev);
> diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> index 6197938dcc2d8f..935ee98e049f65 100644
> --- a/drivers/usb/core/message.c
> +++ b/drivers/usb/core/message.c
> @@ -1956,10 +1956,11 @@ int usb_set_configuration(struct usb_device *dev, int 
> configuration)
>   intf->dev.groups = usb_interface_groups;
>   /*
>* Please refer to usb_alloc_dev() to see why we set
> -  * dma_mask and dma_pfn_offset.
> +  * dma_mask and dma_range_map.
>*/
>   intf->dev.dma_mask = dev->dev.dma_mask;
> - intf->dev.dma_pfn_offset = dev->dev.dma_pfn_offset;
> + if (dma_direct_copy_range_map(&intf->dev, &dev->dev))
> + dev_err(&dev->dev, "failed to copy DMA map\n");

We tell the user, but then just keep on running?  Is there anything that
we can do here?

If not, why not have dma_direct_copy_range_map() print out the error?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu: amd: Fix IO_PAGE_FAULT due to __unmap_single() size overflow

2020-07-30 Thread Greg KH

On Wed, Jun 24, 2020 at 01:58:27PM +0200, Joerg Roedel wrote:
> Hi Greg,
> 
> On Wed, Jun 24, 2020 at 11:09:06AM +0200, Greg KH wrote:
> > So what exact stable kernel version(s) do you want to see this patch
> > applied to?
> 
> It is needed in kernels <= v5.4. Linux v5.5 has replaced the code with
> the bug.

This doesn't apply to any older kernels that I can see :(
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2] iommu: amd: Fix IO_PAGE_FAULT due to __unmap_single() size overflow

2020-06-24 Thread Greg KH

On Wed, Jun 24, 2020 at 08:41:21AM +, Suravee Suthikulpanit wrote:
> Currently, an integer is used to specify the size in unmap_sg().
> With 2GB worth of pages (512k 4k pages), it requires 31 bits
> (i.e. (1 << 19) << 12), which overflows the integer, and ends up
> unmapping more pages than intended. Subsequently, this results in
> IO_PAGE_FAULT.
> 
> Uses size_t instead of int to pass parameter to __unmap_single().
> 
> Please note that this patch is only for the stable-kernels tree
> because the commit be62dbf554c5 ("iommu/amd: Convert AMD iommu driver
> to the dma-iommu api"), which removes the function unmap_sg()
> was introduced in v5.5. This patch is not applicable in subsequent
> kernel versions.

So what exact stable kernel version(s) do you want to see this patch
applied to?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Greg KH

On Tue, Jun 02, 2020 at 05:09:57AM +, Song Bao Hua (Barry Song) wrote:
> > >
> > > Platform devices are NUMA?  That's crazy, and feels like a total abuse
> > > of platform devices and drivers that really should belong on a "real"
> > > bus.
> > 
> > I am not sure if it is an abuse of platform device. But smmu is a platform
> > device,
> > drivers/iommu/arm-smmu-v3.c is a platform driver.
> > In a typical ARM server, there are maybe multiple SMMU devices which can
> > support
> > IO virtual address and page tables for other devices on PCI-like busses.
> > Each different SMMU device might be close to different NUMA node. There is
> > really a hardware topology.
> > 
> > If you have multiple CPU packages in a NUMA server, some platform devices
> > might
> > Belong to CPU0, some other might belong to CPU1.
> 
> Those devices are populated by acpi_iort for an ARM server:
> 
> drivers/acpi/arm64/iort.c:
> 
> static const struct iort_dev_config iort_arm_smmu_v3_cfg __initconst = {
> .name = "arm-smmu-v3",
> .dev_dma_configure = arm_smmu_v3_dma_configure,
> .dev_count_resources = arm_smmu_v3_count_resources,
> .dev_init_resources = arm_smmu_v3_init_resources,
> .dev_set_proximity = arm_smmu_v3_set_proximity,
> };
> 
> void __init acpi_iort_init(void)
> {
> acpi_status status;
> 
> status = acpi_get_table(ACPI_SIG_IORT, 0, &iort_table);
> ...
> iort_check_id_count_workaround(iort_table);
> iort_init_platform_devices();
> }
> 
> static void __init iort_init_platform_devices(void)
> {
> ...
> 
> for (i = 0; i < iort->node_count; i++) {
> if (iort_node >= iort_end) {
> pr_err("iort node pointer overflows, bad table\n");
> return;
> }
> 
> iort_enable_acs(iort_node);
> 
> ops = iort_get_dev_cfg(iort_node);
> if (ops) {
> fwnode = acpi_alloc_fwnode_static();
> if (!fwnode)
> return;
> 
> iort_set_fwnode(iort_node, fwnode);
> 
> ret = iort_add_platform_device(iort_node, ops);
> if (ret) {
> iort_delete_fwnode(iort_node);
> acpi_free_fwnode_static(fwnode);
> return;
> }
> }
> 
> ...
> }
> ...
> }
> 
> NUMA node is got from ACPI:
> 
> static int  __init arm_smmu_v3_set_proximity(struct device *dev,
>   struct acpi_iort_node *node)
> {
> struct acpi_iort_smmu_v3 *smmu;
> 
> smmu = (struct acpi_iort_smmu_v3 *)node->node_data;
> if (smmu->flags & ACPI_IORT_SMMU_V3_PXM_VALID) {
> int dev_node = acpi_map_pxm_to_node(smmu->pxm);
> 
> ...
> 
> set_dev_node(dev, dev_node);
> ...
> }
> return 0;
> }
> 
> Barry

That's fine, but those are "real" devices, not platform devices, right?

What platform device has this issue?  What one will show up this way
with the new patch?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Greg KH

On Tue, Jun 02, 2020 at 03:01:39PM +1200, Barry Song wrote:
> For some platform devices like iommu, particually ARM smmu, users may
> care about the numa locality. for example, if threads and drivers run
> near iommu, they may get much better performance on dma_unmap_sg.
> For other platform devices, users may still want to know the hardware
> topology.
> 
> Cc: Prime Zeng 
> Cc: Robin Murphy 
> Signed-off-by: Barry Song 
> ---
>  drivers/base/platform.c | 26 +-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index b27d0f6c18c9..7794b9a38d82 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1062,13 +1062,37 @@ static ssize_t driver_override_show(struct device 
> *dev,
>  }
>  static DEVICE_ATTR_RW(driver_override);
>  
> +static ssize_t numa_node_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + return sprintf(buf, "%d\n", dev_to_node(dev));
> +}
> +static DEVICE_ATTR_RO(numa_node);
> +
> +static umode_t platform_dev_attrs_visible(struct kobject *kobj, struct 
> attribute *a,
> + int n)
> +{
> + struct device *dev = container_of(kobj, typeof(*dev), kobj);
> +
> + if (a == &dev_attr_numa_node.attr &&
> + dev_to_node(dev) == NUMA_NO_NODE)
> + return 0;
> +
> + return a->mode;
> +}
>  
>  static struct attribute *platform_dev_attrs[] = {
>   &dev_attr_modalias.attr,
> + &dev_attr_numa_node.attr,
>   &dev_attr_driver_override.attr,
>   NULL,
>  };
> -ATTRIBUTE_GROUPS(platform_dev);
> +
> +static struct attribute_group platform_dev_group = {
> + .attrs = platform_dev_attrs,
> + .is_visible = platform_dev_attrs_visible,
> +};
> +__ATTRIBUTE_GROUPS(platform_dev);
>  
>  static int platform_uevent(struct device *dev, struct kobj_uevent_env *env)
>  {

Also you forgot a new entry in Documentation/ABI/ :(

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Greg KH

On Tue, Jun 02, 2020 at 03:01:39PM +1200, Barry Song wrote:
> For some platform devices like iommu, particually ARM smmu, users may
> care about the numa locality. for example, if threads and drivers run
> near iommu, they may get much better performance on dma_unmap_sg.
> For other platform devices, users may still want to know the hardware
> topology.
> 
> Cc: Prime Zeng 
> Cc: Robin Murphy 
> Signed-off-by: Barry Song 
> ---
>  drivers/base/platform.c | 26 +-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index b27d0f6c18c9..7794b9a38d82 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1062,13 +1062,37 @@ static ssize_t driver_override_show(struct device 
> *dev,
>  }
>  static DEVICE_ATTR_RW(driver_override);
>  
> +static ssize_t numa_node_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + return sprintf(buf, "%d\n", dev_to_node(dev));
> +}
> +static DEVICE_ATTR_RO(numa_node);
> +
> +static umode_t platform_dev_attrs_visible(struct kobject *kobj, struct 
> attribute *a,
> + int n)
> +{
> + struct device *dev = container_of(kobj, typeof(*dev), kobj);
> +
> + if (a == &dev_attr_numa_node.attr &&
> + dev_to_node(dev) == NUMA_NO_NODE)
> + return 0;
> +
> + return a->mode;
> +}
>  
>  static struct attribute *platform_dev_attrs[] = {
>   &dev_attr_modalias.attr,
> + &dev_attr_numa_node.attr,
>   &dev_attr_driver_override.attr,
>   NULL,
>  };
> -ATTRIBUTE_GROUPS(platform_dev);
> +
> +static struct attribute_group platform_dev_group = {
> + .attrs = platform_dev_attrs,
> + .is_visible = platform_dev_attrs_visible,
> +};
> +__ATTRIBUTE_GROUPS(platform_dev);
>  
>  static int platform_uevent(struct device *dev, struct kobj_uevent_env *env)
>  {

Platform devices are NUMA?  That's crazy, and feels like a total abuse
of platform devices and drivers that really should belong on a "real"
bus.

What drivers in the kernel today have this issue?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] dma: Fix max PFN arithmetic overflow on 32 bit systems

2020-05-27 Thread Greg KH

On Tue, May 26, 2020 at 07:57:49PM +0200, Alexander Dahl wrote:
> The intermediate result of the old term (4UL * 1024 * 1024 * 1024) is
> 4 294 967 296 or 0x1 which is no problem on 64 bit systems.  The
> patch does not change the later overall result of 0x10 for
> MAX_DMA32_PFN.  The new calculation yields the same result, but does not
> require 64 bit arithmetic.
> 
> On 32 bit systems the old calculation suffers from an arithmetic
> overflow in that intermediate term in braces: 4UL aka unsigned long int
> is 4 byte wide and an arithmetic overflow happens (the 0x1 does
> not fit in 4 bytes), the in braces result is truncated to zero, the
> following right shift does not alter that, so MAX_DMA32_PFN evaluates to
> 0 on 32 bit systems.
> 
> That wrong value is a problem in a comparision against MAX_DMA32_PFN in
> the init code for swiotlb in 'pci_swiotlb_detect_4gb()' to decide if
> swiotlb should be active.  That comparison yields the opposite result,
> when compiling on 32 bit systems.
> 
> This was not possible before 1b7e03ef7570 ("x86, NUMA: Enable emulation
> on 32bit too") when that MAX_DMA32_PFN was first made visible to x86_32
> (and which landed in v3.0).
> 
> In practice this wasn't a problem, unless you activated CONFIG_SWIOTLB
> on x86 (32 bit).
> 
> However for ARCH=x86 (32 bit) and if you have set CONFIG_IOMMU_INTEL,
> since c5a5dc4cbbf4 ("iommu/vt-d: Don't switch off swiotlb if bounce page
> is used") there's a dependency on CONFIG_SWIOTLB, which was not
> necessarily active before.  That landed in v5.4, where we noticed it in
> the fli4l Linux distribution.  We have CONFIG_IOMMU_INTEL active on both
> 32 and 64 bit kernel configs there (I could not find out why, so let's
> just say historical reasons).
> 
> The effect is at boot time 64 MiB (default size) were allocated for
> bounce buffers now, which is a noticeable amount of memory on small
> systems like pcengines ALIX 2D3 with 256 MiB memory, which are still
> frequently used as home routers.
> 
> We noticed this effect when migrating from kernel v4.19 (LTS) to v5.4
> (LTS) in fli4l and got that kernel messages for example:
> 
>   Linux version 5.4.22 (buildroot@buildroot) (gcc version 7.3.0 (Buildroot 
> 2018.02.8)) #1 SMP Mon Nov 26 23:40:00 CET 2018
>   …
>   Memory: 183484K/261756K available (4594K kernel code, 393K rwdata, 1660K 
> rodata, 536K init, 456K bss , 78272K reserved, 0K cma-reserved, 0K highmem)
>   …
>   PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
>   software IO TLB: mapped [mem 0x0bb78000-0x0fb78000] (64MB)
> 
> The initial analysis and the suggested fix was done by user 'sourcejedi'
> at stackoverflow and explicitly marked as GPLv2 for inclusion in the
> Linux kernel:
> 
>   https://unix.stackexchange.com/a/520525/50007
> 
> The new calculation, which does not suffer from that overflow, is the
> same as for arch/mips now as suggested by Robin Murphy.
> 
> The fix was tested by fli4l users on round about two dozen different
> systems, including both 32 and 64 bit archs, bare metal and virtualized
> machines.
> 
> Fixes: 1b7e03ef7570 ("x86, NUMA: Enable emulation on 32bit too")
> Fixes: https://web.nettworks.org/bugs/browse/FFL-2560
> Fixes: https://unix.stackexchange.com/q/520065/50007
> Reported-by: Alan Jenkins 
> Suggested-by: Robin Murphy 
> Signed-off-by: Alexander Dahl 
> Cc: sta...@vger.kernel.org

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/28] staging: android: ion: use vmap instead of vm_map_ram

2020-04-08 Thread Greg KH

On Wed, Apr 08, 2020 at 01:59:00PM +0200, Christoph Hellwig wrote:
> vm_map_ram can keep mappings around after the vm_unmap_ram.  Using that
> with non-PAGE_KERNEL mappings can lead to all kinds of aliasing issues.
> 
> Signed-off-by: Christoph Hellwig 

Acked-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] device core: fix dma_mask handling in platform_device_register_full

2020-03-11 Thread Greg KH

On Wed, Mar 11, 2020 at 05:15:51PM +0100, Christoph Hellwig wrote:
> On Wed, Mar 11, 2020 at 05:14:23PM +0100, Greg KH wrote:
> > On Wed, Mar 11, 2020 at 05:07:10PM +0100, Christoph Hellwig wrote:
> > > Ever since the generic platform device code started allocating DMA masks
> > > itself the code to allocate and leak a private DMA mask in
> > > platform_device_register_full has been superflous.  More so the fact that
> > > it unconditionally frees the DMA mask allocation in the failure path
> > > can lead to slab corruption if the function fails later on for a device
> > > where it didn't allocate the mask.  Just remove the offending code.
> > > 
> > > Fixes: cdfee5623290 ("driver core: initialize a default DMA mask for 
> > > platform device")
> > > Reported-by: Artem S. Tashkinov 
> > > Tested-by: Artem S. Tashkinov 
> > 
> > No s-o-b from you?  :(
> > 
> > I can take this, or Linus, you can take this now if you want to as well:
> 
> Sorry, here it is:
> 
> Signed-off-by: Christoph Hellwig 

Is this still needed with the patch that Linus just committed to his
tree?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] device core: fix dma_mask handling in platform_device_register_full

2020-03-11 Thread Greg KH

On Wed, Mar 11, 2020 at 05:07:10PM +0100, Christoph Hellwig wrote:
> Ever since the generic platform device code started allocating DMA masks
> itself the code to allocate and leak a private DMA mask in
> platform_device_register_full has been superflous.  More so the fact that
> it unconditionally frees the DMA mask allocation in the failure path
> can lead to slab corruption if the function fails later on for a device
> where it didn't allocate the mask.  Just remove the offending code.
> 
> Fixes: cdfee5623290 ("driver core: initialize a default DMA mask for platform 
> device")
> Reported-by: Artem S. Tashkinov 
> Tested-by: Artem S. Tashkinov 

No s-o-b from you?  :(

I can take this, or Linus, you can take this now if you want to as well:

Reviewed-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Patch "iommu/vt-d: Don't queue_iova() if there is no flush queue" has been added to the 5.2-stable tree

2019-07-29 Thread Greg KH

On Mon, Jul 29, 2019 at 05:36:09PM +0100, Dmitry Safonov wrote:
> Hi Greg,
> 
> On 7/29/19 5:30 PM, gre...@linuxfoundation.org wrote:
> > 
> > This is a note to let you know that I've just added the patch titled
> > 
> > iommu/vt-d: Don't queue_iova() if there is no flush queue
> > 
> > to the 5.2-stable tree which can be found at:
> > 
> > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> > 
> > The filename of the patch is:
> >  iommu-vt-d-don-t-queue_iova-if-there-is-no-flush-queue.patch
> > and it can be found in the queue-5.2 subdirectory.
> > 
> > If you, or anyone else, feels it should not be added to the stable tree,
> > please let  know about it.
> 
> Please, make sure to apply the following patch too (kudos to Joerg).
> [Sorry for any inconvenience cause by my copy-n-paste mistake]
> 
> commit 201c1db90cd6
> Author: Joerg Roedel 
> Date:   Tue Jul 23 09:51:00 2019 +0200
> 
> iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA
> 
> The stub function for !CONFIG_IOMMU_IOVA needs to be
> 'static inline'.
> 
> Fixes: effa467870c76 ('iommu/vt-d: Don't queue_iova() if there is no
> flush queue')
> Signed-off-by: Joerg Roedel 
> 

Now also queued up to 5.2.y, thanks.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/16] staging/comedi: mark as broken

2019-06-14 Thread Greg KH

On Fri, Jun 14, 2019 at 04:48:57PM +0200, Christoph Hellwig wrote:
> On Fri, Jun 14, 2019 at 04:02:39PM +0200, Greg KH wrote:
> > Perhaps a hint as to how we can fix this up?  This is the first time
> > I've heard of the comedi code not handling dma properly.
> 
> It can be fixed by:
> 
>  a) never calling virt_to_page (or vmalloc_to_page for that matter)
> on dma allocation
>  b) never remapping dma allocation with conflicting cache modes
> (no remapping should be doable after a) anyway).

Ok, fair enough, have any pointers of drivers/core code that does this
correctly?  I can put it on my todo list, but might take a week or so...

Ian, want to look into doing this sooner?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/16] staging/comedi: mark as broken

2019-06-14 Thread Greg KH

On Fri, Jun 14, 2019 at 03:47:22PM +0200, Christoph Hellwig wrote:
> comedi_buf.c abuse the DMA API in gravely broken ways, as it assumes it
> can call virt_to_page on the result, and the just remap it as uncached
> using vmap.  Disable the driver until this API abuse has been fixed.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  drivers/staging/comedi/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/staging/comedi/Kconfig b/drivers/staging/comedi/Kconfig
> index 049b659fa6ad..e7c021d76cfa 100644
> --- a/drivers/staging/comedi/Kconfig
> +++ b/drivers/staging/comedi/Kconfig
> @@ -1,6 +1,7 @@
>  # SPDX-License-Identifier: GPL-2.0
>  config COMEDI
>   tristate "Data acquisition support (comedi)"
> + depends on BROKEN

Um, that's a huge sledgehammer.

Perhaps a hint as to how we can fix this up?  This is the first time
I've heard of the comedi code not handling dma properly.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: How to resolve an issue in swiotlb environment?

2019-06-13 Thread Greg KH

On Thu, Jun 13, 2019 at 01:16:32PM -0400, Alan Stern wrote:
> On Thu, 13 Jun 2019, Christoph Hellwig wrote:
> 
> > On Wed, Jun 12, 2019 at 10:43:11AM -0400, Alan Stern wrote:
> > > Would it be okay to rely on the assumption that USB block devices never 
> > > have block size < 512?  (We could even add code to the driver to 
> > > enforce this, although refusing to handle such devices at all might be 
> > > worse than getting an occasional error.)
> > 
> > sd.c only supports a few specific sector size, and none of them is
> > < 512 bytes:
> > 
> > if (sector_size != 512 &&
> > sector_size != 1024 &&
> > sector_size != 2048 &&
> > sector_size != 4096) {
> > ...
> > sdkp->capacity = 0;
> 
> Great!  So all we have to do is fix vhci-hcd.  Then we can remove all 
> the virt_boundary_mask stuff from usb-storage and uas entirely.
> 
> (I'm assuming wireless USB isn't a genuine issue.  As far as I know, it 
> is pretty much abandoned at this point.)

It is, I need to just move it to staging and delete the thing.  I don't
know of any hardware anymore.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: Postpone DMA tear-down until after devres release for probe failure

2019-04-03 Thread Greg KH

On Wed, Apr 03, 2019 at 09:02:36AM +0100, John Garry wrote:
> On 28/03/2019 10:08, John Garry wrote:
> > In commit 376991db4b64 ("driver core: Postpone DMA tear-down until after
> > devres release"), we changed the ordering of tearing down the device DMA
> > ops and releasing all the device's resources; this was because the DMA ops
> > should be maintained until we release the device's managed DMA memories.
> > 
> 
> Hi all,
> 
> A friendly reminder on this patch... I didn't see any update.
> 
> I thought that it had some importance.
> 
> Thanks,
> John
> 
> > However, we have seen another crash on an arm64 system when a
> > device driver probe fails:
> > 
> >   hisi_sas_v3_hw :74:02.0: Adding to iommu group 2
> >   scsi host1: hisi_sas_v3_hw
> >   BUG: Bad page state in process swapper/0  pfn:313f5
> >   page:7ec4fd40 count:1 mapcount:0
> >   mapping: index:0x0
> >   flags: 0xfffe0001000(reserved)
> >   raw: 0fffe0001000 7ec4fd48 7ec4fd48
> > 
> >   raw:   0001
> > 
> >   page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> >   bad because of flags: 0x1000(reserved)
> >   Modules linked in:
> >   CPU: 49 PID: 1 Comm: swapper/0 Not tainted
> > 5.1.0-rc1-43081-g22d97fd-dirty #1433
> >   Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI
> > RC0 - V1.12.01 01/29/2019
> >   Call trace:
> >   dump_backtrace+0x0/0x118
> >   show_stack+0x14/0x1c
> >   dump_stack+0xa4/0xc8
> >   bad_page+0xe4/0x13c
> >   free_pages_check_bad+0x4c/0xc0
> >   __free_pages_ok+0x30c/0x340
> >   __free_pages+0x30/0x44
> >   __dma_direct_free_pages+0x30/0x38
> >   dma_direct_free+0x24/0x38
> >   dma_free_attrs+0x9c/0xd8
> >   dmam_release+0x20/0x28
> >   release_nodes+0x17c/0x220
> >   devres_release_all+0x34/0x54
> >   really_probe+0xc4/0x2c8
> >   driver_probe_device+0x58/0xfc
> >   device_driver_attach+0x68/0x70
> >   __driver_attach+0x94/0xdc
> >   bus_for_each_dev+0x5c/0xb4
> >   driver_attach+0x20/0x28
> >   bus_add_driver+0x14c/0x200
> >   driver_register+0x6c/0x124
> >   __pci_register_driver+0x48/0x50
> >   sas_v3_pci_driver_init+0x20/0x28
> >   do_one_initcall+0x40/0x25c
> >   kernel_init_freeable+0x2b8/0x3c0
> >   kernel_init+0x10/0x100
> >   ret_from_fork+0x10/0x18
> >   Disabling lock debugging due to kernel taint
> >   BUG: Bad page state in process swapper/0  pfn:313f6
> >   page:7ec4fd80 count:1 mapcount:0
> > mapping: index:0x0
> > [   89.322983] flags: 0xfffe0001000(reserved)
> >   raw: 0fffe0001000 7ec4fd88 7ec4fd88
> > 
> >   raw:   0001
> > 
> > 
> > The crash occurs for the same reason.
> > 
> > In this case, on the really_probe() failure path, we are still clearing
> > the DMA ops prior to releasing the device's managed memories.
> > 
> > This patch fixes this issue by reordering the DMA ops teardown and the
> > call to devres_release_all() on the failure path.
> > 
> > Reported-by: Xiang Chen 
> > Tested-by: Xiang Chen 
> > Signed-off-by: John Garry 

So does this "fix" 376991db4b64?  If so, should this be added to the
patch and also backported to the stable trees?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] HYPERV/IOMMU: Add Hyper-V stub IOMMU driver

2019-01-31 Thread Greg KH

On Thu, Jan 31, 2019 at 06:17:32PM +0800, lantianyu1...@gmail.com wrote:
> --- /dev/null
> +++ b/drivers/iommu/hyperv-iommu.c
> @@ -0,0 +1,189 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#define pr_fmt(fmt) "HYPERV-IR: " fmt

Minor nit, you never do any pr_*() calls, so this isn't needed, right?

> +static cpumask_t ioapic_max_cpumask = { CPU_BITS_NONE };
> +struct irq_domain *ioapic_ir_domain;

Global?  Why?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/3] x86/Hyper-V: Set x2apic destination mode to physical when x2apic is available

2019-01-31 Thread Greg KH

On Thu, Jan 31, 2019 at 06:17:31PM +0800, lantianyu1...@gmail.com wrote:
> From: Lan Tianyu 
> 
> Hyper-V doesn't provide irq remapping for IO-APIC. To enable x2apic,
> set x2apic destination mode to physcial mode when x2apic is available
> and Hyper-V IOMMU driver makes sure cpus assigned with IO-APIC irqs have
> 8-bit APIC id.
> 
> Signed-off-by: Lan Tianyu 
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 14 ++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index e81a2db..9d62f33 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -36,6 +36,8 @@
>  struct ms_hyperv_info ms_hyperv;
>  EXPORT_SYMBOL_GPL(ms_hyperv);
>  
> +extern int x2apic_phys;

Shouldn't this be in a .h file somewhere instead?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/io-pgtable-arm-v7s: only kmemleak_ignore L2 tables

2019-01-28 Thread Greg KH

On Mon, Jan 28, 2019 at 05:43:01PM +0800, Nicolas Boichat wrote:
> L1 tables are allocated with __get_dma_pages, and therefore already
> ignored by kmemleak.
> 
> Without this, the kernel would print this error message on boot,
> when the first L1 table is allocated:
> 
> [2.810533] kmemleak: Trying to color unknown object at 0xffd652388000 
> as Black
> [2.818190] CPU: 5 PID: 39 Comm: kworker/5:0 Tainted: G S
> 4.19.16 #8
> [2.831227] Workqueue: events deferred_probe_work_func
> [2.836353] Call trace:
> ...
> [2.852532]  paint_ptr+0xa0/0xa8
> [2.855750]  kmemleak_ignore+0x38/0x6c
> [2.859490]  __arm_v7s_alloc_table+0x168/0x1f4
> [2.863922]  arm_v7s_alloc_pgtable+0x114/0x17c
> [2.868354]  alloc_io_pgtable_ops+0x3c/0x78
> ...
> 
> Fixes: e5fc9753b1a8314 ("iommu/io-pgtable: Add ARMv7 short descriptor 
> support")
> Signed-off-by: Nicolas Boichat 
> ---
> 
> I only tested this on top of my other series
> (https://patchwork.kernel.org/patch/10720495/), but I think the same fix
> applies. I'm still a bit confused as to why this only shows up now, as IIUC,
> the kmemleak_ignore call was always wrong with L1 tables.
> 
>  drivers/iommu/io-pgtable-arm-v7s.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: usb HC busted?

2018-07-17 Thread Greg KH

On Tue, Jul 17, 2018 at 10:31:38AM -0400, Alan Stern wrote:
> On Tue, 17 Jul 2018, Greg KH wrote:
> 
> > > From: Sudip Mukherjee 
> > > Date: Tue, 10 Jul 2018 09:50:00 +0100
> > > Subject: [PATCH] hacky solution to mem-corruption
> > > 
> > > Signed-off-by: Sudip Mukherjee 
> > > ---
> > >  drivers/usb/core/message.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > > index 7cd4ec33dbf4..7fdf7a27611d 100644
> > > --- a/drivers/usb/core/message.c
> > > +++ b/drivers/usb/core/message.c
> > > @@ -1398,7 +1398,8 @@ int usb_set_interface(struct usb_device *dev, int 
> > > interface, int alternate)
> > >   remove_intf_ep_devs(iface);
> > >   usb_remove_sysfs_intf_files(iface);
> > >   }
> > > - usb_disable_interface(dev, iface, true);
> > > + if (!(iface->cur_altsetting && alt))
> > > + usb_disable_interface(dev, iface, true);
> > 
> > 
> > 
> > This feels like a "correct" patch anyway, why would a driver keep
> > calling set_interface to an interface that it was already set to?
> > 
> > But can't we check for this higher up in the function?  This hack will
> > just not disable an interface but it will do all of the other stuff
> > being asked for.  Does the patch below also solve this for you?  It's
> > not a good solution of course, but it might work around the problem a
> > bit better.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> > diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c
> > index 1a15392326fc..0f718f1a1ca3 100644
> > --- a/drivers/usb/core/message.c
> > +++ b/drivers/usb/core/message.c
> > @@ -1376,6 +1376,14 @@ int usb_set_interface(struct usb_device *dev, int 
> > interface, int alternate)
> > return -EINVAL;
> > }
> >  
> > +   if (iface->cur_altsetting == alt) {
> > +   /*
> > +* foolish bluetooth stack, don't try to set a setting you are
> > +* already set to...
> > +*/
> > +   return 0;
> > +   }
> > +
> > /* Make sure we have enough bandwidth for this alternate interface.
> >  * Remove the current alt setting and add the new alt setting.
> >  */
> 
> No, neither of these is right.  It's possible to use 
> usb_set_interface() as a kind of "soft" reset.  Even when the new 
> altsetting is specified to be the same as the current one, we still 
> have to tell the lower-layer drivers and hardware about it.

You are right, it's a hacky soft reset, I was just trying to figure out
what the bluetooth driver was trying to do.  I wouldn't expect it to be
calling that function a lot, but I guess it does :(

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: usb HC busted?

2018-07-17 Thread Greg KH

On Tue, Jul 17, 2018 at 02:20:00PM +0100, Sudip Mukherjee wrote:
> Hi Greg,
> 
> On Tue, Jul 17, 2018 at 02:04:11PM +0200, Greg KH wrote:
> > On Tue, Jul 17, 2018 at 12:41:04PM +0100, Sudip Mukherjee wrote:
> > > Hi Mathias,
> > > 
> > > On Sat, Jun 30, 2018 at 10:07:04PM +0100, Sudip Mukherjee wrote:
> > > > Hi Mathias,
> > > > 
> > > > On Fri, Jun 29, 2018 at 02:41:13PM +0300, Mathias Nyman wrote:
> > > > > On 27.06.2018 14:59, Sudip Mukherjee wrote:
> > > > > > > > Can you share a bit more details on the platform you are using, 
> > > > > > > > and what types of test you are running.
> > > > > > > 
> > > 
> > > > Then to track what is going on, I added the slub debugging and :(
> > > > I have attached part of dmesg for you to check.
> > > > Will appreciate your help in finding out the problem.
> > > 
> > > I did some more debugging. Tested with a KASAN enabled kernel and that
> > > shows the problem. The report is attached.
> > > 
> > > To my understanding:
> > > 
> > > btusb_work() is calling usb_set_interface() with alternate = 0. which
> > > again calls usb_hcd_alloc_bandwidth() and that frees the rings by
> > > xhci_free_endpoint_ring(). But then usb_set_interface() continues and
> > > calls usb_disable_interface() -> usb_hcd_flush_endpoint()->unlink1()->
> > > xhci_urb_dequeue() which at the end gives the command to stop endpoint.
> > > 
> > > In all the cycles I have tested I see that only in the fail case
> > > handle_cmd_completion() gets called, but in the cycles where the error
> > > is not there handle_cmd_completion() is not called with that command.
> > > 
> > > I am not sure what is happening, and you are the best person to understand
> > > what is happening. :)
> > > 
> > > But for now (untill you are back from holiday and suggest a proper 
> > > solution),
> > > I made a hacky patch (attached) which is working and I donot get any
> > > corruption after that. Both KASAN and slub debug are also happy.
> > > 
> > > So, now waiting for you to analyze what is going on and suggest a proper
> > > fix.
> > > 
> > > Thanks in advance.
> > > 
> > > --
> > > Regards
> > > Sudip
> > 
> > > [  236.814156] 
> > > ==
> > > [  236.814187] BUG: KASAN: use-after-free in 
> > > xhci_trb_virt_to_dma+0x2e/0x74 [xhci_hcd]
> > > [  236.814193] Read of size 8 at addr 8800789329c8 by task weston/138
> > > 
> > > [  236.814203] CPU: 0 PID: 138 Comm: weston Tainted: G U  W  O
> > > 4.14.47-20180606+ #7
> > > [  236.814206] Hardware name: xxx, BIOS 2017.01-00087-g43e04de 08/30/2017
> > > [  236.814209] Call Trace:
> > > [  236.814214]  
> > > [  236.814226]  dump_stack+0x46/0x59
> > > [  236.814238]  print_address_description+0x6b/0x23b
> > > [  236.814255]  ? xhci_trb_virt_to_dma+0x2e/0x74 [xhci_hcd]
> > > [  236.814262]  kasan_report+0x220/0x246
> > > [  236.814278]  xhci_trb_virt_to_dma+0x2e/0x74 [xhci_hcd]
> > > [  236.814294]  trb_in_td+0x3b/0x1cd [xhci_hcd]
> > > [  236.814311]  handle_cmd_completion+0x1181/0x2c9b [xhci_hcd]
> > > [  236.814329]  ? xhci_queue_new_dequeue_state+0x5d9/0x5d9 [xhci_hcd]
> > > [  236.814337]  ? drm_handle_vblank+0x4ec/0x590
> > > [  236.814352]  xhci_irq+0x529/0x3294 [xhci_hcd]
> > > [  236.814362]  ? __accumulate_pelt_segments+0x24/0x33
> > > [  236.814378]  ? finish_td.isra.40+0x223/0x223 [xhci_hcd]
> > > [  236.814384]  ? __accumulate_pelt_segments+0x24/0x33
> > > [  236.814390]  ? __accumulate_pelt_segments+0x24/0x33
> > > [  236.814405]  ? xhci_irq+0x3294/0x3294 [xhci_hcd]
> > > [  236.814412]  __handle_irq_event_percpu+0x149/0x3db
> > > [  236.814421]  handle_irq_event_percpu+0x65/0x109
> > > [  236.814428]  ? __handle_irq_event_percpu+0x3db/0x3db
> > > [  236.814436]  ? ttwu_do_wakeup.isra.18+0x3a2/0x3ce
> > > [  236.814442]  handle_irq_event+0xa8/0x10a
> > > [  236.814449]  handle_edge_irq+0x4b2/0x538
> > > [  236.814458]  handle_irq+0x3e/0x45
> > > [  236.814465]  do_IRQ+0x5c/0x126
> > > [  236.814474]  common_interrupt+0x7a/0x7a
> > > [  236.814478]  
> > > [  236.814483] RIP: 0023:0xf79d3d82
> > > [  236.814486] RSP: 002b:ffc

Re: usb HC busted?

2018-07-17 Thread Greg KH

On Tue, Jul 17, 2018 at 12:41:04PM +0100, Sudip Mukherjee wrote:
> Hi Mathias,
> 
> On Sat, Jun 30, 2018 at 10:07:04PM +0100, Sudip Mukherjee wrote:
> > Hi Mathias,
> > 
> > On Fri, Jun 29, 2018 at 02:41:13PM +0300, Mathias Nyman wrote:
> > > On 27.06.2018 14:59, Sudip Mukherjee wrote:
> > > > > > Can you share a bit more details on the platform you are using, and 
> > > > > > what types of test you are running.
> > > > > 
> 
> > Then to track what is going on, I added the slub debugging and :(
> > I have attached part of dmesg for you to check.
> > Will appreciate your help in finding out the problem.
> 
> I did some more debugging. Tested with a KASAN enabled kernel and that
> shows the problem. The report is attached.
> 
> To my understanding:
> 
> btusb_work() is calling usb_set_interface() with alternate = 0. which
> again calls usb_hcd_alloc_bandwidth() and that frees the rings by
> xhci_free_endpoint_ring(). But then usb_set_interface() continues and
> calls usb_disable_interface() -> usb_hcd_flush_endpoint()->unlink1()->
> xhci_urb_dequeue() which at the end gives the command to stop endpoint.
> 
> In all the cycles I have tested I see that only in the fail case
> handle_cmd_completion() gets called, but in the cycles where the error
> is not there handle_cmd_completion() is not called with that command.
> 
> I am not sure what is happening, and you are the best person to understand
> what is happening. :)
> 
> But for now (untill you are back from holiday and suggest a proper solution),
> I made a hacky patch (attached) which is working and I donot get any
> corruption after that. Both KASAN and slub debug are also happy.
> 
> So, now waiting for you to analyze what is going on and suggest a proper
> fix.
> 
> Thanks in advance.
> 
> --
> Regards
> Sudip

> [  236.814156] 
> ==
> [  236.814187] BUG: KASAN: use-after-free in xhci_trb_virt_to_dma+0x2e/0x74 
> [xhci_hcd]
> [  236.814193] Read of size 8 at addr 8800789329c8 by task weston/138
> 
> [  236.814203] CPU: 0 PID: 138 Comm: weston Tainted: G U  W  O
> 4.14.47-20180606+ #7
> [  236.814206] Hardware name: xxx, BIOS 2017.01-00087-g43e04de 08/30/2017
> [  236.814209] Call Trace:
> [  236.814214]  
> [  236.814226]  dump_stack+0x46/0x59
> [  236.814238]  print_address_description+0x6b/0x23b
> [  236.814255]  ? xhci_trb_virt_to_dma+0x2e/0x74 [xhci_hcd]
> [  236.814262]  kasan_report+0x220/0x246
> [  236.814278]  xhci_trb_virt_to_dma+0x2e/0x74 [xhci_hcd]
> [  236.814294]  trb_in_td+0x3b/0x1cd [xhci_hcd]
> [  236.814311]  handle_cmd_completion+0x1181/0x2c9b [xhci_hcd]
> [  236.814329]  ? xhci_queue_new_dequeue_state+0x5d9/0x5d9 [xhci_hcd]
> [  236.814337]  ? drm_handle_vblank+0x4ec/0x590
> [  236.814352]  xhci_irq+0x529/0x3294 [xhci_hcd]
> [  236.814362]  ? __accumulate_pelt_segments+0x24/0x33
> [  236.814378]  ? finish_td.isra.40+0x223/0x223 [xhci_hcd]
> [  236.814384]  ? __accumulate_pelt_segments+0x24/0x33
> [  236.814390]  ? __accumulate_pelt_segments+0x24/0x33
> [  236.814405]  ? xhci_irq+0x3294/0x3294 [xhci_hcd]
> [  236.814412]  __handle_irq_event_percpu+0x149/0x3db
> [  236.814421]  handle_irq_event_percpu+0x65/0x109
> [  236.814428]  ? __handle_irq_event_percpu+0x3db/0x3db
> [  236.814436]  ? ttwu_do_wakeup.isra.18+0x3a2/0x3ce
> [  236.814442]  handle_irq_event+0xa8/0x10a
> [  236.814449]  handle_edge_irq+0x4b2/0x538
> [  236.814458]  handle_irq+0x3e/0x45
> [  236.814465]  do_IRQ+0x5c/0x126
> [  236.814474]  common_interrupt+0x7a/0x7a
> [  236.814478]  
> [  236.814483] RIP: 0023:0xf79d3d82
> [  236.814486] RSP: 002b:ffc588e8 EFLAGS: 00200282 ORIG_RAX: 
> ffdc
> [  236.814493] RAX:  RBX: f7bebd5c RCX: 
> 
> [  236.814496] RDX: 08d4197c RSI:  RDI: 
> f746c020
> [  236.814499] RBP: ffc588e8 R08:  R09: 
> 
> [  236.814503] R10:  R11: 00200206 R12: 
> 
> [  236.814506] R13:  R14:  R15: 
> 
> 
> [  236.814513] Allocated by task 2082:
> [  236.814521]  kasan_kmalloc.part.1+0x51/0xc7
> [  236.814526]  kmem_cache_alloc_trace+0x178/0x187
> [  236.814540]  xhci_segment_alloc.isra.11+0x9d/0x3bf [xhci_hcd]
> [  236.814553]  xhci_alloc_segments_for_ring+0x9e/0x176 [xhci_hcd]
> [  236.814566]  xhci_ring_alloc.constprop.16+0x197/0x4ba [xhci_hcd]
> [  236.814579]  xhci_endpoint_init+0x77a/0x9ba [xhci_hcd]
> [  236.814592]  xhci_add_endpoint+0x3bc/0x43b [xhci_hcd]
> [  236.814615]  usb_hcd_alloc_bandwidth+0x7ef/0x857 [usbcore]
> [  236.814637]  usb_set_interface+0x294/0x681 [usbcore]
> [  236.814645]  btusb_work+0x2e6/0x981 [btusb]
> [  236.814651]  process_one_work+0x579/0x9e9
> [  236.814656]  worker_thread+0x68f/0x804
> [  236.814662]  kthread+0x31c/0x32b
> [  236.814668]  ret_from_fork+0x35/0x40
> 
> [  236.814672] Freed by task 1533:
> [  236.814678]  kasan_slab

Re: [PATCH v8 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-06-05 Thread Greg KH

On Tue, Jun 05, 2018 at 12:01:41PM -0500, Gary R Hook wrote:
> > > +/**
> > > + * iommu_debugfs_new_driver_dir - create a vendor directory under 
> > > debugfs/iommu
> > > + * @vendor: name of the vendor-specific subdirectory to create
> > > + *
> > > + * This function is called by an IOMMU driver to create the top-level 
> > > debugfs
> > > + * directory for that driver.
> > > + *
> > > + * Return: upon success, a pointer to the dentry for the new directory.
> > > + * NULL in case of failure.
> > > + */
> > > +struct dentry *iommu_debugfs_new_driver_dir(const char *vendor)
> > > +{
> > > + struct dentry *d_new;
> > > +
> > > + d_new = debugfs_create_dir(vendor, iommu_debugfs_dir);
> > > +
> > > + return d_new;
> > > +}
> > > +EXPORT_SYMBOL_GPL(iommu_debugfs_new_driver_dir);
> > 
> > Why are you wrapping a debugfs call?  Why not just export
> > iommu_debugfs_dir instead?
> 
> It was a choice, as I stated in my other post. It is not a requirement.
> If you resolutely reject this approach, that's fine. I'll change it, no
> worries.

Either is fine, but if it stays, it should stay a single line function
:)

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 2/2] iommu/amd: Add basic debugfs infrastructure for AMD IOMMU

2018-06-05 Thread Greg KH

On Tue, Jun 05, 2018 at 11:58:13AM -0500, Gary R Hook wrote:
> On 05/29/2018 01:39 PM, Greg KH wrote:
> > On Tue, May 29, 2018 at 01:23:23PM -0500, Gary R Hook wrote:
> > > Implement a skeleton framework for debugfs support in the
> > > AMD IOMMU. Add a hidden boolean to Kconfig that is defined
> > > for the AMD IOMMU when general IOMMY DebugFS support is
> > > enabled.
> > > 
> > > Signed-off-by: Gary R Hook 
> > > ---
> > >   drivers/iommu/Kconfig |4 
> > >   drivers/iommu/Makefile|1 +
> > >   drivers/iommu/amd_iommu_debugfs.c |   39 
> > > +
> > >   drivers/iommu/amd_iommu_init.c|6 --
> > >   drivers/iommu/amd_iommu_proto.h   |6 ++
> > >   drivers/iommu/amd_iommu_types.h   |5 +
> > >   6 files changed, 59 insertions(+), 2 deletions(-)
> > >   create mode 100644 drivers/iommu/amd_iommu_debugfs.c
> > > 
> > > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > > index f9af25ac409f..ec223f6f4ad4 100644
> > > --- a/drivers/iommu/Kconfig
> > > +++ b/drivers/iommu/Kconfig
> > > @@ -137,6 +137,10 @@ config AMD_IOMMU
> > > your BIOS for an option to enable it or if you have an IVRS 
> > > ACPI
> > > table.
> > > +config AMD_IOMMU_DEBUGFS
> > > + def_bool y
> > 
> > Why default y?  Can you not boot a box without this?  If not, it should
> > not be Y.
> 
> Again, apologies for not seeing this sooner.
> 
> Yes, the system can boot without this. The idea of a hidden option was
> surfaced by Robin, and after my first approach was shot down, I tried this.
> 
> Logic: If the over-arching IOMMU debugfs option is enabled, then
> AMD_IOMMU_DEBUGFS gets defined, and AMD IOMMU code gets included.
> 
> This issue was discussed a few weeks ago. No single approach appears to
> satisfy everyone. I like this because it depends upon one switch: Do you
> want DebugFS support enabled in the IOMMU driver, period? Vendor-specific
> code can then choose to implement support or not, and a builder doesn't have
> to worry about enabling/disabling multiple Kconfig options.
> 
> At least, that was my line of reasoning.
> 
> I'm not married to any approach, and I don't find clever use of Kconfig
> options too terribly challenging. And I'm not defending, I'm just
> explaining.

The issue is, no one sets Kconfig options except a very tiny subset of
kernel developers.  Distros allways enable everything, as they have to
do that.

If you are creating something here that is so dangerous that you spam
the kernel log with big warning messages, you should not be making it
easy to enable, let alone be enabled by default :)

Just make it an option, have it rely on the kernel debugging option, and
say "DO NOT ENABLE THIS UNLESS YOU REALLY REALLY REALLY KNOW WHAT YOU
ARE DOING!"

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v8 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-05-29 Thread Greg KH

On Tue, May 29, 2018 at 01:23:14PM -0500, Gary R Hook wrote:
> Provide base enablement for using debugfs to expose internal data of an
> IOMMU driver. When called, create the /sys/kernel/debug/iommu directory.
> 
> Emit a strong warning at boot time to indicate that this feature is
> enabled.
> 
> This function is called from iommu_init, and creates the initial DebugFS
> directory. Drivers may then call iommu_debugfs_new_driver_dir() to
> instantiate a device-specific directory to expose internal data.
> It will return a pointer to the new dentry structure created in
> /sys/kernel/debug/iommu, or NULL in the event of a failure.
> 
> Since the IOMMU driver can not be removed from the running system, there
> is no need for an "off" function.
> 
> Signed-off-by: Gary R Hook 
> ---
>  drivers/iommu/Kconfig |   10 ++
>  drivers/iommu/Makefile|1 +
>  drivers/iommu/iommu-debugfs.c |   72 
> +
>  drivers/iommu/iommu.c |2 +
>  include/linux/iommu.h |   11 ++
>  5 files changed, 96 insertions(+)
>  create mode 100644 drivers/iommu/iommu-debugfs.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c76157e57f6b..f9af25ac409f 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -60,6 +60,16 @@ config IOMMU_IO_PGTABLE_ARMV7S_SELFTEST
>  
>  endmenu
>  
> +config IOMMU_DEBUGFS
> + bool "Export IOMMU internals in DebugFS"
> + depends on DEBUG_FS
> + help
> +   Allows exposure of IOMMU device internals. This option enables
> +   the use of debugfs by IOMMU drivers as required. Devices can,
> +   at initialization time, cause the IOMMU code to create a top-level
> +   debug/iommu directory, and then populate a subdirectory with
> +   entries as required.

You didn't put a big enough warning here, like I asked last time :(



> +
>  config IOMMU_IOVA
>   tristate
>  
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 1fb695854809..74cfbc392862 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@
>  obj-$(CONFIG_IOMMU_API) += iommu.o
>  obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
> +obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o
>  obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
> diff --git a/drivers/iommu/iommu-debugfs.c b/drivers/iommu/iommu-debugfs.c
> new file mode 100644
> index ..bb1bf2d0ac51
> --- /dev/null
> +++ b/drivers/iommu/iommu-debugfs.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * IOMMU debugfs core infrastructure
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + *
> + * Author: Gary R Hook 
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +static struct dentry *iommu_debugfs_dir;
> +
> +/**
> + * iommu_debugfs_setup - create the top-level iommu directory in debugfs
> + *
> + * Provide base enablement for using debugfs to expose internal data of an
> + * IOMMU driver. When called, this function creates the
> + * /sys/kernel/debug/iommu directory.
> + *
> + * Emit a strong warning at boot time to indicate that this feature is
> + * enabled.
> + *
> + * This function is called from iommu_init; drivers may then call
> + * iommu_debugfs_new_driver_dir() to instantiate a vendor-specific
> + * directory to be used to expose internal data.
> + */
> +void iommu_debugfs_setup(void)
> +{
> + if (!iommu_debugfs_dir) {
> + iommu_debugfs_dir = debugfs_create_dir("iommu", NULL);
> + if (iommu_debugfs_dir) {

Again, no, never test a return value from a debugfs call.  You do not
care, you should do the same exact thing if it is enabled or disabled or
somehow failing.  Just keep moving on.

> + pr_warn("\n");
> + 
> pr_warn("*\n");
> + pr_warn("** NOTICE NOTICE NOTICE NOTICE NOTICE 
> NOTICE NOTICE**\n");
> + pr_warn("** 
> **\n");
> + pr_warn("**  IOMMU DebugFS SUPPORT HAS BEEN ENABLED IN 
> THIS KERNEL  **\n");
> + pr_warn("** 
> **\n");
> + pr_warn("** This means that this kernel is built to 
> expose internal **\n");
> + pr_warn("** IOMMU data structures, which may compromise 
> security on **\n");
> + pr_warn("** your system.
> **\n");
> + pr_warn("** 
> **\n");
> + pr_warn("** If you see this message and you are not 
> debugging the   **\n");
> + pr_warn("** kernel, r

Re: [PATCH v8 2/2] iommu/amd: Add basic debugfs infrastructure for AMD IOMMU

2018-05-29 Thread Greg KH

On Tue, May 29, 2018 at 01:23:23PM -0500, Gary R Hook wrote:
> Implement a skeleton framework for debugfs support in the
> AMD IOMMU. Add a hidden boolean to Kconfig that is defined
> for the AMD IOMMU when general IOMMY DebugFS support is
> enabled.
> 
> Signed-off-by: Gary R Hook 
> ---
>  drivers/iommu/Kconfig |4 
>  drivers/iommu/Makefile|1 +
>  drivers/iommu/amd_iommu_debugfs.c |   39 
> +
>  drivers/iommu/amd_iommu_init.c|6 --
>  drivers/iommu/amd_iommu_proto.h   |6 ++
>  drivers/iommu/amd_iommu_types.h   |5 +
>  6 files changed, 59 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/iommu/amd_iommu_debugfs.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index f9af25ac409f..ec223f6f4ad4 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -137,6 +137,10 @@ config AMD_IOMMU
> your BIOS for an option to enable it or if you have an IVRS ACPI
> table.
>  
> +config AMD_IOMMU_DEBUGFS
> + def_bool y

Why default y?  Can you not boot a box without this?  If not, it should
not be Y.

> + depends on AMD_IOMMU && IOMMU_DEBUGFS
> +
>  config AMD_IOMMU_V2
>   tristate "AMD IOMMU Version 2 driver"
>   depends on AMD_IOMMU
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 74cfbc392862..47fd6ea9de2d 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -11,6 +11,7 @@ obj-$(CONFIG_IOMMU_IOVA) += iova.o
>  obj-$(CONFIG_OF_IOMMU)   += of_iommu.o
>  obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
>  obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o
> +obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
>  obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
>  obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
>  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
> diff --git a/drivers/iommu/amd_iommu_debugfs.c 
> b/drivers/iommu/amd_iommu_debugfs.c
> new file mode 100644
> index ..6dff98552969
> --- /dev/null
> +++ b/drivers/iommu/amd_iommu_debugfs.c
> @@ -0,0 +1,39 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD IOMMU driver
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + *
> + * Author: Gary R Hook 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include "amd_iommu_proto.h"
> +#include "amd_iommu_types.h"
> +
> +static struct dentry *amd_iommu_debugfs;
> +static DEFINE_MUTEX(amd_iommu_debugfs_lock);
> +
> +#define  MAX_NAME_LEN20
> +
> +void amd_iommu_debugfs_setup(struct amd_iommu *iommu)
> +{
> + char name[MAX_NAME_LEN + 1];
> +
> + mutex_lock(&amd_iommu_debugfs_lock);
> + if (!amd_iommu_debugfs)
> + amd_iommu_debugfs = iommu_debugfs_new_driver_dir("amd");
> + mutex_unlock(&amd_iommu_debugfs_lock);
> +
> + if (amd_iommu_debugfs) {

Do not test this.

> + snprintf(name, MAX_NAME_LEN, "iommu%02d", iommu->index);
> + iommu->debugfs = debugfs_create_dir(name,
> + amd_iommu_debugfs);
> + if (!iommu->debugfs) {

No, you don't care about the return value of any debugfs call.  Just
save the directory and move on.  If it's not correct, find, nothing is
wrong, just keep going.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 1/2] iommu - Enable debugfs exposure of IOMMU driver internals

2018-05-24 Thread Greg KH

On Mon, May 14, 2018 at 12:20:20PM -0500, Gary R Hook wrote:
> Provide base enablement for using debugfs to expose internal data of an
> IOMMU driver. When called, create the /sys/kernel/debug/iommu directory.
> 
> Emit a strong warning at boot time to indicate that this feature is
> enabled.
> 
> This function is called from iommu_init, and creates the initial DebugFS
> directory. Drivers may then call iommu_debugfs_new_driver_dir() to
> instantiate a device-specific directory to expose internal data.
> It will return a pointer to the new dentry structure created in
> /sys/kernel/debug/iommu, or NULL in the event of a failure.
> 
> Since the IOMMU driver can not be removed from the running system, there
> is no need for an "off" function.
> 
> Signed-off-by: Gary R Hook 
> ---
>  drivers/iommu/Kconfig |   10 ++
>  drivers/iommu/Makefile|1 +
>  drivers/iommu/iommu-debugfs.c |   72 
> +
>  drivers/iommu/iommu.c |2 +
>  include/linux/iommu.h |   11 ++
>  5 files changed, 96 insertions(+)
>  create mode 100644 drivers/iommu/iommu-debugfs.c
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index f3a21343e636..2eab6a849a0a 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -60,6 +60,16 @@ config IOMMU_IO_PGTABLE_ARMV7S_SELFTEST
>  
>  endmenu
>  
> +config IOMMU_DEBUGFS
> + bool "Export IOMMU internals in DebugFS"
> + depends on DEBUG_FS
> + help
> +   Allows exposure of IOMMU device internals. This option enables
> +   the use of debugfs by IOMMU drivers as required. Devices can,
> +   at initialization time, cause the IOMMU code to create a top-level
> +   debug/iommu directory, and then populate a subdirectory with
> +   entries as required.

You need a big "DO NOT ENABLE THIS OPTION UNLESS YOU REALLY REALLY KNOW
WHAT YOU ARE DOING!!!" line here.  To match up with your crazy kernel
log warning.

Otherwise distros will turn this on, I guarantee it.

> +
>  config IOMMU_IOVA
>   tristate
>  
> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
> index 1fb695854809..74cfbc392862 100644
> --- a/drivers/iommu/Makefile
> +++ b/drivers/iommu/Makefile
> @@ -2,6 +2,7 @@
>  obj-$(CONFIG_IOMMU_API) += iommu.o
>  obj-$(CONFIG_IOMMU_API) += iommu-traces.o
>  obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
> +obj-$(CONFIG_IOMMU_DEBUGFS) += iommu-debugfs.o
>  obj-$(CONFIG_IOMMU_DMA) += dma-iommu.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE) += io-pgtable.o
>  obj-$(CONFIG_IOMMU_IO_PGTABLE_ARMV7S) += io-pgtable-arm-v7s.o
> diff --git a/drivers/iommu/iommu-debugfs.c b/drivers/iommu/iommu-debugfs.c
> new file mode 100644
> index ..bb1bf2d0ac51
> --- /dev/null
> +++ b/drivers/iommu/iommu-debugfs.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * IOMMU debugfs core infrastructure
> + *
> + * Copyright (C) 2018 Advanced Micro Devices, Inc.
> + *
> + * Author: Gary R Hook 
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +static struct dentry *iommu_debugfs_dir;
> +
> +/**
> + * iommu_debugfs_setup - create the top-level iommu directory in debugfs
> + *
> + * Provide base enablement for using debugfs to expose internal data of an
> + * IOMMU driver. When called, this function creates the
> + * /sys/kernel/debug/iommu directory.
> + *
> + * Emit a strong warning at boot time to indicate that this feature is
> + * enabled.
> + *
> + * This function is called from iommu_init; drivers may then call
> + * iommu_debugfs_new_driver_dir() to instantiate a vendor-specific
> + * directory to be used to expose internal data.
> + */
> +void iommu_debugfs_setup(void)
> +{
> + if (!iommu_debugfs_dir) {
> + iommu_debugfs_dir = debugfs_create_dir("iommu", NULL);
> + if (iommu_debugfs_dir) {

No need to care about the return value, just keep going.  Nothing you
should, or should not, do depending on the return value of a debugfs
call.

> + pr_warn("\n");
> + 
> pr_warn("*\n");
> + pr_warn("** NOTICE NOTICE NOTICE NOTICE NOTICE 
> NOTICE NOTICE**\n");
> + pr_warn("** 
> **\n");
> + pr_warn("**  IOMMU DebugFS SUPPORT HAS BEEN ENABLED IN 
> THIS KERNEL  **\n");
> + pr_warn("** 
> **\n");
> + pr_warn("** This means that this kernel is built to 
> expose internal **\n");
> + pr_warn("** IOMMU data structures, which may compromise 
> security on **\n");
> + pr_warn("** your system.
> **\n");
> + pr_warn("** 
> **\n");
> + pr_warn("** If you see

Re: [PATCH v4 5/6] bus: fsl-mc: supoprt dma configure for devices on fsl-mc bus

2018-05-14 Thread Greg KH

On Mon, Apr 30, 2018 at 11:57:20AM +0530, Nipun Gupta wrote:
> Signed-off-by: Nipun Gupta 
> ---

I can't take patches without any changelog text at all, sorry.  Please
fix up and resend.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] dma-mapping: move dma configuration to bus infrastructure

2018-03-30 Thread Greg KH

On Fri, Mar 30, 2018 at 09:15:30AM +0200, Christoph Hellwig wrote:
> Can you resend the current state of affairs so we can get it in for
> 4.17?

Changes to the driver core and 3 different busses 2 days before 4.16 is
out, preventing any testing at all in linux-next?  This needs to wait
for the next merge window, sorry.  I'm away all this weekend, and can't
test on my end, sorry.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 2/2] drivers: remove force dma flag from buses

2018-03-21 Thread Greg KH

On Wed, Mar 21, 2018 at 04:28:46PM +, Nipun Gupta wrote:
> 
> 
> > -Original Message-
> > From: Greg KH [mailto:gre...@linuxfoundation.org]
> > Sent: Wednesday, March 21, 2018 15:05
> > To: Nipun Gupta 
> > Cc: robin.mur...@arm.com; h...@lst.de; li...@armlinux.org.uk;
> > m.szyprow...@samsung.com; bhelg...@google.com; zaj...@gmail.com;
> > andy.gr...@linaro.org; david.br...@linaro.org; dan.j.willi...@intel.com;
> > vinod.k...@intel.com; thierry.red...@gmail.com; robh...@kernel.org;
> > frowand.l...@gmail.com; jarkko.sakki...@linux.intel.com;
> > rafael.j.wyso...@intel.com; dmitry.torok...@gmail.com; jo...@kernel.org;
> > msucha...@suse.de; linux-ker...@vger.kernel.org; iommu@lists.linux-
> > foundation.org; linux-wirel...@vger.kernel.org; linux-arm-
> > m...@vger.kernel.org; linux-...@vger.kernel.org; dmaeng...@vger.kernel.org;
> > dri-de...@lists.freedesktop.org; linux-te...@vger.kernel.org;
> > devicet...@vger.kernel.org; linux-...@vger.kernel.org; Bharat Bhushan
> > ; Leo Li 
> > Subject: Re: [PATCH v2 2/2] drivers: remove force dma flag from buses
> > 
> > On Wed, Mar 21, 2018 at 12:25:23PM +0530, Nipun Gupta wrote:
> > > With each bus implementing its own DMA configuration callback,
> > > there is no need for bus to explicitly have force_dma in its
> > > global structure. This patch modifies of_dma_configure API to
> > > accept an input parameter which specifies if implicit DMA
> > > configuration is required even when it is not described by the
> > > firmware.
> > 
> > Having to "remember" what that bool variable means on the end of the
> > function call is a royal pain over time, right?
> > 
> > Why not just create a new function:
> > dma_common_configure_force(dma)
> > that always does this?  Leave "dma_common_configure()" alone, and then
> > wrap the old code with these two helper functions that call the 'core'
> > code with the bool set properly?
> > 
> > That way you do not have to "know" what that parameter is, the function
> > name just documents it automatically, so when you see it in the
> > bus-specific code, no need to go and have to hunt for anything.  And if
> > you are reading the dma core code, it's obvious what is happening as the
> > functions are all right there.
> 
> How about we do not pass any flag in 'dma_common_configure()', and inside this
> API we pass "true" to 'of_dma_configure()'? I am saying this because currently
> both the busses (platform and AMBA) which uses 'dma_common_configure()' passes
> "true" value. If we create additional 'dma_common_configure_force()', then
> 'dma_common_configure()' will not be used anytime and will become redundant.
> 
> If someday new busses come and they needs to use similar functionality which
> 'dma_common_configure()' provides, but with passing "false" to 
> 'of_dma_configure()',
> then what you suggests of having two separate such API's will be more 
> reasonable
> and can be implemented?

If that makes things "simple", yes, sounds good.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 2/2] drivers: remove force dma flag from buses

2018-03-21 Thread Greg KH

On Wed, Mar 21, 2018 at 12:25:23PM +0530, Nipun Gupta wrote:
> With each bus implementing its own DMA configuration callback,
> there is no need for bus to explicitly have force_dma in its
> global structure. This patch modifies of_dma_configure API to
> accept an input parameter which specifies if implicit DMA
> configuration is required even when it is not described by the
> firmware.

Having to "remember" what that bool variable means on the end of the
function call is a royal pain over time, right?

Why not just create a new function:
dma_common_configure_force(dma)
that always does this?  Leave "dma_common_configure()" alone, and then
wrap the old code with these two helper functions that call the 'core'
code with the bool set properly?

That way you do not have to "know" what that parameter is, the function
name just documents it automatically, so when you see it in the
bus-specific code, no need to go and have to hunt for anything.  And if
you are reading the dma core code, it's obvious what is happening as the
functions are all right there.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/2] dma-mapping: move dma configuration to bus infrastructure

2018-03-21 Thread Greg KH

On Wed, Mar 21, 2018 at 12:25:22PM +0530, Nipun Gupta wrote:
> It's bus specific aspect to map a given device on the bus and
> relevant firmware description of its DMA configuration.
> So, this change introduces '/dma_configure/' as bus callback
> giving flexibility to busses for implementing its own dma
> configuration function.
> 
> The change eases the addition of new busses w.r.t. adding the dma
> configuration functionality.
> 
> This patch also updates the PCI, Platform, ACPI and host1x bus to
> use new introduced callbacks.
> 
> Suggested-by: Christoph Hellwig 
> Signed-off-by: Nipun Gupta 
> ---
>  - The patches are based on the comments on:
>https://patchwork.kernel.org/patch/10259087/
> 
> Changes in v2:
>   - Do not have dma_deconfigure callback
>   - Have '/dma_common_configure/' API to provide a common DMA
> configuration which can be used by busses if it suits them.
>   - Platform and ACPI bus to use '/dma_common_configure/' in
> '/dma_configure/' callback.
>   - Updated commit message
>   - Updated pci_dma_configure API with changes suggested by Robin
> 
>  drivers/amba/bus.c  |  7 +++
>  drivers/base/dma-mapping.c  | 35 +++
>  drivers/base/platform.c |  6 ++
>  drivers/gpu/host1x/bus.c|  9 +
>  drivers/pci/pci-driver.c| 32 
>  include/linux/device.h  |  4 
>  include/linux/dma-mapping.h |  1 +
>  7 files changed, 74 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index 594c228..2fa1e8b 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -171,6 +172,11 @@ static int amba_pm_runtime_resume(struct device *dev)
>  }
>  #endif /* CONFIG_PM */
>  
> +static int amba_dma_configure(struct device *dev)
> +{
> + return dma_common_configure(dev);
> +}
> +
>  static const struct dev_pm_ops amba_pm = {
>   .suspend= pm_generic_suspend,
>   .resume = pm_generic_resume,
> @@ -194,6 +200,7 @@ struct bus_type amba_bustype = {
>   .dev_groups = amba_dev_groups,
>   .match  = amba_match,
>   .uevent = amba_uevent,
> + .dma_configure  = amba_dma_configure,
>   .pm = &amba_pm,
>   .force_dma  = true,
>  };
> diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
> index 3b11835..48f9af0 100644
> --- a/drivers/base/dma-mapping.c
> +++ b/drivers/base/dma-mapping.c
> @@ -331,38 +331,33 @@ void dma_common_free_remap(void *cpu_addr, size_t size, 
> unsigned long vm_flags)
>  #endif
>  
>  /*
> - * Common configuration to enable DMA API use for a device
> + * Common configuration to enable DMA API use for a device.
> + * A bus can use this function in its 'dma_configure' callback, if
> + * suitable for the bus.
>   */
> -#include 
> -
> -int dma_configure(struct device *dev)
> +int dma_common_configure(struct device *dev)
>  {
> - struct device *bridge = NULL, *dma_dev = dev;
>   enum dev_dma_attr attr;
>   int ret = 0;
>  
> - if (dev_is_pci(dev)) {
> - bridge = pci_get_host_bridge_device(to_pci_dev(dev));
> - dma_dev = bridge;
> - if (IS_ENABLED(CONFIG_OF) && dma_dev->parent &&
> - dma_dev->parent->of_node)
> - dma_dev = dma_dev->parent;
> - }
> -
> - if (dma_dev->of_node) {
> - ret = of_dma_configure(dev, dma_dev->of_node);
> - } else if (has_acpi_companion(dma_dev)) {
> - attr = acpi_get_dma_attr(to_acpi_device_node(dma_dev->fwnode));
> + if (dev->of_node) {
> + ret = of_dma_configure(dev, dev->of_node);
> + } else if (has_acpi_companion(dev)) {
> + attr = acpi_get_dma_attr(to_acpi_device_node(dev->fwnode));
>   if (attr != DEV_DMA_NOT_SUPPORTED)
>   ret = acpi_dma_configure(dev, attr);
>   }
>  
> - if (bridge)
> - pci_put_host_bridge_device(bridge);
> -
>   return ret;
>  }
>  
> +int dma_configure(struct device *dev)
> +{
> + if (dev->bus->dma_configure)
> + return dev->bus->dma_configure(dev);
> +
> + return 0;
> +}
>  void dma_deconfigure(struct device *dev)

Empty line after this new function?  Sorry, couldn't help it :)

>  {
>   of_dma_deconfigure(dev);
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index f1bf7b3..d2d5891 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -1130,6 +1130,11 @@ int platform_pm_restore(struct device *dev)
>  
>  #endif /* CONFIG_HIBERNATE_CALLBACKS */
>  
> +static int platform_dma_configure(struct device *dev)
> +{
> + return dma_common_configure(dev);
> +}
> +
>  static const struct dev_pm_ops platform_dev_pm_ops = {
>   .runtime_suspend = pm_generic_runtime_suspend,
>   .runtime_resume = pm_generic_runtime_resume,
> @@ -

Re: [PATCH] drivers: Flag buses which demand DMA configuration

2017-10-17 Thread Greg KH

On Tue, Oct 17, 2017 at 10:17:01AM +0200, Christoph Hellwig wrote:
> Looks fine:
> 
> Reviewed-by: Christoph Hellwig 
> 
> Greg, do you want to take this or should I queue it up in the
> dma-mapping tree?

You can take it, thanks!

Acked-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Greg KH

On Fri, May 05, 2017 at 02:56:00PM -0400, Rob Clark wrote:
> On Fri, May 5, 2017 at 2:24 PM, Greg KH  wrote:
> > On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
> >> It looks like it *used* to make sense to free the device.  But now it is
> >> embedded in 'struct iommu' (which is allocated or embedded in something
> >> that the device allocated).
> >>
> >> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
> >>
> >> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
> >> Signed-off-by: Rob Clark 
> >> ---
> >>  drivers/iommu/iommu-sysfs.c | 1 -
> >>  1 file changed, 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
> >> index c58351e..ad19cbb 100644
> >> --- a/drivers/iommu/iommu-sysfs.c
> >> +++ b/drivers/iommu/iommu-sysfs.c
> >> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] 
> >> = {
> >>
> >>  static void iommu_release_device(struct device *dev)
> >>  {
> >> - kfree(dev);
> >>  }
> >
> > As per the documentation in the kernel tree, I now get to make fun of
> > you for doing such a crazh and foolish thing!
> >
> > Come on, don't do that, a release function _HAS_ to free the memory
> > involved.  If not, then it is really broken...
> 
> There are shenanigans going on.. so release isn't counterpoint to a
> _probe() which allocates some memory.  See iommu_device_sysfs_add().
> So I'm not the one you get to make fun of ;-)
> 
> This the correct thing to do.  Whether the way the extra fake device
> embedded in something allocated in the iommu driver's probe (and
> free'd it *it's* _release()) stuff for iommu sysfs stuff works is
> bonkers or not, I'll let someone else decide..  it was like that when
> I got here.

If you have multiple reference counts in the same structure, your code
is wrong.  That is the root issue here that needs to be resolved.  Yes,
your patch papers over that, but again, it isn't right either.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Greg KH

On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
> It looks like it *used* to make sense to free the device.  But now it is
> embedded in 'struct iommu' (which is allocated or embedded in something
> that the device allocated).
> 
> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
> 
> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
> Signed-off-by: Rob Clark 
> ---
>  drivers/iommu/iommu-sysfs.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
> index c58351e..ad19cbb 100644
> --- a/drivers/iommu/iommu-sysfs.c
> +++ b/drivers/iommu/iommu-sysfs.c
> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
>  
>  static void iommu_release_device(struct device *dev)
>  {
> - kfree(dev);
>  }

As per the documentation in the kernel tree, I now get to make fun of
you for doing such a crazh and foolish thing!

Come on, don't do that, a release function _HAS_ to free the memory
involved.  If not, then it is really broken...

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH 1/2] vfio: Move vfio.c vfio_core.c

2015-10-09 Thread Greg KH

On Fri, Oct 09, 2015 at 12:41:03PM -0600, Alex Williamson wrote:
> This allows us to more easily create a "vfio" module that includes
> multiple files.  No code change, rename and Makefile update only.
> 
> Signed-off-by: Alex Williamson 
> ---
>  drivers/vfio/Makefile|1 
>  drivers/vfio/vfio.c  | 1640 
> --
>  drivers/vfio/vfio_core.c | 1640 
> ++
>  3 files changed, 1641 insertions(+), 1640 deletions(-)
>  delete mode 100644 drivers/vfio/vfio.c
>  create mode 100644 drivers/vfio/vfio_core.c

Use -S to git format-patch so we can see the rename, not a delete/add
patch please.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH V2] debugfs: don't assume sizeof(bool) to be 4 bytes

2015-09-14 Thread Greg KH

On Mon, Sep 14, 2015 at 09:17:08PM +0530, Viresh Kumar wrote:
> On 14-09-15, 17:39, Arnd Bergmann wrote:
> > On Monday 14 September 2015 09:21:54 Viresh Kumar wrote:
> > > diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
> > > index b4c216bab22b..bea8e425a8de 100644
> > > --- a/drivers/acpi/ec_sys.c
> > > +++ b/drivers/acpi/ec_sys.c
> > > @@ -128,7 +128,7 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec, 
> > > unsigned int ec_device_count)
> > >   if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 *)&first_ec->gpe))
> > >   goto error;
> > >   if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
> > > -  (u32 *)&first_ec->global_lock))
> > > +  &first_ec->global_lock))
> > >   goto error;
> > >  
> > >   if (write_support)
> > 
> > This one might need a separate patch that can be backported to stable, as
> > the original code is already broken on big-endian 64-bit machines:
> > global_lock is 'unsigned long'.
> 
> Hmmm, so you suggest a single patch that will do s/unsigned long/u32
> and that will be followed with this patch (almost) as is. Right?
> 
> Also, regarding the stable thing, we will surely not be able to
> backport it straight away. But we should get it backported for sure.
> 
> @Greg: What all kernel versions you want this to be backported for?

What ever ones you think it is relevant for :)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: platform: add device binding path 'driver_override'

2014-07-08 Thread Greg KH

On Mon, Jun 02, 2014 at 07:42:58PM -0500, Kim Phillips wrote:
> Needed by platform device drivers, such as the upcoming
> vfio-platform driver, in order to bypass the existing OF, ACPI,
> id_table and name string matches, and successfully be able to be
> bound to any device, like so:
> 
> echo vfio-platform > 
> /sys/bus/platform/devices/fff51000.ethernet/driver_override
> echo fff51000.ethernet > 
> /sys/bus/platform/devices/fff51000.ethernet/driver/unbind
> echo fff51000.ethernet > /sys/bus/platform/drivers_probe
> 
> This mimics "PCI: Introduce new device binding path using
> pci_dev.driver_override", which is an interface enhancement
> for more deterministic PCI device binding, e.g., when in the
> presence of hotplug.
> 
> Reviewed-by: Alex Williamson 
> Reviewed-by: Alexander Graf 
> Reviewed-by: Stuart Yoder 
> Signed-off-by: Kim Phillips 
> ---
> Greg,
> 
> This is largely identical to the PCI version of the same that has
> been accepted for v3.16 and ack'd by you:
> 
> https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009674.html
> 
> and applied to Bjorn Helgaas' PCI tree:
> 
> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization&id=782a985d7af26db39e86070d28f987cad21313c0
> 
> You are the platform driver core maintainer: can you apply this to
> your driver-core tree now?

Sorry for the very long delay, it's now merged in my tree.

Thanks for being persistant.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 09/24] drivers/iommu: check actual error on iommu_init

2014-06-17 Thread Greg KH

On Tue, Jun 17, 2014 at 05:21:54PM +0200, Joerg Roedel wrote:
> On Tue, Jun 17, 2014 at 10:29:19PM +0800, Jeff Liu wrote:
> > From: Jie Liu 
> > 
> > kset_create_and_add() has been fixed to return the actual error
> > code ptr rather than NULL, so update iommu_init() to check the
> > return value via IS_ERR() accordingly.
> > 
> > Cc: Joerg Roedel 
> > Signed-off-by: Jie Liu 
> 
> Acked-by: Joerg Roedel 

Nope, please do not take this patch, it is not needed and as the kset
call is not changing, it will be incorrect.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [stable PATCH] iommu/vt-d: Fix missing IOTLB flush in intel_iommu_unmap()

2014-06-09 Thread Greg KH

On Mon, Jun 09, 2014 at 02:09:53PM +0100, David Woodhouse wrote:
> This missing IOTLB flush was added as a minor, inconsequential bug-fix
> in commit ea8ea460c ("iommu/vt-d: Clean up and fix page table clear/free
> behaviour") in 3.15. It wasn't originally intended for -stable but a
> couple of users have reported issues which turn out to be fixed by
> adding the missing flush.
> 
> Signed-off-by: David Woodhouse 
> ---
> For 3.14 and earlier. As noted, this fix is in 3.15 already.

What is the commit id of the patch for this in 3.15?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dma-mapping: Add devm_ interface for dma_map_single()

2014-06-06 Thread Greg KH

On Fri, Jun 06, 2014 at 02:45:06PM +0300, Eli Billauer wrote:
> Hello Joerg.
> 
> 
> On 05/06/14 00:25, Joerg Roedel wrote:
> >
> >What you are trying to do should work with dma_alloc_noncoherent(). The
> >API allows partial syncs on this memory, so you should be fine.
> Please try to put yourself in my position: I have a driver that I care
> about, which works fine for a few years. It's based upon dma_map_single(),
> which seems to be the common way to get non-coherent memory, even for the
> driver's entire lifespan. I realize that dma_alloc_* was the intended way to
> do it, but fact is that dma_map_* has become the common choice.

Is your driver in the kernel tree?  If not, you really are on your own :(

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] driver core: platform: add device binding path 'driver_override'

2014-06-02 Thread Greg KH

On Mon, Jun 02, 2014 at 07:42:58PM -0500, Kim Phillips wrote:
> Needed by platform device drivers, such as the upcoming
> vfio-platform driver, in order to bypass the existing OF, ACPI,
> id_table and name string matches, and successfully be able to be
> bound to any device, like so:
> 
> echo vfio-platform > 
> /sys/bus/platform/devices/fff51000.ethernet/driver_override
> echo fff51000.ethernet > 
> /sys/bus/platform/devices/fff51000.ethernet/driver/unbind
> echo fff51000.ethernet > /sys/bus/platform/drivers_probe
> 
> This mimics "PCI: Introduce new device binding path using
> pci_dev.driver_override", which is an interface enhancement
> for more deterministic PCI device binding, e.g., when in the
> presence of hotplug.
> 
> Reviewed-by: Alex Williamson 
> Reviewed-by: Alexander Graf 
> Reviewed-by: Stuart Yoder 
> Signed-off-by: Kim Phillips 
> ---
> Greg,
> 
> This is largely identical to the PCI version of the same that has
> been accepted for v3.16 and ack'd by you:
> 
> https://lists.cs.columbia.edu/pipermail/kvmarm/2014-May/009674.html
> 
> and applied to Bjorn Helgaas' PCI tree:
> 
> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization&id=782a985d7af26db39e86070d28f987cad21313c0
> 
> You are the platform driver core maintainer: can you apply this to
> your driver-core tree now?

Yes, I will after this merge window ends, it's too late for 3.16-rc1
with the window opening up a week early, sorry.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] PCI: Introduce new device binding path using pci_dev.driver_override

2014-05-28 Thread Greg KH

On Tue, May 27, 2014 at 09:07:42PM -0600, Bjorn Helgaas wrote:
> On Tue, May 20, 2014 at 08:53:21AM -0600, Alex Williamson wrote:
> > The driver_override field allows us to specify the driver for a device
> > rather than relying on the driver to provide a positive match of the
> > device.  This shortcuts the existing process of looking up the vendor
> > and device ID, adding them to the driver new_id, binding the device,
> > then removing the ID, but it also provides a couple advantages.
> > 
> > First, the above existing process allows the driver to bind to any
> > device matching the new_id for the window where it's enabled.  This is
> > often not desired, such as the case of trying to bind a single device
> > to a meta driver like pci-stub or vfio-pci.  Using driver_override we
> > can do this deterministically using:
> > 
> > echo pci-stub > /sys/bus/pci/devices/:03:00.0/driver_override
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Previously we could not invoke drivers_probe after adding a device
> > to new_id for a driver as we get non-deterministic behavior whether
> > the driver we intend or the standard driver will claim the device.
> > Now it becomes a deterministic process, only the driver matching
> > driver_override will probe the device.
> > 
> > To return the device to the standard driver, we simply clear the
> > driver_override and reprobe the device:
> > 
> > echo > /sys/bus/pci/devices/:03:00.0/driver_override
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Another advantage to this approach is that we can specify a driver
> > override to force a specific binding or prevent any binding.  For
> > instance when an IOMMU group is exposed to userspace through VFIO
> > we require that all devices within that group are owned by VFIO.
> > However, devices can be hot-added into an IOMMU group, in which case
> > we want to prevent the device from binding to any driver (override
> > driver = "none") or perhaps have it automatically bind to vfio-pci.
> > With driver_override it's a simple matter for this field to be set
> > internally when the device is first discovered to prevent driver
> > matches.
> > 
> > Signed-off-by: Alex Williamson 
> > Cc: Greg Kroah-Hartman 
> 
> Greg, are you going to weigh in on this?  It does seem to solve some real
> problems.  ISTR you had an opinion once, but I don't know your current
> thoughts.

This looks good to me:

Acked-by: Greg Kroah-Hartman 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] PCI: Introduce new device binding path using pci_dev.driver_override

2014-05-27 Thread Greg KH

On Tue, May 27, 2014 at 09:07:42PM -0600, Bjorn Helgaas wrote:
> On Tue, May 20, 2014 at 08:53:21AM -0600, Alex Williamson wrote:
> > The driver_override field allows us to specify the driver for a device
> > rather than relying on the driver to provide a positive match of the
> > device.  This shortcuts the existing process of looking up the vendor
> > and device ID, adding them to the driver new_id, binding the device,
> > then removing the ID, but it also provides a couple advantages.
> > 
> > First, the above existing process allows the driver to bind to any
> > device matching the new_id for the window where it's enabled.  This is
> > often not desired, such as the case of trying to bind a single device
> > to a meta driver like pci-stub or vfio-pci.  Using driver_override we
> > can do this deterministically using:
> > 
> > echo pci-stub > /sys/bus/pci/devices/:03:00.0/driver_override
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Previously we could not invoke drivers_probe after adding a device
> > to new_id for a driver as we get non-deterministic behavior whether
> > the driver we intend or the standard driver will claim the device.
> > Now it becomes a deterministic process, only the driver matching
> > driver_override will probe the device.
> > 
> > To return the device to the standard driver, we simply clear the
> > driver_override and reprobe the device:
> > 
> > echo > /sys/bus/pci/devices/:03:00.0/driver_override
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Another advantage to this approach is that we can specify a driver
> > override to force a specific binding or prevent any binding.  For
> > instance when an IOMMU group is exposed to userspace through VFIO
> > we require that all devices within that group are owned by VFIO.
> > However, devices can be hot-added into an IOMMU group, in which case
> > we want to prevent the device from binding to any driver (override
> > driver = "none") or perhaps have it automatically bind to vfio-pci.
> > With driver_override it's a simple matter for this field to be set
> > internally when the device is first discovered to prevent driver
> > matches.
> > 
> > Signed-off-by: Alex Williamson 
> > Cc: Greg Kroah-Hartman 
> 
> Greg, are you going to weigh in on this?  It does seem to solve some real
> problems.  ISTR you had an opinion once, but I don't know your current
> thoughts.

Give me a few more days, still digging through my patch backlog...

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH] PCI: Introduce new device binding path using pci_dev.driver_override

2014-04-01 Thread Greg KH

On Tue, Apr 01, 2014 at 06:52:12PM -0500, Kim Phillips wrote:
> On Tue, 01 Apr 2014 10:28:54 -0600
> Alex Williamson  wrote:
> 
> > The driver_override field allows us to specify the driver for a device
> > rather than relying on the driver to provide a positive match of the
> > device.  This shortcuts the existing process of looking up the vendor
> > and device ID, adding them to the driver new_id, binding the device,
> > then removing the ID, but it also provides a couple advantages.
> > 
> > First, the above process allows the driver to bind to any device
> > matching the new_id for the window where it's enabled.  This is often
> > not desired, such as the case of trying to bind a single device to a
> > meta driver like pci-stub or vfio-pci.  Using driver_override we can
> > do this deterministically using:
> > 
> > echo pci-stub > /sys/bus/pci/devices/:03:00.0/driver_override
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Previously we could not invoke drivers_probe after adding a device
> > to new_id for a driver as we get non-deterministic behavior whether
> > the driver we intend or the standard driver will claim the device.
> > Now it becomes a deterministic process, only the driver matching
> > driver_override will probe the device.
> > 
> > To return the device to the standard driver, we simply clear the
> > driver_override and reprobe the device, ex:
> > 
> > echo > /sys/bus/pci/devices/:03:00.0/preferred_driver
> > echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> > echo :03:00.0 > /sys/bus/pci/drivers_probe
> > 
> > Another advantage to this approach is that we can specify a driver
> > override to force a specific binding or prevent any binding.  For
> > instance when an IOMMU group is exposed to userspace through VFIO
> > we require that all devices within that group are owned by VFIO.
> > However, devices can be hot-added into an IOMMU group, in which case
> > we want to prevent the device from binding to any driver (preferred
> > driver = "none") or perhaps have it automatically bind to vfio-pci.
> > With driver_override it's a simple matter for this field to be set
> > internally when the device is first discovered to prevent driver
> > matches.
> > 
> > Signed-off-by: Alex Williamson 
> > ---
> > 
> > Apologies for the exceptionally long cc list, this is a follow-up to
> > Stuart's "Subject: mechanism to allow a driver to bind to any device"
> > thread.  This is effectively a v2 of the proof-of-concept patch I
> > posted in that thread.  This version changes to use a dummy id struct
> > to return on an "override" match, which removes the collateral damage
> > and greatly simplifies the patch.  This feels fairly well baked for
> > PCI and I would expect that platform drivers could do a similar
> > implementation.  From there perhaps we can discuss whether there's
> > any advantage to placing driver_override on struct device.  The logic
> > for incorporating it into the match still needs to happen per bus
> > driver, so it might only contribute to consistency of the show/store
> > sysfs attributes to move it up to struct device.  Please comment.
> 
> Sounds like Greg likes this approach more than {drv,dev}_sysfs_only.

I have made no such judgement, I only pointed out that if you
modify/add/remove a sysfs file, it needs to have documentation for it.

> The diff below is the result of duplicating and converting this patch
> for platform devices, and, indeed, binding a device to the
> vfio-platform driver succeeds with:
> 
> echo vfio-platform > 
> /sys/bus/platform/devices/fff51000.ethernet/driver_override
> echo fff51000.ethernet > 
> /sys/bus/platform/devices/fff51000.ethernet/driver/unbind
> echo fff51000.ethernet > /sys/bus/platform/drivers_probe
> 
> However, it's almost pure duplication modulo the bus match code.  The
> only other place I can see where to put the common bus check is
> drivers/base/base.h:driver_match_device(), which I'm guessing is
> off-limits?  So should we leave this as per-bus code, and somehow
> refactor driver_override_{show,store}?

If you can provide a way for this to be done in a bus-independant way,
like we did for new_id and the like, I'd be open to reviewing it.


greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH] PCI: Introduce new device binding path using pci_dev.driver_override

2014-04-01 Thread Greg KH

On Tue, Apr 01, 2014 at 10:28:54AM -0600, Alex Williamson wrote:
> The driver_override field allows us to specify the driver for a device
> rather than relying on the driver to provide a positive match of the
> device.  This shortcuts the existing process of looking up the vendor
> and device ID, adding them to the driver new_id, binding the device,
> then removing the ID, but it also provides a couple advantages.
> 
> First, the above process allows the driver to bind to any device
> matching the new_id for the window where it's enabled.  This is often
> not desired, such as the case of trying to bind a single device to a
> meta driver like pci-stub or vfio-pci.  Using driver_override we can
> do this deterministically using:
> 
> echo pci-stub > /sys/bus/pci/devices/:03:00.0/driver_override
> echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> echo :03:00.0 > /sys/bus/pci/drivers_probe
> 
> Previously we could not invoke drivers_probe after adding a device
> to new_id for a driver as we get non-deterministic behavior whether
> the driver we intend or the standard driver will claim the device.
> Now it becomes a deterministic process, only the driver matching
> driver_override will probe the device.
> 
> To return the device to the standard driver, we simply clear the
> driver_override and reprobe the device, ex:
> 
> echo > /sys/bus/pci/devices/:03:00.0/preferred_driver
> echo :03:00.0 > /sys/bus/pci/devices/:03:00.0/driver/unbind
> echo :03:00.0 > /sys/bus/pci/drivers_probe
> 
> Another advantage to this approach is that we can specify a driver
> override to force a specific binding or prevent any binding.  For
> instance when an IOMMU group is exposed to userspace through VFIO
> we require that all devices within that group are owned by VFIO.
> However, devices can be hot-added into an IOMMU group, in which case
> we want to prevent the device from binding to any driver (preferred
> driver = "none") or perhaps have it automatically bind to vfio-pci.
> With driver_override it's a simple matter for this field to be set
> internally when the device is first discovered to prevent driver
> matches.
> 
> Signed-off-by: Alex Williamson 
> ---
> 
> Apologies for the exceptionally long cc list, this is a follow-up to
> Stuart's "Subject: mechanism to allow a driver to bind to any device"
> thread.  This is effectively a v2 of the proof-of-concept patch I
> posted in that thread.  This version changes to use a dummy id struct
> to return on an "override" match, which removes the collateral damage
> and greatly simplifies the patch.  This feels fairly well baked for
> PCI and I would expect that platform drivers could do a similar
> implementation.  From there perhaps we can discuss whether there's
> any advantage to placing driver_override on struct device.  The logic
> for incorporating it into the match still needs to happen per bus
> driver, so it might only contribute to consistency of the show/store
> sysfs attributes to move it up to struct device.  Please comment.
> Thanks,
> 
> Alex
> 
>  drivers/pci/pci-driver.c |   25 ++---
>  drivers/pci/pci-sysfs.c  |   40 
>  include/linux/pci.h  |1 +
>  3 files changed, 63 insertions(+), 3 deletions(-)

No Documentation/ABI/ update to reflect the ABI you are creating?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: mechanism to allow a driver to bind to any device

2014-03-31 Thread Greg KH

On Mon, Mar 31, 2014 at 06:47:51PM +, Stuart Yoder wrote:
> I also, was at the point where I thought we should perhaps just
> go with current mechanisms and implement new_id for the platform
> bus...but Greg's recent response is 'platform devices suck' and it sounds
> like he would reject a new_id patch for the platform bus.  So it kind
> of feels like we are stuck.

ids mean nothing in the platform device model, so having a new_id file
for them makes no sense.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: mechanism to allow a driver to bind to any device

2014-03-27 Thread Greg KH

On Wed, Mar 26, 2014 at 10:39:57PM +0100, Antonios Motakis wrote:
> 
> Of note is that new_id doesn't work particularly well for platform devices.

Nor should it.  Platform devices suck horribly, and "ids" mean nothing
to them, so you shouldn't even try this.  Use a "real" bus and it should
be fine.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

2014-02-20 Thread Greg KH

On Thu, Feb 20, 2014 at 10:34:35PM +, Stuart Yoder wrote:
> 
> 
> > -Original Message-
> > From: Yoder Stuart-B08248
> > Sent: Saturday, February 15, 2014 12:19 PM
> > To: 'Greg KH'
> > Cc: Antonios Motakis; alex.william...@redhat.com;
> > kvm...@lists.cs.columbia.edu; iommu@lists.linux-foundation.org; linux-
> > ker...@vger.kernel.org; t...@virtualopensystems.com;
> > a.r...@virtualopensystems.com; kim.phill...@linaro.org;
> > jan.kis...@siemens.com; k...@vger.kernel.org; Bhushan Bharat-R65777; Wood
> > Scott-B07421; christoffer.d...@linaro.org; ag...@suse.de; Sethi Varun-
> > B16395; will.dea...@arm.com; Tejun Heo; Rafael J. Wysocki; Guenter Roeck;
> > Toshi Kani; Joe Perches; Dmitry Kasatkin; Michal Hocko; Bjorn Helgaas
> > Subject: RE: [RFC PATCH v4 01/10] driver core: export
> > driver_probe_device()
> > 
> > 
> > 
> > > -Original Message-
> > > From: Greg KH [mailto:gre...@linuxfoundation.org]
> > > Sent: Saturday, February 15, 2014 11:34 AM
> > > To: Yoder Stuart-B08248
> > > Cc: Antonios Motakis; alex.william...@redhat.com;
> > > kvm...@lists.cs.columbia.edu; iommu@lists.linux-foundation.org; linux-
> > > ker...@vger.kernel.org; t...@virtualopensystems.com;
> > > a.r...@virtualopensystems.com; kim.phill...@linaro.org;
> > > jan.kis...@siemens.com; k...@vger.kernel.org; Bhushan Bharat-R65777;
> > Wood
> > > Scott-B07421; christoffer.d...@linaro.org; ag...@suse.de; Sethi Varun-
> > > B16395; will.dea...@arm.com; Tejun Heo; Rafael J. Wysocki; Guenter
> > Roeck;
> > > Toshi Kani; Joe Perches; Dmitry Kasatkin; Michal Hocko; Bjorn Helgaas
> > > Subject: Re: [RFC PATCH v4 01/10] driver core: export
> > > driver_probe_device()
> > >
> > > On Sat, Feb 15, 2014 at 04:33:44PM +, Stuart Yoder wrote:
> > > > > > Why?  driver_probe_device() allows a driver to explicitly bind
> > > > > > to a specific device.   What is conceptually wrong with allowing
> > > > > > that?
> > > > >
> > > > > Because that's not how a bus should work, and the fact that no
> > other
> > > > > subsystem in the kernel does that might be a hint you are trying to
> > > do
> > > > > something a bit "wrong" here.
> > > >
> > > > Let me try to succinctly as I can describe the problem we are trying
> > to
> > > > solve here...
> > > >
> > > > The vfio mechanism in the kernel (e.g. vfio-pci) allows devices to be
> > > > exposed user space (via file descriptors), enabling user space
> > > > drivers.  So, for example to export an e1000 card to user space, I do
> > > > this:
> > > >
> > > >echo 0001:03:00.0 >
> > /sys/bus/pci/devices/0001:03:00.0/driver/unbind
> > > >echo 8086 10d3 > /sys/bus/pci/drivers/vfio-pci/new_id
> > >
> > > What's wrong with using the "bind" file instead?  That picks a specific
> > > device and binds it to a specific driver.  Or have we been down this
> > > path before?  :)
> > 
> > Yes we have :)  The "bind" file does not bypass device ID checks, so
> > it wouldn't work without new_id or a wildcard match of some kind.
> > 
> > > And that is for a PCI "driver" not a totally separate bus, which it
> > > looks like you are wanting to do here.
> > 
> > vfio-pci is a PCI driver, not a bus (drivers/vfio/pci/vfio_pci.c).
> > 
> > > > The first step unbinds the target device (0001:03:00.0) from the
> > normal
> > > > e1000 driver.
> > > >
> > > > The second step causes the vfio-pci driver to bind to device
> > > 0001:03:00.0.
> > > > This second step tells vfio-pci that it now handles e1000 device IDs,
> > > > and the vfio-pci drivers registers with the PCI bus to handle '8086
> > > 10d3'.
> > > >
> > > > That works, but it is ugly.  We now have 2 active drivers handling
> > > > the same device type...which introduces various possible race
> > > conditions.
> > > >
> > > > We never want vfio-pci to auto-bind to any new device that shows up
> > > > on the PCI bus.  Binding a device to vfio-pci must be an explicit
> > > > action by an administrator.
> > >
> > > Then use the "bind" file.
> > 
> > See above.
> > 
> > > > You mentioned previously that u

Re: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

2014-02-15 Thread Greg KH

On Sat, Feb 15, 2014 at 04:33:44PM +, Stuart Yoder wrote:
> > > Why?  driver_probe_device() allows a driver to explicitly bind
> > > to a specific device.   What is conceptually wrong with allowing
> > > that?
> > 
> > Because that's not how a bus should work, and the fact that no other
> > subsystem in the kernel does that might be a hint you are trying to do
> > something a bit "wrong" here.
> 
> Let me try to succinctly as I can describe the problem we are trying to
> solve here...
> 
> The vfio mechanism in the kernel (e.g. vfio-pci) allows devices to be
> exposed user space (via file descriptors), enabling user space
> drivers.  So, for example to export an e1000 card to user space, I do
> this:
> 
>echo 0001:03:00.0 > /sys/bus/pci/devices/0001:03:00.0/driver/unbind
>echo 8086 10d3 > /sys/bus/pci/drivers/vfio-pci/new_id

What's wrong with using the "bind" file instead?  That picks a specific
device and binds it to a specific driver.  Or have we been down this
path before?  :)

And that is for a PCI "driver" not a totally separate bus, which it
looks like you are wanting to do here.

> The first step unbinds the target device (0001:03:00.0) from the normal
> e1000 driver.
> 
> The second step causes the vfio-pci driver to bind to device 0001:03:00.0.
> This second step tells vfio-pci that it now handles e1000 device IDs,
> and the vfio-pci drivers registers with the PCI bus to handle '8086 10d3'. 
> 
> That works, but it is ugly.  We now have 2 active drivers handling
> the same device type...which introduces various possible race conditions.
> 
> We never want vfio-pci to auto-bind to any new device that shows up
> on the PCI bus.  Binding a device to vfio-pci must be an explicit
> action by an administrator.

Then use the "bind" file.

> You mentioned previously that user space can sort out the problem
> of multiple drivers registered for handling the same device type.
> That is true, but doesn't help here.   We don't want vfio-pci
> to handle _all_ e1000 cards, just explicitly selected e1000 cards.
> 
> We want the normal e1000 driver to be loaded and to bind to new
> devices that may be hot-plugged.

I want a pony too...

> There are 2 proposed mechanisms that have been put forth, both of
> which you have now rejected:
> 
>1.  sysfs_bind_only flag was proposed which would allow a vfio
>driver (like vfio-pci) to only bind by explicit request through
>the sysfs 'bind' file.

Why did I reject this?  What did the patch look like?

>2.  Have the vfio driver call driver_probe_device() to explicitly bind
>a particular device instance to the driver.  Only change we need
>here is the EXPORT_SYMBOL.

How are you going to prevent the driver from being bound to the device
in the core with this change?  How are you going to call this function?
When?  On what action of the user?

> Are you in principle opposed to any mechanism that would allow 2 drivers
> to be resident/active and allow a sysadmin to explicitly bind a 
> particular device instance to the driver of their choice?

No, that works today with the bind/unbind/new_id files, it's just that
you don't like it :)

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

2014-02-14 Thread Greg KH

On Fri, Feb 14, 2014 at 11:00:31PM +, Stuart Yoder wrote:
> 
> 
> > -Original Message-
> > From: Greg KH [mailto:gre...@linuxfoundation.org]
> > Sent: Friday, February 14, 2014 4:27 PM
> > To: Antonios Motakis
> > Cc: alex.william...@redhat.com; kvm...@lists.cs.columbia.edu;
> > iommu@lists.linux-foundation.org; linux-ker...@vger.kernel.org;
> > t...@virtualopensystems.com; a.r...@virtualopensystems.com; Yoder Stuart-
> > B08248; kim.phill...@linaro.org; jan.kis...@siemens.com;
> > k...@vger.kernel.org; Bhushan Bharat-R65777; Wood Scott-B07421;
> > christoffer.d...@linaro.org; ag...@suse.de; Sethi Varun-B16395;
> > will.dea...@arm.com; Tejun Heo; Rafael J. Wysocki; Guenter Roeck; Toshi
> > Kani; Joe Perches; Dmitry Kasatkin; Michal Hocko; Bjorn Helgaas
> > Subject: Re: [RFC PATCH v4 01/10] driver core: export
> > driver_probe_device()
> > 
> > On Sat, Feb 08, 2014 at 06:29:31PM +0100, Antonios Motakis wrote:
> > > From: Kim Phillips 
> > >
> > > Needed by drivers, such as the vfio platform driver [1], seeking to
> > > bypass bind_store()'s driver_match_device(), and bind to any device
> > > via a private sysfs bind file.
> > >
> > > [1] https://lkml.org/lkml/2013/12/11/522
> > >
> > > note: the EXPORT_SYMBOL is needed because vfio-platform can be built
> > > as a module.
> > 
> > No code outside of drivers/base/ should be calling this function
> 
> Why?  driver_probe_device() allows a driver to explicitly bind
> to a specific device.   What is conceptually wrong with allowing
> that?

Because that's not how a bus should work, and the fact that no other
subsystem in the kernel does that might be a hint you are trying to do
something a bit "wrong" here.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC PATCH v4 01/10] driver core: export driver_probe_device()

2014-02-14 Thread Greg KH

On Sat, Feb 08, 2014 at 06:29:31PM +0100, Antonios Motakis wrote:
> From: Kim Phillips 
> 
> Needed by drivers, such as the vfio platform driver [1], seeking to
> bypass bind_store()'s driver_match_device(), and bind to any device
> via a private sysfs bind file.
> 
> [1] https://lkml.org/lkml/2013/12/11/522
> 
> note: the EXPORT_SYMBOL is needed because vfio-platform can be built
> as a module.

No code outside of drivers/base/ should be calling this function, you
are doing something wrong in your bus if you want to do this, please fix
your bus code.

sorry, I can't accept this at all.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCHv7 04/12] driver/core: populate devices in order for IOMMUs

2013-12-12 Thread Greg KH

On Thu, Dec 12, 2013 at 11:39:20AM +, Grant Likely wrote:
> On Thu, 12 Dec 2013 09:57:05 +0200, Hiroshi Doyu  wrote:
> > IOMMU devices on the bus need to be poplulated first, then iommu
> > master devices are done later.
> > 
> > With CONFIG_OF_IOMMU, "iommus=" DT binding would be used to identify
> > whether a device can be an iommu msater or not. If a device can, we'll
> > defer to populate that device till an iommu device is populated. Then,
> > those deferred iommu master devices are populated and configured with
> > help of the already populated iommu device.
> > 
> > Signed-off-by: Hiroshi Doyu 
> > Cc: Greg Kroah-Hartman 
> > ---
> > This is related to the following discussion:
> >   [RFC PATCH] Documentation: devicetree: add description for generic bus 
> > properties
> >   
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/215042.html
> > 
> > v6:
> > Spinned off only driver core part from:
> >   [PATCHv5 2/9] driver/core: populate devices in order for IOMMUs
> > 
> > v5:
> > Use "iommus=" binding instread of arm,smmu's "#stream-id-cells".
> > 
> > v4:
> > This is newly added, and the successor of the following RFC:
> >   [RFC][PATCHv3+ 1/2] driver/core: Add of_iommu_attach()
> >   http://lists.linuxfoundation.org/pipermail/iommu/2013-November/006914.html
> > 
> > Signed-off-by: Hiroshi Doyu 
> > ---
> >  drivers/base/dd.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 0605176..0605f52 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -25,6 +25,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include "base.h"
> >  #include "power/power.h"
> > @@ -273,6 +274,10 @@ static int really_probe(struct device *dev, struct 
> > device_driver *drv)
> >  
> > dev->driver = drv;
> >  
> > +   ret = of_iommu_attach(dev);
> > +   if (ret)
> > +   goto probe_failed;
> > +
> 
> As discussed before, I really don't think hooking in to dd.c is the
> right thing to do here, and certainly not as a device tree specific
> function. ACPI or PCI described devices may have the same constraints
> and those won't have DT descriptions.

I agree, this shouldn't be in the driver core.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] intel-iommu: Fix leaks in pagetable freeing

2013-10-05 Thread Greg KH

On Wed, Oct 02, 2013 at 10:44:31AM +0200, Borislav Petkov wrote:
> On Sat, Jun 15, 2013 at 10:27:19AM -0600, Alex Williamson wrote:
> > At best the current code only seems to free the leaf pagetables and
> > the root.  If you're unlucky enough to have a large gap (like any
> > QEMU guest with more than 3G of memory), only the first chunk of leaf
> > pagetables are freed (plus the root).  This is a massive memory leak.
> > This patch re-writes the pagetable freeing function to use a
> > recursive algorithm and manages to not only free all the pagetables,
> > but does it without any apparent performance loss versus the current
> > broken version.
> > 
> > Signed-off-by: Alex Williamson 
> > Cc: sta...@vger.kernel.org
> > ---
> > 
> > Suggesting for stable, would like to see some soak time, but it's
> > hard to imagine this being any worse than the current code.
> 
> Btw, I have a backport for the 3.0.x series which builds fine here, in
> case you guys are interested :)

Thanks, now applied.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: IO_PAGE_FAULTs on unity mapped regions during amd_iommu_init() in Linux 3.4

2013-02-11 Thread Greg KH

On Wed, Feb 06, 2013 at 07:40:50PM -0700, Shuah Khan wrote:
> On Wed, 2013-02-06 at 13:12 +0100, Joerg Roedel wrote:
> > On Tue, Feb 05, 2013 at 06:57:21AM -0700, Shuah Khan wrote:
> > > Thanks much. I will hang on to this test system for testing your fix.
> > 
> > Okay, here is the simple fix for v3.8-rc6. I guess it is not
> > straighforward to port it to v3.4, but it should be doable.
> > 
> > From 2ecf57c85e67e0243b36b787d0490c0b47202ba8 Mon Sep 17 00:00:00 2001
> > From: Joerg Roedel 
> > Date: Wed, 6 Feb 2013 12:55:23 +0100
> > Subject: [PATCH] iommu/amd: Initialize device table after dma_ops
> > 
> > When dma_ops are initialized the unity mappings are
> > created. The init_device_table_dma() function makes sure DMA
> > from all devices is blocked by default. This opens a short
> > window in time where DMA to unity mapped regions is blocked
> > by the IOMMU. Make sure this does not happen by initializing
> > the device table after dma_ops.
> > 
> > Signed-off-by: Joerg Roedel 
> 
> Joerg,
> 
> I tested your patch on 3.8. I was able to reproduce the problem and then
> apply your patch to verify that the problem is fixed. This patch applies
> cleanly to 3.7.6, however I could not reproduce the problem on 3.7.6
> without the patch. But the window exists on 3.7 as well. Your patch can
> be applied to 3.7.6 as is.
> 
> I back-ported the patch to 3.4 and 3.0 and tested. I am sending those
> patches after this email.
> 
> On 3.4.29 and 3.0.62 I was able to reproduce the problem and then
> applied the back-ported patch to verify that the problem is fixed.
> 
> Thanks again for the fix.

I'm lost here, why isn't this patch in Linus's tree already?  You seem
to be sending backports for something that isn't there to backport yet.

Please resend these patches to sta...@vger.kernel.org, with the upstream
git commit id, when they are in Linus's tree, there's nothing I can do
with them for now, sorry.

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/1] iommu/tegra: smmu: Fix deadly typo

2012-10-25 Thread Greg KH

On Thu, Oct 25, 2012 at 09:06:02AM +0300, Hiroshi Doyu wrote:
> On Wed, 24 Oct 2012 17:01:21 +0200
> Joerg Roedel  wrote:
> 
> > On Thu, Oct 18, 2012 at 08:35:10AM +0300, Hiroshi Doyu wrote:
> > > From: Hiro Sugawara 
> > > 
> > > Fix a deadly typo in macro definition.
> > > 
> > > Signed-off-by: Hiro Sugawara 
> > > Signed-off-by: Hiroshi Doyu 
> > 
> > Applied, thanks.
> 
> Putting stable@ on To:. The original ptach attached.



This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read Documentation/stable_kernel_rules.txt
for how to do this properly.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/13] driver core: Add iommu_group tracking to struct device

2012-05-16 Thread Greg KH

On Fri, May 11, 2012 at 05:58:01PM -0600, Alex Williamson wrote:
> On Fri, 2012-05-11 at 16:38 -0700, Greg KH wrote:
> > On Fri, May 11, 2012 at 04:55:35PM -0600, Alex Williamson wrote:
> > > IOMMU groups allow IOMMU drivers to represent DMA visibility
> > > and isolation of devices.  Multiple devices may be grouped
> > > together for the purposes of DMA.  Placing a pointer on
> > > struct device enable easy access for things like streaming
> > > DMA programming and drivers like VFIO.
> > > 
> > > Signed-off-by: Alex Williamson 
> > 
> > Can't you get this today from the iommu_ops pointer that is on the bus
> > that the device is associated with?  Or can devices on a bus have
> > different iommu_group pointers?
> 
> The latter, each device on a bus might be it's own group.  This is often
> the case on x86 unless PCIe-to-PCI bridges obscure the device
> visibility.  Thanks,

Ah, ok, then I have no objection to add this to struct device:

Acked-by: Greg Kroah-Hartman 

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 02/13] iommu: IOMMU Groups

2012-05-16 Thread Greg KH

On Fri, May 11, 2012 at 04:55:41PM -0600, Alex Williamson wrote:
> IOMMU device groups are currently a rather vague associative notion
> with assembly required by the user or user level driver provider to
> do anything useful.  This patch intends to grow the IOMMU group concept
> into something a bit more consumable.
> 
> To do this, we first create an object representing the group, struct
> iommu_group.  This structure is allocated (iommu_group_alloc) and
> filled (iommu_group_add_device) by the iommu driver.  The iommu driver
> is free to add devices to the group using it's own set of policies.
> This allows inclusion of devices based on physical hardware or topology
> limitations of the platform, as well as soft requirements, such as
> multi-function trust levels or peer-to-peer protection of the
> interconnects.  Each device may only belong to a single iommu group,
> which is linked from struct device.iommu_group.  IOMMU groups are
> maintained using kobject reference counting, allowing for automatic
> removal of empty, unreferenced groups.  It is the responsibility of
> the iommu driver to remove devices from the group
> (iommu_group_remove_device).
> 
> IOMMU groups also include a userspace representation in sysfs under
> /sys/kernel/iommu_groups.  When allocated, each group is given a
> dynamically assign ID (int).  The ID is managed by the core IOMMU group
> code to support multiple heterogeneous iommu drivers, which could
> potentially collide in group naming/numbering.  This also keeps group
> IDs to small, easily managed values.  A directory is created under
> /sys/kernel/iommu_groups for each group.  A further subdirectory named
> "devices" contains links to each device within the group.  The iommu_group
> file in the device's sysfs directory, which formerly contained a group
> number when read, is now a link to the iommu group.  Example:
> 
> $ ls -l /sys/kernel/iommu_groups/26/devices/



As you are creating new sysfs files and directories, you need to also
add the proper Documentation/ABI/ files at the same time.

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 01/13] driver core: Add iommu_group tracking to struct device

2012-05-16 Thread Greg KH

On Fri, May 11, 2012 at 04:55:35PM -0600, Alex Williamson wrote:
> IOMMU groups allow IOMMU drivers to represent DMA visibility
> and isolation of devices.  Multiple devices may be grouped
> together for the purposes of DMA.  Placing a pointer on
> struct device enable easy access for things like streaming
> DMA programming and drivers like VFIO.
> 
> Signed-off-by: Alex Williamson 

Can't you get this today from the iommu_ops pointer that is on the bus
that the device is associated with?  Or can devices on a bus have
different iommu_group pointers?

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

99 matches

Mail list logo