Re: [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug
Hi Janusz, On 9/3/19 3:41 PM, Janusz Krzysztofik wrote: Hi Baolu, On Tuesday, September 3, 2019 3:29:40 AM CEST Lu Baolu wrote: Hi Janusz, On 9/2/19 4:37 PM, Janusz Krzysztofik wrote: I am not saying that keeping data is not acceptable. I just want to check whether there are any other solutions. Then reverting 458b7c8e0dde and applying this patch still resolves the issue for me. No errors appear when mappings are unmapped on device close after the device has been removed, and domain info preserved on device removal is successfully reused on device re-plug. This patch doesn't look good to me although I agree that keeping data is acceptable. It updates dev->archdata.iommu, but leaves the hardware context/pasid table unchanged. This might cause problems somewhere. Is there anything else I can do to help? Can you please tell me how to reproduce the problem? The most simple way to reproduce the issue, assuming there are no non-Intel graphics adapters installed, is to run the following shell commands: #!/bin/sh # load i915 module modprobe i915 # open an i915 device and keep it open in background cat /dev/dri/card0 >/dev/null & sleep 2 # simulate device unplug echo 1 >/sys/class/drm/card0/device/remove # make the background process close the device on exit kill $! I tried to reproduce the issue with above instructions. I got below kernel messages after above operation. Is it the same as what you've seen? I can't find anything explicitly related to iommu except an IOMMU fault generated after device got removed and the DMA translation structures got torn down. Can you please help me to understand how IOMMU driver triggers the issue? [ 182.217672] [ cut here ] [ 182.217679] WARNING: CPU: 1 PID: 1285 at drivers/gpu/drm/drm_mode_config.c:495 drm_mode_config_cleanup+0x2cb/0x2e0 [ 182.217680] Modules linked in: nls_iso8859_1 snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core intel_rapl_msr snd_hda_codec_hdmi snd_hda_codec_realtek snd_compress snd_hda_codec_generic ledtrig_audio ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep snd_pcm intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_seq_midi irqbypass snd_seq_midi_event snd_rawmidi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlmvm snd_seq mac80211 libarc4 aesni_intel snd_seq_device snd_timer crypto_simd ipu3_cio2 cryptd glue_helper iwlwifi v4l2_fwnode intel_cstate videobuf2_dma_sg videobuf2_memops intel_rapl_perf asix videobuf2_v4l2 videobuf2_common usbnet mii cfg80211 input_leds serio_raw intel_wmi_thunderbolt snd mei_me videodev soundcore intel_xhci_usb_role_switch mei 8250_dw mc roles intel_pch_thermal hid_sensor_magn_3d hid_sensor_accel_3d hid_sensor_press [ 182.217709] hid_sensor_incl_3d hid_sensor_als hid_sensor_rotation hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio intel_vbtn mac_hid intel_hid acpi_pad sparse_keymap sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables hid_sensor_custom hid_sensor_hub intel_ishtp_hid hid_generic dwc3 ulpi udc_core i2c_designware_platform i2c_designware_core e1000e psmouse i2c_i801 tg3 usbhid dwc3_pci hid intel_ish_ipc intel_lpss_pci intel_ishtp intel_lpss wmi pinctrl_sunrisepoint pinctrl_intel [ 182.217727] CPU: 1 PID: 1285 Comm: bash Not tainted 5.4.0-rc2+ #2704 [ 182.217728] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B105.B00.1512051901 12/05/2015 [ 182.217731] RIP: 0010:drm_mode_config_cleanup+0x2cb/0x2e0 [ 182.217732] Code: eb 0c 48 8b 70 48 4c 89 e7 e8 31 f4 ff ff 48 8d 7d a0 e8 d8 95 ff ff 48 85 c0 75 e6 48 8d 7d a0 e8 1a 87 ff ff e9 ef fd ff ff <0f> 0b e9 ed fe ff ff 0f 0b eb 98 e8 05 ec 98 ff 0f 1f 44 00 00 0f [ 182.217733] RSP: 0018:a1c481d37c78 EFLAGS: 00010216 [ 182.217735] RAX: 95c7da326e08 RBX: 95c7dcf6 RCX: 801e [ 182.217735] RDX: 95c7dcf603b8 RSI: 801e RDI: 95c7dcf60390 [ 182.217736] RBP: a1c481d37cd8 R08: R09: 9e035e00 [ 182.217737] R10: a1c481d37be0 R11: 0001 R12: 95c7dcf60250 [ 182.217737] R13: 95c7dcf603b8 R14: a1c481d37ee8 R15: fff2 [ 182.217739] FS: 7f804af7f740() GS:95c7e7a8() knlGS: [ 182.217739] CS: 0010 DS: ES: CR0: 80050033 [ 182.217740] CR2: 7ffebc530cec CR3: 000137e0e003 CR4: 003606e0 [ 182.217741] DR0: DR1: DR2: [ 182.217742] DR3: DR6: fffe0ff0 DR7: 0400 [ 182.217742] Call Trace: [ 182.217748] ? _cond_resched+0x19/0x40 [ 182.217752] intel_modeset_driver_remove+0xd1/0x150 [ 182.217754] i915_driver_remove+0xb8/0x120 [ 182.217756] i915_pci_remove+0
Re: [PATCH v2] dma-mapping: Move vmap address checks into dma_map_single()
Hi Kees, I love your patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.4-rc2 next-20191010] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Kees-Cook/dma-mapping-Move-vmap-address-checks-into-dma_map_single/20191005-073954 config: sh-magicpanelr2_defconfig (attached as .config) compiler: sh4-linux-gcc (GCC) 7.4.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.4.0 make.cross ARCH=sh If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): In file included from arch/sh/include/asm/bug.h:112:0, from include/linux/bug.h:5, from include/linux/thread_info.h:12, from include/asm-generic/preempt.h:5, from ./arch/sh/include/generated/asm/preempt.h:1, from include/linux/preempt.h:78, from include/linux/spinlock.h:51, from include/linux/seqlock.h:36, from include/linux/time.h:6, from include/linux/stat.h:19, from include/linux/module.h:10, from arch/sh/kernel/io.c:8: include/linux/dma-mapping.h: In function 'dma_map_single_attrs': >> include/linux/dma-mapping.h:588:9: error: format '%lu' expects argument of >> type 'long unsigned int', but argument 4 has type 'size_t {aka unsigned >> int}' [-Werror=format=] "%s %s: driver maps %lu bytes from vmalloc area\n", ^ include/asm-generic/bug.h:92:17: note: in definition of macro '__WARN_printf' __warn_printk(arg); \ ^~~ include/asm-generic/bug.h:155:3: note: in expansion of macro 'WARN' WARN(1, format);\ ^~~~ include/linux/dma-mapping.h:587:6: note: in expansion of macro 'WARN_ONCE' if (WARN_ONCE(is_vmalloc_addr(ptr), ^ cc1: all warnings being treated as errors vim +588 include/linux/dma-mapping.h 582 583 static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, 584 size_t size, enum dma_data_direction dir, unsigned long attrs) 585 { 586 /* DMA must never operate on areas that might be remapped. */ 587 if (WARN_ONCE(is_vmalloc_addr(ptr), > 588"%s %s: driver maps %lu bytes from vmalloc > area\n", 589dev ? dev_driver_string(dev) : "unknown driver", 590dev ? dev_name(dev) : "unknown device", size)) 591 return DMA_MAPPING_ERROR; 592 593 debug_dma_map_single(dev, ptr, size); 594 return dma_map_page_attrs(dev, virt_to_page(ptr), offset_in_page(ptr), 595 size, dir, attrs); 596 } 597 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v3 0/6] ACPI / utils: add new helper for HID/UID match
On Tuesday, October 1, 2019 4:27:19 PM CEST Andy Shevchenko wrote: > There are few users outside of ACPI realm that re-introduce a custom > solution to match ACPI device against HID/UID. Add a generic helper for > them. > > The series is supposed to go via linux-pm tree. > > In v3: > - correct logic in sdhci-acpi for qcom devices (Adrian) > - add Mika's Ack > > In v2: > - add patch 2 due to latent issue in the header (lkp) > - get rid of match_hid_uid() completely in patch 6 > > Andy Shevchenko (6): > ACPI / utils: Describe function parameters in kernel-doc > ACPI / utils: Move acpi_dev_get_first_match_dev() under CONFIG_ACPI > ACPI / utils: Introduce acpi_dev_hid_uid_match() helper > ACPI / LPSS: Switch to use acpi_dev_hid_uid_match() > mmc: sdhci-acpi: Switch to use acpi_dev_hid_uid_match() > iommu/amd: Switch to use acpi_dev_hid_uid_match() > > drivers/acpi/acpi_lpss.c | 21 +++ > drivers/acpi/utils.c | 32 +++ > drivers/iommu/amd_iommu.c | 30 - > drivers/mmc/host/sdhci-acpi.c | 49 --- > include/acpi/acpi_bus.h | 8 +++--- > include/linux/acpi.h | 6 + > 6 files changed, 67 insertions(+), 79 deletions(-) > > Applying the series (with the tags given so far) as 5.5 material, thanks! ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu/arm-smmu: Add an optional "input-address-size" property
On 2019-10-11 4:46 am, Nicolin Chen wrote: This series of patches add an optional DT property to allow an SoC to specify how many bits being physically connected to its SMMU instance, depending on the SoC design. This has come up before, and it doesn't work in general because a single SMMU instance can have many master interfaces, with potentially different sizes of address bus wired up to each. It's also a conceptually-wrong approach anyway, since this isn't a property of the SMMU; it's a property of the interconnect(s) upstream of the SMMU. IIRC you were working on Tegra - if so, Thierry already has a plan, see this thread: https://lore.kernel.org/linux-arm-kernel/20190930133510.GA1904140@ulmo/ Robin. Nicolin Chen (2): dt-bindings: arm-smmu: Add an optional "input-address-size" property iommu/arm-smmu: Read optional "input-address-size" property Documentation/devicetree/bindings/iommu/arm,smmu.txt | 7 +++ drivers/iommu/arm-smmu.c | 10 -- 2 files changed, 15 insertions(+), 2 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 1/2] dma-mapping: Add vmap checks to dma_map_single()
On 2019-10-11 6:02 am, Greg Kroah-Hartman wrote: On Thu, Oct 10, 2019 at 03:28:28PM -0700, Kees Cook wrote: As we've seen from USB and other areas[1], we need to always do runtime checks for DMA operating on memory regions that might be remapped. This adds vmap checks (similar to those already in USB but missing in other places) into dma_map_single() so all callers benefit from the checking. [1] https://git.kernel.org/linus/3840c5b78803b2b6cc1ff820100a74a092c40cbb Suggested-by: Laura Abbott Signed-off-by: Kees Cook --- include/linux/dma-mapping.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 4a1c4fca475a..ff4e91c66f44 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -583,6 +583,12 @@ static inline unsigned long dma_get_merge_boundary(struct device *dev) static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) { + /* DMA must never operate on areas that might be remapped. */ + if (unlikely(is_vmalloc_addr(ptr))) { + dev_warn_once(dev, "bad map: %zu bytes in vmalloc\n", size); Can we get a bit better error text here? In USB we were at least giving people a hint as to what went wrong, "bad map" might not really make that much sense to a USB developer as to what they needed to do to fix this issue. Agreed, something along the lines of "dma_map_single of vmalloc'ed buffer..." might be clearer. Also I'd be inclined to use dev_WARN_ONCE() to include a diagnostically-useful backtrace, as the existing USB code would. Thanks, Robin.
Re: [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug
Hi Baolu, On Friday, October 11, 2019 8:54:09 AM CEST Lu Baolu wrote: > Hi Janusz, > > On 9/3/19 3:41 PM, Janusz Krzysztofik wrote: > > Hi Baolu, > > > > On Tuesday, September 3, 2019 3:29:40 AM CEST Lu Baolu wrote: > >> Hi Janusz, > >> > >> On 9/2/19 4:37 PM, Janusz Krzysztofik wrote: > I am not saying that keeping data is not acceptable. I just want to > check whether there are any other solutions. > >>> Then reverting 458b7c8e0dde and applying this patch still resolves the > > issue > >>> for me. No errors appear when mappings are unmapped on device close after > > the > >>> device has been removed, and domain info preserved on device removal is > >>> successfully reused on device re-plug. > >> This patch doesn't look good to me although I agree that keeping data is > >> acceptable. It updates dev->archdata.iommu, but leaves the hardware > >> context/pasid table unchanged. This might cause problems somewhere. > >> > >>> Is there anything else I can do to help? > >> Can you please tell me how to reproduce the problem? > > The most simple way to reproduce the issue, assuming there are no non-Intel > > graphics adapters installed, is to run the following shell commands: > > > > #!/bin/sh > > # load i915 module > > modprobe i915 > > # open an i915 device and keep it open in background > > cat /dev/dri/card0 >/dev/null & > > sleep 2 > > # simulate device unplug > > echo 1 >/sys/class/drm/card0/device/remove > > # make the background process close the device on exit > > kill $! > > > > I tried to reproduce the issue with above instructions. I got below > kernel messages after above operation. Is it the same as what you've > seen? I can't find anything explicitly related to iommu except an IOMMU > fault generated after device got removed and the DMA translation > structures got torn down. Can you please help me to understand how IOMMU > driver triggers the issue? > > > [ 182.217672] [ cut here ] > [ 182.217679] WARNING: CPU: 1 PID: 1285 at > drivers/gpu/drm/drm_mode_config.c:495 drm_mode_config_cleanup+0x2cb/0x2e0 > [ 182.217680] Modules linked in: nls_iso8859_1 snd_soc_skl > snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core > snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core intel_rapl_msr > snd_hda_codec_hdmi snd_hda_codec_realtek snd_compress > snd_hda_codec_generic ledtrig_audio ac97_bus snd_pcm_dmaengine > snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep > snd_pcm intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp > kvm_intel kvm snd_seq_midi irqbypass snd_seq_midi_event snd_rawmidi > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel iwlmvm snd_seq > mac80211 libarc4 aesni_intel snd_seq_device snd_timer crypto_simd > ipu3_cio2 cryptd glue_helper iwlwifi v4l2_fwnode intel_cstate > videobuf2_dma_sg videobuf2_memops intel_rapl_perf asix videobuf2_v4l2 > videobuf2_common usbnet mii cfg80211 input_leds serio_raw > intel_wmi_thunderbolt snd mei_me videodev soundcore > intel_xhci_usb_role_switch mei 8250_dw mc roles intel_pch_thermal > hid_sensor_magn_3d hid_sensor_accel_3d hid_sensor_press > [ 182.217709] hid_sensor_incl_3d hid_sensor_als hid_sensor_rotation > hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer > kfifo_buf hid_sensor_iio_common industrialio intel_vbtn mac_hid > intel_hid acpi_pad sparse_keymap sch_fq_codel parport_pc ppdev lp > parport ip_tables x_tables hid_sensor_custom hid_sensor_hub > intel_ishtp_hid hid_generic dwc3 ulpi udc_core i2c_designware_platform > i2c_designware_core e1000e psmouse i2c_i801 tg3 usbhid dwc3_pci hid > intel_ish_ipc intel_lpss_pci intel_ishtp intel_lpss wmi > pinctrl_sunrisepoint pinctrl_intel > [ 182.217727] CPU: 1 PID: 1285 Comm: bash Not tainted 5.4.0-rc2+ #2704 > [ 182.217728] Hardware name: Intel Corporation Skylake Client > platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B105.B00.1512051901 > 12/05/2015 > [ 182.217731] RIP: 0010:drm_mode_config_cleanup+0x2cb/0x2e0 > [ 182.217732] Code: eb 0c 48 8b 70 48 4c 89 e7 e8 31 f4 ff ff 48 8d 7d > a0 e8 d8 95 ff ff 48 85 c0 75 e6 48 8d 7d a0 e8 1a 87 ff ff e9 ef fd ff > ff <0f> 0b e9 ed fe ff ff 0f 0b eb 98 e8 05 ec 98 ff 0f 1f 44 00 00 0f > [ 182.217733] RSP: 0018:a1c481d37c78 EFLAGS: 00010216 > [ 182.217735] RAX: 95c7da326e08 RBX: 95c7dcf6 RCX: > 801e > [ 182.217735] RDX: 95c7dcf603b8 RSI: 801e RDI: > 95c7dcf60390 > [ 182.217736] RBP: a1c481d37cd8 R08: R09: > 9e035e00 > [ 182.217737] R10: a1c481d37be0 R11: 0001 R12: > 95c7dcf60250 > [ 182.217737] R13: 95c7dcf603b8 R14: a1c481d37ee8 R15: > fff2 > [ 182.217739] FS: 7f804af7f740() GS:95c7e7a8() > knlGS: > [ 182.217739] CS: 0010 DS: ES: CR0: 80050033 > [ 182.217740] CR2: 7ffebc530cec CR3: 000137e0e003 CR4: > 003606e0 > [ 182.217741] DR0: 000
Re: [PATCH 1/3] iommu/ipmmu-vmsa: Remove some unused register declarations
Hi Shimoda-san, Thanks for your patch! On Wed, Oct 9, 2019 at 10:27 AM Yoshihiro Shimoda wrote: > To support different registers memory mapping hardware easily > in the future, this patch removes some unused register > declarations. > > Signed-off-by: Yoshihiro Shimoda Reviewed-by: Geert Uytterhoeven While I can confirm the removed definitions are unused, they were still valid (but see comments below). Perhaps it would be better to add comments, to state clearly to which SoCs or SoC families they apply? Or do you think this would be futile, and would add too much clutter to the source file in the near future? > --- a/drivers/iommu/ipmmu-vmsa.c > +++ b/drivers/iommu/ipmmu-vmsa.c > @@ -104,8 +104,6 @@ static struct ipmmu_vmsa_device *to_ipmmu(struct device > *dev) > #define IMCTR 0x > #define IMCTR_TRE (1 << 17) > #define IMCTR_AFE (1 << 16) > -#define IMCTR_RTSEL_MASK (3 << 4) FWIW, this is valid for R-Car Gen2 only. On R-Car Gen3, the field contains 3 bits. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH 2/3] iommu/ipmmu-vmsa: Calculate context registers' offset instead of a macro
Hi Shimoda-san, On Wed, Oct 9, 2019 at 10:27 AM Yoshihiro Shimoda wrote: > Since we will have changed memory mapping of the IPMMU in the future, > this patch uses ipmmu_features values instead of a macro to > calculate context registers offset. No behavior change. > > Signed-off-by: Yoshihiro Shimoda Thanks for your patch! > --- a/drivers/iommu/ipmmu-vmsa.c > +++ b/drivers/iommu/ipmmu-vmsa.c > @@ -50,6 +50,8 @@ struct ipmmu_features { > bool twobit_imttbcr_sl0; > bool reserved_context; > bool cache_snoop; > + u32 ctx_offset_base; > + u32 ctx_offset_stride; > }; > > struct ipmmu_vmsa_device { > @@ -99,8 +101,6 @@ static struct ipmmu_vmsa_device *to_ipmmu(struct device > *dev) > > #define IM_NS_ALIAS_OFFSET 0x800 > > -#define IM_CTX_SIZE0x40 > - > #define IMCTR 0x > #define IMCTR_TRE (1 << 17) > #define IMCTR_AFE (1 << 16) > @@ -253,18 +253,25 @@ static void ipmmu_write(struct ipmmu_vmsa_device *mmu, > unsigned int offset, > iowrite32(data, mmu->base + offset); > } > > +static u32 ipmmu_ctx_reg(struct ipmmu_vmsa_device *mmu, unsigned int > context_id, > +unsigned int reg) > +{ > + return mmu->features->ctx_offset_base + > + context_id * mmu->features->ctx_offset_stride + reg; > +} > + > static u32 ipmmu_ctx_read_root(struct ipmmu_vmsa_domain *domain, >unsigned int reg) > { > return ipmmu_read(domain->mmu->root, > - domain->context_id * IM_CTX_SIZE + reg); > + ipmmu_ctx_reg(domain->mmu, domain->context_id, > reg)); For consistency: ipmmu_ctx_reg(domain->mmu->root, ...) but in practice the features for domain->mmu and domain->mmu->root are identical anyway. > } > > static void ipmmu_ctx_write_root(struct ipmmu_vmsa_domain *domain, > unsigned int reg, u32 data) > { > ipmmu_write(domain->mmu->root, > - domain->context_id * IM_CTX_SIZE + reg, data); > + ipmmu_ctx_reg(domain->mmu, domain->context_id, reg), > data); Likewise: ipmmu_ctx_reg(domain->mmu->root, ...)? I find these ipmmu_{read,write}() a bit hard too read, with passing the mmu to both ipmmu_{read,write}() and ipmmu_ctx_reg(). What do you think about providing two helpers ipmmu_ctx_{read,write}(), so all users can just use e.g. ipmmu_ctx_write(mmu, context_id, reg, data); instead of ipmmu_write(mmu, ipmmu_ctx_reg(mmu, context_id, reg), data); ? Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH 3/3] iommu/ipmmu-vmsa: Add utlb_offset_base
Hi Shimoda-san, On Wed, Oct 9, 2019 at 10:27 AM Yoshihiro Shimoda wrote: > Since we will have changed memory mapping of the IPMMU in the future, > this patch adds a utlb_offset_base into struct ipmmu_features > for IMUCTR and IMUASID registers. > No behavior change. > > Signed-off-by: Yoshihiro Shimoda Thanks for your patch! > --- a/drivers/iommu/ipmmu-vmsa.c > +++ b/drivers/iommu/ipmmu-vmsa.c > @@ -52,6 +52,7 @@ struct ipmmu_features { > bool cache_snoop; > u32 ctx_offset_base; > u32 ctx_offset_stride; > + u32 utlb_offset_base; > }; > > struct ipmmu_vmsa_device { > @@ -285,6 +286,11 @@ static void ipmmu_ctx_write_all(struct ipmmu_vmsa_domain > *domain, > ipmmu_ctx_write_root(domain, reg, data); > } > > +static u32 ipmmu_utlb_reg(struct ipmmu_vmsa_device *mmu, unsigned int reg) > +{ > + return mmu->features->utlb_offset_base + reg; > +} > + > /* > - > * TLB and microTLB Management > */ > @@ -330,9 +336,9 @@ static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain > *domain, > */ > > /* TODO: What should we set the ASID to ? */ > - ipmmu_write(mmu, IMUASID(utlb), 0); > + ipmmu_write(mmu, ipmmu_utlb_reg(mmu, IMUASID(utlb)), 0); > /* TODO: Do we need to flush the microTLB ? */ > - ipmmu_write(mmu, IMUCTR(utlb), > + ipmmu_write(mmu, ipmmu_utlb_reg(mmu, IMUCTR(utlb)), > IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_FLUSH | > IMUCTR_MMUEN); Like in [PATCH 2/3], I think providing two helpers would make this more readable: ipmmu_imuasid_write(mmu, utlb, 0); ipmmu_imuctr_write(mmu, utlb, data); Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
Re: [PATCH v2 3/4] iommu/mediatek: Use writel for TLB range invalidation
On Wed, Oct 09, 2019 at 09:19:02PM +0800, Yong Wu wrote: > Use writel for the register F_MMU_INV_RANGE which is for triggering the > HW work. We expect all the setting(iova_start/iova_end...) have already > been finished before F_MMU_INV_RANGE. > > Signed-off-by: Anan.Sun > Signed-off-by: Yong Wu > --- > This is a improvement rather than fixing a issue. > --- > drivers/iommu/mtk_iommu.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 24a13a6..607f92c 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -187,8 +187,7 @@ static void mtk_iommu_tlb_add_flush(unsigned long iova, > size_t size, > writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A); > writel_relaxed(iova + size - 1, > data->base + REG_MMU_INVLD_END_A); > - writel_relaxed(F_MMU_INV_RANGE, > -data->base + REG_MMU_INVALIDATE); > + writel(F_MMU_INV_RANGE, data->base + REG_MMU_INVALIDATE); I don't understand this change. Why is it an "improvement" and which accesses are you ordering with the writel? Will
Re: [PATCH 0/2] iommu/arm-smmu: Add an optional "input-address-size" property
On Fri, Oct 11, 2019 at 10:16:28AM +0100, Robin Murphy wrote: > On 2019-10-11 4:46 am, Nicolin Chen wrote: > > This series of patches add an optional DT property to allow an SoC to > > specify how many bits being physically connected to its SMMU instance, > > depending on the SoC design. > > This has come up before, and it doesn't work in general because a single > SMMU instance can have many master interfaces, with potentially different > sizes of address bus wired up to each. It's also a conceptually-wrong > approach anyway, since this isn't a property of the SMMU; it's a property of > the interconnect(s) upstream of the SMMU. > > IIRC you were working on Tegra - if so, Thierry already has a plan, see this > thread: > https://lore.kernel.org/linux-arm-kernel/20190930133510.GA1904140@ulmo/ Thanks for the reply and link!
[PATCH 1/2] dma-mapping: Add dma_addr_is_phys_addr()
From: Thiago Jung Bauermann In order to safely use the DMA API, virtio needs to know whether DMA addresses are in fact physical addresses and for that purpose, dma_addr_is_phys_addr() is introduced. cc: Benjamin Herrenschmidt cc: David Gibson cc: Michael Ellerman cc: Paul Mackerras cc: Michael Roth cc: Alexey Kardashevskiy cc: Paul Burton cc: Robin Murphy cc: Bartlomiej Zolnierkiewicz cc: Marek Szyprowski cc: Christoph Hellwig Suggested-by: Michael S. Tsirkin Signed-off-by: Ram Pai Signed-off-by: Thiago Jung Bauermann --- arch/powerpc/include/asm/dma-mapping.h | 21 + arch/powerpc/platforms/pseries/Kconfig | 1 + include/linux/dma-mapping.h| 20 kernel/dma/Kconfig | 3 +++ 4 files changed, 45 insertions(+) diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h index 565d6f7..f92c0a4b 100644 --- a/arch/powerpc/include/asm/dma-mapping.h +++ b/arch/powerpc/include/asm/dma-mapping.h @@ -5,6 +5,8 @@ #ifndef _ASM_DMA_MAPPING_H #define _ASM_DMA_MAPPING_H +#include + static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) { /* We don't handle the NULL dev case for ISA for now. We could @@ -15,4 +17,23 @@ static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus) return NULL; } +#ifdef CONFIG_ARCH_HAS_DMA_ADDR_IS_PHYS_ADDR +/** + * dma_addr_is_phys_addr - check whether a device DMA address is a physical + * address + * @dev: device to check + * + * Returns %true if any DMA address for this device happens to also be a valid + * physical address (not necessarily of the same page). + */ +static inline bool dma_addr_is_phys_addr(struct device *dev) +{ + /* +* Secure guests always use the SWIOTLB, therefore DMA addresses are +* actually the physical address of the bounce buffer. +*/ + return is_secure_guest(); +} +#endif + #endif /* _ASM_DMA_MAPPING_H */ diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 9e35cdd..0108150 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -152,6 +152,7 @@ config PPC_SVM select SWIOTLB select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_FORCE_DMA_UNENCRYPTED + select ARCH_HAS_DMA_ADDR_IS_PHYS_ADDR help There are certain POWER platforms which support secure guests using the Protected Execution Facility, with the help of an Ultravisor diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index f7d1eea..6df5664 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -693,6 +693,26 @@ static inline bool dma_addressing_limited(struct device *dev) dma_get_required_mask(dev); } +#ifndef CONFIG_ARCH_HAS_DMA_ADDR_IS_PHYS_ADDR +/** + * dma_addr_is_phys_addr - check whether a device DMA address is a physical + * address + * @dev: device to check + * + * Returns %true if any DMA address for this device happens to also be a valid + * physical address (not necessarily of the same page). + */ +static inline bool dma_addr_is_phys_addr(struct device *dev) +{ + /* +* Except in very specific setups, DMA addresses exist in a different +* address space from CPU physical addresses and cannot be directly used +* to reference system memory. +*/ + return false; +} +#endif + #ifdef CONFIG_ARCH_HAS_SETUP_DMA_OPS void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size, const struct iommu_ops *iommu, bool coherent); diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig index 9decbba..6209b46 100644 --- a/kernel/dma/Kconfig +++ b/kernel/dma/Kconfig @@ -51,6 +51,9 @@ config ARCH_HAS_DMA_MMAP_PGPROT config ARCH_HAS_FORCE_DMA_UNENCRYPTED bool +config ARCH_HAS_DMA_ADDR_IS_PHYS_ADDR + bool + config DMA_NONCOHERENT_CACHE_SYNC bool -- 1.8.3.1
[PATCH 2/2] virtio_ring: Use DMA API if memory is encrypted
From: Thiago Jung Bauermann Normally, virtio enables DMA API with VIRTIO_F_IOMMU_PLATFORM, which must be set by both device and guest driver. However, as a hack, when DMA API returns physical addresses, guest driver can use the DMA API; even though device does not set VIRTIO_F_IOMMU_PLATFORM and just uses physical addresses. Doing this works-around POWER secure guests for which only the bounce buffer is accessible to the device, but which don't set VIRTIO_F_IOMMU_PLATFORM due to a set of hypervisor and architectural bugs. To guard against platform changes, breaking any of these assumptions down the road, we check at probe time and fail if that's not the case. cc: Benjamin Herrenschmidt cc: David Gibson cc: Michael Ellerman cc: Paul Mackerras cc: Michael Roth cc: Alexey Kardashevskiy cc: Jason Wang cc: Christoph Hellwig Suggested-by: Michael S. Tsirkin Signed-off-by: Ram Pai Signed-off-by: Thiago Jung Bauermann --- drivers/virtio/virtio.c | 18 ++ drivers/virtio/virtio_ring.c | 8 include/linux/virtio_config.h | 14 ++ 3 files changed, 40 insertions(+) diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c index a977e32..77a3baf 100644 --- a/drivers/virtio/virtio.c +++ b/drivers/virtio/virtio.c @@ -4,6 +4,7 @@ #include #include #include +#include #include /* Unique numbering for virtio devices. */ @@ -245,6 +246,23 @@ static int virtio_dev_probe(struct device *_d) if (err) goto err; + /* +* If memory is encrypted, but VIRTIO_F_IOMMU_PLATFORM is not set, then +* the device is broken: DMA API is required for these platforms, but +* the only way using the DMA API is going to work at all is if the +* device is ready for it. So we need a flag on the virtio device, +* exposed by the hypervisor (or hardware for hw virtio devices) that +* says: hey, I'm real, don't take a shortcut. +* +* There's one exception where guest can make things work, and that is +* when DMA API is guaranteed to always return physical addresses. +*/ + if (mem_encrypt_active() && !virtio_can_use_dma_api(dev)) { + dev_err(_d, "virtio: device unable to access encrypted memory\n"); + err = -EINVAL; + goto err; + } + err = drv->probe(dev); if (err) goto err; diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index c8be1c4..9c56b61 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -255,6 +255,14 @@ static bool vring_use_dma_api(struct virtio_device *vdev) if (xen_domain()) return true; + /* +* Also, if guest memory is encrypted the host can't access it +* directly. We need to either use an IOMMU or do bounce buffering. +* Both are done via the DMA API. +*/ + if (mem_encrypt_active() && virtio_can_use_dma_api(vdev)) + return true; + return false; } diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index bb4cc49..57bc25c 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -4,6 +4,7 @@ #include #include +#include #include #include #include @@ -174,6 +175,19 @@ static inline bool virtio_has_iommu_quirk(const struct virtio_device *vdev) return !virtio_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } +/** + * virtio_can_use_dma_api - determine whether the DMA API can be used + * @vdev: the device + * + * The DMA API can be used either when the device doesn't have the IOMMU quirk, + * or when the DMA API is guaranteed to always return physical addresses. + */ +static inline bool virtio_can_use_dma_api(const struct virtio_device *vdev) +{ + return !virtio_has_iommu_quirk(vdev) || + dma_addr_is_phys_addr(vdev->dev.parent); +} + static inline struct virtqueue *virtio_find_single_vq(struct virtio_device *vdev, vq_callback_t *c, const char *n) -- 1.8.3.1
[PATCH 0/2] virtio: Support encrypted memory on powerpc secure guests
**We would like the patches to be merged through the virtio tree. Please review, and ack merging the DMA mapping change through that tree. Thanks!** The memory of powerpc secure guests can't be accessed by the hypervisor / virtio device except for a few memory regions designated as 'shared'. At the moment, Linux uses bounce-buffering to communicate with the hypervisor, with a bounce buffer marked as shared. This is how the DMA API is implemented on this platform. In particular, the most convenient way to use virtio on this platform is by making virtio use the DMA API: in fact, this is exactly what happens if the virtio device exposes the flag VIRTIO_F_ACCESS_PLATFORM. However, bugs in the hypervisor on the powerpc platform do not allow setting this flag, with some hypervisors already in the field that don't set this flag. At the moment they are forced to use emulated devices when guest is in secure mode; virtio is only useful when guest is not secure. Normally, both device and driver must support VIRTIO_F_ACCESS_PLATFORM: if one of them doesn't, the other mustn't assume it for communication to work. However, a guest-side work-around is possible to enable virtio for these hypervisors with guest in secure mode: it so happens that on powerpc secure platform the DMA address is actually a physical address - that of the bounce buffer. For these platforms we can make the virtio driver go through the DMA API even though the device itself ignores the DMA API. These patches implement this work around for virtio: we detect that - secure guest mode is enabled - so we know that since we don't share most memory and Hypervisor has not enabled VIRTIO_F_ACCESS_PLATFORM, regular virtio code won't work. - DMA API is giving us addresses that are actually also physical addresses. - Hypervisor has not enabled VIRTIO_F_ACCESS_PLATFORM. and if all conditions are true, we force all data through the bounce buffer. To put it another way, from hypervisor's point of view DMA API is not required: hypervisor would be happy to get access to all of guest memory. That's why it does not set VIRTIO_F_ACCESS_PLATFORM. However, guest decides that it does not trust the hypervisor and wants to force a bounce buffer for its own reasons. Thiago Jung Bauermann (2): dma-mapping: Add dma_addr_is_phys_addr() virtio_ring: Use DMA API if memory is encrypted arch/powerpc/include/asm/dma-mapping.h | 21 + arch/powerpc/platforms/pseries/Kconfig | 1 + drivers/virtio/virtio.c| 18 ++ drivers/virtio/virtio_ring.c | 8 include/linux/dma-mapping.h| 20 include/linux/virtio_config.h | 14 ++ kernel/dma/Kconfig | 3 +++ 7 files changed, 85 insertions(+) -- 1.8.3.1
Re: [PATCH 0/2] virtio: Support encrypted memory on powerpc secure guests
Hmm.. git-send-email forgot to CC Michael Tsirkin, and Thiago; the original author, who is on vacation. Adding them now. RP On Fri, Oct 11, 2019 at 06:25:17PM -0700, Ram Pai wrote: > **We would like the patches to be merged through the virtio tree. Please > review, and ack merging the DMA mapping change through that tree. Thanks!** > > The memory of powerpc secure guests can't be accessed by the hypervisor / > virtio device except for a few memory regions designated as 'shared'. > > At the moment, Linux uses bounce-buffering to communicate with the > hypervisor, with a bounce buffer marked as shared. This is how the DMA API > is implemented on this platform. > > In particular, the most convenient way to use virtio on this platform is by > making virtio use the DMA API: in fact, this is exactly what happens if the > virtio device exposes the flag VIRTIO_F_ACCESS_PLATFORM. However, bugs in > the > hypervisor on the powerpc platform do not allow setting this flag, with some > hypervisors already in the field that don't set this flag. At the moment they > are forced to use emulated devices when guest is in secure mode; virtio is > only useful when guest is not secure. > > Normally, both device and driver must support VIRTIO_F_ACCESS_PLATFORM: > if one of them doesn't, the other mustn't assume it for communication > to work. > > However, a guest-side work-around is possible to enable virtio > for these hypervisors with guest in secure mode: it so happens that on > powerpc secure platform the DMA address is actually a physical address - > that of the bounce buffer. For these platforms we can make the virtio > driver go through the DMA API even though the device itself ignores > the DMA API. > > These patches implement this work around for virtio: we detect that > - secure guest mode is enabled - so we know that since we don't share >most memory and Hypervisor has not enabled VIRTIO_F_ACCESS_PLATFORM, >regular virtio code won't work. > - DMA API is giving us addresses that are actually also physical >addresses. > - Hypervisor has not enabled VIRTIO_F_ACCESS_PLATFORM. > > and if all conditions are true, we force all data through the bounce > buffer. > > To put it another way, from hypervisor's point of view DMA API is > not required: hypervisor would be happy to get access to all of guest > memory. That's why it does not set VIRTIO_F_ACCESS_PLATFORM. However, > guest decides that it does not trust the hypervisor and wants to force > a bounce buffer for its own reasons. > > > Thiago Jung Bauermann (2): > dma-mapping: Add dma_addr_is_phys_addr() > virtio_ring: Use DMA API if memory is encrypted > > arch/powerpc/include/asm/dma-mapping.h | 21 + > arch/powerpc/platforms/pseries/Kconfig | 1 + > drivers/virtio/virtio.c| 18 ++ > drivers/virtio/virtio_ring.c | 8 > include/linux/dma-mapping.h| 20 > include/linux/virtio_config.h | 14 ++ > kernel/dma/Kconfig | 3 +++ > 7 files changed, 85 insertions(+) > > -- > 1.8.3.1 -- Ram Pai
Re: [PATCH v2 3/4] iommu/mediatek: Use writel for TLB range invalidation
On Fri, 2019-10-11 at 17:29 +0100, Will Deacon wrote: > On Wed, Oct 09, 2019 at 09:19:02PM +0800, Yong Wu wrote: > > Use writel for the register F_MMU_INV_RANGE which is for triggering the > > HW work. We expect all the setting(iova_start/iova_end...) have already > > been finished before F_MMU_INV_RANGE. > > > > Signed-off-by: Anan.Sun > > Signed-off-by: Yong Wu > > --- > > This is a improvement rather than fixing a issue. > > --- > > drivers/iommu/mtk_iommu.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > > index 24a13a6..607f92c 100644 > > --- a/drivers/iommu/mtk_iommu.c > > +++ b/drivers/iommu/mtk_iommu.c > > @@ -187,8 +187,7 @@ static void mtk_iommu_tlb_add_flush(unsigned long iova, > > size_t size, > > writel_relaxed(iova, data->base + REG_MMU_INVLD_START_A); > > writel_relaxed(iova + size - 1, > >data->base + REG_MMU_INVLD_END_A); > > - writel_relaxed(F_MMU_INV_RANGE, > > - data->base + REG_MMU_INVALIDATE); > > + writel(F_MMU_INV_RANGE, data->base + REG_MMU_INVALIDATE); > > I don't understand this change. > > Why is it an "improvement" and which accesses are you ordering with the > writel? The register(F_MMU_INV_RANGE) will trigger HW to begin flush range. HW expect the other register iova_start/end/flush_type always is ready before trigger. thus I'd like use writel to guarantee the previous register has been finished. I didn't see the writel_relaxed cause some error in practice, we only think writel is necessary here in theory. so call it "improvement". > > Will