Re: [RFC PATCH V4 1/1] swiotlb: Split up single swiotlb lock

2022-06-22 Thread Dongli Zhang
I will build the next RFC version of 64-bit swiotlb on top of this patch (or
next version of this patch), so that it will render a more finalized view of
32-bt/64-bit plus multiple area.

Thank you very much!

Dongli Zhang

On 6/22/22 3:54 AM, Christoph Hellwig wrote:
> Thanks,
> 
> this looks pretty good to me.  A few comments below:
> 
> On Fri, Jun 17, 2022 at 10:47:41AM -0400, Tianyu Lan wrote:
>> +/**
>> + * struct io_tlb_area - IO TLB memory area descriptor
>> + *
>> + * This is a single area with a single lock.
>> + *
>> + * @used:   The number of used IO TLB block.
>> + * @index:  The slot index to start searching in this area for next round.
>> + * @lock:   The lock to protect the above data structures in the map and
>> + *  unmap calls.
>> + */
>> +struct io_tlb_area {
>> +unsigned long used;
>> +unsigned int index;
>> +spinlock_t lock;
>> +};
> 
> This can go into swiotlb.c.
> 
>> +void __init swiotlb_adjust_nareas(unsigned int nareas);
> 
> And this should be marked static.
> 
>> +#define DEFAULT_NUM_AREAS 1
> 
> I'd drop this define, the magic 1 and a > 1 comparism seems to
> convey how it is used much better as the checks aren't about default
> or not, but about larger than one.
> 
> I also think that we want some good way to size the default, e.g.
> by number of CPUs or memory size.
> 
>> +void __init swiotlb_adjust_nareas(unsigned int nareas)
>> +{
>> +if (!is_power_of_2(nareas)) {
>> +pr_err("swiotlb: Invalid areas parameter %d.\n", nareas);
>> +return;
>> +}
>> +
>> +default_nareas = nareas;
>> +
>> +pr_info("area num %d.\n", nareas);
>> +/* Round up number of slabs to the next power of 2.
>> + * The last area is going be smaller than the rest if
>> + * default_nslabs is not power of two.
>> + */
> 
> Please follow the normal kernel comment style with a /* on its own line.
> 
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://urldefense.com/v3/__https://lists.linuxfoundation.org/mailman/listinfo/iommu__;!!ACWV5N9M2RV99hQ!Jd_DYgd6_uOF8IPr8h1tratEG51zFXtwVpaPa_OW3AEJlWe8gOnmA_fGOdaFUfsVcj1sT5oYw2j4vacY$
>  
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 4/4] swiotlb: panic if nslabs is too small

2022-06-13 Thread Dongli Zhang



On 6/11/22 1:25 AM, Dongli Zhang wrote:
> Panic on purpose if nslabs is too small, in order to sync with the remap
> retry logic.
> 
> In addition, print the number of bytes for tlb alloc failure.
> 
> Signed-off-by: Dongli Zhang 
> ---
>  kernel/dma/swiotlb.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index fd21f4162f4b..1758b724c7a8 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -242,6 +242,9 @@ void __init swiotlb_init_remap(bool addressing_limit, 
> unsigned int flags,
>   if (swiotlb_force_disable)
>   return;
>  
> + if (nslabs < IO_TLB_MIN_SLABS)
> + panic("%s: nslabs = %lu too small\n", __func__, nslabs);
> +
>   /*
>* By default allocate the bounce buffer memory from low memory, but
>* allow to pick a location everywhere for hypervisors with guest
> @@ -254,7 +257,8 @@ void __init swiotlb_init_remap(bool addressing_limit, 
> unsigned int flags,
>   else
>   tlb = memblock_alloc_low(bytes, PAGE_SIZE);
>   if (!tlb) {
> - pr_warn("%s: failed to allocate tlb structure\n", __func__);
> + pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
> + __func__, bytes);

Indeed I have a question on this pr_warn(). (I was going to send another patch
to retry with nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE)).

Why not retry with nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE), or panic here?

If the QEMU machine of my VM is i440fx, the boot is almost failed even here is
pr_warn. Why not sync with the remap failure handling?

1. retry with nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE))
2. and finally panic if nslabs is too small.

Thank you very much!

Dongli Zhang

>   return;
>   }
>  
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 4/4] swiotlb: panic if nslabs is too small

2022-06-11 Thread Dongli Zhang
Panic on purpose if nslabs is too small, in order to sync with the remap
retry logic.

In addition, print the number of bytes for tlb alloc failure.

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index fd21f4162f4b..1758b724c7a8 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -242,6 +242,9 @@ void __init swiotlb_init_remap(bool addressing_limit, 
unsigned int flags,
if (swiotlb_force_disable)
return;
 
+   if (nslabs < IO_TLB_MIN_SLABS)
+   panic("%s: nslabs = %lu too small\n", __func__, nslabs);
+
/*
 * By default allocate the bounce buffer memory from low memory, but
 * allow to pick a location everywhere for hypervisors with guest
@@ -254,7 +257,8 @@ void __init swiotlb_init_remap(bool addressing_limit, 
unsigned int flags,
else
tlb = memblock_alloc_low(bytes, PAGE_SIZE);
if (!tlb) {
-   pr_warn("%s: failed to allocate tlb structure\n", __func__);
+   pr_warn("%s: Failed to allocate %zu bytes tlb structure\n",
+   __func__, bytes);
return;
}
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 3/4] x86/swiotlb: fix param usage in boot-options.rst

2022-06-11 Thread Dongli Zhang
Fix the usage of swiotlb param in kernel doc.

Signed-off-by: Dongli Zhang 
---
 Documentation/x86/x86_64/boot-options.rst | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/Documentation/x86/x86_64/boot-options.rst 
b/Documentation/x86/x86_64/boot-options.rst
index 03ec9cf01181..cbd14124a667 100644
--- a/Documentation/x86/x86_64/boot-options.rst
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -287,11 +287,13 @@ iommu options only relevant to the AMD GART hardware 
IOMMU:
 iommu options only relevant to the software bounce buffering (SWIOTLB) IOMMU
 implementation:
 
-swiotlb=[,force]
-  
-Prereserve that many 128K pages for the software IO bounce buffering.
+swiotlb=[,force,noforce]
+  
+Prereserve that many 2K slots for the software IO bounce buffering.
   force
 Force all IO through the software TLB.
+  noforce
+Do not initialize the software TLB.
 
 
 Miscellaneous
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 0/4] swiotlb: some cleanup

2022-06-11 Thread Dongli Zhang
Hello,

The patch 1-2 are to cleanup unused variable and return type.

The patch 3 is to fix the param usage description in kernel doc.

The patch 4 is to panic on purpose when nslabs is too small.


Dongli Zhang:
  [PATCH v1 1/4] swiotlb: remove unused swiotlb_force
  [PATCH v1 2/4] swiotlb: remove useless return
  [PATCH v1 3/4] x86/swiotlb: fix param usage in boot-options.rst
  [PATCH v1 4/4] swiotlb: panic if nslabs is too small


Thank you very much!

Dongli Zhang


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 2/4] swiotlb: remove useless return

2022-06-11 Thread Dongli Zhang
Both swiotlb_init_remap() and swiotlb_init() have return type void.

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index cb50f8d38360..fd21f4162f4b 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -282,7 +282,7 @@ void __init swiotlb_init_remap(bool addressing_limit, 
unsigned int flags,
 
 void __init swiotlb_init(bool addressing_limit, unsigned int flags)
 {
-   return swiotlb_init_remap(addressing_limit, flags, NULL);
+   swiotlb_init_remap(addressing_limit, flags, NULL);
 }
 
 /*
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v1 1/4] swiotlb: remove unused swiotlb_force

2022-06-11 Thread Dongli Zhang
The 'swiotlb_force' is removed since commit c6af2aa9ffc9 ("swiotlb: make
the swiotlb_init interface more useful").

Signed-off-by: Dongli Zhang 
---
 include/linux/swiotlb.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7ed35dd3de6e..bdc58a0e20b1 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -60,7 +60,6 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
size_t size, enum dma_data_direction dir, unsigned long attrs);
 
 #ifdef CONFIG_SWIOTLB
-extern enum swiotlb_force swiotlb_force;
 
 /**
  * struct io_tlb_mem - IO TLB Memory Pool Descriptor
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

2022-06-10 Thread Dongli Zhang
Hi Christoph,

On 6/8/22 10:05 PM, Christoph Hellwig wrote:
> All this really needs to be hidden under the hood.
> 

Since this patch file has 200+ lines, would you please help clarify what does
'this' indicate?

The idea of this patch:

1. Convert the functions to initialize swiotlb into callee. The callee accepts
'true' or 'false' as arguments to indicate whether it is for default or new
swiotlb buffer, e.g., swiotlb_init_remap() into __swiotlb_init_remap().

2. At the caller side to decide if we are going to call the callee to create the
extra buffer.

Do you mean the callee if still *too high level* and all the
decision/allocation/initialization should be down at *lower level functions*?

E.g., if I re-work the "struct io_tlb_mem" to make the 64-bit buffer as the 2nd
array of io_tlb_mem->slots[32_or_64][index], the user will only notice it is the
default 'io_tlb_default_mem', but will not be able to know if it is allocated
from 32-bit or 64-bit buffer.

Thank you very much for the feedback.

Dongli Zhang
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 0/7] swiotlb: extra 64-bit buffer for dev->dma_io_tlb_mem

2022-06-08 Thread Dongli Zhang
Hello,

I used to send out a patchset on 64-bit buffer and people thought it was
the same as Restricted DMA. However, the 64-bit buffer is still not supported.

https://lore.kernel.org/all/20210203233709.19819-1-dongli.zh...@oracle.com/

This RFC is to introduce the extra swiotlb buffer with SWIOTLB_ANY flag,
to support 64-bit swiotlb.

The core ideas are:

1. Create an extra io_tlb_mem with SWIOTLB_ANY flags.

2. The dev->dma_io_tlb_mem is set to either default or the extra io_tlb_mem,
   depending on dma mask.


Would you please help suggest for below questions in the RFC?

- Is it fine to create the extra io_tlb_mem?

- Which one is better: to create a separate variable for the extra
  io_tlb_mem, or make it an array of two io_tlb_mem?

- Should I set dev->dma_io_tlb_mem in each driver (e.g., virtio driver as
  in this patchset)based on the value of
  min_not_zero(*dev->dma_mask, dev->bus_dma_limit), or at higher level
  (e.g., post pci driver)?


This patchset is to demonstrate that the idea works. Since this is just a
RFC, I have only tested virtio-blk on qemu-7.0 by enforcing swiotlb. It is
not tested on AMD SEV environment.

qemu-system-x86_64 -cpu host -name debug-threads=on \
-smp 8 -m 16G -machine q35,accel=kvm -vnc :5 -hda boot.img \
-kernel mainline-linux/arch/x86_64/boot/bzImage \
-append "root=/dev/sda1 init=/sbin/init text console=ttyS0 loglevel=7 
swiotlb=327680,3145728,force" \
-device 
virtio-blk-pci,id=vblk0,num-queues=8,drive=drive0,disable-legacy=on,iommu_platform=true
 \
-drive file=test.raw,if=none,id=drive0,cache=none \
-net nic -net user,hostfwd=tcp::5025-:22 -serial stdio


The kernel command line "swiotlb=327680,3145728,force" is to allocate 6GB for
the extra swiotlb.

[2.826676] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[2.826693] software IO TLB: default mapped [mem 
0x3700-0x5f00] (640MB)
[2.826697] software IO TLB: high mapped [mem 
0x0002edc8-0x00046dc8] (6144MB)

The highmem swiotlb is being used by virtio-blk.

$ cat /sys/kernel/debug/swiotlb/swiotlb-hi/io_tlb_nslabs 
3145728
$ cat /sys/kernel/debug/swiotlb/swiotlb-hi/io_tlb_used 
8960


Dongli Zhang (7):
  swiotlb: introduce the highmem swiotlb buffer
  swiotlb: change the signature of remap function
  swiotlb-xen: support highmem for xen specific code
  swiotlb: to implement io_tlb_high_mem
  swiotlb: add interface to set dev->dma_io_tlb_mem
  virtio: use io_tlb_high_mem if it is active
  swiotlb: fix the slot_addr() overflow

arch/powerpc/kernel/dma-swiotlb.c  |   8 +-
arch/x86/include/asm/xen/swiotlb-xen.h |   2 +-
arch/x86/kernel/pci-dma.c  |   5 +-
drivers/virtio/virtio.c|   8 ++
drivers/xen/swiotlb-xen.c  |  16 +++-
include/linux/swiotlb.h|  14 ++-
kernel/dma/swiotlb.c   | 136 +---
7 files changed, 145 insertions(+), 44 deletions(-)

Thank you very much for feedback and suggestion!

Dongli Zhang


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 3/7] swiotlb-xen: support highmem for xen specific code

2022-06-08 Thread Dongli Zhang
While for most of times the swiotlb-xen relies on the generic swiotlb api
to initialize and use swiotlb, this patch is to support highmem swiotlb
for swiotlb-xen specific code.

E.g., the xen_swiotlb_fixup() may request the hypervisor to provide
64-bit memory pages as swiotlb buffer.

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 drivers/xen/swiotlb-xen.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 339f46e21053..d15321e9f9db 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -38,7 +38,8 @@
 #include 
 
 #include 
-#define MAX_DMA_BITS 32
+#define MAX_DMA32_BITS 32
+#define MAX_DMA64_BITS 64
 
 /*
  * Quick lookup value of the bus address of the IOTLB.
@@ -109,19 +110,25 @@ int xen_swiotlb_fixup(void *buf, unsigned long nslabs, 
bool high)
int rc;
unsigned int order = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT);
unsigned int i, dma_bits = order + PAGE_SHIFT;
+   unsigned int max_dma_bits = MAX_DMA32_BITS;
dma_addr_t dma_handle;
phys_addr_t p = virt_to_phys(buf);
 
BUILD_BUG_ON(IO_TLB_SEGSIZE & (IO_TLB_SEGSIZE - 1));
BUG_ON(nslabs % IO_TLB_SEGSIZE);
 
+   if (high) {
+   dma_bits = MAX_DMA64_BITS;
+   max_dma_bits = MAX_DMA64_BITS;
+   }
+
i = 0;
do {
do {
rc = xen_create_contiguous_region(
p + (i << IO_TLB_SHIFT), order,
dma_bits, _handle);
-   } while (rc && dma_bits++ < MAX_DMA_BITS);
+   } while (rc && dma_bits++ < max_dma_bits);
if (rc)
return rc;
 
@@ -381,7 +388,8 @@ xen_swiotlb_sync_sg_for_device(struct device *dev, struct 
scatterlist *sgl,
 static int
 xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
-   return xen_phys_to_dma(hwdev, io_tlb_default_mem.end - 1) <= mask;
+   return xen_phys_to_dma(hwdev, io_tlb_default_mem.end - 1) <= mask ||
+  xen_phys_to_dma(hwdev, io_tlb_high_mem.end - 1) <= mask;
 }
 
 const struct dma_map_ops xen_swiotlb_dma_ops = {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 7/7] swiotlb: fix the slot_addr() overflow

2022-06-08 Thread Dongli Zhang
Since the type of swiotlb slot index is a signed integer, the
"((idx) << IO_TLB_SHIFT)" will returns incorrect value. As a result, the
slot_addr() returns a value which is smaller than the expected one.

E.g., the 'tlb_addr' generated in swiotlb_tbl_map_single() may return a
value smaller than the expected one. As a result, the swiotlb_bounce()
will access a wrong swiotlb slot.

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 0dcdd25ea95d..c64e557de55c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -531,7 +531,8 @@ static void swiotlb_bounce(struct device *dev, phys_addr_t 
tlb_addr, size_t size
}
 }
 
-#define slot_addr(start, idx)  ((start) + ((idx) << IO_TLB_SHIFT))
+#define slot_addr(start, idx)  ((start) + \
+   (((unsigned long)idx) << IO_TLB_SHIFT))
 
 /*
  * Carefully handle integer overflow which can occur when boundary_mask == 
~0UL.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 1/7] swiotlb: introduce the highmem swiotlb buffer

2022-06-08 Thread Dongli Zhang
Currently, the virtio driver is not able to use 4+ GB memory when the
swiotlb is enforced, e.g., when amd sev is involved.

Fortunately, the SWIOTLB_ANY flag has been introduced since
commit 8ba2ed1be90f ("swiotlb: add a SWIOTLB_ANY flag to lift the low
memory restriction") to allocate swiotlb buffer from high memory.

While the default swiotlb is 'io_tlb_default_mem', the extra
'io_tlb_high_mem' is introduced to allocate with SWIOTLB_ANY flag in the
future patches. E.g., the user may configure the extra highmem swiotlb
buffer via "swiotlb=327680,4194304" to allocate 8GB memory.

In the future, the driver will be able to decide to use whether
'io_tlb_default_mem' or 'io_tlb_high_mem'.

The highmem swiotlb is enabled by user if io_tlb_high_mem is set. It can
be actively used if swiotlb_high_active() returns true.

The kernel command line "swiotlb=32768,3145728,force" is to allocate 64MB
for default swiotlb, and 6GB for the extra highmem swiotlb.

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 include/linux/swiotlb.h |  2 ++
 kernel/dma/swiotlb.c| 16 
 2 files changed, 18 insertions(+)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 7ed35dd3de6e..e67e605af2dd 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -109,6 +109,7 @@ struct io_tlb_mem {
} *slots;
 };
 extern struct io_tlb_mem io_tlb_default_mem;
+extern struct io_tlb_mem io_tlb_high_mem;
 
 static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
 {
@@ -164,6 +165,7 @@ static inline void swiotlb_adjust_size(unsigned long size)
 }
 #endif /* CONFIG_SWIOTLB */
 
+extern bool swiotlb_high_active(void);
 extern void swiotlb_print_info(void);
 
 #ifdef CONFIG_DMA_RESTRICTED_POOL
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index cb50f8d38360..569bc30e7b7a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -66,10 +66,12 @@ static bool swiotlb_force_bounce;
 static bool swiotlb_force_disable;
 
 struct io_tlb_mem io_tlb_default_mem;
+struct io_tlb_mem io_tlb_high_mem;
 
 phys_addr_t swiotlb_unencrypted_base;
 
 static unsigned long default_nslabs = IO_TLB_DEFAULT_SIZE >> IO_TLB_SHIFT;
+static unsigned long high_nslabs;
 
 static int __init
 setup_io_tlb_npages(char *str)
@@ -81,6 +83,15 @@ setup_io_tlb_npages(char *str)
}
if (*str == ',')
++str;
+
+   if (isdigit(*str)) {
+   /* avoid tail segment of size < IO_TLB_SEGSIZE */
+   high_nslabs =
+   ALIGN(simple_strtoul(str, , 0), IO_TLB_SEGSIZE);
+   }
+   if (*str == ',')
+   ++str;
+
if (!strcmp(str, "force"))
swiotlb_force_bounce = true;
else if (!strcmp(str, "noforce"))
@@ -90,6 +101,11 @@ setup_io_tlb_npages(char *str)
 }
 early_param("swiotlb", setup_io_tlb_npages);
 
+bool swiotlb_high_active(void)
+{
+   return high_nslabs && io_tlb_high_mem.nslabs;
+}
+
 unsigned int swiotlb_max_segment(void)
 {
if (!io_tlb_default_mem.nslabs)
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 2/7] swiotlb: change the signature of remap function

2022-06-08 Thread Dongli Zhang
Add new argument 'high' to remap function, so that it will be able to
remap the swiotlb buffer based on whether the swiotlb is 32-bit or
64-bit.

Currently the only remap function is xen_swiotlb_fixup().

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 arch/x86/include/asm/xen/swiotlb-xen.h | 2 +-
 drivers/xen/swiotlb-xen.c  | 2 +-
 include/linux/swiotlb.h| 4 ++--
 kernel/dma/swiotlb.c   | 8 
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/xen/swiotlb-xen.h 
b/arch/x86/include/asm/xen/swiotlb-xen.h
index 77a2d19cc990..a54eae15605e 100644
--- a/arch/x86/include/asm/xen/swiotlb-xen.h
+++ b/arch/x86/include/asm/xen/swiotlb-xen.h
@@ -8,7 +8,7 @@ extern int pci_xen_swiotlb_init_late(void);
 static inline int pci_xen_swiotlb_init_late(void) { return -ENXIO; }
 #endif
 
-int xen_swiotlb_fixup(void *buf, unsigned long nslabs);
+int xen_swiotlb_fixup(void *buf, unsigned long nslabs, bool high);
 int xen_create_contiguous_region(phys_addr_t pstart, unsigned int order,
unsigned int address_bits,
dma_addr_t *dma_handle);
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 67aa74d20162..339f46e21053 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -104,7 +104,7 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
 }
 
 #ifdef CONFIG_X86
-int xen_swiotlb_fixup(void *buf, unsigned long nslabs)
+int xen_swiotlb_fixup(void *buf, unsigned long nslabs, bool high)
 {
int rc;
unsigned int order = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index e67e605af2dd..e61c074c55eb 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -36,9 +36,9 @@ struct scatterlist;
 
 unsigned long swiotlb_size_or_default(void);
 void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
-   int (*remap)(void *tlb, unsigned long nslabs));
+   int (*remap)(void *tlb, unsigned long nslabs, bool high));
 int swiotlb_init_late(size_t size, gfp_t gfp_mask,
-   int (*remap)(void *tlb, unsigned long nslabs));
+   int (*remap)(void *tlb, unsigned long nslabs, bool high));
 extern void __init swiotlb_update_mem_attributes(void);
 
 phys_addr_t swiotlb_tbl_map_single(struct device *hwdev, phys_addr_t phys,
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 569bc30e7b7a..793ca7f9 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -245,7 +245,7 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, 
phys_addr_t start,
  * structures for the software IO TLB used to implement the DMA API.
  */
 void __init swiotlb_init_remap(bool addressing_limit, unsigned int flags,
-   int (*remap)(void *tlb, unsigned long nslabs))
+   int (*remap)(void *tlb, unsigned long nslabs, bool high))
 {
struct io_tlb_mem *mem = _tlb_default_mem;
unsigned long nslabs = default_nslabs;
@@ -274,7 +274,7 @@ void __init swiotlb_init_remap(bool addressing_limit, 
unsigned int flags,
return;
}
 
-   if (remap && remap(tlb, nslabs) < 0) {
+   if (remap && remap(tlb, nslabs, false) < 0) {
memblock_free(tlb, PAGE_ALIGN(bytes));
 
nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
@@ -307,7 +307,7 @@ void __init swiotlb_init(bool addressing_limit, unsigned 
int flags)
  * This should be just like above, but with some error catching.
  */
 int swiotlb_init_late(size_t size, gfp_t gfp_mask,
-   int (*remap)(void *tlb, unsigned long nslabs))
+   int (*remap)(void *tlb, unsigned long nslabs, bool high))
 {
struct io_tlb_mem *mem = _tlb_default_mem;
unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
@@ -337,7 +337,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
return -ENOMEM;
 
if (remap)
-   rc = remap(vstart, nslabs);
+   rc = remap(vstart, nslabs, false);
if (rc) {
free_pages((unsigned long)vstart, order);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 4/7] swiotlb: to implement io_tlb_high_mem

2022-06-08 Thread Dongli Zhang
This patch is to implement the extra 'io_tlb_high_mem'. In the future, the
device drivers may choose to use either 'io_tlb_default_mem' or
'io_tlb_high_mem' as dev->dma_io_tlb_mem.

The highmem buffer is regarded as active if
(high_nslabs && io_tlb_high_mem.nslabs) returns true.

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 arch/powerpc/kernel/dma-swiotlb.c |   8 ++-
 arch/x86/kernel/pci-dma.c |   5 +-
 include/linux/swiotlb.h   |   2 +-
 kernel/dma/swiotlb.c  | 103 +-
 4 files changed, 84 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index ba256c37bcc0..f18694881264 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,9 +20,11 @@ void __init swiotlb_detect_4g(void)
 
 static int __init check_swiotlb_enabled(void)
 {
-   if (ppc_swiotlb_enable)
-   swiotlb_print_info();
-   else
+   if (ppc_swiotlb_enable) {
+   swiotlb_print_info(false);
+   if (swiotlb_high_active())
+   swiotlb_print_info(true);
+   } else
swiotlb_exit();
 
return 0;
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index 30bbe4abb5d6..1504b349b312 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -196,7 +196,10 @@ static int __init pci_iommu_init(void)
/* An IOMMU turned us off. */
if (x86_swiotlb_enable) {
pr_info("PCI-DMA: Using software bounce buffering for IO 
(SWIOTLB)\n");
-   swiotlb_print_info();
+
+   swiotlb_print_info(false);
+   if (swiotlb_high_active())
+   swiotlb_print_info(true);
} else {
swiotlb_exit();
}
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index e61c074c55eb..8196bf961aab 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -166,7 +166,7 @@ static inline void swiotlb_adjust_size(unsigned long size)
 #endif /* CONFIG_SWIOTLB */
 
 extern bool swiotlb_high_active(void);
-extern void swiotlb_print_info(void);
+extern void swiotlb_print_info(bool high);
 
 #ifdef CONFIG_DMA_RESTRICTED_POOL
 struct page *swiotlb_alloc(struct device *dev, size_t size);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 793ca7f9..ff82b281ce01 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -101,6 +101,21 @@ setup_io_tlb_npages(char *str)
 }
 early_param("swiotlb", setup_io_tlb_npages);
 
+static struct io_tlb_mem *io_tlb_mem_get(bool high)
+{
+   return high ? _tlb_high_mem : _tlb_default_mem;
+}
+
+static unsigned long nslabs_get(bool high)
+{
+   return high ? high_nslabs : default_nslabs;
+}
+
+static char *swiotlb_name_get(bool high)
+{
+   return high ? "high" : "default";
+}
+
 bool swiotlb_high_active(void)
 {
return high_nslabs && io_tlb_high_mem.nslabs;
@@ -133,17 +148,18 @@ void __init swiotlb_adjust_size(unsigned long size)
pr_info("SWIOTLB bounce buffer size adjusted to %luMB", size >> 20);
 }
 
-void swiotlb_print_info(void)
+void swiotlb_print_info(bool high)
 {
-   struct io_tlb_mem *mem = _tlb_default_mem;
+   struct io_tlb_mem *mem = io_tlb_mem_get(high);
 
if (!mem->nslabs) {
pr_warn("No low mem\n");
return;
}
 
-   pr_info("mapped [mem %pa-%pa] (%luMB)\n", >start, >end,
-  (mem->nslabs << IO_TLB_SHIFT) >> 20);
+   pr_info("%s mapped [mem %pa-%pa] (%luMB)\n",
+   swiotlb_name_get(high), >start, >end,
+   (mem->nslabs << IO_TLB_SHIFT) >> 20);
 }
 
 static inline unsigned long io_tlb_offset(unsigned long val)
@@ -184,15 +200,9 @@ static void *swiotlb_mem_remap(struct io_tlb_mem *mem, 
unsigned long bytes)
 }
 #endif
 
-/*
- * Early SWIOTLB allocation may be too early to allow an architecture to
- * perform the desired operations.  This function allows the architecture to
- * call SWIOTLB when the operations are possible.  It needs to be called
- * before the SWIOTLB memory is used.
- */
-void __init swiotlb_update_mem_attributes(void)
+static void __init __swiotlb_update_mem_attributes(bool high)
 {
-   struct io_tlb_mem *mem = _tlb_default_mem;
+   struct io_tlb_mem *mem = io_tlb_mem_get(high);
void *vaddr;
unsigned long bytes;
 
@@ -207,6 +217,19 @@ void __init swiotlb_update_mem_attributes(void)
mem->vaddr = vaddr;
 }
 
+/*
+ * Early SWIOTLB allocation may be too early to allow an architecture to
+ * perform the desired operations.  This function allows the architecture to
+ * call SWIOTLB when the operations are possible.  It needs to be called
+ * before the SWIOTLB

[PATCH RFC v1 6/7] virtio: use io_tlb_high_mem if it is active

2022-06-08 Thread Dongli Zhang
When the swiotlb is enforced (e.g., when amd sev is involved), the virito
driver will not be able to use 4+ GB memory. Therefore, the virtio driver
uses 'io_tlb_high_mem' as swiotlb.

Cc: Konrad Wilk 
Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 drivers/virtio/virtio.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index ef04a96942bf..d9ebe3940e2d 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -5,6 +5,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
 /* Unique numbering for virtio devices. */
@@ -241,6 +243,12 @@ static int virtio_dev_probe(struct device *_d)
u64 device_features;
u64 driver_features;
u64 driver_features_legacy;
+   struct device *parent = dev->dev.parent;
+   u64 dma_mask = min_not_zero(*parent->dma_mask,
+   parent->bus_dma_limit);
+
+   if (dma_mask == DMA_BIT_MASK(64))
+   swiotlb_use_high(parent);
 
/* We have a driver! */
virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 12/15] swiotlb: provide swiotlb_init variants that remap the buffer

2022-04-04 Thread Dongli Zhang
abs))
>  {
>   unsigned long nslabs = ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE);
>   unsigned long bytes;
> @@ -303,6 +322,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
>   if (swiotlb_force_disable)
>   return 0;
>  
> +retry:
>   order = get_order(nslabs << IO_TLB_SHIFT);
>   nslabs = SLABS_PER_PAGE << order;
>   bytes = nslabs << IO_TLB_SHIFT;
> @@ -323,6 +343,16 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask)
>   (PAGE_SIZE << order) >> 20);
>   nslabs = SLABS_PER_PAGE << order;
>   }
> + if (remap)
> + rc = remap(vstart, nslabs);
> + if (rc) {
> + free_pages((unsigned long)vstart, order);
> + 

"warning: 1 line adds whitespace errors." above when I was applying the patch
for test.

Dongli Zhang

> + nslabs = ALIGN(nslabs >> 1, IO_TLB_SEGSIZE);
> + if (nslabs < IO_TLB_MIN_SLABS)
> + return rc;
> + goto retry;
> + }
>   rc = swiotlb_late_init_with_tbl(vstart, nslabs);
>   if (rc)
>   free_pages((unsigned long)vstart, order);
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 10/12] swiotlb: add a SWIOTLB_ANY flag to lift the low memory restriction

2022-03-04 Thread Dongli Zhang
Hi Michael,

On 3/4/22 10:12 AM, Michael Kelley (LINUX) wrote:
> From: Christoph Hellwig  Sent: Tuesday, March 1, 2022 2:53 AM
>>
>> Power SVM wants to allocate a swiotlb buffer that is not restricted to low 
>> memory for
>> the trusted hypervisor scheme.  Consolidate the support for this into the 
>> swiotlb_init
>> interface by adding a new flag.
> 
> Hyper-V Isolated VMs want to do the same thing of not restricting the swiotlb
> buffer to low memory.  That's what Tianyu Lan's patch set[1] is proposing.
> Hyper-V synthetic devices have no DMA addressing limitations, and the
> likelihood of using a PCI pass-thru device with addressing limitations in an
> Isolated VM seems vanishingly small.
> 
> So could use of the SWIOTLB_ANY flag be generalized?  Let Hyper-V init
> code set the flag before swiotlb_init() is called.  Or provide a CONFIG
> variable that Hyper-V Isolated VMs could set.

I used to send 64-bit swiotlb, while at that time people thought it was the same
as Restricted DMA patchset.

https://lore.kernel.org/all/20210203233709.19819-1-dongli.zh...@oracle.com/

However, I do not think Restricted DMA patchset is going to supports 64-bit (or
high memory) DMA. Is this what you are looking for?

Dongli Zhang

> 
> Michael
> 
> [1] 
> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20220209122302.213882-1-ltyker...@gmail.com/__;!!ACWV5N9M2RV99hQ!fUx4fMgdQIrqJDDy-pbv9xMeyHX0rC6iN8176LWjylI2_lsjy03gysm0-lAbV1Yb7_g$
>  
> 
>>
>> Signed-off-by: Christoph Hellwig 
>> ---
>>  arch/powerpc/include/asm/svm.h   |  4 
>>  arch/powerpc/include/asm/swiotlb.h   |  1 +
>>  arch/powerpc/kernel/dma-swiotlb.c|  1 +
>>  arch/powerpc/mm/mem.c|  5 +
>>  arch/powerpc/platforms/pseries/svm.c | 26 +-
>>  include/linux/swiotlb.h  |  1 +
>>  kernel/dma/swiotlb.c |  9 +++--
>>  7 files changed, 12 insertions(+), 35 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/svm.h b/arch/powerpc/include/asm/svm.h
>> index 7546402d796af..85580b30aba48 100644
>> --- a/arch/powerpc/include/asm/svm.h
>> +++ b/arch/powerpc/include/asm/svm.h
>> @@ -15,8 +15,6 @@ static inline bool is_secure_guest(void)
>>  return mfmsr() & MSR_S;
>>  }
>>
>> -void __init svm_swiotlb_init(void);
>> -
>>  void dtl_cache_ctor(void *addr);
>>  #define get_dtl_cache_ctor()(is_secure_guest() ? dtl_cache_ctor : 
>> NULL)
>>
>> @@ -27,8 +25,6 @@ static inline bool is_secure_guest(void)
>>  return false;
>>  }
>>
>> -static inline void svm_swiotlb_init(void) {}
>> -
>>  #define get_dtl_cache_ctor() NULL
>>
>>  #endif /* CONFIG_PPC_SVM */
>> diff --git a/arch/powerpc/include/asm/swiotlb.h
>> b/arch/powerpc/include/asm/swiotlb.h
>> index 3c1a1cd161286..4203b5e0a88ed 100644
>> --- a/arch/powerpc/include/asm/swiotlb.h
>> +++ b/arch/powerpc/include/asm/swiotlb.h
>> @@ -9,6 +9,7 @@
>>  #include 
>>
>>  extern unsigned int ppc_swiotlb_enable;
>> +extern unsigned int ppc_swiotlb_flags;
>>
>>  #ifdef CONFIG_SWIOTLB
>>  void swiotlb_detect_4g(void);
>> diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-
>> swiotlb.c
>> index fc7816126a401..ba256c37bcc0f 100644
>> --- a/arch/powerpc/kernel/dma-swiotlb.c
>> +++ b/arch/powerpc/kernel/dma-swiotlb.c
>> @@ -10,6 +10,7 @@
>>  #include 
>>
>>  unsigned int ppc_swiotlb_enable;
>> +unsigned int ppc_swiotlb_flags;
>>
>>  void __init swiotlb_detect_4g(void)
>>  {
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index
>> e1519e2edc656..a4d65418c30a9 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -249,10 +249,7 @@ void __init mem_init(void)
>>   * back to to-down.
>>   */
>>  memblock_set_bottom_up(true);
>> -if (is_secure_guest())
>> -svm_swiotlb_init();
>> -else
>> -swiotlb_init(ppc_swiotlb_enable, 0);
>> +swiotlb_init(ppc_swiotlb_enable, ppc_swiotlb_flags);
>>  #endif
>>
>>  high_memory = (void *) __va(max_low_pfn * PAGE_SIZE); diff --git
>> a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c
>> index c5228f4969eb2..3b4045d508ec8 100644
>> --- a/arch/powerpc/platforms/pseries/svm.c
>> +++ b/arch/powerpc/platforms/pseries/svm.c
>> @@ -28,7 +28,7 @@ static int __init init_svm(void)
>>   * need to use the SWIOTLB buffer for DMA even if dma_capable() says
>>   * otherwise.
>>

[PATCH RFC v1 6/6] xen-swiotlb: enable 64-bit xen-swiotlb

2021-02-03 Thread Dongli Zhang
This patch is to enable the 64-bit xen-swiotlb buffer.

For Xen PVM DMA address, the 64-bit device will be able to allocate from
64-bit swiotlb buffer.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 drivers/xen/swiotlb-xen.c | 117 --
 1 file changed, 74 insertions(+), 43 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index e18cae693cdc..c9ab07809e32 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -108,27 +108,36 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
unsigned long bfn = XEN_PFN_DOWN(dma_to_phys(dev, dma_addr));
unsigned long xen_pfn = bfn_to_local_pfn(bfn);
phys_addr_t paddr = (phys_addr_t)xen_pfn << XEN_PAGE_SHIFT;
+   int i;
 
/* If the address is outside our domain, it CAN
 * have the same virtual address as another address
 * in our domain. Therefore _only_ check address within our domain.
 */
-   if (pfn_valid(PFN_DOWN(paddr))) {
-   return paddr >= virt_to_phys(xen_io_tlb_start[SWIOTLB_LO]) &&
-  paddr < virt_to_phys(xen_io_tlb_end[SWIOTLB_LO]);
-   }
+   if (!pfn_valid(PFN_DOWN(paddr)))
+   return 0;
+
+   for (i = 0; i < swiotlb_nr; i++)
+   if (paddr >= virt_to_phys(xen_io_tlb_start[i]) &&
+   paddr < virt_to_phys(xen_io_tlb_end[i]))
+   return 1;
+
return 0;
 }
 
 static int
-xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
+xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs,
+ enum swiotlb_t type)
 {
int i, rc;
int dma_bits;
dma_addr_t dma_handle;
phys_addr_t p = virt_to_phys(buf);
 
-   dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + PAGE_SHIFT;
+   if (type == SWIOTLB_HI)
+   dma_bits = max_dma_bits[SWIOTLB_HI];
+   else
+   dma_bits = get_order(IO_TLB_SEGSIZE << IO_TLB_SHIFT) + 
PAGE_SHIFT;
 
i = 0;
do {
@@ -139,7 +148,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long 
nslabs)
p + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
dma_bits, _handle);
-   } while (rc && dma_bits++ < max_dma_bits[SWIOTLB_LO]);
+   } while (rc && dma_bits++ < max_dma_bits[type]);
if (rc)
return rc;
 
@@ -147,16 +156,17 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long 
nslabs)
} while (i < nslabs);
return 0;
 }
-static unsigned long xen_set_nslabs(unsigned long nr_tbl)
+
+static unsigned long xen_set_nslabs(unsigned long nr_tbl, enum swiotlb_t type)
 {
if (!nr_tbl) {
-   xen_io_tlb_nslabs[SWIOTLB_LO] = (64 * 1024 * 1024 >> 
IO_TLB_SHIFT);
-   xen_io_tlb_nslabs[SWIOTLB_LO] = 
ALIGN(xen_io_tlb_nslabs[SWIOTLB_LO],
+   xen_io_tlb_nslabs[type] = (64 * 1024 * 1024 >> IO_TLB_SHIFT);
+   xen_io_tlb_nslabs[type] = ALIGN(xen_io_tlb_nslabs[type],
  IO_TLB_SEGSIZE);
} else
-   xen_io_tlb_nslabs[SWIOTLB_LO] = nr_tbl;
+   xen_io_tlb_nslabs[type] = nr_tbl;
 
-   return xen_io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
+   return xen_io_tlb_nslabs[type] << IO_TLB_SHIFT;
 }
 
 enum xen_swiotlb_err {
@@ -180,23 +190,24 @@ static const char *xen_swiotlb_error(enum xen_swiotlb_err 
err)
}
return "";
 }
-int __ref xen_swiotlb_init(int verbose, bool early)
+
+static int xen_swiotlb_init_type(int verbose, bool early, enum swiotlb_t type)
 {
unsigned long bytes, order;
int rc = -ENOMEM;
enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
unsigned int repeat = 3;
 
-   xen_io_tlb_nslabs[SWIOTLB_LO] = swiotlb_nr_tbl(SWIOTLB_LO);
+   xen_io_tlb_nslabs[type] = swiotlb_nr_tbl(type);
 retry:
-   bytes = xen_set_nslabs(xen_io_tlb_nslabs[SWIOTLB_LO]);
-   order = get_order(xen_io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
+   bytes = xen_set_nslabs(xen_io_tlb_nslabs[type], type);
+   order = get_order(xen_io_tlb_nslabs[type] << IO_TLB_SHIFT);
 
/*
 * IO TLB memory already allocated. Just use it.
 */
-   if (io_tlb_start[SWIOTLB_LO] != 0) {
-   xen_io_tlb_start[SWIOTLB_LO] = 
phys_to_virt(io_tlb_start[SWIOTLB_LO]);
+   if (io_tlb_start[type] != 0) {
+   xen_io_tlb_start[type] = phys_to_virt(io_tlb_start[type]);
goto end;
}
 
@@ -204,81 +215,95 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 * Get IO TLB memory from any location.
 */
if (early) {
-  

[PATCH RFC v1 5/6] xen-swiotlb: convert variables to arrays

2021-02-03 Thread Dongli Zhang
This patch converts several xen-swiotlb related variables to arrays, in
order to maintain stat/status for different swiotlb buffers. Here are
variables involved:

- xen_io_tlb_start and xen_io_tlb_end
- xen_io_tlb_nslabs
- MAX_DMA_BITS

There is no functional change and this is to prepare to enable 64-bit
xen-swiotlb.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 drivers/xen/swiotlb-xen.c | 75 +--
 1 file changed, 40 insertions(+), 35 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 662638093542..e18cae693cdc 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -39,15 +39,17 @@
 #include 
 
 #include 
-#define MAX_DMA_BITS 32
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
  * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by 
this
  * API.
  */
 
-static char *xen_io_tlb_start, *xen_io_tlb_end;
-static unsigned long xen_io_tlb_nslabs;
+static char *xen_io_tlb_start[SWIOTLB_MAX], *xen_io_tlb_end[SWIOTLB_MAX];
+static unsigned long xen_io_tlb_nslabs[SWIOTLB_MAX];
+
+static int max_dma_bits[] = {32, 64};
+
 /*
  * Quick lookup value of the bus address of the IOTLB.
  */
@@ -112,8 +114,8 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
 * in our domain. Therefore _only_ check address within our domain.
 */
if (pfn_valid(PFN_DOWN(paddr))) {
-   return paddr >= virt_to_phys(xen_io_tlb_start) &&
-  paddr < virt_to_phys(xen_io_tlb_end);
+   return paddr >= virt_to_phys(xen_io_tlb_start[SWIOTLB_LO]) &&
+  paddr < virt_to_phys(xen_io_tlb_end[SWIOTLB_LO]);
}
return 0;
 }
@@ -137,7 +139,7 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long 
nslabs)
p + (i << IO_TLB_SHIFT),
get_order(slabs << IO_TLB_SHIFT),
dma_bits, _handle);
-   } while (rc && dma_bits++ < MAX_DMA_BITS);
+   } while (rc && dma_bits++ < max_dma_bits[SWIOTLB_LO]);
if (rc)
return rc;
 
@@ -148,12 +150,13 @@ xen_swiotlb_fixup(void *buf, size_t size, unsigned long 
nslabs)
 static unsigned long xen_set_nslabs(unsigned long nr_tbl)
 {
if (!nr_tbl) {
-   xen_io_tlb_nslabs = (64 * 1024 * 1024 >> IO_TLB_SHIFT);
-   xen_io_tlb_nslabs = ALIGN(xen_io_tlb_nslabs, IO_TLB_SEGSIZE);
+   xen_io_tlb_nslabs[SWIOTLB_LO] = (64 * 1024 * 1024 >> 
IO_TLB_SHIFT);
+   xen_io_tlb_nslabs[SWIOTLB_LO] = 
ALIGN(xen_io_tlb_nslabs[SWIOTLB_LO],
+ IO_TLB_SEGSIZE);
} else
-   xen_io_tlb_nslabs = nr_tbl;
+   xen_io_tlb_nslabs[SWIOTLB_LO] = nr_tbl;
 
-   return xen_io_tlb_nslabs << IO_TLB_SHIFT;
+   return xen_io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT;
 }
 
 enum xen_swiotlb_err {
@@ -184,16 +187,16 @@ int __ref xen_swiotlb_init(int verbose, bool early)
enum xen_swiotlb_err m_ret = XEN_SWIOTLB_UNKNOWN;
unsigned int repeat = 3;
 
-   xen_io_tlb_nslabs = swiotlb_nr_tbl(SWIOTLB_LO);
+   xen_io_tlb_nslabs[SWIOTLB_LO] = swiotlb_nr_tbl(SWIOTLB_LO);
 retry:
-   bytes = xen_set_nslabs(xen_io_tlb_nslabs);
-   order = get_order(xen_io_tlb_nslabs << IO_TLB_SHIFT);
+   bytes = xen_set_nslabs(xen_io_tlb_nslabs[SWIOTLB_LO]);
+   order = get_order(xen_io_tlb_nslabs[SWIOTLB_LO] << IO_TLB_SHIFT);
 
/*
 * IO TLB memory already allocated. Just use it.
 */
if (io_tlb_start[SWIOTLB_LO] != 0) {
-   xen_io_tlb_start = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
+   xen_io_tlb_start[SWIOTLB_LO] = 
phys_to_virt(io_tlb_start[SWIOTLB_LO]);
goto end;
}
 
@@ -201,76 +204,78 @@ int __ref xen_swiotlb_init(int verbose, bool early)
 * Get IO TLB memory from any location.
 */
if (early) {
-   xen_io_tlb_start = memblock_alloc(PAGE_ALIGN(bytes),
+   xen_io_tlb_start[SWIOTLB_LO] = memblock_alloc(PAGE_ALIGN(bytes),
  PAGE_SIZE);
-   if (!xen_io_tlb_start)
+   if (!xen_io_tlb_start[SWIOTLB_LO])
panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
  __func__, PAGE_ALIGN(bytes), PAGE_SIZE);
} else {
 #define SLABS_PER_PAGE (1 << (PAGE_SHIFT - IO_TLB_SHIFT))
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
while ((SLABS_PER_PAGE << order) > IO_TLB_MIN_SLABS) {
-   xen_io_tlb_start = (void 
*)xen_get_swiotlb_free_pages(order);
-   if (xen_io_tlb_start)
+   

[PATCH RFC v1 4/6] swiotlb: enable 64-bit swiotlb

2021-02-03 Thread Dongli Zhang
This patch is to enable the 64-bit swiotlb buffer.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of the times, except:

1. The xen PVM domain requires the DMA addresses to both (1) less than the
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;

Therefore, this patch introduces the 2nd swiotlb buffer for 64-bit DMA
access. For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask is
min_not_zero(*hwdev->dma_mask, hwdev->bus_dma_limit).

The example to configure 64-bit swiotlb is "swiotlb=65536,524288,force"
or "swiotlb=,524288,force", where 524288 is the size of 64-bit buffer.

With the patch, the kernel will be able to allocate >4GB swiotlb buffer.
This patch is only for swiotlb, not including xen-swiotlb.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 arch/mips/cavium-octeon/dma-octeon.c |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c|   2 +-
 arch/powerpc/platforms/pseries/svm.c |   2 +-
 arch/x86/kernel/pci-swiotlb.c|   5 +-
 arch/x86/pci/sta2x11-fixup.c |   2 +-
 drivers/gpu/drm/i915/gem/i915_gem_internal.c |   4 +-
 drivers/gpu/drm/i915/i915_scatterlist.h  |   2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c|   2 +-
 drivers/mmc/host/sdhci.c |   2 +-
 drivers/pci/xen-pcifront.c   |   2 +-
 drivers/xen/swiotlb-xen.c|   9 +-
 include/linux/swiotlb.h  |  28 +-
 kernel/dma/swiotlb.c | 339 +++
 13 files changed, 238 insertions(+), 164 deletions(-)

diff --git a/arch/mips/cavium-octeon/dma-octeon.c 
b/arch/mips/cavium-octeon/dma-octeon.c
index df70308db0e6..3480555d908a 100644
--- a/arch/mips/cavium-octeon/dma-octeon.c
+++ b/arch/mips/cavium-octeon/dma-octeon.c
@@ -245,6 +245,7 @@ void __init plat_swiotlb_setup(void)
panic("%s: Failed to allocate %zu bytes align=%lx\n",
  __func__, swiotlbsize, PAGE_SIZE);
 
-   if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs, 1) == -ENOMEM)
+   if (swiotlb_init_with_tbl(octeon_swiotlb, swiotlb_nslabs,
+ SWIOTLB_LO, 1) == -ENOMEM)
panic("Cannot allocate SWIOTLB buffer");
 }
diff --git a/arch/powerpc/kernel/dma-swiotlb.c 
b/arch/powerpc/kernel/dma-swiotlb.c
index fc7816126a40..88113318c53f 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -20,7 +20,7 @@ void __init swiotlb_detect_4g(void)
 static int __init check_swiotlb_enabled(void)
 {
if (ppc_swiotlb_enable)
-   swiotlb_print_info();
+   swiotlb_print_info(SWIOTLB_LO);
else
swiotlb_exit();
 
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 9f8842d0da1f..77910e4ffad8 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -52,7 +52,7 @@ void __init svm_swiotlb_init(void)
bytes = io_tlb_nslabs << IO_TLB_SHIFT;
 
vstart = memblock_alloc(PAGE_ALIGN(bytes), PAGE_SIZE);
-   if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
+   if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, SWIOTLB_LO, 
false))
return;
 
if (io_tlb_start[SWIOTLB_LO])
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..950624fd95a4 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -67,12 +67,15 @@ void __init pci_swiotlb_init(void)
 
 void __init pci_swiotlb_late_init(void)
 {
+   int i;
+
/* An IOMMU turned us off. */
if (!swiotlb)
swiotlb_exit();
else {
printk(KERN_INFO "PCI-DMA: "
   "Using software bounce buffering for IO (SWIOTLB)\n");
-   swiotlb_print_info();
+   for (i = 0; i < swiotlb_nr; i++)
+   swiotlb_print_info(i);
}
 }
diff --git a/arch/x86/pci/sta2x11-fixup.c b/arch/x86/pci/sta2x11-fixup.c
index 7d2525691854..c440520b2055 100644
--- a/arch/x86/pci/sta2x11-fixup.c
+++ b/arch/x86/pci/sta2x11-fixup.c
@@ -57,7 +57,7 @@ static void sta2x11_new_instance(struct pci_dev *pdev)
int size = STA2X11_SWIOTLB_SIZE;
/* First instance

[PATCH RFC v1 0/6] swiotlb: 64-bit DMA buffer

2021-02-03 Thread Dongli Zhang
This RFC is to introduce the 2nd swiotlb buffer for 64-bit DMA access.  The
prototype is based on v5.11-rc6.

The state of the art swiotlb pre-allocates <=32-bit memory in order to meet
the DMA mask requirement for some 32-bit legacy device. Considering most
devices nowadays support 64-bit DMA and IOMMU is available, the swiotlb is
not used for most of times, except:

1. The xen PVM domain requires the DMA addresses to both (1) <= the device
dma mask, and (2) continuous in machine address. Therefore, the 64-bit
device may still require swiotlb on PVM domain.

2. From source code the AMD SME/SEV will enable SWIOTLB_FORCE. As a result
it is always required to allocate from swiotlb buffer even the device dma
mask is 64-bit.

sme_early_init()
-> if (sev_active())
   swiotlb_force = SWIOTLB_FORCE;


Therefore, this RFC introduces the 2nd swiotlb buffer for 64-bit DMA
access.  For instance, the swiotlb_tbl_map_single() allocates from the 2nd
64-bit buffer if the device DMA mask min_not_zero(*hwdev->dma_mask,
hwdev->bus_dma_limit) is 64-bit.  With the RFC, the Xen/AMD will be able to
allocate >4GB swiotlb buffer.

With it being 64-bit, you can (not in this patch set but certainly
possible) allocate this at runtime. Meaning the size could change depending
on the device MMIO buffers, etc.


I have tested the patch set on Xen PVM dom0 boot via QEMU. The dom0 is boot
via:

qemu-system-x86_64 -smp 8 -m 20G -enable-kvm -vnc :9 \
-net nic -net user,hostfwd=tcp::5029-:22 \
-hda disk.img \
-device nvme,drive=nvme0,serial=deudbeaf1,max_ioqpairs=16 \
-drive file=test.qcow2,if=none,id=nvme0 \
-serial stdio

The "swiotlb=65536,1048576,force" is to configure 32-bit swiotlb as 128MB
and 64-bit swiotlb as 2048MB. The swiotlb is enforced.

vm# cat /proc/cmdline 
placeholder root=UUID=4e942d60-c228-4caf-b98e-f41c365d9703 ro text
swiotlb=65536,1048576,force quiet splash

[5.119877] Booting paravirtualized kernel on Xen
... ...
[5.190423] software IO TLB: Low Mem mapped [mem 
0x000234e0-0x00023ce0] (128MB)
[6.276161] software IO TLB: High Mem mapped [mem 
0x000166f33000-0x0001e6f33000] (2048MB)

0x000234e0 is mapped to 0x001c (32-bit machine address)
0x00023ce0-1 is mapped to 0x0ff3 (32-bit machine address)
0x000166f33000 is mapped to 0x0004b728 (64-bit machine address)
0x0001e6f33000-1 is mapped to 0x00033a07 (64-bit machine address)


While running fio for emulated-NVMe, the swiotlb is allocating from 64-bit
io_tlb_used-highmem.

vm# cat /sys/kernel/debug/swiotlb/io_tlb_nslabs
65536
vm# cat /sys/kernel/debug/swiotlb/io_tlb_used
258
vm# cat /sys/kernel/debug/swiotlb/io_tlb_nslabs-highmem
1048576
vm# cat /sys/kernel/debug/swiotlb/io_tlb_used-highmem
58880


I also tested virtio-scsi (with "disable-legacy=on,iommu_platform=true") on
VM with AMD SEV enabled.

qemu-system-x86_64 -enable-kvm -machine q35 -smp 36 -m 20G \
-drive if=pflash,format=raw,unit=0,file=OVMF_CODE.pure-efi.fd,readonly \
-drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd \
-hda ol7-uefi.qcow2 -serial stdio -vnc :9 \
-net nic -net user,hostfwd=tcp::5029-:22 \
-cpu EPYC -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 \
-machine memory-encryption=sev0 \
-device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true \
-device scsi-hd,drive=disk0 \
-drive file=test.qcow2,if=none,id=disk0,format=qcow2

The "swiotlb=65536,1048576" is to configure 32-bit swiotlb as 128MB and
64-bit swiotlb as 2048MB. We do not need to force swiotlb because AMD SEV
will set SWIOTLB_FORCE.

# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-5.11.0-rc6swiotlb+ root=/dev/mapper/ol-root ro
crashkernel=auto rd.lvm.lv=ol/root rd.lvm.lv=ol/swap rhgb quiet
LANG=en_US.UTF-8 swiotlb=65536,1048576

[0.729790] AMD Memory Encryption Features active: SEV
... ...
[2.113147] software IO TLB: Low Mem mapped [mem 
0x73e1e000-0x7be1e000] (128MB)
[2.113151] software IO TLB: High Mem mapped [mem 
0x0004e840-0x00056840] (2048MB)

While running fio for virtio-scsi, the swiotlb is allocating from 64-bit
io_tlb_used-highmem.

vm# cat /sys/kernel/debug/swiotlb/io_tlb_nslabs
65536
vm# cat /sys/kernel/debug/swiotlb/io_tlb_used
0
vm# cat /sys/kernel/debug/swiotlb/io_tlb_nslabs-highmem
1048576
vm# cat /sys/kernel/debug/swiotlb/io_tlb_used-highmem
64647


Please let me know if there is any feedback for this idea and RFC.


Dongli Zhang (6):
   swiotlb: define new enumerated type
   swiotlb: convert variables to arrays
   swiotlb: introduce swiotlb_get_type() to calculate swiotlb buffer type
   swiotlb: enable 64-bit swiotlb
   xen-swiotlb: convert variables to arrays
   xen-swiotlb: enable 64-bit xen-swiotlb

 arch/mips/cavium-octeon/dma-octeon.c |   3 +-
 arch/powerpc/kernel/dma-swiotlb.c|   2 +-
 arch/powerpc/platforms/pseries/svm.c |   8 +-
 arch/x86/kernel/pci-swiotlb.c  

[PATCH RFC v1 2/6] swiotlb: convert variables to arrays

2021-02-03 Thread Dongli Zhang
This patch converts several swiotlb related variables to arrays, in
order to maintain stat/status for different swiotlb buffers. Here are
variables involved:

- io_tlb_start and io_tlb_end
- io_tlb_nslabs and io_tlb_used
- io_tlb_list
- io_tlb_index
- max_segment
- io_tlb_orig_addr
- no_iotlb_memory

There is no functional change and this is to prepare to enable 64-bit
swiotlb.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 arch/powerpc/platforms/pseries/svm.c |   6 +-
 drivers/xen/swiotlb-xen.c|   4 +-
 include/linux/swiotlb.h  |   5 +-
 kernel/dma/swiotlb.c | 257 ++-
 4 files changed, 140 insertions(+), 132 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 7b739cc7a8a9..9f8842d0da1f 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -55,9 +55,9 @@ void __init svm_swiotlb_init(void)
if (vstart && !swiotlb_init_with_tbl(vstart, io_tlb_nslabs, false))
return;
 
-   if (io_tlb_start)
-   memblock_free_early(io_tlb_start,
-   PAGE_ALIGN(io_tlb_nslabs << IO_TLB_SHIFT));
+   if (io_tlb_start[SWIOTLB_LO])
+   memblock_free_early(io_tlb_start[SWIOTLB_LO],
+   PAGE_ALIGN(io_tlb_nslabs[SWIOTLB_LO] << 
IO_TLB_SHIFT));
panic("SVM: Cannot allocate SWIOTLB buffer");
 }
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..3261880ad859 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -192,8 +192,8 @@ int __ref xen_swiotlb_init(int verbose, bool early)
/*
 * IO TLB memory already allocated. Just use it.
 */
-   if (io_tlb_start != 0) {
-   xen_io_tlb_start = phys_to_virt(io_tlb_start);
+   if (io_tlb_start[SWIOTLB_LO] != 0) {
+   xen_io_tlb_start = phys_to_virt(io_tlb_start[SWIOTLB_LO]);
goto end;
}
 
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index ca125c1b1281..777046cd4d1b 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -76,11 +76,12 @@ dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
 
 #ifdef CONFIG_SWIOTLB
 extern enum swiotlb_force swiotlb_force;
-extern phys_addr_t io_tlb_start, io_tlb_end;
+extern phys_addr_t io_tlb_start[], io_tlb_end[];
 
 static inline bool is_swiotlb_buffer(phys_addr_t paddr)
 {
-   return paddr >= io_tlb_start && paddr < io_tlb_end;
+   return paddr >= io_tlb_start[SWIOTLB_LO] &&
+  paddr < io_tlb_end[SWIOTLB_LO];
 }
 
 void __init swiotlb_exit(void);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 7c42df6e6100..1fbb65daa2dd 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -69,38 +69,38 @@ enum swiotlb_force swiotlb_force;
  * swiotlb_tbl_sync_single_*, to see if the memory was in fact allocated by 
this
  * API.
  */
-phys_addr_t io_tlb_start, io_tlb_end;
+phys_addr_t io_tlb_start[SWIOTLB_MAX], io_tlb_end[SWIOTLB_MAX];
 
 /*
  * The number of IO TLB blocks (in groups of 64) between io_tlb_start and
  * io_tlb_end.  This is command line adjustable via setup_io_tlb_npages.
  */
-static unsigned long io_tlb_nslabs;
+static unsigned long io_tlb_nslabs[SWIOTLB_MAX];
 
 /*
  * The number of used IO TLB block
  */
-static unsigned long io_tlb_used;
+static unsigned long io_tlb_used[SWIOTLB_MAX];
 
 /*
  * This is a free list describing the number of free entries available from
  * each index
  */
-static unsigned int *io_tlb_list;
-static unsigned int io_tlb_index;
+static unsigned int *io_tlb_list[SWIOTLB_MAX];
+static unsigned int io_tlb_index[SWIOTLB_MAX];
 
 /*
  * Max segment that we can provide which (if pages are contingous) will
  * not be bounced (unless SWIOTLB_FORCE is set).
  */
-static unsigned int max_segment;
+static unsigned int max_segment[SWIOTLB_MAX];
 
 /*
  * We need to save away the original address corresponding to a mapped entry
  * for the sync operations.
  */
 #define INVALID_PHYS_ADDR (~(phys_addr_t)0)
-static phys_addr_t *io_tlb_orig_addr;
+static phys_addr_t *io_tlb_orig_addr[SWIOTLB_MAX];
 
 /*
  * Protect the above data structures in the map and unmap calls
@@ -113,9 +113,9 @@ static int __init
 setup_io_tlb_npages(char *str)
 {
if (isdigit(*str)) {
-   io_tlb_nslabs = simple_strtoul(str, , 0);
+   io_tlb_nslabs[SWIOTLB_LO] = simple_strtoul(str, , 0);
/* avoid tail segment of size < IO_TLB_SEGSIZE */
-   io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
+   io_tlb_nslabs[SWIOTLB_LO] = ALIGN(io_tlb_nslabs[SWIOTLB_LO], 
IO_TLB_SEGSIZE);
}
if (*str == ',')
++str;
@@ -123,40 +123,40 @@ setup_io_tlb_npages(char *str)
swiotlb_fo

[PATCH RFC v1 1/6] swiotlb: define new enumerated type

2021-02-03 Thread Dongli Zhang
This is just to define new enumerated type without functional change.

The 'SWIOTLB_LO' is to index legacy 32-bit swiotlb buffer, while the
'SWIOTLB_HI' is to index the 64-bit buffer.

This is to prepare to enable 64-bit swiotlb.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 include/linux/swiotlb.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index d9c9fc9ca5d2..ca125c1b1281 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -17,6 +17,12 @@ enum swiotlb_force {
SWIOTLB_NO_FORCE,   /* swiotlb=noforce */
 };
 
+enum swiotlb_t {
+   SWIOTLB_LO,
+   SWIOTLB_HI,
+   SWIOTLB_MAX,
+};
+
 /*
  * Maximum allowable number of contiguous slabs to map,
  * must be a power of 2.  What is the appropriate value ?
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC v1 3/6] swiotlb: introduce swiotlb_get_type() to calculate swiotlb buffer type

2021-02-03 Thread Dongli Zhang
This patch introduces swiotlb_get_type() in order to calculate which
swiotlb buffer the given DMA address is belong to.

This is to prepare to enable 64-bit swiotlb.

Cc: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 include/linux/swiotlb.h | 14 ++
 kernel/dma/swiotlb.c|  2 ++
 2 files changed, 16 insertions(+)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 777046cd4d1b..3d5980d77810 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -3,6 +3,7 @@
 #define __LINUX_SWIOTLB_H
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -23,6 +24,8 @@ enum swiotlb_t {
SWIOTLB_MAX,
 };
 
+extern int swiotlb_nr;
+
 /*
  * Maximum allowable number of contiguous slabs to map,
  * must be a power of 2.  What is the appropriate value ?
@@ -84,6 +87,17 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
   paddr < io_tlb_end[SWIOTLB_LO];
 }
 
+static inline int swiotlb_get_type(phys_addr_t paddr)
+{
+   int i;
+
+   for (i = 0; i < swiotlb_nr; i++)
+   if (paddr >= io_tlb_start[i] && paddr < io_tlb_end[i])
+   return i;
+
+   return -ENOENT;
+}
+
 void __init swiotlb_exit(void);
 unsigned int swiotlb_max_segment(void);
 size_t swiotlb_max_mapping_size(struct device *dev);
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 1fbb65daa2dd..c91d3d2c3936 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -109,6 +109,8 @@ static DEFINE_SPINLOCK(io_tlb_lock);
 
 static int late_alloc;
 
+int swiotlb_nr = 1;
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-15 Thread Dongli Zhang



On 04/15/2019 07:26 PM, Konrad Rzeszutek Wilk wrote:
> 
> 
> On Mon, Apr 15, 2019, 2:50 AM Dongli Zhang  <mailto:dongli.zh...@oracle.com>> wrote:
> 
> As the patch to be fixed is still in Konrad's own tree, I will send the 
> v2 for
> the patch to be fixed, instead of an incremental fix.
> 
> 
> I squashed it in.

Thank you very much!

I saw it is at:

https://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb.git/commit/?h=devel/for-linus-5.1=53b29c336830db48ad3dc737f88b8c065b1f0851

Dongli Zhang
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-15 Thread Dongli Zhang
As the patch to be fixed is still in Konrad's own tree, I will send the v2 for
the patch to be fixed, instead of an incremental fix.

Dongli Zhang

On 4/12/19 10:13 PM, Joe Jin wrote:
> I'm good to have this patch, which helps identify the cause of failure is
> fragmentation or it really been used up.
> 
> On 4/12/19 4:38 AM, Dongli Zhang wrote:
>> When swiotlb is full, the kernel would print io_tlb_used. However, the
>> result might be inaccurate at that time because we have left the critical
>> section protected by spinlock.
>>
>> Therefore, we backup the io_tlb_used into local variable before leaving
>> critical section.
>>
>> Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb buffer 
>> is full")
>> Suggested-by: Håkon Bugge 
>> Signed-off-by: Dongli Zhang 
> 
> Reviewed-by: Joe Jin  
> 
> Thanks,
> Joe
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/1] swiotlb: save io_tlb_used to local variable before leaving critical section

2019-04-12 Thread Dongli Zhang
When swiotlb is full, the kernel would print io_tlb_used. However, the
result might be inaccurate at that time because we have left the critical
section protected by spinlock.

Therefore, we backup the io_tlb_used into local variable before leaving
critical section.

Fixes: 83ca25948940 ("swiotlb: dump used and total slots when swiotlb buffer is 
full")
Suggested-by: Håkon Bugge 
Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 857e823..6f7619c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -452,6 +452,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
unsigned long mask;
unsigned long offset_slots;
unsigned long max_slots;
+   unsigned long tmp_io_tlb_used;
 
if (no_iotlb_memory)
panic("Can not allocate SWIOTLB buffer earlier and can't now 
provide you with the DMA bounce buffer");
@@ -538,10 +539,12 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
} while (index != wrap);
 
 not_found:
+   tmp_io_tlb_used = io_tlb_used;
+
spin_unlock_irqrestore(_tlb_lock, flags);
if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-   dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total 
%lu, used %lu\n",
-size, io_tlb_nslabs, io_tlb_used);
+   dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total 
%lu (slots), used %lu (slots)\n",
+size, io_tlb_nslabs, tmp_io_tlb_used);
return DMA_MAPPING_ERROR;
 found:
io_tlb_used += nslots;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 1/1] swiotlb: dump used and total slots when swiotlb buffer is full

2019-04-04 Thread Dongli Zhang
So far the kernel only prints the requested size if swiotlb buffer if full.
It is not possible to know whether it is simply an out of buffer, or it is
because swiotlb cannot allocate buffer with the requested size due to
fragmentation.

As 'io_tlb_used' is available since commit 71602fe6d4e9 ("swiotlb: add
debugfs to track swiotlb buffer usage"), both 'io_tlb_used' and
'io_tlb_nslabs' are printed when swiotlb buffer is full.

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 53012db..3f43b37 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -540,7 +540,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 not_found:
spin_unlock_irqrestore(_tlb_lock, flags);
if (!(attrs & DMA_ATTR_NO_WARN) && printk_ratelimit())
-   dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
+   dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes), total 
%lu, used %lu\n",
+size, io_tlb_nslabs, io_tlb_used);
return DMA_MAPPING_ERROR;
 found:
io_tlb_used += nslots;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 3/3] swiotlb: checking whether swiotlb buffer is full with io_tlb_used

2019-01-17 Thread Dongli Zhang
This patch uses io_tlb_used to help check whether swiotlb buffer is full.
io_tlb_used is no longer used for only debugfs. It is also used to help
optimize swiotlb_tbl_map_single().

Suggested-by: Joe Jin 
Signed-off-by: Dongli Zhang 
---
Changed since v2:
  * move #ifdef folding to previous patch (suggested by Konrad Rzeszutek Wilk)

 kernel/dma/swiotlb.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index bedc9f9..a01b83e 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -483,6 +483,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 * request and allocate a buffer from that IO TLB pool.
 */
spin_lock_irqsave(_tlb_lock, flags);
+
+   if (unlikely(nslots > io_tlb_nslabs - io_tlb_used))
+   goto not_found;
+
index = ALIGN(io_tlb_index, stride);
if (index >= io_tlb_nslabs)
index = 0;
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 1/3] swiotlb: fix comment on swiotlb_bounce()

2019-01-17 Thread Dongli Zhang
Fix the comment as swiotlb_bounce() is used to copy from original dma
location to swiotlb buffer during swiotlb_tbl_map_single(), while to
copy from swiotlb buffer to original dma location during
swiotlb_tbl_unmap_single().

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 1fb6fd6..1d8b377 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -385,7 +385,7 @@ void __init swiotlb_exit(void)
 }
 
 /*
- * Bounce: copy the swiotlb buffer back to the original dma location
+ * Bounce: copy the swiotlb buffer from or back to the original dma location
  */
 static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
   size_t size, enum dma_data_direction dir)
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v3 2/3] swiotlb: add debugfs to track swiotlb buffer usage

2019-01-17 Thread Dongli Zhang
The device driver will not be able to do dma operations once swiotlb buffer
is full, either because the driver is using so many IO TLB blocks inflight,
or because there is memory leak issue in device driver. To export the
swiotlb buffer usage via debugfs would help the user estimate the size of
swiotlb buffer to pre-allocate or analyze device driver memory leak issue.

Signed-off-by: Dongli Zhang 
---
Changed since v1:
  * init debugfs with late_initcall (suggested by Robin Murphy)
  * create debugfs entries with debugfs_create_ulong(suggested by Robin Murphy)

Changed since v2:
  * some #ifdef folding (suggested by Konrad Rzeszutek Wilk)

 kernel/dma/swiotlb.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 1d8b377..bedc9f9 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -34,6 +34,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_DEBUG_FS
+#include 
+#endif
 
 #include 
 #include 
@@ -73,6 +76,11 @@ phys_addr_t io_tlb_start, io_tlb_end;
 static unsigned long io_tlb_nslabs;
 
 /*
+ * The number of used IO TLB block
+ */
+static unsigned long io_tlb_used;
+
+/*
  * This is a free list describing the number of free entries available from
  * each index
  */
@@ -524,6 +532,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return DMA_MAPPING_ERROR;
 found:
+   io_tlb_used += nslots;
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -584,6 +593,8 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
 */
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
+
+   io_tlb_used -= nslots;
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
@@ -662,3 +673,36 @@ swiotlb_dma_supported(struct device *hwdev, u64 mask)
 {
return __phys_to_dma(hwdev, io_tlb_end - 1) <= mask;
 }
+
+#ifdef CONFIG_DEBUG_FS
+
+static int __init swiotlb_create_debugfs(void)
+{
+   static struct dentry *d_swiotlb_usage;
+   struct dentry *ent;
+
+   d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
+
+   if (!d_swiotlb_usage)
+   return -ENOMEM;
+
+   ent = debugfs_create_ulong("io_tlb_nslabs", 0400,
+  d_swiotlb_usage, _tlb_nslabs);
+   if (!ent)
+   goto fail;
+
+   ent = debugfs_create_ulong("io_tlb_used", 0400,
+  d_swiotlb_usage, _tlb_used);
+   if (!ent)
+   goto fail;
+
+   return 0;
+
+fail:
+   debugfs_remove_recursive(d_swiotlb_usage);
+   return -ENOMEM;
+}
+
+late_initcall(swiotlb_create_debugfs);
+
+#endif
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 1/2] swiotlb: add debugfs to track swiotlb buffer usage

2019-01-03 Thread Dongli Zhang
Hi Konrad,

Would you please take a look on those two patches?

In addition, there is another patch correcting an error in comment.

https://lkml.org/lkml/2018/12/5/1721

Thank you very much!

Dongli Zhang

On 12/11/2018 05:05 AM, Joe Jin wrote:
> On 12/10/18 12:00 PM, Tim Chen wrote:
>>> @@ -528,6 +538,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
>>> dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
>>> size);
>>> return SWIOTLB_MAP_ERROR;
>>>  found:
>>> +#ifdef CONFIG_DEBUG_FS
>>> +   io_tlb_used += nslots;
>>> +#endif
>> One nit I have about this patch is there are too many CONFIG_DEBUG_FS.
>>
>> For example here, instead of io_tlb_used, we can have a macro defined,
>> perhaps something like inc_iotlb_used(nslots).  It can be placed in the
>> same section that swiotlb_create_debugfs is defined so there's a single
>> place where all the CONFIG_DEBUG_FS stuff is located.
>>
>> Then define inc_iotlb_used to be null when we don't have
>> CONFIG_DEBUG_FS.
>>
> 
> Dongli had removed above ifdef/endif on his next patch, "[PATCH v2 2/2]
> swiotlb: checking whether swiotlb buffer is full with io_tlb_used"
> 
> Thanks,
> Joe
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/2] swiotlb: checking whether swiotlb buffer is full with io_tlb_used

2018-12-09 Thread Dongli Zhang
This patch uses io_tlb_used to help check whether swiotlb buffer is full.
io_tlb_used is no longer used for only debugfs. It is also used to help
optimize swiotlb_tbl_map_single().

Suggested-by: Joe Jin 
Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 3979c2c..9300341 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -76,12 +76,10 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
-#ifdef CONFIG_DEBUG_FS
 /*
  * The number of used IO TLB block
  */
 static unsigned long io_tlb_used;
-#endif
 
 /*
  * This is a free list describing the number of free entries available from
@@ -489,6 +487,10 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
 * request and allocate a buffer from that IO TLB pool.
 */
spin_lock_irqsave(_tlb_lock, flags);
+
+   if (unlikely(nslots > io_tlb_nslabs - io_tlb_used))
+   goto not_found;
+
index = ALIGN(io_tlb_index, stride);
if (index >= io_tlb_nslabs)
index = 0;
@@ -538,9 +540,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return SWIOTLB_MAP_ERROR;
 found:
-#ifdef CONFIG_DEBUG_FS
io_tlb_used += nslots;
-#endif
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -602,9 +602,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
 
-#ifdef CONFIG_DEBUG_FS
io_tlb_used -= nslots;
-#endif
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/2] swiotlb: add debugfs to track swiotlb buffer usage

2018-12-09 Thread Dongli Zhang
The device driver will not be able to do dma operations once swiotlb buffer
is full, either because the driver is using so many IO TLB blocks inflight,
or because there is memory leak issue in device driver. To export the
swiotlb buffer usage via debugfs would help the user estimate the size of
swiotlb buffer to pre-allocate or analyze device driver memory leak issue.

Signed-off-by: Dongli Zhang 
---
Changed since v1:
  * init debugfs with late_initcall (suggested by Robin Murphy)
  * create debugfs entries with debugfs_create_ulong(suggested by Robin Murphy)

 kernel/dma/swiotlb.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e..3979c2c 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -35,6 +35,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_DEBUG_FS
+#include 
+#endif
 
 #include 
 #include 
@@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
+#ifdef CONFIG_DEBUG_FS
+/*
+ * The number of used IO TLB block
+ */
+static unsigned long io_tlb_used;
+#endif
+
 /*
  * This is a free list describing the number of free entries available from
  * each index
@@ -528,6 +538,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return SWIOTLB_MAP_ERROR;
 found:
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used += nslots;
+#endif
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -588,6 +601,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
 */
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
+
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used -= nslots;
+#endif
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
@@ -883,3 +900,36 @@ const struct dma_map_ops swiotlb_dma_ops = {
.dma_supported  = dma_direct_supported,
 };
 EXPORT_SYMBOL(swiotlb_dma_ops);
+
+#ifdef CONFIG_DEBUG_FS
+
+static int __init swiotlb_create_debugfs(void)
+{
+   static struct dentry *d_swiotlb_usage;
+   struct dentry *ent;
+
+   d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
+
+   if (!d_swiotlb_usage)
+   return -ENOMEM;
+
+   ent = debugfs_create_ulong("io_tlb_nslabs", 0400,
+  d_swiotlb_usage, _tlb_nslabs);
+   if (!ent)
+   goto fail;
+
+   ent = debugfs_create_ulong("io_tlb_used", 0400,
+   d_swiotlb_usage, _tlb_used);
+   if (!ent)
+   goto fail;
+
+   return 0;
+
+fail:
+   debugfs_remove_recursive(d_swiotlb_usage);
+   return -ENOMEM;
+}
+
+late_initcall(swiotlb_create_debugfs);
+
+#endif
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH RFC 1/1] swiotlb: add debugfs to track swiotlb buffer usage

2018-12-06 Thread Dongli Zhang



On 12/07/2018 12:12 AM, Joe Jin wrote:
> Hi Dongli,
> 
> Maybe move d_swiotlb_usage declare into swiotlb_create_debugfs():

I assume the call of swiotlb_tbl_map_single() might be frequent in some
situations, e.g., when 'swiotlb=force'.

That's why I declare the d_swiotlb_usage out of any functions and use "if
(unlikely(!d_swiotlb_usage))".

I think "if (unlikely(!d_swiotlb_usage))" incur less performance overhead than
calling swiotlb_create_debugfs() every time to confirm if debugfs is created. I
would declare d_swiotlb_usage statically inside swiotlb_create_debugfs() if the
performance overhead is acceptable (it is trivial indeed).


That is the reason I tag the patch with RFC because I am not sure if the
on-demand creation of debugfs is fine with maintainers/reviewers. If swiotlb
pages are never allocated, we would not be able to see the debugfs entry.

I would prefer to limit the modification within swiotlb and to not taint any
other files.

The drawback is there is no place to create or delete the debugfs entry because
swiotlb buffer could be initialized and uninitialized at very early stage.

> 
> void swiotlb_create_debugfs(void)
> {
> #ifdef CONFIG_DEBUG_FS
>   static struct dentry *d_swiotlb_usage = NULL;
> 
>   if (d_swiotlb_usage)
>   return;
> 
>   d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
> 
>   if (!d_swiotlb_usage)
>   return;
> 
>   debugfs_create_file("usage", 0600, d_swiotlb_usage,
>   NULL, _usage_fops);
> #endif
> }
> 
> And for io_tlb_used, possible add a check at the begin of 
> swiotlb_tbl_map_single(),
> if there were not any free slots or not enough slots, return fail directly?

This would optimize the slots allocation path. I will follow this in next
version after I got more suggestions and confirmations from maintainers.


Thank you very much!

Dongli Zhang

> 
> Thanks,
> Joe
> On 12/5/18 7:59 PM, Dongli Zhang wrote:
>> The device driver will not be able to do dma operations once swiotlb buffer
>> is full, either because the driver is using so many IO TLB blocks inflight,
>> or because there is memory leak issue in device driver. To export the
>> swiotlb buffer usage via debugfs would help the user estimate the size of
>> swiotlb buffer to pre-allocate or analyze device driver memory leak issue.
>>
>> As the swiotlb can be initialized at very early stage when debugfs cannot
>> register successfully, this patch creates the debugfs entry on demand.
>>
>> Signed-off-by: Dongli Zhang 
>> ---
>>  kernel/dma/swiotlb.c | 57 
>> 
>>  1 file changed, 57 insertions(+)
>>
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index 045930e..d3c8aa4 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -35,6 +35,9 @@
>>  #include 
>>  #include 
>>  #include 
>> +#ifdef CONFIG_DEBUG_FS
>> +#include 
>> +#endif
>>  
>>  #include 
>>  #include 
>> @@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end;
>>   */
>>  static unsigned long io_tlb_nslabs;
>>  
>> +#ifdef CONFIG_DEBUG_FS
>> +/*
>> + * The number of used IO TLB block
>> + */
>> +static unsigned long io_tlb_used;
>> +#endif
>> +
>>  /*
>>   * This is a free list describing the number of free entries available from
>>   * each index
>> @@ -100,6 +110,41 @@ static DEFINE_SPINLOCK(io_tlb_lock);
>>  
>>  static int late_alloc;
>>  
>> +#ifdef CONFIG_DEBUG_FS
>> +
>> +static struct dentry *d_swiotlb_usage;
>> +
>> +static int swiotlb_usage_show(struct seq_file *m, void *v)
>> +{
>> +seq_printf(m, "%lu\n%lu\n", io_tlb_used, io_tlb_nslabs);
>> +return 0;
>> +}
>> +
>> +static int swiotlb_usage_open(struct inode *inode, struct file *filp)
>> +{
>> +return single_open(filp, swiotlb_usage_show, NULL);
>> +}
>> +
>> +static const struct file_operations swiotlb_usage_fops = {
>> +.open   = swiotlb_usage_open,
>> +.read   = seq_read,
>> +.llseek = seq_lseek,
>> +.release= single_release,
>> +};
>> +
>> +void swiotlb_create_debugfs(void)
>> +{
>> +d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
>> +
>> +if (!d_swiotlb_usage)
>> +return;
>> +
>> +debugfs_create_file("usage", 0600, d_swiotlb_usage,
>> +NULL, _usage_fops);
>> +}
>> +
>> 

[PATCH 1/1] swiotlb: fix comment on swiotlb_bounce()

2018-12-05 Thread Dongli Zhang
Fix the comment as swiotlb_bounce() is used to copy from original dma
location to swiotlb buffer during swiotlb_tbl_map_single(), while to
copy from swiotlb buffer to original dma location during
swiotlb_tbl_unmap_single().

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e..0a5ec94 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -389,7 +389,7 @@ static int is_swiotlb_buffer(phys_addr_t paddr)
 }
 
 /*
- * Bounce: copy the swiotlb buffer back to the original dma location
+ * Bounce: copy the swiotlb buffer from or back to the original dma location
  */
 static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
   size_t size, enum dma_data_direction dir)
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH RFC 1/1] swiotlb: add debugfs to track swiotlb buffer usage

2018-12-05 Thread Dongli Zhang
The device driver will not be able to do dma operations once swiotlb buffer
is full, either because the driver is using so many IO TLB blocks inflight,
or because there is memory leak issue in device driver. To export the
swiotlb buffer usage via debugfs would help the user estimate the size of
swiotlb buffer to pre-allocate or analyze device driver memory leak issue.

As the swiotlb can be initialized at very early stage when debugfs cannot
register successfully, this patch creates the debugfs entry on demand.

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 57 
 1 file changed, 57 insertions(+)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 045930e..d3c8aa4 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -35,6 +35,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_DEBUG_FS
+#include 
+#endif
 
 #include 
 #include 
@@ -73,6 +76,13 @@ static phys_addr_t io_tlb_start, io_tlb_end;
  */
 static unsigned long io_tlb_nslabs;
 
+#ifdef CONFIG_DEBUG_FS
+/*
+ * The number of used IO TLB block
+ */
+static unsigned long io_tlb_used;
+#endif
+
 /*
  * This is a free list describing the number of free entries available from
  * each index
@@ -100,6 +110,41 @@ static DEFINE_SPINLOCK(io_tlb_lock);
 
 static int late_alloc;
 
+#ifdef CONFIG_DEBUG_FS
+
+static struct dentry *d_swiotlb_usage;
+
+static int swiotlb_usage_show(struct seq_file *m, void *v)
+{
+   seq_printf(m, "%lu\n%lu\n", io_tlb_used, io_tlb_nslabs);
+   return 0;
+}
+
+static int swiotlb_usage_open(struct inode *inode, struct file *filp)
+{
+   return single_open(filp, swiotlb_usage_show, NULL);
+}
+
+static const struct file_operations swiotlb_usage_fops = {
+   .open   = swiotlb_usage_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+void swiotlb_create_debugfs(void)
+{
+   d_swiotlb_usage = debugfs_create_dir("swiotlb", NULL);
+
+   if (!d_swiotlb_usage)
+   return;
+
+   debugfs_create_file("usage", 0600, d_swiotlb_usage,
+   NULL, _usage_fops);
+}
+
+#endif
+
 static int __init
 setup_io_tlb_npages(char *str)
 {
@@ -449,6 +494,11 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
pr_warn_once("%s is active and system is using DMA bounce 
buffers\n",
 sme_active() ? "SME" : "SEV");
 
+#ifdef CONFIG_DEBUG_FS
+   if (unlikely(!d_swiotlb_usage))
+   swiotlb_create_debugfs();
+#endif
+
mask = dma_get_seg_boundary(hwdev);
 
tbl_dma_addr &= mask;
@@ -528,6 +578,9 @@ phys_addr_t swiotlb_tbl_map_single(struct device *hwdev,
dev_warn(hwdev, "swiotlb buffer is full (sz: %zd bytes)\n", 
size);
return SWIOTLB_MAP_ERROR;
 found:
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used += nslots;
+#endif
spin_unlock_irqrestore(_tlb_lock, flags);
 
/*
@@ -588,6 +641,10 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
 */
for (i = index - 1; (OFFSET(i, IO_TLB_SEGSIZE) != 
IO_TLB_SEGSIZE -1) && io_tlb_list[i]; i--)
io_tlb_list[i] = ++count;
+
+#ifdef CONFIG_DEBUG_FS
+   io_tlb_used -= nslots;
+#endif
}
spin_unlock_irqrestore(_tlb_lock, flags);
 }
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] swiotlb: fix comment on swiotlb_bounce()

2018-08-28 Thread Dongli Zhang
Fix the comment as swiotlb_bounce() is used to copy from original dma
location to swiotlb buffer during swiotlb_tbl_map_single(), while to
copy from swiotlb buffer to original dma location during
swiotlb_tbl_unmap_single().

Signed-off-by: Dongli Zhang 
---
 kernel/dma/swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 4f8a6db..8fd4cfe 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -435,7 +435,7 @@ int is_swiotlb_buffer(phys_addr_t paddr)
 }
 
 /*
- * Bounce: copy the swiotlb buffer back to the original dma location
+ * Bounce: copy the swiotlb buffer from or back to the original dma location
  */
 static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
   size_t size, enum dma_data_direction dir)
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu