Re: [PATCH v2] powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs

2023-05-02 Thread Gaurav Batra

Hello Alexey,

I recently joined IOMMU team. There was a bug reported by test team 
where Mellanox driver was timing out during configuration. I proposed a 
fix for the same, which is below in the email.


You suggested a fix for Srikar's reported problem. Basically, both these 
fixes will resolve Srikar and Mellanox driver issues. The problem is 
with 2MB DDW.


Since you have extensive knowledge of IOMMU design and code, in your 
opinion, which patch should we adopt?


Thanks a lot

Gaurav

On 4/20/23 2:45 PM, Gaurav Batra wrote:

Hello Michael,

I was looking into the Bug: 199106 
(https://bugzilla.linux.ibm.com/show_bug.cgi?id=199106).


In the Bug, Mellanox driver was timing out when enabling SRIOV device.

I tested, Alexey's patch and it fixes the issue with Mellanox driver. 
The down side


to Alexey's fix is that even a small memory request by the driver will 
be aligned up


to 2MB. In my test, the Mellanox driver is issuing multiple requests 
of 64K size.


All these will get aligned up to 2MB, which is quite a waste of 
resources.



In any case, both the patches work. Let me know which approach you 
prefer. In case


we decide to go with my patch, I just realized that I need to fix 
nio_pages in


iommu_free_coherent() as well.


Thanks,

Gaurav

On 4/20/23 10:21 AM, Michael Ellerman wrote:

Gaurav Batra  writes:

When DMA window is backed by 2MB TCEs, the DMA address for the mapped
page should be the offset of the page relative to the 2MB TCE. The code
was incorrectly setting the DMA address to the beginning of the TCE
range.

Mellanox driver is reporting timeout trying to ENABLE_HCA for an SR-IOV
ethernet port, when DMA window is backed by 2MB TCEs.

I assume this is similar or related to the bug Srikar reported?

https://lore.kernel.org/linuxppc-dev/20230323095333.gi1005...@linux.vnet.ibm.com/

In that thread Alexey suggested a patch, have you tried his patch? He
suggested rounding up the allocation size, rather than adjusting the
dma_handle.


Fixes: 3872731187141d5d0a5c4fb30007b8b9ec36a44d

That's not the right syntax, it's described in the documentation how to
generate it.

It should be:

   Fixes: 387273118714 ("powerps/pseries/dma: Add support for 2M 
IOMMU page size")


cheers


diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ee95937bdaf1..ca57526ce47a 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -517,7 +517,7 @@ int ppc_iommu_map_sg(struct device *dev, struct 
iommu_table *tbl,

  /* Convert entry to a dma_addr_t */
  entry += tbl->it_offset;
  dma_addr = entry << tbl->it_page_shift;
-    dma_addr |= (s->offset & ~IOMMU_PAGE_MASK(tbl));
+    dma_addr |= (vaddr & ~IOMMU_PAGE_MASK(tbl));
    DBG("  - %lu pages, entry: %lx, dma_addr: %lx\n",
  npages, entry, dma_addr);
@@ -904,6 +904,7 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  unsigned int order;
  unsigned int nio_pages, io_order;
  struct page *page;
+    int tcesize = (1 << tbl->it_page_shift);
    size = PAGE_ALIGN(size);
  order = get_order(size);
@@ -930,7 +931,8 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  memset(ret, 0, size);
    /* Set up tces to cover the allocated range */
-    nio_pages = size >> tbl->it_page_shift;
+    nio_pages = IOMMU_PAGE_ALIGN(size, tbl) >> tbl->it_page_shift;
+
  io_order = get_iommu_order(size, tbl);
  mapping = iommu_alloc(dev, tbl, ret, nio_pages, 
DMA_BIDIRECTIONAL,

    mask >> tbl->it_page_shift, io_order, 0);
@@ -938,7 +940,8 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  free_pages((unsigned long)ret, order);
  return NULL;
  }
-    *dma_handle = mapping;
+
+    *dma_handle = mapping | ((u64)ret & (tcesize - 1));
  return ret;
  }
  --


[PATCH RFC] rcu: torture: shorten the time between forward-progress tests

2023-05-02 Thread zhouzhouyi
From: Zhouyi Zhou 

Currently, default time between rcu torture forward-progress tests is 60HZ,
Under this configuration, false positive caused by __stack_chk_fail [1] is
difficult to reproduce (needs average 5*420 seconds for SRCU-P),
which means one has to invoke [2] 5 times in average to make [1] appear.

With time between rcu torture forward-progress tests be 1 HZ, above
phenomenon will be reproduced within 3 minutes, which means we can
reproduce [1] everytime we invoke [2].

Although [1] is a false positive, this change will make possible future
true bugs easier to be discovered.
   
[1] Link: 
https://lore.kernel.org/lkml/CAABZP2yS5=zuwezq7ihkv0wdm_hgo8k-teahyjrzhavzkda...@mail.gmail.com/T/
[2] tools/testing/selftests/rcutorture/bin/torture.sh

Tested in PPC VM of Opensource Lab of Oregon State Univerisity.

Signed-off-by: Zhouyi Zhou 
---
 tools/testing/selftests/rcutorture/configs/rcu/SRCU-N.boot  | 1 +
 tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot  | 1 +
 tools/testing/selftests/rcutorture/configs/rcu/TRACE02.boot | 1 +
 tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot  | 1 +
 tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot  | 1 +
 5 files changed, 5 insertions(+)

diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-N.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-N.boot
index ce0694fd9b92..982582bff041 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-N.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-N.boot
@@ -1,2 +1,3 @@
 rcutorture.torture_type=srcu
 rcutorture.fwd_progress=3
+rcutorture.fwd_progress_holdoff=1
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot
index 2db39f298d18..18f5d7361d8a 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-P.boot
@@ -1,4 +1,5 @@
 rcutorture.torture_type=srcud
 rcupdate.rcu_self_test=1
 rcutorture.fwd_progress=3
+rcutorture.fwd_progress_holdoff=1
 srcutree.big_cpu_lim=5
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TRACE02.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/TRACE02.boot
index c70b5db6c2ae..b86bc7df7603 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TRACE02.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TRACE02.boot
@@ -1,2 +1,3 @@
 rcutorture.torture_type=tasks-tracing
 rcutorture.fwd_progress=2
+rcutorture.fwd_progress_holdoff=1
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot
index dd914fa8f690..933302f885df 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE02.boot
@@ -1 +1,2 @@
 rcutorture.fwd_progress=2
+rcutorture.fwd_progress_holdoff=1
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot 
b/tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot
index dd914fa8f690..933302f885df 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE10.boot
@@ -1 +1,2 @@
 rcutorture.fwd_progress=2
+rcutorture.fwd_progress_holdoff=1
-- 
2.34.1



[Bug 217390] use after free in spufs_switch_log_poll

2023-05-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217390

James Kim (james010...@gmail.com) changed:

   What|Removed |Added

   Severity|high|normal

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 217390] New: use after free in spufs_switch_log_poll

2023-05-02 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217390

Bug ID: 217390
   Summary: use after free in spufs_switch_log_poll
   Product: Platform Specific/Hardware
   Version: 2.5
  Hardware: PPC-32
OS: Linux
Status: NEW
  Severity: high
  Priority: P3
 Component: PPC-32
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: james010...@gmail.com
Regression: No

When T2(poll) and T3(release) run concurrently by file_operations calls,
use-after-free happens due to the mistake of proper lock.

(ALLOC)
https://elixir.bootlin.com/linux/latest/source/arch/powerpc/platforms/cell/spufs/file.c#L2298
T1. open
2298 static int spufs_switch_log_open(struct inode *inode, struct file *file) {
2300struct spu_context *ctx = SPUFS_I(inode)->i_ctx;
2301int rc;
…..
2312ctx->switch_log = kmalloc(struct_size(ctx->switch_log, log,
2313  SWITCH_LOG_BUFSIZE), GFP_KERNEL); //
ALLOC-site
….
2327 }

The spufs_switch_log_open malloced ctx->switch_log that globally.

T2. poll
2431 static __poll_t spufs_switch_log_poll(struct file *file, poll_table *wait)
2432 {
2433struct inode *inode = file_inode(file);
2434struct spu_context *ctx = SPUFS_I(inode)->i_ctx;
2435__poll_t mask = 0;
2436int rc;

2438poll_wait(file, &ctx->switch_log->wait, wait); // delayed by ‘wait’
callback
2430 // ctx->switch_log can be free by T3. 
2440rc = spu_acquire(ctx);
2441if (rc)
2442return rc;

2444if (spufs_switch_log_used(ctx) > 0) // USE-site
2445mask |= EPOLLIN;

2447spu_release(ctx);

2449return mask;
2450 }

static inline void poll_wait(struct file * filp, wait_queue_head_t *
wait_address, poll_table *p)
{
if (p && p->_qproc && wait_address)
p->_qproc(filp, wait_address, p);  // The callback makes delays
}

T3. release
https://elixir.bootlin.com/linux/latest/source/arch/powerpc/platforms/cell/spufs/file.c#L2329
2329 static int spufs_switch_log_release(struct inode *inode, struct file
*file)
2330 {
2331struct spu_context *ctx = SPUFS_I(inode)->i_ctx;
2332int rc;

2334rc = spu_acquire(ctx);
2335if (rc)
2336return rc;

2338kfree(ctx->switch_log);   // FREE-site
2339ctx->switch_log = NULL;
2340spu_release(ctx);

2342return 0;
2343 }


Fix could maybe be something like:

--- a/arch/powerpc/platforms/cell/spufs/file.c
+++ b/arch/powerpc/platforms/cell/spufs/file.c
*** static __poll_t spufs_switch_log_poll(struct file *fil
*** 2435,2443 
__poll_t mask = 0;
int rc;

poll_wait(file, &ctx->switch_log->wait, wait);

-   rc = spu_acquire(ctx);
if (rc)
return rc;

--- 2435,2443 
__poll_t mask = 0;
int rc;

+   rc = spu_acquire(ctx);
poll_wait(file, &ctx->switch_log->wait, wait);

if (rc)
return rc;

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-02 Thread Michael Ellerman
Christian Zigotzky  writes:
> Hello,
>
> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
>
> The kernel hangs right after the booting Linux via __start() @ 
> 0x ...
>
> I was able to revert the PowerPC updates 6.4-1 [2] with the following 
> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
>
> After a re-compiling, the kernel boots without any problems without the 
> PowerPC updates 6.4-1 [2].
>
> Could you please explain me, what you have done in the boot area?

There's a few possibilities, but nothing obvious.

To begin with can you please test the following commits?

77e69ee7ce07
e4ab08be5b49
eeac8ede1755

cheers


Re: [PASEMI NEMO] Boot issue with the PowerPC updates 6.4-1

2023-05-02 Thread Christophe Leroy
Hello,

Le 02/05/2023 à 04:22, Christian Zigotzky a écrit :
> Hello,
> 
> Our PASEMI Nemo board [1] doesn't boot with the PowerPC updates 6.4-1 [2].
> 
> The kernel hangs right after the booting Linux via __start() @ 
> 0x ...
> 
> I was able to revert the PowerPC updates 6.4-1 [2] with the following 
> command: git revert 70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7 -m 1
> 
> After a re-compiling, the kernel boots without any problems without the 
> PowerPC updates 6.4-1 [2].

You are reverting the entire powerpc changes, that's helpless.

Can you do a bisect ?

Thanks
Christophe

> 
> Could you please explain me, what you have done in the boot area?
> 
> Please find attached the kernel config.
> 
> Thanks,
> Christian
> 
> 
> [1] https://en.wikipedia.org/wiki/AmigaOne_X1000
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=70cc1b5307e8ee3076fdf2ecbeb89eb973aa0ff7