date:20230522

Re: [PATCH v3 02/12] powerpc/ptrace: Add missing include

2023-05-22 Thread Andrew Donnellan

On Fri, 2023-05-19 at 15:02 +1000, Benjamin Gray wrote:
> ptrace-decl.h uses user_regset_get2_fn (among other things) from
> regset.h. While all current users of ptrace-decl.h include regset.h
> before it anyway, it adds an implicit ordering dependency and breaks
> source tooling that tries to inspect ptrace-decl.h by itself.
> 
> Signed-off-by: Benjamin Gray 
> Reviewed-by: Russell Currey 

Reviewed-by: Andrew Donnellan 

-- 
Andrew DonnellanOzLabs, ADL Canberra
a...@linux.ibm.com   IBM Australia Limited

Re: [PATCH v3 01/12] powerpc/book3s: Add missing include

2023-05-22 Thread Andrew Donnellan

On Fri, 2023-05-19 at 15:02 +1000, Benjamin Gray wrote:
> The functions here use struct thread_struct fields, so need to import
> the full definition from . The  header
> that defines current only forward declares struct thread_struct.

AFAICT, struct thread_struct is defined in
arch/powerpc/include/asm/processor.h? I think you mean struct
task_struct (which is what's forward declared in asm/current.h, not
thread_struct).

> 
> Failing to include this  header leads to a compilation
> error when a translation unit does not also include 
> indirectly.
> 
> Signed-off-by: Benjamin Gray 
> Reviewed-by: Nicholas Piggin 
> Reviewed-by: Russell Currey 
> 
> ---
> 
> v3: * Add ruscur reviewed-by
> v1: * Add npiggin reviewed-by
> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h
> b/arch/powerpc/include/asm/book3s/64/kup.h
> index 54cf46808157..84c09e546115 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
> @@ -194,6 +194,7 @@
>  #else /* !__ASSEMBLY__ */
>  
>  #include 
> +#include 
>  
>  DECLARE_STATIC_KEY_FALSE(uaccess_flush_key);
>  

-- 
Andrew DonnellanOzLabs, ADL Canberra
a...@linux.ibm.com   IBM Australia Limited

Re: [PATCH v2 0/3] Remove iommu_group_remove_device() from fsl

2023-05-22 Thread Joerg Roedel

On Tue, May 16, 2023 at 09:35:25PM -0300, Jason Gunthorpe wrote:
> With POWER SPAPR now having a real iommu driver and using the normal group
> lifecycle stuff fixing FSL will leave only VFIO's no-iommu support as a
> user for the iommu_group_add/remove_device() calls. This will help
> simplify the understanding of what the core code should be doing for these
> functions.
> 
> Fix FSL to not need to call iommu_group_remove_device() at all.
> 
> v2:
>  - Change the approach to use driver_managed_dma
>  - Really simplify fsl_pamu_device_group() and just put everything in one
>function
>  - New patch to make missing OF properties a probe failure
> v1: 
> https://lore.kernel.org/r/0-v1-1421774b874b+167-ppc_device_group_...@nvidia.com
> 
> Jason Gunthorpe (3):
>   iommu/fsl: Always allocate a group for non-pci devices
>   iommu/fsl: Move ENODEV to fsl_pamu_probe_device()
>   iommu/fsl: Use driver_managed_dma to allow VFIO to work
> 
>  arch/powerpc/sysdev/fsl_pci.c   |   1 +
>  drivers/iommu/fsl_pamu_domain.c | 123 +---
>  2 files changed, 36 insertions(+), 88 deletions(-)

Any chance someone can test this on real hardware?

Regards,

Joerg

Re: [PATCH v2 03/11] asm-generic/mmiowb: Mark accesses to fix KCSAN warnings

2023-05-22 Thread Rohan McLure

On Wed May 10, 2023 at 1:31 PM AEST, Rohan McLure wrote:
> Prior to this patch, data races are detectable by KCSAN of the following
> forms:
> 
> [1] Asynchronous calls to mmiowb_set_pending() from an interrupt context
>or otherwise outside of a critical section
> [2] Interrupted critical sections, where the interrupt will itself
>acquire a lock
> 
> In case [1], calling context does not need an mmiowb() call to be
> issued, otherwise it would do so itself. Such calls to
> mmiowb_set_pending() are either idempotent or no-ops.
> 
> In case [2], irrespective of when the interrupt occurs, the interrupt
> will acquire and release its locks prior to its return, nesting_count
> will continue balanced. In the worst case, the interrupted critical
> section during a mmiowb_spin_unlock() call observes an mmiowb to be
> pending and afterward is interrupted, leading to an extraneous call to
> mmiowb(). This data race is clearly innocuous.
> 
> Mark all potentially asynchronous memory accesses with READ_ONCE or
> WRITE_ONCE, including increments and decrements to nesting_count. This
> has the effect of removing KCSAN warnings at consumer's callsites.
> 
> Signed-off-by: Rohan McLure 
> Reported-by: Michael Ellerman 
> Reported-by: Gautam Menghani 
> Tested-by: Gautam Menghani 
> Acked-by: Arnd Bergmann 
> ---
> v2: Remove extraneous READ_ONCE in mmiowb_set_pending for nesting_count
> ---
> include/asm-generic/mmiowb.h | 14 +-
> 1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> index 5698fca3bf56..6dea28c8835b 100644
> --- a/include/asm-generic/mmiowb.h
> +++ b/include/asm-generic/mmiowb.h
> @@ -37,25 +37,29 @@ static inline void mmiowb_set_pending(void)
>   struct mmiowb_state *ms = __mmiowb_state();
> 
>   if (likely(ms->nesting_count))
> - ms->mmiowb_pending = ms->nesting_count;
> + WRITE_ONCE(ms->mmiowb_pending, ms->nesting_count);
> }
> 
> static inline void mmiowb_spin_lock(void)
> {
>   struct mmiowb_state *ms = __mmiowb_state();
> - ms->nesting_count++;
> +
> + /* Increment need not be atomic. Nestedness is balanced over 
> interrupts. */
> + WRITE_ONCE(ms->nesting_count, READ_ONCE(ms->nesting_count) + 1);
> }
> 
> static inline void mmiowb_spin_unlock(void)
> {
>   struct mmiowb_state *ms = __mmiowb_state();
> + u16 pending = READ_ONCE(ms->mmiowb_pending);
> 
> - if (unlikely(ms->mmiowb_pending)) {
> - ms->mmiowb_pending = 0;
> + WRITE_ONCE(ms->mmiowb_pending, 0);
> + if (unlikely(pending)) {
>   mmiowb();
>   }
> 
> - ms->nesting_count--;
> + /* Decrement need not be atomic. Nestedness is balanced over 
> interrupts. */
> + WRITE_ONCE(ms->nesting_count, READ_ONCE(ms->nesting_count) - 1);

Still think the nesting_counts don't need WRITE_ONCE/READ_ONCE.
data_race() maybe but I don't know if it's even classed as a data
race. How does KCSAN handle/annotate preempt_count, for example?

Thanks,
Nick

[PATCH] soc: fsl: qe: Replace all non-returning strlcpy with strscpy

2023-05-22 Thread Azeem Shaikh

strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().
No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Signed-off-by: Azeem Shaikh 
---
 drivers/soc/fsl/qe/qe.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/qe/qe.c b/drivers/soc/fsl/qe/qe.c
index b3c226eb5292..58746e570d14 100644
--- a/drivers/soc/fsl/qe/qe.c
+++ b/drivers/soc/fsl/qe/qe.c
@@ -524,7 +524,7 @@ int qe_upload_firmware(const struct qe_firmware *firmware)
 * saved microcode information and put in the new.
 */
memset(&qe_firmware_info, 0, sizeof(qe_firmware_info));
-   strlcpy(qe_firmware_info.id, firmware->id, sizeof(qe_firmware_info.id));
+   strscpy(qe_firmware_info.id, firmware->id, sizeof(qe_firmware_info.id));
qe_firmware_info.extended_modes = be64_to_cpu(firmware->extended_modes);
memcpy(qe_firmware_info.vtraps, firmware->vtraps,
sizeof(firmware->vtraps));
@@ -599,7 +599,7 @@ struct qe_firmware_info *qe_get_firmware_info(void)
/* Copy the data into qe_firmware_info*/
sprop = of_get_property(fw, "id", NULL);
if (sprop)
-   strlcpy(qe_firmware_info.id, sprop,
+   strscpy(qe_firmware_info.id, sprop,
sizeof(qe_firmware_info.id));
 
of_property_read_u64(fw, "extended-modes",

Re: [PATCH v2 03/11] asm-generic/mmiowb: Mark accesses to fix KCSAN warnings

2023-05-22 Thread Rohan McLure

On 23 May 2023, at 10:28 am, Rohan McLure  wrote:
> 
> On Wed May 10, 2023 at 1:31 PM AEST, Rohan McLure wrote:
>> Prior to this patch, data races are detectable by KCSAN of the following
>> forms:
>> 
>> [1] Asynchronous calls to mmiowb_set_pending() from an interrupt context
>>or otherwise outside of a critical section
>> [2] Interrupted critical sections, where the interrupt will itself
>>acquire a lock
>> 
>> In case [1], calling context does not need an mmiowb() call to be
>> issued, otherwise it would do so itself. Such calls to
>> mmiowb_set_pending() are either idempotent or no-ops.
>> 
>> In case [2], irrespective of when the interrupt occurs, the interrupt
>> will acquire and release its locks prior to its return, nesting_count
>> will continue balanced. In the worst case, the interrupted critical
>> section during a mmiowb_spin_unlock() call observes an mmiowb to be
>> pending and afterward is interrupted, leading to an extraneous call to
>> mmiowb(). This data race is clearly innocuous.
>> 
>> Mark all potentially asynchronous memory accesses with READ_ONCE or
>> WRITE_ONCE, including increments and decrements to nesting_count. This
>> has the effect of removing KCSAN warnings at consumer's callsites.
>> 
>> Signed-off-by: Rohan McLure 
>> Reported-by: Michael Ellerman 
>> Reported-by: Gautam Menghani 
>> Tested-by: Gautam Menghani 
>> Acked-by: Arnd Bergmann 
>> ---
>> v2: Remove extraneous READ_ONCE in mmiowb_set_pending for nesting_count
>> ---
>> include/asm-generic/mmiowb.h | 14 +-
>> 1 file changed, 9 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
>> index 5698fca3bf56..6dea28c8835b 100644
>> --- a/include/asm-generic/mmiowb.h
>> +++ b/include/asm-generic/mmiowb.h
>> @@ -37,25 +37,29 @@ static inline void mmiowb_set_pending(void)
>> struct mmiowb_state *ms = __mmiowb_state();
>> 
>> if (likely(ms->nesting_count))
>> - ms->mmiowb_pending = ms->nesting_count;
>> + WRITE_ONCE(ms->mmiowb_pending, ms->nesting_count);
>> }
>> 
>> static inline void mmiowb_spin_lock(void)
>> {
>> struct mmiowb_state *ms = __mmiowb_state();
>> - ms->nesting_count++;
>> +
>> + /* Increment need not be atomic. Nestedness is balanced over interrupts. */
>> + WRITE_ONCE(ms->nesting_count, READ_ONCE(ms->nesting_count) + 1);
>> }
>> 
>> static inline void mmiowb_spin_unlock(void)
>> {
>> struct mmiowb_state *ms = __mmiowb_state();
>> + u16 pending = READ_ONCE(ms->mmiowb_pending);
>> 
>> - if (unlikely(ms->mmiowb_pending)) {
>> - ms->mmiowb_pending = 0;
>> + WRITE_ONCE(ms->mmiowb_pending, 0);
>> + if (unlikely(pending)) {
>> mmiowb();
>> }
>> 
>> - ms->nesting_count--;
>> + /* Decrement need not be atomic. Nestedness is balanced over interrupts. */
>> + WRITE_ONCE(ms->nesting_count, READ_ONCE(ms->nesting_count) - 1);
> 
> Still think the nesting_counts don't need WRITE_ONCE/READ_ONCE.
> data_race() maybe but I don't know if it's even classed as a data
> race. How does KCSAN handle/annotate preempt_count, for example?

Wow sorry my mail client has some unhelpful keybindings - I don’t know why it
thought I’d want to resend your last item!

Yeah I agree, we don’t need the compiler guarantees of WRITE_ONCE/READ_ONCE, and
yet it’s also not a real data-race. I think I’ll apply data_race() and comment 
as
I’m still seeing KCSAN warnings here.

Just from inspection, it appears as if __preempt_count_{add,sub} are unmarked 
and
so likely to generate KCSAN warnings also, but also asm-generic/preempt.h I 
think
hasn’t been updated to address any such warnings.

> 
> Thanks,
> Nick

Re: [PATCH] scsi: ibmvscsi: Replace all non-returning strlcpy with strscpy

2023-05-22 Thread Kees Cook

On Wed, 17 May 2023 14:34:09 +, Azeem Shaikh wrote:
> strlcpy() reads the entire source buffer first.
> This read may exceed the destination size limit.
> This is both inefficient and can lead to linear read
> overflows if a source string is not NUL-terminated [1].
> In an effort to remove strlcpy() completely [2], replace
> strlcpy() here with strscpy().
> No return values were used, so direct replacement is safe.
> 
> [...]

Applied to for-next/hardening, thanks!

[1/1] scsi: ibmvscsi: Replace all non-returning strlcpy with strscpy
  https://git.kernel.org/kees/c/015f6618194e

-- 
Kees Cook

Re: [PATCH] powerpc/iommu: limit number of TCEs to 512 for H_STUFF_TCE hcall

2023-05-22 Thread Gaurav Batra



On 5/17/23 7:19 AM, Michael Ellerman wrote:

Gaurav Batra  writes:

Hello Michael,

System test hit the crash. I believe, it was PHYP that resulted in it
due to number of TCEs passed in to be >512.

OK. It's always good to spell out in the change log whether it's a
theoretical/unlikely bug, or one that's actually been hit in testing or
the field.
I will submit another version of the patch with some changes in the log 
once I figure out how to Tag it for stable (as mentioned below).

I was wondering about the Fixes tag as well. But, this interface, in
it's current form, is there from the day the file was created. So, in
this case, should I mention the first commit which created this source file?

If it really goes back to the origin commit, then it's probably better
to just say so and tag it for stable, rather than pointing to 1da177e4.
How to do I tag it for stable? Will it be part of the "Fixes:" tag or 
some other tag?


I wonder though is there something else that changed that means this bug
is now being hit but wasn't before? Or maybe it's just that we are
testing on systems with large enough amounts of memory to hit this but
which aren't using a direct mapping?


From the details in Bugzilla, it does seems like the HCALL was 
previously taking long as well but PHYP was more relaxed about it. Now, 
PHYP is limiting on how long can an HCALL take.


Below are some excerpts from the Bug: 202349

Linux is passing too many counts in H_STUFF_TCE. The higher the counts, 
the longer the HCALL takes. From a Hypervisor perspective, we cannot 
stop Linux from doing this or it will violate the rules in the PAPR 
(which then would cause Linux to crash). The dispatcher team has 
"tightened the screws" on long running HCALLs by causing this trap to 
fire. From our discussions, they will not put the limits back where they 
were before.



Thanks

Gaurav



cheers

Re: [PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

2023-05-22 Thread Vladimir Oltean

On Mon, May 22, 2023 at 10:42:04AM -0400, Sean Anderson wrote:
> Have you had a chance to review this driver?

Partially / too little (and no, I don't have an answer yet). I am
debugging a SERDES protocol change procedure from XFI to SGMII.

Re: [PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

2023-05-22 Thread Sean Anderson

Hi Vladmir,

On 5/1/23 11:03, Sean Anderson wrote:
> On 4/29/23 13:24, Vladimir Oltean wrote:
>> On Wed, Apr 26, 2023 at 10:50:17AM -0400, Sean Anderson wrote:
>>> > I need to catch up with 14 rounds of patches from you and with the
>>> > discussions that took place on each version, and understand how you
>>> > responded to feedback like "don't remove PHY interrupts without finding
>>> > out why they don't work"
>>> 
>>> All I can say is that
>>> 
>>> - It doesn't work on my board
>>> - The traces are on the bottom of the PCB
>>> - The signal goes through an FPGA which (unlike the LS1046ARDB) is 
>>> closed-source
>> 
>> I don't understand the distinction you are making here. Are the sources
>> for QIXIS bit streams public for any Layerscape board?
> 
> Correct. The sources for the LS1046ARDB QIXIS are available for download.
> 
>>> - The alternative is polling once a second (not terribly intensive)
>> 
>> It makes a difference to performance (forwarded packets per second), believe 
>> it or not.
> 
> I don't. Please elaborate how link status latency from the phy affects 
> performance.
> 
>>> 
>>> I think it's very reasonable to make this change. Anyway, it's in a separate
>>> patch so that it can be applied independently.
>> 
>> Perhaps better phrased: "discussed separately"...
>> 
>>> > Even if the SERDES and PLL drivers "work for you" in the current form,
>>> > I doubt the usefulness of a PLL driver if you have to disconnect the
>>> > SoC's reset request signal on the board to not be stuck in a reboot loop.
>>> 
>>> I would like to emphasize that this has *nothing to do with this driver*.
>>> This behavior is part of the boot ROM (or something like it) and occurs 
>>> before
>>> any user code has ever executed. The problem of course is that certain RCWs
>>> expect the reference clocks to be in certain (incompatible) configurations,
>>> and will fail the boot without a lock. I think this is rather silly (since
>>> you only need PLL lock when you actually want to use the serdes), but that's
>>> how it is. And of course, this is only necessary because I was unable to get
>>> major reconfiguration to work. In an ideal world, you could always boot with
>>> the same RCW (with PLL config matching the board) and choose the major 
>>> protocol
>>> at runtime.
>> 
>> Could you please tell me what are the reference clock frequencies that
>> your board provides at boot time to the 2 PLLs, and which SERDES
>> protocol out of those 2 (1133 and ) boots correctly (no RESET_REQ
>> hacks necessary) with those refclks? I will try to get a LS1046A-QDS
>> where I boot from the same refclk + SERDES protocol configuration as
>> you, and use PBI commands in the RCW to reconfigure the lanes (PLL
>> selection and protocol registers) for the other mode, while keeping the
>> FRATE_SEL of the PLLs unmodified.
> 
>  From table 31-1 in the RM, the PLL mapping for 1133 is 2211, and the
>  PLL mapping for  is . As a consequence, for 1133, PLL 2 must be
>  156.25 MHz and PLL 1 must be either 100 or 125 MHz. And for , PLL 2
>  must be either 100 or 125 MHz, and PLL 1 should be shut down (as it is
>  unused). This conflict for PLL 2 means that the same reference clock
>  configuration cannot work for both 1133 and . In one of the
>  configurations, SRDS_RST_RR will be set in RSTRQSR1. On our board,
>  reference clock 1 is 156.25 MHz, and reference clock 2 is 125 MHz.
>  Therefore,  will fail to boot. Unfortunately, this reset request
>  occurs before any user-configurable code has run (except the RCW), so
>  it is not possible to fix this issue with e.g. PBI.

Have you had a chance to review this driver?

--Sean

Re: [PATCH v2] powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs

2023-05-22 Thread Gaurav Batra


Hello Alexey,

No worries. I resolved the issue with Michael's help. The patch is 
merged upstream and it fixes the issue.


Here is the link

https://lore.kernel.org/all/20230504175913.83844-1-gba...@linux.vnet.ibm.com/

Thanks,

Gaurav

On 5/21/23 7:08 PM, Alexey Kardashevskiy wrote:

Hi Gaurav,

Sorry I missed this. Please share the link to the your fix, I do not 
see it in my mail. In general, the problem can probably be solved by 
using huge pages (anything more than 64K) only for 1:1 mapping.



On 03/05/2023 13:25, Gaurav Batra wrote:

Hello Alexey,

I recently joined IOMMU team. There was a bug reported by test team 
where Mellanox driver was timing out during configuration. I proposed 
a fix for the same, which is below in the email.


You suggested a fix for Srikar's reported problem. Basically, both 
these fixes will resolve Srikar and Mellanox driver issues. The 
problem is with 2MB DDW.


Since you have extensive knowledge of IOMMU design and code, in your 
opinion, which patch should we adopt?


Thanks a lot

Gaurav

On 4/20/23 2:45 PM, Gaurav Batra wrote:

Hello Michael,

I was looking into the Bug: 199106 
(https://bugzilla.linux.ibm.com/show_bug.cgi?id=199106).


In the Bug, Mellanox driver was timing out when enabling SRIOV device.

I tested, Alexey's patch and it fixes the issue with Mellanox 
driver. The down side


to Alexey's fix is that even a small memory request by the driver 
will be aligned up


to 2MB. In my test, the Mellanox driver is issuing multiple requests 
of 64K size.


All these will get aligned up to 2MB, which is quite a waste of 
resources.



In any case, both the patches work. Let me know which approach you 
prefer. In case


we decide to go with my patch, I just realized that I need to fix 
nio_pages in


iommu_free_coherent() as well.


Thanks,

Gaurav

On 4/20/23 10:21 AM, Michael Ellerman wrote:

Gaurav Batra  writes:

When DMA window is backed by 2MB TCEs, the DMA address for the mapped
page should be the offset of the page relative to the 2MB TCE. The 
code

was incorrectly setting the DMA address to the beginning of the TCE
range.

Mellanox driver is reporting timeout trying to ENABLE_HCA for an 
SR-IOV

ethernet port, when DMA window is backed by 2MB TCEs.

I assume this is similar or related to the bug Srikar reported?

https://lore.kernel.org/linuxppc-dev/20230323095333.gi1005...@linux.vnet.ibm.com/ 



In that thread Alexey suggested a patch, have you tried his patch? He
suggested rounding up the allocation size, rather than adjusting the
dma_handle.


Fixes: 3872731187141d5d0a5c4fb30007b8b9ec36a44d
That's not the right syntax, it's described in the documentation 
how to

generate it.

It should be:

   Fixes: 387273118714 ("powerps/pseries/dma: Add support for 2M 
IOMMU page size")


cheers

diff --git a/arch/powerpc/kernel/iommu.c 
b/arch/powerpc/kernel/iommu.c

index ee95937bdaf1..ca57526ce47a 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -517,7 +517,7 @@ int ppc_iommu_map_sg(struct device *dev, 
struct iommu_table *tbl,

  /* Convert entry to a dma_addr_t */
  entry += tbl->it_offset;
  dma_addr = entry << tbl->it_page_shift;
-    dma_addr |= (s->offset & ~IOMMU_PAGE_MASK(tbl));
+    dma_addr |= (vaddr & ~IOMMU_PAGE_MASK(tbl));
    DBG("  - %lu pages, entry: %lx, dma_addr: %lx\n",
  npages, entry, dma_addr);
@@ -904,6 +904,7 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  unsigned int order;
  unsigned int nio_pages, io_order;
  struct page *page;
+    int tcesize = (1 << tbl->it_page_shift);
    size = PAGE_ALIGN(size);
  order = get_order(size);
@@ -930,7 +931,8 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  memset(ret, 0, size);
    /* Set up tces to cover the allocated range */
-    nio_pages = size >> tbl->it_page_shift;
+    nio_pages = IOMMU_PAGE_ALIGN(size, tbl) >> tbl->it_page_shift;
+
  io_order = get_iommu_order(size, tbl);
  mapping = iommu_alloc(dev, tbl, ret, nio_pages, 
DMA_BIDIRECTIONAL,

    mask >> tbl->it_page_shift, io_order, 0);
@@ -938,7 +940,8 @@ void *iommu_alloc_coherent(struct device *dev, 
struct iommu_table *tbl,

  free_pages((unsigned long)ret, order);
  return NULL;
  }
-    *dma_handle = mapping;
+
+    *dma_handle = mapping | ((u64)ret & (tcesize - 1));
  return ret;
  }
  --

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

2023-05-22 Thread Srikar Dronamraju

* Michael Ellerman  [2023-05-22 17:41:22]:

> Srikar Dronamraju  writes:
> > * Alexey Kardashevskiy  [2023-04-13 22:09:22]:
> >
> >> > > On 23.03.23 10:53, Srikar Dronamraju wrote:
> >> > > > 
> > Hi Alexey, Michael
> >
> > Sorry for the late reply, but I didnt have access to this large system.
> > This weekend, I did get access and tested with the patch. However it didn't
> > help much, system is still stuck at dracut with similar message except the
> > trace.
> >
> > However this patch
> > https://lore.kernel.org/all/20230418204401.13168-1-gba...@linux.vnet.ibm.com/
> > from Gaurav Batra does solve this issue.
> 
> Thanks.
> 
> There was a v3 of that patch:
>   
> https://lore.kernel.org/all/20230504175913.83844-1-gba...@linux.vnet.ibm.com/
> 
> Which is merged now into mainline as:
>   096339ab84f3 ("powerpc/iommu: DMA address offset is incorrectly calculated 
> with 2MB TCEs")
> 
> Presumably it also fixes the bug for you, so I'll mark this as fixed,
> but if you can test that exact commit that would be good to confirm the
> bug is fixed in mainline.
> 

Yes verified with mainline kernel and also with the v3.
This patch/commit does fix it.

-- 
Thanks and Regards
Srikar Dronamraju

Re: [PATCH v8 1/3] riscv: Introduce CONFIG_RELOCATABLE

2023-05-22 Thread Alexandre Ghiti




On 19/05/2023 23:55, Palmer Dabbelt wrote:

On Fri, 19 May 2023 14:48:59 PDT (-0700), sch...@linux-m68k.org wrote:

On Mai 19 2023, Alexandre Ghiti wrote:


I have tested the following patch successfully, can you give it a try
while I make sure this is the only place I forgot to add the -fno-pie
flag?

diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index fbdccc21418a..153864e4f399 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -23,6 +23,10 @@ ifdef CONFIG_FTRACE
 CFLAGS_REMOVE_alternative.o = $(CC_FLAGS_FTRACE)
 CFLAGS_REMOVE_cpufeature.o = $(CC_FLAGS_FTRACE)
 endif
+ifdef CONFIG_RELOCATABLE
+CFLAGS_alternative.o += -fno-pie
+CFLAGS_cpufeature.o += -fno-pie
+endif
 ifdef CONFIG_KASAN
 KASAN_SANITIZE_alternative.o := n
 KASAN_SANITIZE_cpufeature.o := n


I can confirm that this fixes the crash.


Thanks.  Alex: can you send a patch?



I don't think this patch alone will work, all the code in early 
alternatives must be compiled with -fno-pie, but I'm a bit scared that's 
a "big" constraint. For now, I see 2 solutions:


- Document somewhere the fact that anything called from early 
alternatives must be compiled with -fno-pie
- Or relocate once with physical address, call early alternatives, and 
then do the final virtual relocation


Both options can be cumbersome in their own way, if anyone has an 
opinion, I'd be happy to discuss that :)

[PATCH 2/2] serial: cpm_uart: Fix a COMPILE_TEST dependency

2023-05-22 Thread Herve Codina

In a COMPILE_TEST configuration, the cpm_uart driver uses symbols from
the cpm_uart_cpm2.c file. This file is compiled only when CONFIG_CPM2 is
set.

Without this dependency, the linker fails with some missing symbols for
COMPILE_TEST configuration that needs SERIAL_CPM without enabling CPM2.

Signed-off-by: Herve Codina 
Reported-by: kernel test robot 
Link: https://lore.kernel.org/oe-kbuild-all/202305160221.9xgweobz-...@intel.com/
Fixes: e3e7b13bffae ("serial: allow COMPILE_TEST for some drivers")
---
 drivers/tty/serial/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 625358f44419..68a9d9db9144 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -769,7 +769,7 @@ config SERIAL_PMACZILOG_CONSOLE
 
 config SERIAL_CPM
tristate "CPM SCC/SMC serial port support"
-   depends on CPM2 || CPM1 || (PPC32 && COMPILE_TEST)
+   depends on CPM2 || CPM1 || (PPC32 && CPM2 && COMPILE_TEST)
select SERIAL_CORE
help
  This driver supports the SCC and SMC serial ports on Motorola 
-- 
2.40.1

[PATCH 0/2] Fix COMPILE_TEST dependencies for CPM uart, TSA and QMC

2023-05-22 Thread Herve Codina

This series fixes issues raised by the kernel test robot
  https://lore.kernel.org/oe-kbuild-all/202305160221.9xgweobz-...@intel.com/

In COMPILE_TEST configurations, TSA and QMC need CONFIG_CPM to be set in
order to compile and CPM uart needs CONFIG_CPM2.

Best regards,
Hervé

Herve Codina (2):
  soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST
  serial: cpm_uart: Fix a COMPILE_TEST dependency

 drivers/soc/fsl/qe/Kconfig | 4 ++--
 drivers/tty/serial/Kconfig | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

-- 
2.40.1

[PATCH 1/2] soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST

2023-05-22 Thread Herve Codina

In order to compile tsa.c and qmc.c, CONFIG_CPM must be set.

Without this dependency, the linker fails with some missing
symbols for COMPILE_TEST configurations that need QMC without
enabling CPM.

Signed-off-by: Herve Codina 
Reported-by: kernel test robot 
Link: https://lore.kernel.org/oe-kbuild-all/202305160221.9xgweobz-...@intel.com/
---
 drivers/soc/fsl/qe/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 7268c2fbcbc1..e0d096607fef 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -36,7 +36,7 @@ config UCC
 config CPM_TSA
tristate "CPM TSA support"
depends on OF && HAS_IOMEM
-   depends on CPM1 || COMPILE_TEST
+   depends on CPM1 || (CPM && COMPILE_TEST)
help
  Freescale CPM Time Slot Assigner (TSA)
  controller.
@@ -47,7 +47,7 @@ config CPM_TSA
 config CPM_QMC
tristate "CPM QMC support"
depends on OF && HAS_IOMEM
-   depends on CPM1 || (FSL_SOC && COMPILE_TEST)
+   depends on CPM1 || (FSL_SOC && CPM && COMPILE_TEST)
depends on CPM_TSA
help
  Freescale CPM QUICC Multichannel Controller
-- 
2.40.1

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

2023-05-22 Thread Michael Ellerman

Srikar Dronamraju  writes:
> * Alexey Kardashevskiy  [2023-04-13 22:09:22]:
>
>> > > On 23.03.23 10:53, Srikar Dronamraju wrote:
>> > > > 
>> > > > I am unable to boot upstream kernels from v5.16 to the latest upstream
>> > > > kernel on a maxconfig system. (Machine config details given below)
>> > > > 
>> > > > At boot, we see a series of messages like the below.
>> > > > 
>> > > > dracut-initqueue[13917]: Warning: dracut-initqueue: timeout, still 
>> > > > waiting for following initqueue hooks:
>> > > > dracut-initqueue[13917]: Warning: 
>> > > > /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f93dc0767-18aa-467f-afa7-5b4e9c13108a.sh:
>> > > >  "if ! grep -q After=remote-fs-pre.target 
>> > > > /run/systemd/generator/systemd-cryptsetup@*.service 2>/dev/null; then
>> > > > dracut-initqueue[13917]: [ -e 
>> > > > "/dev/disk/by-uuid/93dc0767-18aa-467f-afa7-5b4e9c13108a" ]
>> > > > dracut-initqueue[13917]: fi"
>> > > 
>> > > Alexey, did you look into this? This is apparently caused by a commit of
>> > > yours (see quoted part below) that Michael applied. Looks like it fell
>> > > through the cracks from here, but maybe I'm missing something.
>> > 
>> > Unfortunately Alexey is not working at IBM any more, so he won't have
>> > access to any hardware to debug/test this.
>> > 
>> > Srikar are you debugging this? If not we'll have to find someone else to
>> > look at it.
>> 
>> Has this been fixed and I missed cc:? Anyway, without the full log, I still
>> see it is a huge guest so chances are the guest could not map all RAM so
>> instead it uses the biggest possible DDW with 2M pages. If that's the case,
>> this might help it:
>> 
>
> Hi Alexey, Michael
>
> Sorry for the late reply, but I didnt have access to this large system.
> This weekend, I did get access and tested with the patch. However it didn't
> help much, system is still stuck at dracut with similar message except the
> trace.
>
> However this patch
> https://lore.kernel.org/all/20230418204401.13168-1-gba...@linux.vnet.ibm.com/
> from Gaurav Batra does solve this issue.

Thanks.

There was a v3 of that patch:
  https://lore.kernel.org/all/20230504175913.83844-1-gba...@linux.vnet.ibm.com/

Which is merged now into mainline as:
  096339ab84f3 ("powerpc/iommu: DMA address offset is incorrectly calculated 
with 2MB TCEs")

Presumably it also fixes the bug for you, so I'll mark this as fixed,
but if you can test that exact commit that would be good to confirm the
bug is fixed in mainline.

cheers


#regzbot fixed-by: 096339ab84f3

Re: [PATCH v3] powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs

2023-05-22 Thread Srikar Dronamraju

* Gaurav Batra  [2023-05-04 12:59:13]:

> When DMA window is backed by 2MB TCEs, the DMA address for the mapped
> page should be the offset of the page relative to the 2MB TCE. The code
> was incorrectly setting the DMA address to the beginning of the TCE
> range.
> 
> Mellanox driver is reporting timeout trying to ENABLE_HCA for an SR-IOV
> ethernet port, when DMA window is backed by 2MB TCEs.
> 
> Fixes: 387273118714 ("powerps/pseries/dma: Add support for 2M IOMMU page 
> size")
> 
> Signed-off-by: Gaurav Batra 
> 

Works with this patch.
Tested-by: Srikar Dronamraju 

> Reviewed-by: Greg Joyce 
> Reviewed-by: Brian King 
> ---

-- 
Thanks and Regards
Srikar Dronamraju

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

2023-05-22 Thread Srikar Dronamraju

* Alexey Kardashevskiy  [2023-04-13 22:09:22]:

> > > On 23.03.23 10:53, Srikar Dronamraju wrote:
> > > > 
> > > > I am unable to boot upstream kernels from v5.16 to the latest upstream
> > > > kernel on a maxconfig system. (Machine config details given below)
> > > > 
> > > > At boot, we see a series of messages like the below.
> > > > 
> > > > dracut-initqueue[13917]: Warning: dracut-initqueue: timeout, still 
> > > > waiting for following initqueue hooks:
> > > > dracut-initqueue[13917]: Warning: 
> > > > /lib/dracut/hooks/initqueue/finished/devexists-\x2fdev\x2fdisk\x2fby-uuid\x2f93dc0767-18aa-467f-afa7-5b4e9c13108a.sh:
> > > >  "if ! grep -q After=remote-fs-pre.target 
> > > > /run/systemd/generator/systemd-cryptsetup@*.service 2>/dev/null; then
> > > > dracut-initqueue[13917]: [ -e 
> > > > "/dev/disk/by-uuid/93dc0767-18aa-467f-afa7-5b4e9c13108a" ]
> > > > dracut-initqueue[13917]: fi"
> > > 
> > > Alexey, did you look into this? This is apparently caused by a commit of
> > > yours (see quoted part below) that Michael applied. Looks like it fell
> > > through the cracks from here, but maybe I'm missing something.
> > 
> > Unfortunately Alexey is not working at IBM any more, so he won't have
> > access to any hardware to debug/test this.
> > 
> > Srikar are you debugging this? If not we'll have to find someone else to
> > look at it.
> 
> Has this been fixed and I missed cc:? Anyway, without the full log, I still
> see it is a huge guest so chances are the guest could not map all RAM so
> instead it uses the biggest possible DDW with 2M pages. If that's the case,
> this might help it:
> 

Hi Alexey, Michael

Sorry for the late reply, but I didnt have access to this large system.
This weekend, I did get access and tested with the patch. However it didn't
help much, system is still stuck at dracut with similar message except the
trace.

However this patch
https://lore.kernel.org/all/20230418204401.13168-1-gba...@linux.vnet.ibm.com/
from Gaurav Batra does solve this issue.

-- 
Thanks and Regards
Srikar Dronamraju

Re: [PATCH v3 02/12] powerpc/ptrace: Add missing include

Re: [PATCH v3 01/12] powerpc/book3s: Add missing include

Re: [PATCH v2 0/3] Remove iommu_group_remove_device() from fsl

Re: [PATCH v2 03/11] asm-generic/mmiowb: Mark accesses to fix KCSAN warnings

[PATCH] soc: fsl: qe: Replace all non-returning strlcpy with strscpy

Re: [PATCH v2 03/11] asm-generic/mmiowb: Mark accesses to fix KCSAN warnings

Re: [PATCH] scsi: ibmvscsi: Replace all non-returning strlcpy with strscpy

Re: [PATCH] powerpc/iommu: limit number of TCEs to 512 for H_STUFF_TCE hcall

Re: [PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

Re: [PATCH v14 00/15] phy: Add support for Lynx 10G SerDes

Re: [PATCH v2] powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

Re: [PATCH v8 1/3] riscv: Introduce CONFIG_RELOCATABLE

[PATCH 2/2] serial: cpm_uart: Fix a COMPILE_TEST dependency

[PATCH 0/2] Fix COMPILE_TEST dependencies for CPM uart, TSA and QMC

[PATCH 1/2] soc: fsl: cpm1: Fix TSA and QMC dependencies in case of COMPILE_TEST

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

Re: [PATCH v3] powerpc/iommu: DMA address offset is incorrectly calculated with 2MB TCEs

Re: Probing nvme disks fails on Upstream kernels on powerpc Maxconfig

19 matches

Site Navigation

Mail list logo

Footer information