Re: [PATCH v4 4/5] powerpc/code-patching: introduce patch_instructions()

2023-09-25 Thread Christophe Leroy


Le 26/09/2023 à 00:50, Song Liu a écrit :
> On Fri, Sep 8, 2023 at 6:28 AM Hari Bathini  wrote:
>>
>> patch_instruction() entails setting up pte, patching the instruction,
>> clearing the pte and flushing the tlb. If multiple instructions need
>> to be patched, every instruction would have to go through the above
>> drill unnecessarily. Instead, introduce function patch_instructions()
>> that sets up the pte, clears the pte and flushes the tlb only once per
>> page range of instructions to be patched. This adds a slight overhead
>> to patch_instruction() call while improving the patching time for
>> scenarios where more than one instruction needs to be patched.
>>
>> Signed-off-by: Hari Bathini 
> 
> I didn't see this one when I reviewed 1/5. Please ignore that comment.

If I remember correctry, patch 1 introduces a huge performance 
degradation, which gets then improved with this patch.

As I said before, I'd expect patch 4 to go first then get 
bpf_arch_text_copy() be implemented with patch_instructions() directly.

Christophe

> 
> [...]
> 
>> @@ -307,11 +312,22 @@ static int __do_patch_instruction_mm(u32 *addr, 
>> ppc_inst_t instr)
>>
>>  orig_mm = start_using_temp_mm(patching_mm);
>>
>> -   err = __patch_instruction(addr, instr, patch_addr);
>> +   while (len > 0) {
>> +   instr = ppc_inst_read(code);
>> +   ilen = ppc_inst_len(instr);
>> +   err = __patch_instruction(addr, instr, patch_addr);
> 
> It appears we are still repeating a lot of work here. For example, with
> fill_insn == true, we don't need to repeat ppc_inst_read().
> 
> Can we do this with a memcpy or memset like functions?
> 
>> +   /* hwsync performed by __patch_instruction (sync) if 
>> successful */
>> +   if (err) {
>> +   mb();  /* sync */
>> +   break;
>> +   }
>>
>> -   /* hwsync performed by __patch_instruction (sync) if successful */
>> -   if (err)
>> -   mb();  /* sync */
>> +   len -= ilen;
>> +   patch_addr = patch_addr + ilen;
>> +   addr = (void *)addr + ilen;
>> +   if (!fill_insn)
>> +   code = code + ilen;
> 
> It took me a while to figure out what "fill_insn" means. Maybe call it
> "repeat_input" or something?
> 
> Thanks,
> Song
> 
>> +   }
>>
>>  /* context synchronisation performed by __patch_instruction (isync 
>> or exception) */
>>  stop_using_temp_mm(patching_mm, orig_mm);
>> @@ -328,16 +344,21 @@ static int __do_patch_instruction_mm(u32 *addr, 
>> ppc_inst_t instr)
>>  return err;
>>   }
>>


Re: [PATCH] ASoC: dt-bindings: Add missing (unevaluated|additional)Properties on child node schemas

2023-09-25 Thread Herve Codina
On Mon, 25 Sep 2023 17:09:28 -0500
Rob Herring  wrote:

> Just as unevaluatedProperties or additionalProperties are required at
> the top level of schemas, they should (and will) also be required for
> child node schemas. That ensures only documented properties are
> present for any node.
> 
> Add unevaluatedProperties or additionalProperties as appropriate.
> 
> Signed-off-by: Rob Herring 
> ---
>  Documentation/devicetree/bindings/sound/dialog,da7219.yaml | 1 +
>  Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml | 1 +
>  Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml   | 1 +
>  3 files changed, 3 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/sound/dialog,da7219.yaml 
> b/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
> index eb7d219e2c86..19137abdba3e 100644
> --- a/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
> +++ b/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
> @@ -89,6 +89,7 @@ properties:
>  
>da7219_aad:
>  type: object
> +additionalProperties: false
>  description:
>Configuration of advanced accessory detection.
>  properties:
> diff --git a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml 
> b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
> index ff5cd9241941..b522ed7dcc51 100644
> --- a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
> +++ b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
> @@ -33,6 +33,7 @@ patternProperties:
>  description:
>A DAI managed by this controller
>  type: object
> +additionalProperties: false
>  
>  properties:
>reg:
> diff --git a/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml 
> b/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
> index b6a4360ab845..0b4f003989a4 100644
> --- a/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
> +++ b/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
> @@ -60,6 +60,7 @@ properties:
>  
>ports:
>  $ref: audio-graph-port.yaml#/definitions/port-base
> +unevaluatedProperties: false
>  properties:
>port@0:
>  $ref: audio-graph-port.yaml#

At least for sound/fsl,qmc-audio.yaml:

Acked-by: Herve Codina 

Best regards,
Hervé


Re: [PATCH 3/8] iommu/vt-d: Use ops->blocked_domain

2023-09-25 Thread Baolu Lu

On 9/25/23 7:41 PM, Jason Gunthorpe wrote:

On Mon, Sep 25, 2023 at 10:29:52AM +0800, Baolu Lu wrote:

On 9/23/23 1:07 AM, Jason Gunthorpe wrote:

Trivially migrate to the ops->blocked_domain for the existing global
static.

Signed-off-by: Jason Gunthorpe
---
   drivers/iommu/intel/iommu.c | 3 +--
   1 file changed, 1 insertion(+), 2 deletions(-)


Reviewed-by: Lu Baolu 

P.S. We can further do the same thing to the identity domain. I will
clean it up after all patches are landed.


I looked at that, and it is not trivial..

Both the Intel and virtio-iommu drivers create an "identity" domain
out of a paging domain and pass that off as a true "identity"
domain. So neither can set the global static since the determination
is at runtime..


Emm, yes. The early hardware requires a real 1:1 mapped page table. The
recent implementations are no longer needed.



What I was thinking about doing is consolidating that code so that the
core logic is the thing turning a paging domain into an identity
domain.


Yes. It's not trivial. Needs a separated series with some refactoring
efforts.

Best regards,
baolu


Re: [Bisected] PowerMac G4 getting "BUG: Unable to handle kernel data access on write at 0x00001ff0" at boot with CONFIG_VMAP_STACK=y on kernels 6.5.x (regression over 6.4.x)

2023-09-25 Thread Liam R. Howlett
* Erhard Furtner  [230925 19:02]:
> Greetings!
> 
> Had a chat on #gentoo-powerpc with another user whose G4 Mini fails booting 
> kernel 6.5.0 when CONFIG_VMAP_STACK=y is enabled. I was able to replicate the 
> issue on my PowerMac G4. Also I was able to bisect the issue.
> 
> Kernels 6.4.x boot ok with CONFIG_VMAP_STACK=y but on 6.5.5 I get:
> 
> [...]
> Kernel attempted to write user page (1ff0) - exploit attempt? (uid: 0)
> BUG: Unable to handle kernel data access on write at 0x1ff0
> Faulting instruction address: 0xc0008750
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.5-PMacG4 #5
> Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
> NIP:  c0008750 LR: c0041848 CTR: c0070988
> REGS: c0d3dcd0 TRAP: 0300   Not tainted (6.5.5-PMacG4)
> MSR:  1032   CR: 22d3ddc0 XER: 2000
> DAR: 1ff0 DSISR: 4200
> GPR00: c0041848 c0d3dd90 c0d06360 c0d3ddd0 c0d06360 c0d3dea8 c0d3adc0 
> GPR08:  c0d4  c0d3ddc0    0004
> GPR16: 0002  0002 00402dc2 00402dc2 2000 f1004000 
> GPR24: c0d45220 c0d06644 c0843c34 0002 c0d06360 c0d0ce00 c0d06360 
> NIP [c0008750] do_softirq_own_stack+0x18/0x3c
> LR [c0041848] irq_exit+0x98/0xc4
> Call Trace:
> [c0d3dd90] [c0d69564] 0xc0d69564 (unreliable)
> [c0d3ddb0] [c0041848] irq_exit+0x98/0xc4
> [c0d3ddc0] [c0004a98] Decrementer_virt+0x108/0x10c
> --- interrupt: 900 at __schedule+0x43c/0x4e0
> NIP:  c0843940 LR: c084398c CTR: c0070988
> REGS: c0d3ddd0 TRAP: 0900   Not tainted  (6.5.5-PMacG4)
> MSR:  9032   CR: 22024484  XER: 
> 
> GPR00: c0843574 c0d3de90 c0d06360 c0d06360 c0d06360 c0d3dea8 0001 
> GPR08:  9032 c099ce2c 0007ffbf 22024484   0004
> GPR16: 0002  0002 00402dc2 00402dc2 2000 f1004000 
> GPR24: c0d45220 c0d06644 c0843c34 0002 c0d06360 c0d0ce00 c0d06360 c0d063ac
> NIP [c0843940] __schedule+0x43c/0x4e0
> LR [c084390c] __schedule+0x408/0x4e0
> --- interrupt: 900
> [c0d3de90] [c0843574] __schedule+0x70/0x4e0 (unreliable)
> [c0d3ded0] [c0843c34] __cond_resched+0x34/0x54
> [c0d3dee0] [c0141068] __vmalloc_node_range+0x27c/0x64c
> [c0d3de60] [c0141794] __vmalloc_node+0x44/0x54
> [c8d3df80] [c0c06510] init_IRQ+0x34/0xd4
> [c8d3dfa0] [c0c03440] start_kernel+0x424/0x558
> [c8d3dff0] [3540] 0x3540
> Code: 39490999 7d4901a4 39290aaa 7d2a01a4 4c00012c 4b20 9421ffe0 
> 7c08002a6 3d20c0d4 93e1001c 90010024 83e95278 <943f1ff0> 7fe1fb78 48840c6d 
> 8021
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Attempted to kill the idle task!
> Rebooting in 48 seconds..

This looks very close to the crash a few weeks ago which bisected to the
same commit [1].

Can you try applying this fix [2] which is on its way upstream?

[1] 
https://lore.kernel.org/linux-mm/3f86d58e-7f36-c6b4-c43a-2a7bcffd...@linux-m68k.org/
[2] 
https://lore.kernel.org/lkml/2023091517.2835306-1-liam.howl...@oracle.com/

> 
> 
> The bisect revealed this commit:
>  # git bisect good
> cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b is the first bad commit
> commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b
> Author: Liam R. Howlett 
> Date:   Fri Aug 18 20:43:55 2023 -0400
> 
> maple_tree: disable mas_wr_append() when other readers are possible
> 
> The current implementation of append may cause duplicate data and/or
> incorrect ranges to be returned to a reader during an update.  Although
> this has not been reported or seen, disable the append write operation
> while the tree is in rcu mode out of an abundance of caution.
> 
> During the analysis of the mas_next_slot() the following was
> artificially created by separating the writer and reader code:
> 
> Writer: reader:
> mas_wr_append
> set end pivot
> updates end metata
> Detects write to last slot
> last slot write is to start of slot
> store current contents in slot
> overwrite old end pivot
> mas_next_slot():
> read end metadata
> read old end pivot
> return with incorrect 
> range
> store new value
> 
> Alternatively:
> 
> Writer: reader:
> mas_wr_append
> set end pivot
> updates end metata
> Detects write to last slot
> last lost write to end of slot
> store value
> mas_next_slot():
> read end metadata
> read old end pivot
> read n

Re: [Bisected] PowerMac G4 getting "BUG: Unable to handle kernel data access on write at 0x00001ff0" at boot with CONFIG_VMAP_STACK=y on kernels 6.5.x (regression over 6.4.x)

2023-09-25 Thread Bagas Sanjaya
On Tue, Sep 26, 2023 at 01:01:59AM +0200, Erhard Furtner wrote:
> Greetings!
> 
> Had a chat on #gentoo-powerpc with another user whose G4 Mini fails booting 
> kernel 6.5.0 when CONFIG_VMAP_STACK=y is enabled. I was able to replicate the 
> issue on my PowerMac G4. Also I was able to bisect the issue.
> 
> Kernels 6.4.x boot ok with CONFIG_VMAP_STACK=y but on 6.5.5 I get:
> 
> [...]
> Kernel attempted to write user page (1ff0) - exploit attempt? (uid: 0)
> BUG: Unable to handle kernel data access on write at 0x1ff0
> Faulting instruction address: 0xc0008750
> Oops: Kernel access of bad area, sig: 11 [#1]
> BE PAGE_SIZE=4K MMU=Hash PowerMac
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.5-PMacG4 #5
> Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
> NIP:  c0008750 LR: c0041848 CTR: c0070988
> REGS: c0d3dcd0 TRAP: 0300   Not tainted (6.5.5-PMacG4)
> MSR:  1032   CR: 22d3ddc0 XER: 2000
> DAR: 1ff0 DSISR: 4200
> GPR00: c0041848 c0d3dd90 c0d06360 c0d3ddd0 c0d06360 c0d3dea8 c0d3adc0 
> GPR08:  c0d4  c0d3ddc0    0004
> GPR16: 0002  0002 00402dc2 00402dc2 2000 f1004000 
> GPR24: c0d45220 c0d06644 c0843c34 0002 c0d06360 c0d0ce00 c0d06360 
> NIP [c0008750] do_softirq_own_stack+0x18/0x3c
> LR [c0041848] irq_exit+0x98/0xc4
> Call Trace:
> [c0d3dd90] [c0d69564] 0xc0d69564 (unreliable)
> [c0d3ddb0] [c0041848] irq_exit+0x98/0xc4
> [c0d3ddc0] [c0004a98] Decrementer_virt+0x108/0x10c
> --- interrupt: 900 at __schedule+0x43c/0x4e0
> NIP:  c0843940 LR: c084398c CTR: c0070988
> REGS: c0d3ddd0 TRAP: 0900   Not tainted  (6.5.5-PMacG4)
> MSR:  9032   CR: 22024484  XER: 
> 
> GPR00: c0843574 c0d3de90 c0d06360 c0d06360 c0d06360 c0d3dea8 0001 
> GPR08:  9032 c099ce2c 0007ffbf 22024484   0004
> GPR16: 0002  0002 00402dc2 00402dc2 2000 f1004000 
> GPR24: c0d45220 c0d06644 c0843c34 0002 c0d06360 c0d0ce00 c0d06360 c0d063ac
> NIP [c0843940] __schedule+0x43c/0x4e0
> LR [c084390c] __schedule+0x408/0x4e0
> --- interrupt: 900
> [c0d3de90] [c0843574] __schedule+0x70/0x4e0 (unreliable)
> [c0d3ded0] [c0843c34] __cond_resched+0x34/0x54
> [c0d3dee0] [c0141068] __vmalloc_node_range+0x27c/0x64c
> [c0d3de60] [c0141794] __vmalloc_node+0x44/0x54
> [c8d3df80] [c0c06510] init_IRQ+0x34/0xd4
> [c8d3dfa0] [c0c03440] start_kernel+0x424/0x558
> [c8d3dff0] [3540] 0x3540
> Code: 39490999 7d4901a4 39290aaa 7d2a01a4 4c00012c 4b20 9421ffe0 
> 7c08002a6 3d20c0d4 93e1001c 90010024 83e95278 <943f1ff0> 7fe1fb78 48840c6d 
> 8021
> ---[ end trace  ]---
> 
> Kernel panic - not syncing: Attempted to kill the idle task!
> Rebooting in 48 seconds..
> 
> 
> The bisect revealed this commit:
>  # git bisect good
> cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b is the first bad commit
> commit cfeb6ae8bcb96ccf674724f223661bbcef7b0d0b
> Author: Liam R. Howlett 
> Date:   Fri Aug 18 20:43:55 2023 -0400
> 
> maple_tree: disable mas_wr_append() when other readers are possible
> 
> The current implementation of append may cause duplicate data and/or
> incorrect ranges to be returned to a reader during an update.  Although
> this has not been reported or seen, disable the append write operation
> while the tree is in rcu mode out of an abundance of caution.
> 
> During the analysis of the mas_next_slot() the following was
> artificially created by separating the writer and reader code:
> 
> Writer: reader:
> mas_wr_append
> set end pivot
> updates end metata
> Detects write to last slot
> last slot write is to start of slot
> store current contents in slot
> overwrite old end pivot
> mas_next_slot():
> read end metadata
> read old end pivot
> return with incorrect 
> range
> store new value
> 
> Alternatively:
> 
> Writer: reader:
> mas_wr_append
> set end pivot
> updates end metata
> Detects write to last slot
> last lost write to end of slot
> store value
> mas_next_slot():
> read end metadata
> read old end pivot
> read new end pivot
> return with incorrect 
> range
> set old end pivot
> 
> There may be other accesses that are not safe since we are now updating
> both metadata and pointers, so disabling append if there could be rcu
> readers i

Re: [PATCH v4 4/5] powerpc/code-patching: introduce patch_instructions()

2023-09-25 Thread Song Liu
On Fri, Sep 8, 2023 at 6:28 AM Hari Bathini  wrote:
>
> patch_instruction() entails setting up pte, patching the instruction,
> clearing the pte and flushing the tlb. If multiple instructions need
> to be patched, every instruction would have to go through the above
> drill unnecessarily. Instead, introduce function patch_instructions()
> that sets up the pte, clears the pte and flushes the tlb only once per
> page range of instructions to be patched. This adds a slight overhead
> to patch_instruction() call while improving the patching time for
> scenarios where more than one instruction needs to be patched.
>
> Signed-off-by: Hari Bathini 

I didn't see this one when I reviewed 1/5. Please ignore that comment.

[...]

> @@ -307,11 +312,22 @@ static int __do_patch_instruction_mm(u32 *addr, 
> ppc_inst_t instr)
>
> orig_mm = start_using_temp_mm(patching_mm);
>
> -   err = __patch_instruction(addr, instr, patch_addr);
> +   while (len > 0) {
> +   instr = ppc_inst_read(code);
> +   ilen = ppc_inst_len(instr);
> +   err = __patch_instruction(addr, instr, patch_addr);

It appears we are still repeating a lot of work here. For example, with
fill_insn == true, we don't need to repeat ppc_inst_read().

Can we do this with a memcpy or memset like functions?

> +   /* hwsync performed by __patch_instruction (sync) if 
> successful */
> +   if (err) {
> +   mb();  /* sync */
> +   break;
> +   }
>
> -   /* hwsync performed by __patch_instruction (sync) if successful */
> -   if (err)
> -   mb();  /* sync */
> +   len -= ilen;
> +   patch_addr = patch_addr + ilen;
> +   addr = (void *)addr + ilen;
> +   if (!fill_insn)
> +   code = code + ilen;

It took me a while to figure out what "fill_insn" means. Maybe call it
"repeat_input" or something?

Thanks,
Song

> +   }
>
> /* context synchronisation performed by __patch_instruction (isync or 
> exception) */
> stop_using_temp_mm(patching_mm, orig_mm);
> @@ -328,16 +344,21 @@ static int __do_patch_instruction_mm(u32 *addr, 
> ppc_inst_t instr)
> return err;
>  }
>


Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-25 Thread Arnaldo Carvalho de Melo
On Wed, Sep 13, 2023, 7:40 AM Athira Rajeev 
wrote:

>
>
> > On 08-Sep-2023, at 7:48 PM, Athira Rajeev 
> wrote:
> >
> >
> >
> >> On 08-Sep-2023, at 11:04 AM, Sachin Sant  wrote:
> >>
> >>
> >>
> >>> On 07-Sep-2023, at 10:29 PM, Athira Rajeev <
> atraj...@linux.vnet.ibm.com> wrote:
> >>>
> >>> Testcase "Parsing of all PMU events from sysfs" parse events for
> >>> all PMUs, and not just cpu. In case of powerpc, the PowerVM
> >>> environment supports events from hv_24x7 and hv_gpci PMU which
> >>> is of example format like below:
> >>>
> >>> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
> >>> - hv_gpci/event,partition_id=?/
> >>>
> >>> The value for "?" needs to be filled in depending on system
> >>> configuration. It is better to skip these parametrized events
> >>> in this test as it is done in:
> >>> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
> >>> parametrized events")' which handled a simialr instance with
> >>> "all PMU test".
> >>>
> >>> Fix parse-events test to skip parametrized events since
> >>> it needs proper setup of the parameters.
> >>>
> >>> Signed-off-by: Athira Rajeev 
> >>> ---
> >>> Changelog:
> >>> v1 -> v2:
> >>> Addressed review comments from Ian. Updated size of
> >>> pmu event name variable and changed bool name which is
> >>> used to skip the test.
> >>>
> >>
> >> The patch fixes the reported issue.
> >>
> >> 6.2: Parsing of all PMU events from sysfs  : Ok
> >> 6.3: Parsing of given PMU events from sysfs: Ok
> >>
> >> Tested-by: Sachin Sant 
> >>
> >> - Sachin
> >
> > Hi Sachin, Ian
> >
> > Thanks for testing the patch
>
> Hi Arnaldo
>
> Can you please check and pull this if it looks good to go .
>

Namhyung, can you please take a look?

I'll be back home next week, now at Kernel Recipes in Paris.

- Arnaldo

>
> Thanks
> Athira
> >
> > Athira
> >
> >
>
>


Re: [PATCH v4 3/5] powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free]

2023-09-25 Thread Song Liu
On Fri, Sep 8, 2023 at 6:28 AM Hari Bathini  wrote:
>
> Use bpf_jit_binary_pack_alloc in powerpc jit. The jit engine first
> writes the program to the rw buffer. When the jit is done, the program
> is copied to the final location with bpf_jit_binary_pack_finalize.
> With multiple jit_subprogs, bpf_jit_free is called on some subprograms
> that haven't got bpf_jit_binary_pack_finalize() yet. Implement custom
> bpf_jit_free() like in commit 1d5f82d9dd47 ("bpf, x86: fix freeing of
> not-finalized bpf_prog_pack") to call bpf_jit_binary_pack_finalize(),
> if necessary. While here, correct the misnomer powerpc64_jit_data to
> powerpc_jit_data as it is meant for both ppc32 and ppc64.

I would personally prefer to put the rename to a separate patch.

>
> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/net/bpf_jit.h|  12 ++--
>  arch/powerpc/net/bpf_jit_comp.c   | 110 ++
>  arch/powerpc/net/bpf_jit_comp32.c |  13 ++--
>  arch/powerpc/net/bpf_jit_comp64.c |  10 +--
>  4 files changed, 98 insertions(+), 47 deletions(-)

[...]

> @@ -220,17 +237,19 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog 
> *fp)
>
>  #ifdef CONFIG_PPC64_ELF_ABI_V1
> /* Function descriptor nastiness: Address + TOC */
> -   ((u64 *)image)[0] = (u64)code_base;
> +   ((u64 *)image)[0] = (u64)fcode_base;
> ((u64 *)image)[1] = local_paca->kernel_toc;
>  #endif
>
> -   fp->bpf_func = (void *)image;
> +   fp->bpf_func = (void *)fimage;
> fp->jited = 1;
> fp->jited_len = proglen + FUNCTION_DESCR_SIZE;
>
> -   bpf_flush_icache(bpf_hdr, (u8 *)bpf_hdr + bpf_hdr->size);

I guess we don't need bpf_flush_icache() any more? So can we remove it
from arch/powerpc/net/bpf_jit.h?

Thanks,
Song

> if (!fp->is_func || extra_pass) {
> -   bpf_jit_binary_lock_ro(bpf_hdr);
> +   if (bpf_jit_binary_pack_finalize(fp, fhdr, hdr)) {
> +   fp = org_fp;
> +   goto out_addrs;
> +   }
> bpf_prog_fill_jited_linfo(fp, addrs);
>  out_addrs:
> kfree(addrs);
> @@ -240,8 +259,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
> jit_data->addrs = addrs;
> jit_data->ctx = cgctx;
> jit_data->proglen = proglen;
> -   jit_data->image = image;
> -   jit_data->header = bpf_hdr;
> +   jit_data->fimage = fimage;
> +   jit_data->fhdr = fhdr;
> +   jit_data->hdr = hdr;
> }
>
>  out:
[...]


[PATCH] ASoC: dt-bindings: Add missing (unevaluated|additional)Properties on child node schemas

2023-09-25 Thread Rob Herring
Just as unevaluatedProperties or additionalProperties are required at
the top level of schemas, they should (and will) also be required for
child node schemas. That ensures only documented properties are
present for any node.

Add unevaluatedProperties or additionalProperties as appropriate.

Signed-off-by: Rob Herring 
---
 Documentation/devicetree/bindings/sound/dialog,da7219.yaml | 1 +
 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml | 1 +
 Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml   | 1 +
 3 files changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/dialog,da7219.yaml 
b/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
index eb7d219e2c86..19137abdba3e 100644
--- a/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
+++ b/Documentation/devicetree/bindings/sound/dialog,da7219.yaml
@@ -89,6 +89,7 @@ properties:
 
   da7219_aad:
 type: object
+additionalProperties: false
 description:
   Configuration of advanced accessory detection.
 properties:
diff --git a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml 
b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
index ff5cd9241941..b522ed7dcc51 100644
--- a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
+++ b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
@@ -33,6 +33,7 @@ patternProperties:
 description:
   A DAI managed by this controller
 type: object
+additionalProperties: false
 
 properties:
   reg:
diff --git a/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml 
b/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
index b6a4360ab845..0b4f003989a4 100644
--- a/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
+++ b/Documentation/devicetree/bindings/sound/ti,pcm3168a.yaml
@@ -60,6 +60,7 @@ properties:
 
   ports:
 $ref: audio-graph-port.yaml#/definitions/port-base
+unevaluatedProperties: false
 properties:
   port@0:
 $ref: audio-graph-port.yaml#
-- 
2.40.1



Re: [PATCH] selftests/powerpc: Fix emit_tests to work with run_kselftest.sh

2023-09-25 Thread Kees Cook
On Thu, Sep 21, 2023 at 05:26:10PM +1000, Michael Ellerman wrote:
> In order to use run_kselftest.sh the list of tests must be emitted to
> populate kselftest-list.txt.
> 
> The powerpc Makefile is written to use EMIT_TESTS. But support for
> EMIT_TESTS was dropped in commit d4e59a536f50 ("selftests: Use runner.sh
> for emit targets"). Although prior to that commit a548de0fe8e1
> ("selftests: lib.mk: add test execute bit check to EMIT_TESTS") had
> already broken run_kselftest.sh for powerpc due to the executable check
> using the wrong path.
> 
> It can be fixed by replacing the EMIT_TESTS definitions with actual
> emit_tests rules in the powerpc Makefiles. This makes run_kselftest.sh
> able to run powerpc tests:
> 
>   $ cd linux
>   $ export ARCH=powerpc
>   $ export CROSS_COMPILE=powerpc64le-linux-gnu-
>   $ make headers
>   $ make -j -C tools/testing/selftests install
>   $ grep -c "^powerpc" 
> tools/testing/selftests/kselftest_install/kselftest-list.txt
>   182
> 
> Fixes: d4e59a536f50 ("selftests: Use runner.sh for emit targets")
> Signed-off-by: Michael Ellerman 

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v4 1/5] powerpc/bpf: implement bpf_arch_text_copy

2023-09-25 Thread Song Liu
On Fri, Sep 8, 2023 at 6:28 AM Hari Bathini  wrote:
>
> bpf_arch_text_copy is used to dump JITed binary to RX page, allowing
> multiple BPF programs to share the same page. Use patch_instruction()
> to implement it.
>
> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/net/bpf_jit_comp.c | 41 -
>  1 file changed, 40 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index 37043dfc1add..4f896222c579 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -13,9 +13,12 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>
> +#include 
> +#include 
> +
>  #include "bpf_jit.h"
>
>  static void bpf_jit_fill_ill_insns(void *area, unsigned int size)
> @@ -23,6 +26,28 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned 
> int size)
> memset32(area, BREAKPOINT_INSTRUCTION, size / 4);
>  }
>
> +/*
> + * Patch 'len' bytes of instructions from opcode to addr, one instruction
> + * at a time. Returns addr on success. ERR_PTR(-EINVAL), otherwise.
> + */
> +static void *bpf_patch_instructions(void *addr, void *opcode, size_t len, 
> bool fill_insn)
> +{
> +   while (len > 0) {
> +   ppc_inst_t insn = ppc_inst_read(opcode);
> +   int ilen = ppc_inst_len(insn);
> +
> +   if (patch_instruction(addr, insn))
> +   return ERR_PTR(-EINVAL);

Is there any reason we have to do this one instruction at a time? I believe
Christophe Leroy pointed out the same in an earlier version?

Thanks,
Song

> +
> +   len -= ilen;
> +   addr = addr + ilen;
> +   if (!fill_insn)
> +   opcode = opcode + ilen;
> +   }
> +
> +   return addr;
> +}
> +
>  int bpf_jit_emit_exit_insn(u32 *image, struct codegen_context *ctx, int 
> tmp_reg, long exit_addr)
>  {
> if (!exit_addr || is_offset_in_branch_range(exit_addr - (ctx->idx * 
> 4))) {
> @@ -274,3 +299,17 @@ int bpf_add_extable_entry(struct bpf_prog *fp, u32 
> *image, int pass, struct code
> ctx->exentry_idx++;
> return 0;
>  }
> +
> +void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> +{
> +   void *ret;
> +
> +   if (WARN_ON_ONCE(core_kernel_text((unsigned long)dst)))
> +   return ERR_PTR(-EINVAL);
> +
> +   mutex_lock(&text_mutex);
> +   ret = bpf_patch_instructions(dst, src, len, false);
> +   mutex_unlock(&text_mutex);
> +
> +   return ret;
> +}


Re: [PATCH] powerpc/stacktrace: Fix arch_stack_walk_reliable()

2023-09-25 Thread Joe Lawrence
On Fri, Sep 22, 2023 at 09:24:41AM +1000, Michael Ellerman wrote:
> The changes to copy_thread() made in commit eed7c420aac7 ("powerpc:
> copy_thread differentiate kthreads and user mode threads") inadvertently
> broke arch_stack_walk_reliable() because it has knowledge of the stack
> layout.
> 
> Fix it by changing the condition to match the new logic in
> copy_thread(). The changes make the comments about the stack layout
> incorrect, rather than rephrasing them just refer the reader to
> copy_thread().
> 
> Also the comment about the stack backchain is no longer true, since
> commit edbd0387f324 ("powerpc: copy_thread add a back chain to the
> switch stack frame"), so remove that as well.
> 
> Reported-by: Joe Lawrence 
> Signed-off-by: Michael Ellerman 
> Fixes: eed7c420aac7 ("powerpc: copy_thread differentiate kthreads and user 
> mode threads")
> ---
>  arch/powerpc/kernel/stacktrace.c | 27 +--
>  1 file changed, 5 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/stacktrace.c 
> b/arch/powerpc/kernel/stacktrace.c
> index b15f15dcacb5..e6a958a5da27 100644
> --- a/arch/powerpc/kernel/stacktrace.c
> +++ b/arch/powerpc/kernel/stacktrace.c
> @@ -73,29 +73,12 @@ int __no_sanitize_address 
> arch_stack_walk_reliable(stack_trace_consume_fn consum
>   bool firstframe;
>  
>   stack_end = stack_page + THREAD_SIZE;
> - if (!is_idle_task(task)) {
> - /*
> -  * For user tasks, this is the SP value loaded on
> -  * kernel entry, see "PACAKSAVE(r13)" in _switch() and
> -  * system_call_common().
> -  *
> -  * Likewise for non-swapper kernel threads,
> -  * this also happens to be the top of the stack
> -  * as setup by copy_thread().
> -  *
> -  * Note that stack backlinks are not properly setup by
> -  * copy_thread() and thus, a forked task() will have
> -  * an unreliable stack trace until it's been
> -  * _switch()'ed to for the first time.
> -  */
> - stack_end -= STACK_USER_INT_FRAME_SIZE;
> - } else {
> - /*
> -  * idle tasks have a custom stack layout,
> -  * c.f. cpu_idle_thread_init().
> -  */
> +
> + // See copy_thread() for details.
> + if (task->flags & PF_KTHREAD)
>   stack_end -= STACK_FRAME_MIN_SIZE;
> - }
> + else
> + stack_end -= STACK_USER_INT_FRAME_SIZE;
>  
>   if (task == current)
>   sp = current_stack_frame();
> -- 
> 2.41.0
> 
> 

Reviewed-by: Joe Lawrence 

Thanks for posting, Michael.

Livepatching kselftests are happy now.  Minimal kpatch testing good, too
(we have not rebased our full integration tests to latest upstreams just
yet).

--
Joe




[PATCH v2] cpufreq: pmac32: Use of_property_read_reg() to parse "reg"

2023-09-25 Thread Rob Herring
Use the recently added of_property_read_reg() helper to get the
untranslated "reg" address value.

Acked-by: Viresh Kumar 
Signed-off-by: Rob Herring 
---
v2:
 - Add missing include
---
 drivers/cpufreq/pmac32-cpufreq.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/cpufreq/pmac32-cpufreq.c b/drivers/cpufreq/pmac32-cpufreq.c
index ec75e79659ac..df3567c1e93b 100644
--- a/drivers/cpufreq/pmac32-cpufreq.c
+++ b/drivers/cpufreq/pmac32-cpufreq.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -378,10 +379,9 @@ static int pmac_cpufreq_cpu_init(struct cpufreq_policy 
*policy)
 
 static u32 read_gpio(struct device_node *np)
 {
-   const u32 *reg = of_get_property(np, "reg", NULL);
-   u32 offset;
+   u64 offset;
 
-   if (reg == NULL)
+   if (of_property_read_reg(np, 0, &offset, NULL) < 0)
return 0;
/* That works for all keylargos but shall be fixed properly
 * some day... The problem is that it seems we can't rely
@@ -389,7 +389,6 @@ static u32 read_gpio(struct device_node *np)
 * relative to the base of KeyLargo or to the base of the
 * GPIO space, and the device-tree doesn't help.
 */
-   offset = *reg;
if (offset < KEYLARGO_GPIO_LEVELS0)
offset += KEYLARGO_GPIO_LEVELS0;
return offset;
-- 
2.40.1



[PATCH v2 37/37] powerpc: Support execute-only on all powerpc

2023-09-25 Thread Christophe Leroy
Introduce PAGE_EXECONLY_X macro which provides exec-only rights.
The _X may be seen as redundant with the EXECONLY but it helps
keep consistancy, all macros having the EXEC right have _X.

And put it next to PAGE_NONE as PAGE_EXECONLY_X is
somehow PAGE_NONE + EXEC just like all other SOMETHING_X are
just SOMETHING + EXEC.

On book3s/64 PAGE_EXECONLY becomes PAGE_READONLY_X.

On book3s/64, as PAGE_EXECONLY is only valid for Radix add
VM_READ flag in vm_get_page_prot() for non-Radix.

And update access_error() so that a non exec fault on a VM_EXEC only
mapping is always invalid, even when the underlying layer don't
always generate a fault for that.

For 8xx, set PAGE_EXECONLY_X as _PAGE_NA | _PAGE_EXEC.
For others, only set it as just _PAGE_EXEC

With that change, 8xx, e500 and 44x fully honor execute-only
protection.

On 40x that is a partial implementation of execute-only. The
implementation won't be complete because once a TLB has been loaded
via the Instruction TLB miss handler, it will be possible to read
the page. But at least it can't be read unless it is executed first.

On 603 MMU, TLB missed are handled by SW and there are separate
DTLB and ITLB. Execute-only is therefore now supported by not loading
DTLB when read access is not permitted.

On hash (604) MMU it is more tricky because hash table is common to
load/store and execute. Nevertheless it is still possible to check
whether _PAGE_READ is set before loading hash table for a load/store
access. At least it can't be read unless it is executed first.

Signed-off-by: Christophe Leroy 
Cc: Russell Currey 
Cc: Kees Cook 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h |  4 +---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h |  1 +
 arch/powerpc/include/asm/nohash/pgtable.h|  2 +-
 arch/powerpc/include/asm/nohash/pte-e500.h   |  1 +
 arch/powerpc/include/asm/pgtable-masks.h |  2 ++
 arch/powerpc/mm/book3s64/pgtable.c   | 10 --
 arch/powerpc/mm/fault.c  |  9 +
 arch/powerpc/mm/pgtable.c|  4 ++--
 9 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 244621c88510..52971ee30717 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -425,7 +425,7 @@ static inline bool pte_access_permitted(pte_t pte, bool 
write)
 {
/*
 * A read-only access is controlled by _PAGE_READ bit.
-* We have _PAGE_READ set for WRITE and EXECUTE
+* We have _PAGE_READ set for WRITE
 */
if (!pte_present(pte) || !pte_read(pte))
return false;
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 0fd12bdc7b5e..751b01227e36 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -18,6 +18,7 @@
 #define _PAGE_WRITE0x2 /* write access allowed */
 #define _PAGE_READ 0x4 /* read access allowed */
 #define _PAGE_NA   _PAGE_PRIVILEGED
+#define _PAGE_NAX  _PAGE_EXEC
 #define _PAGE_RO   _PAGE_READ
 #define _PAGE_ROX  (_PAGE_READ | _PAGE_EXEC)
 #define _PAGE_RW   (_PAGE_READ | _PAGE_WRITE)
@@ -141,9 +142,6 @@
 
 #include 
 
-/* Radix only, Hash uses PAGE_READONLY_X + execute-only pkey instead */
-#define PAGE_EXECONLY  __pgprot(_PAGE_BASE | _PAGE_EXEC)
-
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
 #define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_TOLERANT)
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 1ee38befd29a..137dc3c84e45 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -48,6 +48,7 @@
 
 #define _PAGE_HUGE 0x0800  /* Copied to L1 PS bit 29 */
 
+#define _PAGE_NAX  (_PAGE_NA | _PAGE_EXEC)
 #define _PAGE_ROX  (_PAGE_RO | _PAGE_EXEC)
 #define _PAGE_RW   0
 #define _PAGE_RWX  _PAGE_EXEC
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index f922c84b23eb..a50be1de9f83 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -203,7 +203,7 @@ static inline bool pte_access_permitted(pte_t pte, bool 
write)
 {
/*
 * A read-only access is controlled by _PAGE_READ bit.
-* We have _PAGE_READ set for WRITE and EXECUTE
+* We have _PAGE_READ set for WRITE
 */
if (!pte_present(pte) || !pte_read(pte))
return false;
diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h 
b/arch/powerpc/include/asm/nohash/pte-e500.h
index 31d2c3ea7df8..f516f0b5b7a8 100644
-

[PATCH v2 27/37] powerpc/64s: Use generic permission masks

2023-09-25 Thread Christophe Leroy
book3s64 need specific masks because it needs _PAGE_PRIVILEGED
for PAGE_NONE.

book3s64 already has _PAGE_RW and _PAGE_RWX.
So add _PAGE_NA, _PAGE_RO and _PAGE_ROX and remove specific
permission masks.

Signed-off-by: Christophe Leroy 
---
Not sure why it needs _PAGE_PRIVILEGED as it also have _PAGE_READ
and _PAGE_READ is not set on PAGE_NONE.
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 20 +---
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c3b921769ece..0fd12bdc7b5e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -17,6 +17,9 @@
 #define _PAGE_EXEC 0x1 /* execute permission */
 #define _PAGE_WRITE0x2 /* write access allowed */
 #define _PAGE_READ 0x4 /* read access allowed */
+#define _PAGE_NA   _PAGE_PRIVILEGED
+#define _PAGE_RO   _PAGE_READ
+#define _PAGE_ROX  (_PAGE_READ | _PAGE_EXEC)
 #define _PAGE_RW   (_PAGE_READ | _PAGE_WRITE)
 #define _PAGE_RWX  (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC)
 #define _PAGE_PRIVILEGED   0x8 /* kernel access only */
@@ -136,21 +139,8 @@
 #define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED)
 #define _PAGE_BASE (_PAGE_BASE_NC)
 
-/* Permission masks used to generate the __P and __S table,
- *
- * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
- *
- * Write permissions imply read permissions for now (we could make write-only
- * pages on BookE but we don't bother for now). Execute permission control is
- * possible on platforms that define _PAGE_EXEC
- */
-#define PAGE_NONE  __pgprot(_PAGE_BASE | _PAGE_PRIVILEGED)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_RW | _PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_READ)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_READ)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
+#include 
+
 /* Radix only, Hash uses PAGE_READONLY_X + execute-only pkey instead */
 #define PAGE_EXECONLY  __pgprot(_PAGE_BASE | _PAGE_EXEC)
 
-- 
2.41.0



[PATCH v2 25/37] powerpc: Refactor permission masks used for __P/__S table and kernel memory flags

2023-09-25 Thread Christophe Leroy
Prepare a common version of the permission masks that will be based
on _PAGE_NA, _PAGE_RO, _PAGE_ROX, _PAGE_RW, _PAGE_RWX that will be
defined in platform specific headers in later patches.

Put them in a new header pgtable-masks.h which will be included by
platforms.

And prepare a common version of flags used for mapping kernel memory
that will be based on _PAGE_RO, _PAGE_ROX, _PAGE_RW, _PAGE_RWX that
will be defined in platform specific headers.

Put them in unless _PAGE_KERNEL_RO is already defined so that platform
specific definitions can be dismantled one by one.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/pgtable-masks.h | 30 
 1 file changed, 30 insertions(+)
 create mode 100644 arch/powerpc/include/asm/pgtable-masks.h

diff --git a/arch/powerpc/include/asm/pgtable-masks.h 
b/arch/powerpc/include/asm/pgtable-masks.h
new file mode 100644
index ..808a3b9e8fc0
--- /dev/null
+++ b/arch/powerpc/include/asm/pgtable-masks.h
@@ -0,0 +1,30 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_PGTABLE_MASKS_H
+#define _ASM_POWERPC_PGTABLE_MASKS_H
+
+#ifndef _PAGE_NA
+#define _PAGE_NA   0
+#define _PAGE_RO   _PAGE_READ
+#define _PAGE_ROX  (_PAGE_READ | _PAGE_EXEC)
+#define _PAGE_RW   (_PAGE_READ | _PAGE_WRITE)
+#define _PAGE_RWX  (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC)
+#endif
+
+/* Permission flags for kernel mappings */
+#ifndef _PAGE_KERNEL_RO
+#define _PAGE_KERNEL_RO_PAGE_RO
+#define _PAGE_KERNEL_ROX   _PAGE_ROX
+#define _PAGE_KERNEL_RW(_PAGE_RW | _PAGE_DIRTY)
+#define _PAGE_KERNEL_RWX   (_PAGE_RWX | _PAGE_DIRTY)
+#endif
+
+/* Permission masks used to generate the __P and __S table */
+#define PAGE_NONE  __pgprot(_PAGE_BASE | _PAGE_NA)
+#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_RW)
+#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_RWX)
+#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_RO)
+#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_ROX)
+#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_RO)
+#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_ROX)
+
+#endif /* _ASM_POWERPC_PGTABLE_MASKS_H */
-- 
2.41.0



[PATCH v2 33/37] powerpc/32s: Add _PAGE_WRITE to supplement _PAGE_RW

2023-09-25 Thread Christophe Leroy
Several places, _PAGE_RW maps to write permission and don't
always imply read. To make it more clear, do as book3s/64 in
commit c7d54842deb1 ("powerpc/mm: Use _PAGE_READ to indicate
Read access") and use _PAGE_WRITE when more relevant.

For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that
will change when _PAGE_READ gets added in following patches.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 15 ---
 arch/powerpc/kernel/head_book3s_32.S |  6 +++---
 arch/powerpc/mm/book3s32/hash_low.S  | 12 ++--
 arch/powerpc/mm/book3s32/mmu.c   |  2 +-
 4 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 80505915c77c..480ad6b4fd6f 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -31,6 +31,8 @@
 #define _PAGE_RW   0x400   /* software: user write access allowed */
 #define _PAGE_SPECIAL  0x800   /* software: Special page */
 
+#define _PAGE_WRITE_PAGE_RW
+
 #ifdef CONFIG_PTE_64BIT
 /* We never clear the high word of the pte */
 #define _PTE_NONE_MASK (0xULL | _PAGE_HASHPTE)
@@ -347,7 +349,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm, unsigned long addr,
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
 {
-   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+   pte_update(mm, addr, ptep, _PAGE_WRITE, 0, 0);
 }
 
 static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
@@ -406,7 +408,11 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
 }
 
 /* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_RW);}
+static inline bool pte_write(pte_t pte)
+{
+   return !!(pte_val(pte) & _PAGE_WRITE);
+}
+
 static inline int pte_read(pte_t pte)  { return 1; }
 static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_DIRTY); }
 static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_ACCESSED); }
@@ -469,7 +475,7 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t 
pgprot)
 /* Generic modifiers for PTE bits */
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-   return __pte(pte_val(pte) & ~_PAGE_RW);
+   return __pte(pte_val(pte) & ~_PAGE_WRITE);
 }
 
 static inline pte_t pte_exprotect(pte_t pte)
@@ -499,6 +505,9 @@ static inline pte_t pte_mkpte(pte_t pte)
 
 static inline pte_t pte_mkwrite_novma(pte_t pte)
 {
+   /*
+* write implies read, hence set both
+*/
return __pte(pte_val(pte) | _PAGE_RW);
 }
 
diff --git a/arch/powerpc/kernel/head_book3s_32.S 
b/arch/powerpc/kernel/head_book3s_32.S
index 6764b98ca360..615d429d7bd1 100644
--- a/arch/powerpc/kernel/head_book3s_32.S
+++ b/arch/powerpc/kernel/head_book3s_32.S
@@ -503,9 +503,9 @@ DataLoadTLBMiss:
andc.   r1,r1,r0/* check access & ~permission */
bne-DataAddressInvalid  /* return if access not permitted */
/* Convert linux-style PTE to low word of PPC-style PTE */
-   rlwinm  r1,r0,32-9,30,30/* _PAGE_RW -> PP msb */
+   rlwinm  r1,r0,32-9,30,30/* _PAGE_WRITE -> PP msb */
rlwimi  r0,r0,32-1,30,30/* _PAGE_USER -> PP msb */
-   rlwimi  r1,r0,32-3,24,24/* _PAGE_RW -> _PAGE_DIRTY */
+   rlwimi  r1,r0,32-3,24,24/* _PAGE_WRITE -> _PAGE_DIRTY */
rlwimi  r0,r0,32-1,31,31/* _PAGE_USER -> PP lsb */
xorir1,r1,_PAGE_DIRTY   /* clear dirty when not rw */
ori r1,r1,0xe04 /* clear out reserved bits */
@@ -689,7 +689,7 @@ hash_page_dsi:
mfdar   r4
mfsrr0  r5
mfsrr1  r9
-   rlwinm  r3, r3, 32 - 15, _PAGE_RW   /* DSISR_STORE -> _PAGE_RW */
+   rlwinm  r3, r3, 32 - 15, _PAGE_WRITE/* DSISR_STORE -> _PAGE_WRITE */
bl  hash_page
mfspr   r10, SPRN_SPRG_THREAD
restore_regs_thread r10
diff --git a/arch/powerpc/mm/book3s32/hash_low.S 
b/arch/powerpc/mm/book3s32/hash_low.S
index 8b804e1a9fa4..acb0584c174c 100644
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -37,7 +37,7 @@
 /*
  * Load a PTE into the hash table, if possible.
  * The address is in r4, and r3 contains an access flag:
- * _PAGE_RW (0x400) if a write.
+ * _PAGE_WRITE (0x400) if a write.
  * r9 contains the SRR1 value, from which we use the MSR_PR bit.
  * SPRG_THREAD contains the physical address of the current task's thread.
  *
@@ -113,15 +113,15 @@ _GLOBAL(hash_page)
lwarx   r6,0,r8 /* get linux-style pte, flag word */
 #ifdef CONFIG_PPC_KUAP
mfsrin  r5,r4
-   rlwinm  r0,r9,28,_PAGE_RW   /* MSR[PR] => _PAGE_RW */
-   rlwinm  r5,r5,12,_PAGE_RW   /* 

[PATCH v2 30/37] powerpc/e500: Introduce _PAGE_READ and remove _PAGE_USER

2023-09-25 Thread Christophe Leroy
e500 MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user

It means that it can support X without R.

To do that, _PAGE_READ flag is needed.

With 32 bits PTE there is no bit available for it in PTE. On the
other hand the only real use of _PAGE_USER is to implement PAGE_NONE
by clearing _PAGE_USER. As _PAGE_NONE can also be implemented by
clearing _PAGE_READ, remove _PAGE_USER and add _PAGE_READ. Move
_PAGE_PRESENT into bit 30 so that _PAGE_READ can match SR bit.

With 64 bits PTE _PAGE_USER is already the combination of SR and UR
so all we need to do is to rename it _PAGE_READ.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-85xx.h | 22 ---
 arch/powerpc/include/asm/nohash/pte-e500.h| 20 -
 arch/powerpc/kernel/head_85xx.S   | 10 -
 3 files changed, 18 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-85xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-85xx.h
index 462acf69e302..653a342d3b25 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-85xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-85xx.h
@@ -17,9 +17,9 @@
 */
 
 /* Definitions for FSL Book-E Cores */
-#define _PAGE_PRESENT  0x1 /* S: PTE contains a translation */
-#define _PAGE_USER 0x2 /* S: User page (maps to UR) */
-#define _PAGE_RW   0x4 /* S: Write permission (SW) */
+#define _PAGE_READ 0x1 /* H: Read permission (SR) */
+#define _PAGE_PRESENT  0x2 /* S: PTE contains a translation */
+#define _PAGE_WRITE0x4 /* S: Write permission (SW) */
 #define _PAGE_DIRTY0x8 /* S: Page dirty */
 #define _PAGE_EXEC 0x00010 /* H: SX permission */
 #define _PAGE_ACCESSED 0x00020 /* S: Page referenced */
@@ -31,13 +31,6 @@
 #define _PAGE_WRITETHRU0x00400 /* H: W bit */
 #define _PAGE_SPECIAL  0x00800 /* S: Special page */
 
-#define _PAGE_WRITE_PAGE_RW
-
-#define _PAGE_KERNEL_RO0
-#define _PAGE_KERNEL_ROX   _PAGE_EXEC
-#define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW)
-#define _PAGE_KERNEL_RWX   (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
-
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE0
 
@@ -63,14 +56,7 @@
 #define _PAGE_BASE (_PAGE_BASE_NC)
 #endif
 
-/* Permission masks used to generate the __P and __S table */
-#define PAGE_NONE  __pgprot(_PAGE_BASE)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | 
_PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#include 
 
 #endif /* __KERNEL__ */
 #endif /*  _ASM_POWERPC_NOHASH_32_PTE_FSL_85xx_H */
diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h 
b/arch/powerpc/include/asm/nohash/pte-e500.h
index b775c7d465a4..31d2c3ea7df8 100644
--- a/arch/powerpc/include/asm/nohash/pte-e500.h
+++ b/arch/powerpc/include/asm/nohash/pte-e500.h
@@ -48,14 +48,19 @@
 
 /* "Higher level" linux bit combinations */
 #define _PAGE_EXEC (_PAGE_BAP_SX | _PAGE_BAP_UX) /* .. and was 
cache cleaned */
-#define _PAGE_RW   (_PAGE_BAP_SW | _PAGE_BAP_UW) /* User write 
permission */
+#define _PAGE_READ (_PAGE_BAP_SR | _PAGE_BAP_UR) /* User read 
permission */
+#define _PAGE_WRITE(_PAGE_BAP_SW | _PAGE_BAP_UW) /* User write 
permission */
+
 #define _PAGE_KERNEL_RW(_PAGE_BAP_SW | _PAGE_BAP_SR | 
_PAGE_DIRTY)
 #define _PAGE_KERNEL_RO(_PAGE_BAP_SR)
 #define _PAGE_KERNEL_RWX   (_PAGE_BAP_SW | _PAGE_BAP_SR | _PAGE_DIRTY | 
_PAGE_BAP_SX)
 #define _PAGE_KERNEL_ROX   (_PAGE_BAP_SR | _PAGE_BAP_SX)
-#define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
 
-#define _PAGE_WRITE_PAGE_RW
+#define _PAGE_NA   0
+#define _PAGE_RO   _PAGE_READ
+#define _PAGE_ROX  (_PAGE_READ | _PAGE_BAP_UX)
+#define _PAGE_RW   (_PAGE_READ | _PAGE_WRITE)
+#define _PAGE_RWX  (_PAGE_READ | _PAGE_WRITE | _PAGE_BAP_UX)
 
 #define _PAGE_SPECIAL  _PAGE_SW0
 
@@ -90,14 +95,7 @@
 #define _PAGE_BASE (_PAGE_BASE_NC)
 #endif
 
-/* Permission masks used to generate the __P and __S table */
-#define PAGE_NONE  __pgprot(_PAGE_BASE)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | 
_PAGE_BAP_UX)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_BAP_UX)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_BAP_UX)
+#include 
 
 #ifndef __ASSEMBLY__
 static inline pte_t pte_mkexec(pte_t pte)
diff --git a/arch/powerpc/kernel/head_85x

[PATCH v2 28/37] powerpc/nohash: Add _PAGE_WRITE to supplement _PAGE_RW

2023-09-25 Thread Christophe Leroy
Several places, _PAGE_RW maps to write permission and don't
always imply read. To make it more clear, do as book3s/64 in
commit c7d54842deb1 ("powerpc/mm: Use _PAGE_READ to indicate
Read access") and use _PAGE_WRITE when more relevant.

For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that
will change when _PAGE_READ gets added in following patches.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-40x.h  |  2 ++
 arch/powerpc/include/asm/nohash/32/pte-44x.h  |  2 ++
 arch/powerpc/include/asm/nohash/32/pte-85xx.h |  2 ++
 arch/powerpc/include/asm/nohash/64/pgtable.h  |  2 +-
 arch/powerpc/include/asm/nohash/pgtable.h |  9 ++---
 arch/powerpc/include/asm/nohash/pte-e500.h|  2 ++
 arch/powerpc/kernel/head_40x.S| 12 ++--
 arch/powerpc/kernel/head_44x.S|  4 ++--
 arch/powerpc/kernel/head_85xx.S   |  2 +-
 arch/powerpc/mm/nohash/e500.c |  4 ++--
 10 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h 
b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 0b4e5f8ce3e8..e28ef0f5781e 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -49,6 +49,8 @@
 #define _PAGE_EXEC 0x200   /* hardware: EX permission */
 #define _PAGE_ACCESSED 0x400   /* software: R: page referenced */
 
+#define _PAGE_WRITE_PAGE_RW
+
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE0
 
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h 
b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index b7ed13cee137..fc0c075006ea 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -75,6 +75,8 @@
 #define _PAGE_NO_CACHE 0x0400  /* H: I bit */
 #define _PAGE_WRITETHRU0x0800  /* H: W bit */
 
+#define _PAGE_WRITE_PAGE_RW
+
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE0
 
diff --git a/arch/powerpc/include/asm/nohash/32/pte-85xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-85xx.h
index 16451df5ddb0..462acf69e302 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-85xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-85xx.h
@@ -31,6 +31,8 @@
 #define _PAGE_WRITETHRU0x00400 /* H: W bit */
 #define _PAGE_SPECIAL  0x00800 /* S: Special page */
 
+#define _PAGE_WRITE_PAGE_RW
+
 #define _PAGE_KERNEL_RO0
 #define _PAGE_KERNEL_ROX   _PAGE_EXEC
 #define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 36b9bad428cc..2202c78730e8 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -149,7 +149,7 @@ static inline void p4d_set(p4d_t *p4dp, unsigned long val)
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
 {
-   pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
+   pte_update(mm, addr, ptep, _PAGE_WRITE, 0, 1);
 }
 
 #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 200f2dbf48e2..ee677162f9e6 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -84,7 +84,7 @@ static inline int ptep_test_and_clear_young(struct 
vm_area_struct *vma,
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
 {
-   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+   pte_update(mm, addr, ptep, _PAGE_WRITE, 0, 0);
 }
 #endif
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -123,6 +123,9 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
 #ifndef pte_mkwrite_novma
 static inline pte_t pte_mkwrite_novma(pte_t pte)
 {
+   /*
+* write implies read, hence set both
+*/
return __pte(pte_val(pte) | _PAGE_RW);
 }
 #endif
@@ -140,7 +143,7 @@ static inline pte_t pte_mkyoung(pte_t pte)
 #ifndef pte_wrprotect
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-   return __pte(pte_val(pte) & ~_PAGE_RW);
+   return __pte(pte_val(pte) & ~_PAGE_WRITE);
 }
 #endif
 
@@ -154,7 +157,7 @@ static inline pte_t pte_mkexec(pte_t pte)
 #ifndef pte_write
 static inline int pte_write(pte_t pte)
 {
-   return pte_val(pte) & _PAGE_RW;
+   return pte_val(pte) & _PAGE_WRITE;
 }
 #endif
 #ifndef pte_read
diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h 
b/arch/powerpc/include/asm/nohash/pte-e500.h
index 9f9e3f02d414..b775c7d465a4 100644
--- a/arch/powerpc/include/asm/nohash/pte-e500.h
+++ b/arch/powerpc/include/asm/nohash/pte-e500.h
@@ -55,6 +55,8 @@
 #define _PAGE_KERNEL_ROX   (_PAGE_BAP_SR | _PAGE_BAP_SX)
 #define _PAGE_USER

[PATCH v2 31/37] powerpc/44x: Introduce _PAGE_READ and remove _PAGE_USER

2023-09-25 Thread Christophe Leroy
44x MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user

It means that it can support X without R.

To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.

As _PAGE_NONE can also be implemented by clearing _PAGE_READ,
remove _PAGE_USER and add _PAGE_READ. In order to insert bits in
one go during TLB miss, move _PAGE_ACCESSED and put _PAGE_READ
just after _PAGE_DIRTY so that _PAGE_DIRTY is copied into SW and
_PAGE_READ into SR at once.

With that change, 44x now also honors execute-only protection.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-44x.h | 22 +++-
 arch/powerpc/kernel/head_44x.S   | 36 ++--
 2 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h 
b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index fc0c075006ea..851813725237 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -63,28 +63,21 @@
  */
 
 #define _PAGE_PRESENT  0x0001  /* S: PTE valid */
-#define _PAGE_RW   0x0002  /* S: Write permission */
+#define _PAGE_WRITE0x0002  /* S: Write permission */
 #define _PAGE_EXEC 0x0004  /* H: Execute permission */
-#define _PAGE_ACCESSED 0x0008  /* S: Page referenced */
+#define _PAGE_READ 0x0008  /* S: Read permission */
 #define _PAGE_DIRTY0x0010  /* S: Page dirty */
 #define _PAGE_SPECIAL  0x0020  /* S: Special page */
-#define _PAGE_USER 0x0040  /* S: User page */
+#define _PAGE_ACCESSED 0x0040  /* S: Page referenced */
 #define _PAGE_ENDIAN   0x0080  /* H: E bit */
 #define _PAGE_GUARDED  0x0100  /* H: G bit */
 #define _PAGE_COHERENT 0x0200  /* H: M bit */
 #define _PAGE_NO_CACHE 0x0400  /* H: I bit */
 #define _PAGE_WRITETHRU0x0800  /* H: W bit */
 
-#define _PAGE_WRITE_PAGE_RW
-
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE0
 
-#define _PAGE_KERNEL_RO0
-#define _PAGE_KERNEL_ROX   _PAGE_EXEC
-#define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW)
-#define _PAGE_KERNEL_RWX   (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
-
 /* TODO: Add large page lowmem mapping support */
 #define _PMD_PRESENT   0
 #define _PMD_PRESENT_MASK (PAGE_MASK)
@@ -107,14 +100,7 @@
 #define _PAGE_BASE (_PAGE_BASE_NC)
 #endif
 
-/* Permission masks used to generate the __P and __S table */
-#define PAGE_NONE  __pgprot(_PAGE_BASE)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | 
_PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#include 
 
 #endif /* __KERNEL__ */
 #endif /*  _ASM_POWERPC_NOHASH_32_PTE_44x_H */
diff --git a/arch/powerpc/kernel/head_44x.S b/arch/powerpc/kernel/head_44x.S
index 858dabf84432..25642e802ed3 100644
--- a/arch/powerpc/kernel/head_44x.S
+++ b/arch/powerpc/kernel/head_44x.S
@@ -314,8 +314,8 @@ interrupt_base:
 * kernel page tables.
 */
lis r11, PAGE_OFFSET@h
-   cmplw   r10, r11
-   blt+3f
+   cmplw   cr7, r10, r11
+   blt+cr7, 3f
lis r11, swapper_pg_dir@h
ori r11, r11, swapper_pg_dir@l
 
@@ -355,7 +355,7 @@ interrupt_base:
 *   place or can we save a couple of instructions here ?
 */
mfspr   r12,SPRN_ESR
-   li  r13,_PAGE_PRESENT|_PAGE_ACCESSED
+   li  r13,_PAGE_PRESENT|_PAGE_ACCESSED|_PAGE_READ
rlwimi  r13,r12,10,30,30
 
/* Load the PTE */
@@ -428,8 +428,8 @@ interrupt_base:
 * kernel page tables.
 */
lis r11, PAGE_OFFSET@h
-   cmplw   r10, r11
-   blt+3f
+   cmplw   cr7, r10, r11
+   blt+cr7, 3f
lis r11, swapper_pg_dir@h
ori r11, r11, swapper_pg_dir@l
 
@@ -515,6 +515,7 @@ interrupt_base:
  * r11 - PTE high word value
  * r12 - PTE low word value
  * r13 - TLB index
+ * cr7 - Result of comparison with PAGE_OFFSET
  * MMUCR - loaded with proper value when we get here
  * Upon exit, we reload everything and RFI.
  */
@@ -533,11 +534,10 @@ finish_tlb_load_44x:
tlbwe   r10,r13,PPC44x_TLB_PAGEID   /* Write PAGEID */
 
/* And WS 2 */
-   li  r10,0xf85   /* Mask to apply from PTE */
-   rlwimi  r10,r12,29,30,30/* DIRTY -> SW

[PATCH v2 23/37] powerpc: Remove pte_mkuser() and pte_mkpriviledged()

2023-09-25 Thread Christophe Leroy
pte_mkuser() is never used. Remove it.

pte_mkpriviledged() is not used anymore. Remove it too.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 10 --
 arch/powerpc/include/asm/book3s/64/pgtable.h | 10 --
 arch/powerpc/include/asm/nohash/32/pte-8xx.h | 14 --
 arch/powerpc/include/asm/nohash/pgtable.h| 14 --
 arch/powerpc/include/asm/nohash/pte-e500.h   | 15 ---
 5 files changed, 63 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 45b69ae2631e..80505915c77c 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -522,16 +522,6 @@ static inline pte_t pte_mkhuge(pte_t pte)
return pte;
 }
 
-static inline pte_t pte_mkprivileged(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_USER);
-}
-
-static inline pte_t pte_mkuser(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_USER);
-}
-
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index dbd545e73161..c3b921769ece 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -630,16 +630,6 @@ static inline pte_t pte_mkdevmap(pte_t pte)
return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_SPECIAL | 
_PAGE_DEVMAP));
 }
 
-static inline pte_t pte_mkprivileged(pte_t pte)
-{
-   return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PRIVILEGED));
-}
-
-static inline pte_t pte_mkuser(pte_t pte)
-{
-   return __pte_raw(pte_raw(pte) & cpu_to_be64(~_PAGE_PRIVILEGED));
-}
-
 /*
  * This is potentially called with a pmd as the argument, in which case it's 
not
  * safe to check _PAGE_DEVMAP unless we also confirm that _PAGE_PTE is set.
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 52395a5ecd70..843fe0138a66 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -122,20 +122,6 @@ static inline bool pte_user(pte_t pte)
 
 #define pte_user pte_user
 
-static inline pte_t pte_mkprivileged(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_SH);
-}
-
-#define pte_mkprivileged pte_mkprivileged
-
-static inline pte_t pte_mkuser(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_SH);
-}
-
-#define pte_mkuser pte_mkuser
-
 static inline pte_t pte_mkhuge(pte_t pte)
 {
return __pte(pte_val(pte) | _PAGE_SPS | _PAGE_HUGE);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 1493f0b09ae9..9619beae4454 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -252,20 +252,6 @@ static inline pte_t pte_mkhuge(pte_t pte)
 }
 #endif
 
-#ifndef pte_mkprivileged
-static inline pte_t pte_mkprivileged(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_USER);
-}
-#endif
-
-#ifndef pte_mkuser
-static inline pte_t pte_mkuser(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_USER);
-}
-#endif
-
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h 
b/arch/powerpc/include/asm/nohash/pte-e500.h
index 99288e26b6b0..9f9e3f02d414 100644
--- a/arch/powerpc/include/asm/nohash/pte-e500.h
+++ b/arch/powerpc/include/asm/nohash/pte-e500.h
@@ -54,7 +54,6 @@
 #define _PAGE_KERNEL_RWX   (_PAGE_BAP_SW | _PAGE_BAP_SR | _PAGE_DIRTY | 
_PAGE_BAP_SX)
 #define _PAGE_KERNEL_ROX   (_PAGE_BAP_SR | _PAGE_BAP_SX)
 #define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
-#define _PAGE_PRIVILEGED   (_PAGE_BAP_SR)
 
 #define _PAGE_SPECIAL  _PAGE_SW0
 
@@ -99,20 +98,6 @@
 #define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_BAP_UX)
 
 #ifndef __ASSEMBLY__
-static inline pte_t pte_mkprivileged(pte_t pte)
-{
-   return __pte((pte_val(pte) & ~_PAGE_USER) | _PAGE_PRIVILEGED);
-}
-
-#define pte_mkprivileged pte_mkprivileged
-
-static inline pte_t pte_mkuser(pte_t pte)
-{
-   return __pte((pte_val(pte) & ~_PAGE_PRIVILEGED) | _PAGE_USER);
-}
-
-#define pte_mkuser pte_mkuser
-
 static inline pte_t pte_mkexec(pte_t pte)
 {
return __pte((pte_val(pte) & ~_PAGE_BAP_SX) | _PAGE_BAP_UX);
-- 
2.41.0



[PATCH v2 26/37] powerpc/8xx: Use generic permission masks

2023-09-25 Thread Christophe Leroy
8xx already has _PAGE_NA and _PAGE_RO. So add _PAGE_ROX, _PAGE_RW and
_PAGE_RWX and remove specific permission masks.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 843fe0138a66..62c965a4511a 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -48,6 +48,10 @@
 
 #define _PAGE_HUGE 0x0800  /* Copied to L1 PS bit 29 */
 
+#define _PAGE_ROX  (_PAGE_RO | _PAGE_EXEC)
+#define _PAGE_RW   0
+#define _PAGE_RWX  _PAGE_EXEC
+
 /* cache related flags non existing on 8xx */
 #define _PAGE_COHERENT 0
 #define _PAGE_WRITETHRU0
@@ -77,14 +81,7 @@
 #define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
 #define _PAGE_BASE (_PAGE_BASE_NC)
 
-/* Permission masks used to generate the __P and __S table */
-#define PAGE_NONE  __pgprot(_PAGE_BASE | _PAGE_NA)
-#define PAGE_SHARED__pgprot(_PAGE_BASE)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_RO)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_RO | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_RO)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_RO | _PAGE_EXEC)
+#include 
 
 #ifndef __ASSEMBLY__
 static inline pte_t pte_wrprotect(pte_t pte)
-- 
2.41.0



[PATCH v2 32/37] powerpc/40x: Introduce _PAGE_READ and remove _PAGE_USER

2023-09-25 Thread Christophe Leroy
_PAGE_USER is used to select the zone. Today zone 0 is kernel
and zone 1 is user.

To implement _PAGE_NONE, _PAGE_USER is cleared, leading to no access
for user but kernel still has access to the page so it's possible for
a user application to write in that page by using a kernel function
as trampoline.

What is really wanted is to have user rights on pages below TASK_SIZE
and no user rights on pages above TASK_SIZE. Use zones for that.
There are 16 zones so lets use the 4 upper address bits to set the
zone and declare zone rights based on TASK_SIZE.

Then drop _PAGE_USER and reuse it as _PAGE_READ that will be checked
in Data TLB miss handler. That will properly handle PAGE_NONE for
both kernel and user.

In addition, it partially implements execute-only right. The
implementation won't be complete because once a TLB has been loaded
via the Instruction TLB miss handler, it will be possible to read
the page. But at least it can't be read unless it is executed first.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-40x.h | 20 +++-
 arch/powerpc/kernel/head_40x.S   |  7 ---
 arch/powerpc/mm/nohash/40x.c | 19 ---
 3 files changed, 19 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h 
b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index e28ef0f5781e..d759cfd74754 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -42,26 +42,19 @@
 #define _PAGE_PRESENT  0x002   /* software: PTE contains a translation */
 #define_PAGE_NO_CACHE  0x004   /* I: caching is inhibited */
 #define_PAGE_WRITETHRU 0x008   /* W: caching is write-through */
-#define_PAGE_USER  0x010   /* matches one of the zone permission 
bits */
+#define_PAGE_READ  0x010   /* software: read permission */
 #define_PAGE_SPECIAL   0x020   /* software: Special page */
 #define_PAGE_DIRTY 0x080   /* software: dirty page */
-#define _PAGE_RW   0x100   /* hardware: WR, anded with dirty in exception 
*/
+#define _PAGE_WRITE0x100   /* hardware: WR, anded with dirty in exception 
*/
 #define _PAGE_EXEC 0x200   /* hardware: EX permission */
 #define _PAGE_ACCESSED 0x400   /* software: R: page referenced */
 
-#define _PAGE_WRITE_PAGE_RW
-
 /* No page size encoding in the linux PTE */
 #define _PAGE_PSIZE0
 
 /* cache related flags non existing on 40x */
 #define _PAGE_COHERENT 0
 
-#define _PAGE_KERNEL_RO0
-#define _PAGE_KERNEL_ROX   _PAGE_EXEC
-#define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW)
-#define _PAGE_KERNEL_RWX   (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
-
 #define _PMD_PRESENT   0x400   /* PMD points to page of PTEs */
 #define _PMD_PRESENT_MASK  _PMD_PRESENT
 #define _PMD_BAD   0x802
@@ -74,14 +67,7 @@
 #define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED)
 #define _PAGE_BASE (_PAGE_BASE_NC)
 
-/* Permission masks used to generate the __P and __S table */
-#define PAGE_NONE  __pgprot(_PAGE_BASE)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | 
_PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#include 
 
 #endif /* __KERNEL__ */
 #endif /*  _ASM_POWERPC_NOHASH_32_PTE_40x_H */
diff --git a/arch/powerpc/kernel/head_40x.S b/arch/powerpc/kernel/head_40x.S
index 9f92f5c5e6aa..9fc90410b385 100644
--- a/arch/powerpc/kernel/head_40x.S
+++ b/arch/powerpc/kernel/head_40x.S
@@ -312,7 +312,7 @@ _ASM_NOKPROBE_SYMBOL(\name\()_virt)
 
rlwimi  r11, r10, 22, 20, 29/* Compute PTE address */
lwz r11, 0(r11) /* Get Linux PTE */
-   li  r9, _PAGE_PRESENT | _PAGE_ACCESSED
+   li  r9, _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_READ
andc.   r9, r9, r11 /* Check permission */
bne 5f
 
@@ -561,10 +561,11 @@ finish_tlb_load:
/*
 * Clear out the software-only bits in the PTE to generate the
 * TLB_DATA value.  These are the bottom 2 bits of the RPM, the
-* top 3 bits of the zone field, and M.
+* 4 bits of the zone field, and M.
 */
-   li  r9, 0x0ce2
+   li  r9, 0x0cf2
andcr11, r11, r9
+   rlwimi  r11, r10, 8, 24, 27 /* Copy 4 upper address bit into zone */
 
/* load the next available TLB index. */
lwz r9, tlb_4xx_index@l(0)
diff --git a/arch/powerpc/mm/nohash/40x.c b/arch/powerpc/mm/nohash/40x.c
index 3684d6e570fb..e835e80c09db 100644
--- a/arch/powerpc/mm/nohash/40x.c
+++ b/arch/powerpc/mm/nohash/40x.c
@@ -48,20 +48,25 @@
  */
 void __init MMU_init_hw(void)

[PATCH v2 36/37] powerpc: Finally remove _PAGE_USER

2023-09-25 Thread Christophe Leroy
_PAGE_USER is now gone on all targets. Remove it completely.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index aba56fe3b1c6..f922c84b23eb 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -183,18 +183,14 @@ static inline int pte_young(pte_t pte)
 }
 
 /*
- * Don't just check for any non zero bits in __PAGE_USER, since for book3e
+ * Don't just check for any non zero bits in __PAGE_READ, since for book3e
  * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
- * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
+ * _PAGE_READ.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
  */
 #ifndef pte_read
 static inline bool pte_read(pte_t pte)
 {
-#ifdef _PAGE_READ
return (pte_val(pte) & _PAGE_READ) == _PAGE_READ;
-#else
-   return (pte_val(pte) & _PAGE_USER) == _PAGE_USER;
-#endif
 }
 #endif
 
@@ -206,7 +202,7 @@ static inline bool pte_read(pte_t pte)
 static inline bool pte_access_permitted(pte_t pte, bool write)
 {
/*
-* A read-only access is controlled by _PAGE_USER bit.
+* A read-only access is controlled by _PAGE_READ bit.
 * We have _PAGE_READ set for WRITE and EXECUTE
 */
if (!pte_present(pte) || !pte_read(pte))
-- 
2.41.0



[PATCH v2 35/37] powerpc/ptdump: Display _PAGE_READ and _PAGE_WRITE

2023-09-25 Thread Christophe Leroy
Instead of always displaying either 'rw' or 'r ' depending on
_PAGE_RW, display 'r' or ' ' for _PAGE_READ and 'w' or ' '
for _PAGE_WRITE.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/shared.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index 5ff101654c45..39c30c62b7ea 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -11,10 +11,15 @@
 
 static const struct flag_info flag_array[] = {
{
-   .mask   = _PAGE_RW,
+   .mask   = _PAGE_READ,
.val= 0,
-   .set= "r ",
-   .clear  = "rw",
+   .set= " ",
+   .clear  = "r",
+   }, {
+   .mask   = _PAGE_WRITE,
+   .val= 0,
+   .set= " ",
+   .clear  = "w",
}, {
.mask   = _PAGE_EXEC,
.val= _PAGE_EXEC,
-- 
2.41.0



[PATCH v2 29/37] powerpc/nohash: Replace pte_user() by pte_read()

2023-09-25 Thread Christophe Leroy
pte_user() is now only used in pte_access_permitted() to check
access on vmas. User flag is cleared to make a page unreadable.

So rename it pte_read() and remove pte_user() which isn't used
anymore.

For the time being it checks _PAGE_USER but in the near futur
all plateforms will be converted to _PAGE_READ so lets support
both for now.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h |  7 ---
 arch/powerpc/include/asm/nohash/pgtable.h| 13 +++--
 arch/powerpc/mm/ioremap.c|  4 
 3 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 62c965a4511a..1ee38befd29a 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -112,13 +112,6 @@ static inline pte_t pte_mkwrite_novma(pte_t pte)
 
 #define pte_mkwrite_novma pte_mkwrite_novma
 
-static inline bool pte_user(pte_t pte)
-{
-   return !(pte_val(pte) & _PAGE_SH);
-}
-
-#define pte_user pte_user
-
 static inline pte_t pte_mkhuge(pte_t pte)
 {
return __pte(pte_val(pte) | _PAGE_SPS | _PAGE_HUGE);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index ee677162f9e6..aba56fe3b1c6 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -160,9 +160,6 @@ static inline int pte_write(pte_t pte)
return pte_val(pte) & _PAGE_WRITE;
 }
 #endif
-#ifndef pte_read
-static inline int pte_read(pte_t pte)  { return 1; }
-#endif
 static inline int pte_dirty(pte_t pte) { return pte_val(pte) & 
_PAGE_DIRTY; }
 static inline int pte_special(pte_t pte)   { return pte_val(pte) & 
_PAGE_SPECIAL; }
 static inline int pte_none(pte_t pte)  { return (pte_val(pte) & 
~_PTE_NONE_MASK) == 0; }
@@ -190,10 +187,14 @@ static inline int pte_young(pte_t pte)
  * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
  * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
  */
-#ifndef pte_user
-static inline bool pte_user(pte_t pte)
+#ifndef pte_read
+static inline bool pte_read(pte_t pte)
 {
+#ifdef _PAGE_READ
+   return (pte_val(pte) & _PAGE_READ) == _PAGE_READ;
+#else
return (pte_val(pte) & _PAGE_USER) == _PAGE_USER;
+#endif
 }
 #endif
 
@@ -208,7 +209,7 @@ static inline bool pte_access_permitted(pte_t pte, bool 
write)
 * A read-only access is controlled by _PAGE_USER bit.
 * We have _PAGE_READ set for WRITE and EXECUTE
 */
-   if (!pte_present(pte) || !pte_user(pte) || !pte_read(pte))
+   if (!pte_present(pte) || !pte_read(pte))
return false;
 
if (write && !pte_write(pte))
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 7823c38f09de..7b0afcabd89f 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -50,10 +50,6 @@ void __iomem *ioremap_prot(phys_addr_t addr, size_t size, 
unsigned long flags)
if (pte_write(pte))
pte = pte_mkdirty(pte);
 
-   /* we don't want to let _PAGE_USER leak out */
-   if (WARN_ON(pte_user(pte)))
-   return NULL;
-
if (iowa_is_active())
return iowa_ioremap(addr, size, pte_pgprot(pte), caller);
return __ioremap_caller(addr, size, pte_pgprot(pte), caller);
-- 
2.41.0



[PATCH v2 22/37] powerpc: Fail ioremap() instead of silently ignoring flags when PAGE_USER is set

2023-09-25 Thread Christophe Leroy
Calling ioremap() with _PAGE_USER (or _PAGE_PRIVILEDGE unset)
is wrong. Loudly fail the call to ioremap() instead of blindly
clearing the flags.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ioremap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index d5159f205380..7823c38f09de 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -51,7 +51,8 @@ void __iomem *ioremap_prot(phys_addr_t addr, size_t size, 
unsigned long flags)
pte = pte_mkdirty(pte);
 
/* we don't want to let _PAGE_USER leak out */
-   pte = pte_mkprivileged(pte);
+   if (WARN_ON(pte_user(pte)))
+   return NULL;
 
if (iowa_is_active())
return iowa_ioremap(addr, size, pte_pgprot(pte), caller);
-- 
2.41.0



[PATCH v2 24/37] powerpc: Rely on address instead of pte_user()

2023-09-25 Thread Christophe Leroy
pte_user() may return 'false' when a user page is PAGE_NONE.

In that case it is still a user page and needs to be handled
as such. So use is_kernel_addr() instead.

And remove "user" text from ptdump as ptdump only dumps
kernel tables.

Note: no change done for book3s/64 which still has it
'priviledge' bit.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h |  2 +-
 arch/powerpc/mm/book3s32/mmu.c|  4 ++--
 arch/powerpc/mm/nohash/e500.c |  2 +-
 arch/powerpc/mm/pgtable.c | 22 +++---
 arch/powerpc/mm/ptdump/8xx.c  |  5 -
 arch/powerpc/mm/ptdump/shared.c   |  5 -
 6 files changed, 15 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 9619beae4454..200f2dbf48e2 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -58,7 +58,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 
*p = __pte(new);
 
-   if (IS_ENABLED(CONFIG_44x) && (old & _PAGE_USER) && (old & _PAGE_EXEC))
+   if (IS_ENABLED(CONFIG_44x) && !is_kernel_addr(addr) && (old & 
_PAGE_EXEC))
icache_44x_need_flush = 1;
 
/* huge pages use the old page table lock */
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 850783cfa9c7..d1041c946ce2 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -127,7 +127,7 @@ static void setibat(int index, unsigned long virt, 
phys_addr_t phys,
wimgxpp = (flags & _PAGE_COHERENT) | (_PAGE_EXEC ? BPP_RX : BPP_XX);
bat[0].batu = virt | (bl << 2) | 2; /* Vs=1, Vp=0 */
bat[0].batl = BAT_PHYS_ADDR(phys) | wimgxpp;
-   if (flags & _PAGE_USER)
+   if (!is_kernel_addr(virt))
bat[0].batu |= 1;   /* Vp = 1 */
 }
 
@@ -280,7 +280,7 @@ void __init setbat(int index, unsigned long virt, 
phys_addr_t phys,
wimgxpp |= (flags & _PAGE_RW)? BPP_RW: BPP_RX;
bat[1].batu = virt | (bl << 2) | 2; /* Vs=1, Vp=0 */
bat[1].batl = BAT_PHYS_ADDR(phys) | wimgxpp;
-   if (flags & _PAGE_USER)
+   if (!is_kernel_addr(virt))
bat[1].batu |= 1;   /* Vp = 1 */
if (flags & _PAGE_GUARDED) {
/* G bit must be zero in IBATs */
diff --git a/arch/powerpc/mm/nohash/e500.c b/arch/powerpc/mm/nohash/e500.c
index 40a4e69ae1a9..5b7d7a932bfd 100644
--- a/arch/powerpc/mm/nohash/e500.c
+++ b/arch/powerpc/mm/nohash/e500.c
@@ -122,7 +122,7 @@ static void settlbcam(int index, unsigned long virt, 
phys_addr_t phys,
TLBCAM[index].MAS7 = (u64)phys >> 32;
 
/* Below is unlikely -- only for large user pages or similar */
-   if (pte_user(__pte(flags))) {
+   if (!is_kernel_addr(virt)) {
TLBCAM[index].MAS3 |= MAS3_UR;
TLBCAM[index].MAS3 |= (flags & _PAGE_EXEC) ? MAS3_UX : 0;
TLBCAM[index].MAS3 |= (flags & _PAGE_RW) ? MAS3_UW : 0;
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 3f86fd217690..781a68c69c2f 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -46,13 +46,13 @@ static inline int is_exec_fault(void)
  * and we avoid _PAGE_SPECIAL and cache inhibited pte. We also only do that
  * on userspace PTEs
  */
-static inline int pte_looks_normal(pte_t pte)
+static inline int pte_looks_normal(pte_t pte, unsigned long addr)
 {
 
if (pte_present(pte) && !pte_special(pte)) {
if (pte_ci(pte))
return 0;
-   if (pte_user(pte))
+   if (!is_kernel_addr(addr))
return 1;
}
return 0;
@@ -79,11 +79,11 @@ static struct folio *maybe_pte_to_folio(pte_t pte)
  * support falls into the same category.
  */
 
-static pte_t set_pte_filter_hash(pte_t pte)
+static pte_t set_pte_filter_hash(pte_t pte, unsigned long addr)
 {
pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
-   if (pte_looks_normal(pte) && !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) 
||
-  cpu_has_feature(CPU_FTR_NOEXECUTE))) {
+   if (pte_looks_normal(pte, addr) && 
!(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) ||
+
cpu_has_feature(CPU_FTR_NOEXECUTE))) {
struct folio *folio = maybe_pte_to_folio(pte);
if (!folio)
return pte;
@@ -97,7 +97,7 @@ static pte_t set_pte_filter_hash(pte_t pte)
 
 #else /* CONFIG_PPC_BOOK3S */
 
-static pte_t set_pte_filter_hash(pte_t pte) { return pte; }
+static pte_t set_pte_filter_hash(pte_t pte, unsigned long addr) { return pte; }
 
 #endif /* CONFIG_PPC_BOOK3S */
 
@@ -105,7 +105,7 @@ static pte_t set_pte_filter_hash(pte_t pte) { return pte; }
  * as we don't have two bits to spare for _PAGE_EXEC and _PAGE_HWEXEC so
  * inst

[PATCH v2 34/37] powerpc/32s: Introduce _PAGE_READ and remove _PAGE_USER

2023-09-25 Thread Christophe Leroy
On 603 MMU, TLB missed are handled by SW and there are separated
DTLB and ITLB. It is therefore possible to implement execute-only
protection by not loading DTLB when read access is not permitted.

To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.

As _PAGE_NONE can also be implemented by clearing _PAGE_READ, remove
_PAGE_USER and add _PAGE_READ. Then use the virtual address to know
whether user rights or kernel rights are to be used.

With that change, 603 MMU now honors execute-only protection.

For hash (604) MMU it is more tricky because hash table is common to
load/store and execute. Nevertheless it is still possible to check
whether _PAGE_READ is set before loading hash table for a load/store
access. At least it can't be read unless it is executed first.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 48 ---
 arch/powerpc/kernel/head_book3s_32.S | 61 +++-
 arch/powerpc/mm/book3s32/hash_low.S  | 22 ---
 3 files changed, 60 insertions(+), 71 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 480ad6b4fd6f..244621c88510 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -20,7 +20,7 @@
 
 #define _PAGE_PRESENT  0x001   /* software: pte contains a translation */
 #define _PAGE_HASHPTE  0x002   /* hash_page has made an HPTE for this pte */
-#define _PAGE_USER 0x004   /* usermode access allowed */
+#define _PAGE_READ 0x004   /* software: read access allowed */
 #define _PAGE_GUARDED  0x008   /* G: prohibit speculative access */
 #define _PAGE_COHERENT 0x010   /* M: enforce memory coherence (SMP systems) */
 #define _PAGE_NO_CACHE 0x020   /* I: cache inhibit */
@@ -28,11 +28,9 @@
 #define _PAGE_DIRTY0x080   /* C: page changed */
 #define _PAGE_ACCESSED 0x100   /* R: page referenced */
 #define _PAGE_EXEC 0x200   /* software: exec allowed */
-#define _PAGE_RW   0x400   /* software: user write access allowed */
+#define _PAGE_WRITE0x400   /* software: user write access allowed */
 #define _PAGE_SPECIAL  0x800   /* software: Special page */
 
-#define _PAGE_WRITE_PAGE_RW
-
 #ifdef CONFIG_PTE_64BIT
 /* We never clear the high word of the pte */
 #define _PTE_NONE_MASK (0xULL | _PAGE_HASHPTE)
@@ -44,26 +42,13 @@
 #define _PMD_PRESENT_MASK (PAGE_MASK)
 #define _PMD_BAD   (~PAGE_MASK)
 
-/* We borrow the _PAGE_USER bit to store the exclusive marker in swap PTEs. */
-#define _PAGE_SWP_EXCLUSIVE_PAGE_USER
+/* We borrow the _PAGE_READ bit to store the exclusive marker in swap PTEs. */
+#define _PAGE_SWP_EXCLUSIVE_PAGE_READ
 
 /* And here we include common definitions */
 
-#define _PAGE_KERNEL_RO0
-#define _PAGE_KERNEL_ROX   (_PAGE_EXEC)
-#define _PAGE_KERNEL_RW(_PAGE_DIRTY | _PAGE_RW)
-#define _PAGE_KERNEL_RWX   (_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
-
 #define _PAGE_HPTEFLAGS _PAGE_HASHPTE
 
-#ifndef __ASSEMBLY__
-
-static inline bool pte_user(pte_t pte)
-{
-   return pte_val(pte) & _PAGE_USER;
-}
-#endif /* __ASSEMBLY__ */
-
 /*
  * Location of the PFN in the PTE. Most 32-bit platforms use the same
  * as _PAGE_SHIFT here (ie, naturally aligned).
@@ -99,20 +84,7 @@ static inline bool pte_user(pte_t pte)
 #define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED)
 #define _PAGE_BASE (_PAGE_BASE_NC | _PAGE_COHERENT)
 
-/*
- * Permission masks used to generate the __P and __S table.
- *
- * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
- *
- * Write permissions imply read permissions for now.
- */
-#define PAGE_NONE  __pgprot(_PAGE_BASE)
-#define PAGE_SHARED__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X  __pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | 
_PAGE_EXEC)
-#define PAGE_COPY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_COPY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY  __pgprot(_PAGE_BASE | _PAGE_USER)
-#define PAGE_READONLY_X__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#include 
 
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
@@ -408,12 +380,16 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
 }
 
 /* Generic accessors to PTE bits */
+static inline bool pte_read(pte_t pte)
+{
+   return !!(pte_val(pte) & _PAGE_READ);
+}
+
 static inline bool pte_write(pte_t pte)
 {
return !!(pte_val(pte) & _PAGE_WRITE);
 }
 
-static inline int pte_read(pte_t pte)  { return 1; }
 static inline int pte_dirty(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_DIRTY); }
 static inline int pte_young(pte_t pte) { return !!(pte_val(pte) & 
_PAGE_ACCESSED); }
 static inline int pte_special(pte_t p

[PATCH v2 04/37] powerpc: Remove pte_ERROR()

2023-09-25 Thread Christophe Leroy
pte_ERROR() is used neither in powerpc code nor in common mm code.

Remove it.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 3 ---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 2 --
 arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 2 --
 4 files changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 9b13eb14e21b..543c3691839b 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -224,9 +224,6 @@ void unmap_kernel_page(unsigned long va);
 /* Bits to mask out from a PGD to get to the PUD page */
 #define PGD_MASKED_BITS0
 
-#define pte_ERROR(e) \
-   pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
-   (unsigned long long)pte_val(e))
 #define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 /*
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 5c497c862d75..7c4ad1e03a49 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1014,8 +1014,6 @@ static inline pmd_t *pud_pgtable(pud_t pud)
return (pmd_t *)__va(pud_val(pud) & ~PUD_MASKED_BITS);
 }
 
-#define pte_ERROR(e) \
-   pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
 #define pmd_ERROR(e) \
pr_err("%s:%d: bad pmd %08lx.\n", __FILE__, __LINE__, pmd_val(e))
 #define pud_ERROR(e) \
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index f99c53a5f184..868aecbec8d1 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -55,9 +55,6 @@ extern int icache_44x_need_flush;
 
 #define USER_PTRS_PER_PGD  (TASK_SIZE / PGDIR_SIZE)
 
-#define pte_ERROR(e) \
-   pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
-   (unsigned long long)pte_val(e))
 #define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index eb6891e34cbd..8083c04a1e6d 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -269,8 +269,6 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
flush_tlb_page(vma, address);
 }
 
-#define pte_ERROR(e) \
-   pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
 #define pmd_ERROR(e) \
pr_err("%s:%d: bad pmd %08lx.\n", __FILE__, __LINE__, pmd_val(e))
 #define pgd_ERROR(e) \
-- 
2.41.0



[PATCH v2 19/37] powerpc/nohash: Refactor __ptep_set_access_flags()

2023-09-25 Thread Christophe Leroy
nohash/32 version of __ptep_set_access_flags() does the same
as nohash/64 version, the only difference is that nohash/32
version is more complete and uses pte_update().

Make it common and remove the nohash/64 version.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 16 
 arch/powerpc/include/asm/nohash/64/pgtable.h | 15 ---
 arch/powerpc/include/asm/nohash/pgtable.h| 17 +
 3 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 481594097f46..9164a9e41b02 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -161,22 +161,6 @@ static inline void pmd_clear(pmd_t *pmdp)
*pmdp = __pmd(0);
 }
 
-#ifndef __ptep_set_access_flags
-static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
-  pte_t *ptep, pte_t entry,
-  unsigned long address,
-  int psize)
-{
-   unsigned long set = pte_val(entry) &
-   (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | 
_PAGE_EXEC);
-   int huge = psize > mmu_virtual_psize ? 1 : 0;
-
-   pte_update(vma->vm_mm, address, ptep, 0, set, huge);
-
-   flush_tlb_page(vma, address);
-}
-#endif
-
 /*
  * Note that on Book E processors, the pmd contains the kernel virtual
  * (lowmem) address of the pte page.  The physical address is less useful
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index b59fbf754f82..36b9bad428cc 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -159,21 +159,6 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
__young;\
 })
 
-/* Set the dirty and/or accessed bits atomically in a linux PTE */
-static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
-  pte_t *ptep, pte_t entry,
-  unsigned long address,
-  int psize)
-{
-   unsigned long bits = pte_val(entry) &
-   (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
-
-   unsigned long old = pte_val(*ptep);
-   *ptep = __pte(old | bits);
-
-   flush_tlb_page(vma, address);
-}
-
 #define pmd_ERROR(e) \
pr_err("%s:%d: bad pmd %08lx.\n", __FILE__, __LINE__, pmd_val(e))
 #define pgd_ERROR(e) \
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 464eb771db82..1493f0b09ae9 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -102,6 +102,23 @@ static inline void pte_clear(struct mm_struct *mm, 
unsigned long addr,
pte_update(mm, addr, ptep, ~0UL, 0, 0);
 }
 
+/* Set the dirty and/or accessed bits atomically in a linux PTE */
+#ifndef __ptep_set_access_flags
+static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
+  pte_t *ptep, pte_t entry,
+  unsigned long address,
+  int psize)
+{
+   unsigned long set = pte_val(entry) &
+   (_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | 
_PAGE_EXEC);
+   int huge = psize > mmu_virtual_psize ? 1 : 0;
+
+   pte_update(vma->vm_mm, address, ptep, 0, set, huge);
+
+   flush_tlb_page(vma, address);
+}
+#endif
+
 /* Generic accessors to PTE bits */
 #ifndef pte_mkwrite_novma
 static inline pte_t pte_mkwrite_novma(pte_t pte)
-- 
2.41.0



[PATCH v2 06/37] powerpc: Refactor update_mmu_cache_range()

2023-09-25 Thread Christophe Leroy
On nohash, this function voids except for E500 with hugepages.

On book3s, this function is for hash MMUs only.

Combine those tests and rename E500 update_mmu_cache_range()
as __update_mmu_cache() which gets called by
update_mmu_cache_range().

Signed-off-by: Christophe Leroy 
---
v2: Fix the logic
---
 arch/powerpc/include/asm/book3s/pgtable.h | 24 ---
 arch/powerpc/include/asm/nohash/pgtable.h | 15 --
 arch/powerpc/include/asm/pgtable.h| 19 ++
 arch/powerpc/mm/nohash/e500_hugetlbpage.c |  3 +--
 4 files changed, 20 insertions(+), 41 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h 
b/arch/powerpc/include/asm/book3s/pgtable.h
index 6f4578daea6c..f42d68c6b314 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -8,28 +8,4 @@
 #include 
 #endif
 
-#ifndef __ASSEMBLY__
-void __update_mmu_cache(struct vm_area_struct *vma, unsigned long address, 
pte_t *ptep);
-
-/*
- * This gets called at the end of handling a page fault, when
- * the kernel has put a new PTE into the page table for the process.
- * We use it to ensure coherency between the i-cache and d-cache
- * for the page which has just been mapped in.
- * On machines which use an MMU hash table, we use this to put a
- * corresponding HPTE into the hash table ahead of time, instead of
- * waiting for the inevitable extra hash-table miss exception.
- */
-static inline void update_mmu_cache_range(struct vm_fault *vmf,
-   struct vm_area_struct *vma, unsigned long address,
-   pte_t *ptep, unsigned int nr)
-{
-   if (IS_ENABLED(CONFIG_PPC32) && !mmu_has_feature(MMU_FTR_HPTE_TABLE))
-   return;
-   if (radix_enabled())
-   return;
-   __update_mmu_cache(vma, address, ptep);
-}
-
-#endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 5b6647fb398b..a9056f4fad48 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -259,20 +259,5 @@ static inline int pud_huge(pud_t pud)
 #define is_hugepd(hpd) (hugepd_ok(hpd))
 #endif
 
-/*
- * This gets called at the end of handling a page fault, when
- * the kernel has put a new PTE into the page table for the process.
- * We use it to ensure coherency between the i-cache and d-cache
- * for the page which has just been mapped in.
- */
-#if defined(CONFIG_PPC_E500) && defined(CONFIG_HUGETLB_PAGE)
-void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
-   unsigned long address, pte_t *ptep, unsigned int nr);
-#else
-static inline void update_mmu_cache_range(struct vm_fault *vmf,
-   struct vm_area_struct *vma, unsigned long address,
-   pte_t *ptep, unsigned int nr) {}
-#endif
-
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index bcdbdeda65d3..966e7c5119f6 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -119,6 +119,25 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned 
long pfn,
  unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
+void __update_mmu_cache(struct vm_area_struct *vma, unsigned long address, 
pte_t *ptep);
+
+/*
+ * This gets called at the end of handling a page fault, when
+ * the kernel has put a new PTE into the page table for the process.
+ * We use it to ensure coherency between the i-cache and d-cache
+ * for the page which has just been mapped in.
+ * On machines which use an MMU hash table, we use this to put a
+ * corresponding HPTE into the hash table ahead of time, instead of
+ * waiting for the inevitable extra hash-table miss exception.
+ */
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+   struct vm_area_struct *vma, unsigned long address,
+   pte_t *ptep, unsigned int nr)
+{
+   if ((mmu_has_feature(MMU_FTR_HPTE_TABLE) && !radix_enabled()) ||
+   (IS_ENABLED(CONFIG_PPC_E500) && IS_ENABLED(CONFIG_HUGETLB_PAGE)))
+   __update_mmu_cache(vma, address, ptep);
+}
 
 /*
  * When used, PTE_FRAG_NR is defined in subarch pgtable.h
diff --git a/arch/powerpc/mm/nohash/e500_hugetlbpage.c 
b/arch/powerpc/mm/nohash/e500_hugetlbpage.c
index 6b30e40d4590..a134d28a0e4d 100644
--- a/arch/powerpc/mm/nohash/e500_hugetlbpage.c
+++ b/arch/powerpc/mm/nohash/e500_hugetlbpage.c
@@ -178,8 +178,7 @@ book3e_hugetlb_preload(struct vm_area_struct *vma, unsigned 
long ea, pte_t pte)
  *
  * This must always be called with the pte lock held.
  */
-void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
-   unsigned long address, pte_t *ptep, unsigned int nr)
+void __update_mmu_cache(struct vm_area_struct *vma, unsigned long address, 
pte_t *ptep)
 {
if (

[PATCH v2 10/37] powerpc/nohash: Move 8xx version of pte_update() into pte-8xx.h

2023-09-25 Thread Christophe Leroy
No point in having 8xx special pte_update() in common header,
move it into pte-8xx.h

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 57 +---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h | 57 
 2 files changed, 58 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 26289e4e767c..be8bca42bdce 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -219,63 +219,8 @@ static inline void pmd_clear(pmd_t *pmdp)
  * that an executable user mapping was modified, which is needed
  * to properly flush the virtually tagged instruction cache of
  * those implementations.
- *
- * On the 8xx, the page tables are a bit special. For 16k pages, we have
- * 4 identical entries. For 512k pages, we have 128 entries as if it was
- * 4k pages, but they are flagged as 512k pages for the hardware.
- * For other page sizes, we have a single entry in the table.
  */
-#ifdef CONFIG_PPC_8xx
-static pmd_t *pmd_off(struct mm_struct *mm, unsigned long addr);
-static int hugepd_ok(hugepd_t hpd);
-
-static int number_of_cells_per_pte(pmd_t *pmd, pte_basic_t val, int huge)
-{
-   if (!huge)
-   return PAGE_SIZE / SZ_4K;
-   else if (hugepd_ok(*((hugepd_t *)pmd)))
-   return 1;
-   else if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !(val & _PAGE_HUGE))
-   return SZ_16K / SZ_4K;
-   else
-   return SZ_512K / SZ_4K;
-}
-
-static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
-unsigned long clr, unsigned long set, int 
huge)
-{
-   pte_basic_t *entry = (pte_basic_t *)p;
-   pte_basic_t old = pte_val(*p);
-   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
-   int num, i;
-   pmd_t *pmd = pmd_off(mm, addr);
-
-   num = number_of_cells_per_pte(pmd, new, huge);
-
-   for (i = 0; i < num; i += PAGE_SIZE / SZ_4K, new += PAGE_SIZE) {
-   *entry++ = new;
-   if (IS_ENABLED(CONFIG_PPC_16K_PAGES) && num != 1) {
-   *entry++ = new;
-   *entry++ = new;
-   *entry++ = new;
-   }
-   }
-
-   return old;
-}
-
-#ifdef CONFIG_PPC_16K_PAGES
-#define ptep_get ptep_get
-static inline pte_t ptep_get(pte_t *ptep)
-{
-   pte_basic_t val = READ_ONCE(ptep->pte);
-   pte_t pte = {val, val, val, val};
-
-   return pte;
-}
-#endif /* CONFIG_PPC_16K_PAGES */
-
-#else
+#ifndef pte_update
 static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
 unsigned long clr, unsigned long set, int 
huge)
 {
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index e6fe1d5731f2..52395a5ecd70 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -187,6 +187,63 @@ static inline unsigned long pte_leaf_size(pte_t pte)
 
 #define pte_leaf_size pte_leaf_size
 
+/*
+ * On the 8xx, the page tables are a bit special. For 16k pages, we have
+ * 4 identical entries. For 512k pages, we have 128 entries as if it was
+ * 4k pages, but they are flagged as 512k pages for the hardware.
+ * For other page sizes, we have a single entry in the table.
+ */
+static pmd_t *pmd_off(struct mm_struct *mm, unsigned long addr);
+static int hugepd_ok(hugepd_t hpd);
+
+static inline int number_of_cells_per_pte(pmd_t *pmd, pte_basic_t val, int 
huge)
+{
+   if (!huge)
+   return PAGE_SIZE / SZ_4K;
+   else if (hugepd_ok(*((hugepd_t *)pmd)))
+   return 1;
+   else if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !(val & _PAGE_HUGE))
+   return SZ_16K / SZ_4K;
+   else
+   return SZ_512K / SZ_4K;
+}
+
+static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
+unsigned long clr, unsigned long set, int 
huge)
+{
+   pte_basic_t *entry = (pte_basic_t *)p;
+   pte_basic_t old = pte_val(*p);
+   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
+   int num, i;
+   pmd_t *pmd = pmd_off(mm, addr);
+
+   num = number_of_cells_per_pte(pmd, new, huge);
+
+   for (i = 0; i < num; i += PAGE_SIZE / SZ_4K, new += PAGE_SIZE) {
+   *entry++ = new;
+   if (IS_ENABLED(CONFIG_PPC_16K_PAGES) && num != 1) {
+   *entry++ = new;
+   *entry++ = new;
+   *entry++ = new;
+   }
+   }
+
+   return old;
+}
+
+#define pte_update pte_update
+
+#ifdef CONFIG_PPC_16K_PAGES
+#define ptep_get ptep_get
+static inline pte_t ptep_get(pte_t *ptep)
+{
+   pte_basic_t val = READ_ONCE(ptep->pte);
+   pte_t pte = {val, val, val, val

[PATCH v2 20/37] powerpc/e500: Simplify pte_mkexec()

2023-09-25 Thread Christophe Leroy
Commit b6cb20fdc273 ("powerpc/book3e: Fix set_memory_x() and
set_memory_nx()") implemented a more elaborated version of
pte_mkwrite() suitable for both kernel and user pages. That was
needed because set_memory_x() was using pte_mkwrite(). But since
commit a4c182ecf335 ("powerpc/set_memory: Avoid spinlock recursion
in change_page_attr()") pte_mkwrite() is not used anymore by
set_memory_x() so pte_mkwrite() can be simplified as it is only
used for user pages.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pte-e500.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h 
b/arch/powerpc/include/asm/nohash/pte-e500.h
index d8924cbd61e4..99288e26b6b0 100644
--- a/arch/powerpc/include/asm/nohash/pte-e500.h
+++ b/arch/powerpc/include/asm/nohash/pte-e500.h
@@ -115,10 +115,7 @@ static inline pte_t pte_mkuser(pte_t pte)
 
 static inline pte_t pte_mkexec(pte_t pte)
 {
-   if (pte_val(pte) & _PAGE_BAP_UR)
-   return __pte((pte_val(pte) & ~_PAGE_BAP_SX) | _PAGE_BAP_UX);
-   else
-   return __pte((pte_val(pte) & ~_PAGE_BAP_UX) | _PAGE_BAP_SX);
+   return __pte((pte_val(pte) & ~_PAGE_BAP_SX) | _PAGE_BAP_UX);
 }
 #define pte_mkexec pte_mkexec
 
-- 
2.41.0



[PATCH v2 21/37] powerpc: Implement and use pgprot_nx()

2023-09-25 Thread Christophe Leroy
ioremap_page_range() calls pgprot_nx() vmap() and vmap_pfn()
clear execute permission by calling pgprot_nx().

When pgprot_nx() is not defined it falls back to a nop.

Implement it for powerpc then use it in early_ioremap_range().

Then the call to pte_exprotect() can be removed from ioremap_prot().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/pgtable.h | 6 ++
 arch/powerpc/mm/ioremap.c  | 5 ++---
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 966e7c5119f6..2bfb7dd3b49e 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -71,6 +71,12 @@ static inline pgprot_t pte_pgprot(pte_t pte)
return __pgprot(pte_flags);
 }
 
+static inline pgprot_t pgprot_nx(pgprot_t prot)
+{
+   return pte_pgprot(pte_exprotect(__pte(pgprot_val(prot;
+}
+#define pgprot_nx pgprot_nx
+
 #ifndef pmd_page_vaddr
 static inline const void *pmd_page_vaddr(pmd_t pmd)
 {
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 705e8e8ffde4..d5159f205380 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -50,8 +50,7 @@ void __iomem *ioremap_prot(phys_addr_t addr, size_t size, 
unsigned long flags)
if (pte_write(pte))
pte = pte_mkdirty(pte);
 
-   /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
-   pte = pte_exprotect(pte);
+   /* we don't want to let _PAGE_USER leak out */
pte = pte_mkprivileged(pte);
 
if (iowa_is_active())
@@ -66,7 +65,7 @@ int early_ioremap_range(unsigned long ea, phys_addr_t pa,
unsigned long i;
 
for (i = 0; i < size; i += PAGE_SIZE) {
-   int err = map_kernel_page(ea + i, pa + i, prot);
+   int err = map_kernel_page(ea + i, pa + i, pgprot_nx(prot));
 
if (WARN_ON_ONCE(err))  /* Should clean up */
return err;
-- 
2.41.0



[PATCH v2 07/37] powerpc: Untangle fixmap.h and pgtable.h and mmu.h

2023-09-25 Thread Christophe Leroy
fixmap.h need pgtable.h for [un]map_kernel_page()

pgtable.h need fixmap.h for FIXADDR_TOP.

Untangle the two files by moving FIXADDR_TOP into pgtable.h

Also move VIRT_IMMR_BASE to fixmap.h to avoid fixmap.h in mmu.h

Signed-off-by: Christophe Leroy 
---
v2: Add asm/fixmap.h to platforms/83xx/misc.c
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  9 -
 arch/powerpc/include/asm/book3s/64/pgtable.h |  1 +
 arch/powerpc/include/asm/fixmap.h| 16 
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h |  1 -
 arch/powerpc/include/asm/nohash/32/pgtable.h |  9 -
 arch/powerpc/include/asm/nohash/64/pgtable.h |  1 +
 arch/powerpc/mm/init_32.c|  1 +
 arch/powerpc/mm/mem.c|  1 +
 arch/powerpc/mm/nohash/8xx.c |  2 ++
 arch/powerpc/platforms/83xx/misc.c   |  2 ++
 arch/powerpc/platforms/8xx/cpm1.c|  1 +
 11 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 543c3691839b..45b69ae2631e 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -170,7 +170,14 @@ void unmap_kernel_page(unsigned long va);
  * value (for now) on others, from where we can start layout kernel
  * virtual space that goes below PKMAP and FIXMAP
  */
-#include 
+
+#define FIXADDR_SIZE   0
+#ifdef CONFIG_KASAN
+#include 
+#define FIXADDR_TOP(KASAN_SHADOW_START - PAGE_SIZE)
+#else
+#define FIXADDR_TOP((unsigned long)(-PAGE_SIZE))
+#endif
 
 /*
  * ioremap_bot starts at that address. Early ioremaps move down from there,
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 7c4ad1e03a49..dbd545e73161 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -316,6 +316,7 @@ extern unsigned long pci_io_base;
 #define IOREMAP_START  (ioremap_bot)
 #define IOREMAP_END(KERN_IO_END - FIXADDR_SIZE)
 #define FIXADDR_SIZE   SZ_32M
+#define FIXADDR_TOP(IOREMAP_END + FIXADDR_SIZE)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/asm/fixmap.h 
b/arch/powerpc/include/asm/fixmap.h
index a832aeafe560..f9068dd8dfce 100644
--- a/arch/powerpc/include/asm/fixmap.h
+++ b/arch/powerpc/include/asm/fixmap.h
@@ -23,18 +23,6 @@
 #include 
 #endif
 
-#ifdef CONFIG_PPC64
-#define FIXADDR_TOP(IOREMAP_END + FIXADDR_SIZE)
-#else
-#define FIXADDR_SIZE   0
-#ifdef CONFIG_KASAN
-#include 
-#define FIXADDR_TOP(KASAN_SHADOW_START - PAGE_SIZE)
-#else
-#define FIXADDR_TOP((unsigned long)(-PAGE_SIZE))
-#endif
-#endif
-
 /*
  * Here we define all the compile-time 'special' virtual
  * addresses. The point is to have a constant address at
@@ -119,5 +107,9 @@ static inline void __set_fixmap(enum fixed_addresses idx,
 
 #define __early_set_fixmap __set_fixmap
 
+#ifdef CONFIG_PPC_8xx
+#define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
+#endif
+
 #endif /* !__ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 0e93a4728c9e..141d82e249a8 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -188,7 +188,6 @@ typedef struct {
 } mm_context_t;
 
 #define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff8)
-#define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
 
 /* Page size definitions, common between 32 and 64-bit
  *
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 868aecbec8d1..c8311ee08811 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -71,7 +71,14 @@ void unmap_kernel_page(unsigned long va);
  * value (for now) on others, from where we can start layout kernel
  * virtual space that goes below PKMAP and FIXMAP
  */
-#include 
+
+#define FIXADDR_SIZE   0
+#ifdef CONFIG_KASAN
+#include 
+#define FIXADDR_TOP(KASAN_SHADOW_START - PAGE_SIZE)
+#else
+#define FIXADDR_TOP((unsigned long)(-PAGE_SIZE))
+#endif
 
 /*
  * ioremap_bot starts at that address. Early ioremaps move down from there,
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 8083c04a1e6d..dee3fc654d40 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -57,6 +57,7 @@
 #define IOREMAP_START  (ioremap_bot)
 #define IOREMAP_END(KERN_IO_START + KERN_IO_SIZE - FIXADDR_SIZE)
 #define FIXADDR_SIZE   SZ_32M
+#define FIXADDR_TOP(IOREMAP_END + FIXADDR_SIZE)
 
 /*
  * Defines the address of the vmemap area, in its own region on
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index d8adc452f431..4e71dfe7d026 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -39,6 +39,7 @@
 #inc

[PATCH v2 03/37] powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macro

2023-09-25 Thread Christophe Leroy
40x TLB handlers were reworked by commit 2c74e2586bb9 ("powerpc/40x:
Rework 40x PTE access and TLB miss") to not require PTE_ATOMIC_UPDATES
anymore.

Then commit 4e1df545e2fa ("powerpc/pgtable: Drop PTE_ATOMIC_UPDATES")
removed all code related to PTE_ATOMIC_UPDATES.

Remove left over PTE_ATOMIC_UPDATES macro.

Fixes: 2c74e2586bb9 ("powerpc/40x: Rework 40x PTE access and TLB miss")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-40x.h | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h 
b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 6fe46e754556..0b4e5f8ce3e8 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -69,9 +69,6 @@
 
 #define _PTE_NONE_MASK 0
 
-/* Until my rework is finished, 40x still needs atomic PTE updates */
-#define PTE_ATOMIC_UPDATES 1
-
 #define _PAGE_BASE_NC  (_PAGE_PRESENT | _PAGE_ACCESSED)
 #define _PAGE_BASE (_PAGE_BASE_NC)
 
-- 
2.41.0



[PATCH v2 09/37] powerpc/nohash: Refactor declaration of {map/unmap}_kernel_page()

2023-09-25 Thread Christophe Leroy
map_kernel_page() and unmap_kernel_page() have the same prototypes
on nohash/32 and nohash/64, keep only one declaration.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 8 
 arch/powerpc/include/asm/nohash/64/pgtable.h | 2 --
 arch/powerpc/include/asm/nohash/pgtable.h| 3 +++
 arch/powerpc/mm/nohash/book3e_pgtable.c  | 2 +-
 4 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index c8311ee08811..26289e4e767c 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -58,14 +58,6 @@ extern int icache_44x_need_flush;
 #define pgd_ERROR(e) \
pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 
-#ifndef __ASSEMBLY__
-
-int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot);
-void unmap_kernel_page(unsigned long va);
-
-#endif /* !__ASSEMBLY__ */
-
-
 /*
  * This is the bottom of the PKMAP area with HIGHMEM or an arbitrary
  * value (for now) on others, from where we can start layout kernel
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index dee3fc654d40..f5a8e8a9dba4 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -309,8 +309,6 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
 /* We borrow MSB 56 (LSB 7) to store the exclusive marker in swap PTEs. */
 #define _PAGE_SWP_EXCLUSIVE0x80
 
-int map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot);
-void unmap_kernel_page(unsigned long va);
 extern int __meminit vmemmap_create_mapping(unsigned long start,
unsigned long page_size,
unsigned long phys);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index ab26af2b421a..3d684b500fe6 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -242,5 +242,8 @@ static inline int pud_huge(pud_t pud)
 #define is_hugepd(hpd) (hugepd_ok(hpd))
 #endif
 
+int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot);
+void unmap_kernel_page(unsigned long va);
+
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c 
b/arch/powerpc/mm/nohash/book3e_pgtable.c
index b80fc4a91a53..1c5e4ecbebeb 100644
--- a/arch/powerpc/mm/nohash/book3e_pgtable.c
+++ b/arch/powerpc/mm/nohash/book3e_pgtable.c
@@ -71,7 +71,7 @@ static void __init *early_alloc_pgtable(unsigned long size)
  * map_kernel_page adds an entry to the ioremap page table
  * and adds an entry to the HPT, possibly bolting it
  */
-int __ref map_kernel_page(unsigned long ea, unsigned long pa, pgprot_t prot)
+int __ref map_kernel_page(unsigned long ea, phys_addr_t pa, pgprot_t prot)
 {
pgd_t *pgdp;
p4d_t *p4dp;
-- 
2.41.0



[PATCH v2 08/37] powerpc/nohash: Remove {pte/pmd}_protnone()

2023-09-25 Thread Christophe Leroy
Only book3s/64 selects ARCH_SUPPORTS_NUMA_BALANCING so
CONFIG_NUMA_BALANCING can't be selected on nohash targets.

Remove pte_protnone() and pmd_protnone().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/pgtable.h | 17 -
 1 file changed, 17 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index a9056f4fad48..ab26af2b421a 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -35,23 +35,6 @@ static inline bool pte_hashpte(pte_t pte){ return false; 
}
 static inline bool pte_ci(pte_t pte)   { return pte_val(pte) & 
_PAGE_NO_CACHE; }
 static inline bool pte_exec(pte_t pte) { return pte_val(pte) & 
_PAGE_EXEC; }
 
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/linux/pgtable.h . On powerpc, this will only
- * work for user pages and always return true for kernel pages.
- */
-static inline int pte_protnone(pte_t pte)
-{
-   return pte_present(pte) && !pte_user(pte);
-}
-
-static inline int pmd_protnone(pmd_t pmd)
-{
-   return pte_protnone(pmd_pte(pmd));
-}
-#endif /* CONFIG_NUMA_BALANCING */
-
 static inline int pte_present(pte_t pte)
 {
return pte_val(pte) & _PAGE_PRESENT;
-- 
2.41.0



[PATCH v2 13/37] powerpc/nohash: Refactor checking of no-change in pte_update()

2023-09-25 Thread Christophe Leroy
On nohash/64, a few callers of pte_update() check if there is
really a change in order to avoid an unnecessary write.

Refactor that inside pte_update().

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 9 -
 arch/powerpc/include/asm/nohash/pgtable.h| 3 +++
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index b149a39f2685..cba08a62c52c 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -181,8 +181,6 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 {
unsigned long old;
 
-   if (!pte_young(*ptep))
-   return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
 }
@@ -198,10 +196,6 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
 {
-
-   if ((pte_val(*ptep) & _PAGE_RW) == 0)
-   return;
-
pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
 }
 
@@ -209,9 +203,6 @@ static inline void ptep_set_wrprotect(struct mm_struct *mm, 
unsigned long addr,
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
 {
-   if ((pte_val(*ptep) & _PAGE_RW) == 0)
-   return;
-
pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
 }
 
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index bd5c3a4baabd..8adaacbbdd1d 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -47,6 +47,9 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
pte_basic_t old = pte_val(*p);
pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
 
+   if (new == old)
+   return old;
+
*p = __pte(new);
 
if (IS_ENABLED(CONFIG_44x) && (old & _PAGE_USER) && (old & _PAGE_EXEC))
-- 
2.41.0



[PATCH v2 18/37] powerpc/nohash: Refactor pte_clear()

2023-09-25 Thread Christophe Leroy
pte_clear() are doing the same on nohash/32 and nohash/64,

Keep the static inline version of nohash/64, make it common and
remove the macro version of nohash/32.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 3 ---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 7 ---
 arch/powerpc/include/asm/nohash/pgtable.h| 6 ++
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 0be464af4cb1..481594097f46 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -153,9 +153,6 @@
 
 #ifndef __ASSEMBLY__
 
-#define pte_clear(mm, addr, ptep) \
-   do { pte_update(mm, addr, ptep, ~0, 0, 0); } while (0)
-
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
 #definepmd_present(pmd)(pmd_val(pmd) & _PMD_PRESENT_MASK)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index dc6e35c3a53f..b59fbf754f82 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -159,13 +159,6 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
__young;\
 })
 
-static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
-pte_t * ptep)
-{
-   pte_update(mm, addr, ptep, ~0UL, 0, 0);
-}
-
-
 /* Set the dirty and/or accessed bits atomically in a linux PTE */
 static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
   pte_t *ptep, pte_t entry,
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 7e810a84ac15..464eb771db82 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -96,6 +96,12 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, 
unsigned long addr,
 }
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
+pte_t * ptep)
+{
+   pte_update(mm, addr, ptep, ~0UL, 0, 0);
+}
+
 /* Generic accessors to PTE bits */
 #ifndef pte_mkwrite_novma
 static inline pte_t pte_mkwrite_novma(pte_t pte)
-- 
2.41.0



[PATCH v2 16/37] powerpc/nohash: Refactor ptep_test_and_clear_young()

2023-09-25 Thread Christophe Leroy
Remove ptep_test_and_clear_young() macro, make
__ptep_test_and_clear_young() common to nohash/32 and nohash/64
and change it to become ptep_test_and_clear_young()

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 11 ---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 19 +--
 arch/powerpc/include/asm/nohash/pgtable.h| 11 +++
 3 files changed, 12 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index de51f78449a0..b7605000bd91 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -164,17 +164,6 @@ static inline void pmd_clear(pmd_t *pmdp)
*pmdp = __pmd(0);
 }
 
-#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
- unsigned long addr, pte_t *ptep)
-{
-   unsigned long old;
-   old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
-   return (old & _PAGE_ACCESSED) != 0;
-}
-#define ptep_test_and_clear_young(__vma, __addr, __ptep) \
-   __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep)
-
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
   pte_t *ptep)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index e8bbc6ec1084..56041036fa34 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -145,22 +145,6 @@ static inline void p4d_set(p4d_t *p4dp, unsigned long val)
*p4dp = __p4d(val);
 }
 
-static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
- unsigned long addr, pte_t *ptep)
-{
-   unsigned long old;
-
-   old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
-   return (old & _PAGE_ACCESSED) != 0;
-}
-#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-#define ptep_test_and_clear_young(__vma, __addr, __ptep)  \
-({\
-   int __r;   \
-   __r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
-   __r;   \
-})
-
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
@@ -178,8 +162,7 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct 
*mm,
 #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
 #define ptep_clear_flush_young(__vma, __address, __ptep)   \
 ({ \
-   int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
- __ptep);  \
+   int __young = ptep_test_and_clear_young(__vma, __address, __ptep);\
__young;\
 })
 
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 21f232d2e34f..2b043b72f642 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -69,6 +69,17 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 }
 #endif
 
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
+   unsigned long addr, pte_t *ptep)
+{
+   unsigned long old;
+
+   old = pte_update(vma->vm_mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
+
+   return (old & _PAGE_ACCESSED) != 0;
+}
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+
 /* Generic accessors to PTE bits */
 #ifndef pte_mkwrite_novma
 static inline pte_t pte_mkwrite_novma(pte_t pte)
-- 
2.41.0



[PATCH v2 15/37] powerpc/nohash: Deduplicate pte helpers

2023-09-25 Thread Christophe Leroy
Deduplicate following helpers that are identical on
nohash/32 and nohash/64:
  pte_mkwrite_novma()
  pte_mkdirty()
  pte_mkyoung()
  pte_wrprotect()
  pte_mkexec()
  pte_young()

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 36 
 arch/powerpc/include/asm/nohash/64/pgtable.h | 25 --
 arch/powerpc/include/asm/nohash/pgtable.h| 36 
 3 files changed, 36 insertions(+), 61 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index a39ecd498084..de51f78449a0 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -156,37 +156,6 @@
 #define pte_clear(mm, addr, ptep) \
do { pte_update(mm, addr, ptep, ~0, 0, 0); } while (0)
 
-#ifndef pte_mkwrite_novma
-static inline pte_t pte_mkwrite_novma(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_RW);
-}
-#endif
-
-static inline pte_t pte_mkdirty(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkyoung(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_ACCESSED);
-}
-
-#ifndef pte_wrprotect
-static inline pte_t pte_wrprotect(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_RW);
-}
-#endif
-
-#ifndef pte_mkexec
-static inline pte_t pte_mkexec(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_EXEC);
-}
-#endif
-
 #define pmd_none(pmd)  (!pmd_val(pmd))
 #definepmd_bad(pmd)(pmd_val(pmd) & _PMD_BAD)
 #definepmd_present(pmd)(pmd_val(pmd) & _PMD_PRESENT_MASK)
@@ -238,11 +207,6 @@ static inline void __ptep_set_access_flags(struct 
vm_area_struct *vma,
 }
 #endif
 
-static inline int pte_young(pte_t pte)
-{
-   return pte_val(pte) & _PAGE_ACCESSED;
-}
-
 /*
  * Note that on Book E processors, the pmd contains the kernel virtual
  * (lowmem) address of the pte page.  The physical address is less useful
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 34a518a1c04d..e8bbc6ec1084 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -80,26 +80,6 @@
 #ifndef __ASSEMBLY__
 /* pte_clear moved to later in this file */
 
-static inline pte_t pte_mkwrite_novma(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_RW);
-}
-
-static inline pte_t pte_mkdirty(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkyoung(pte_t pte)
-{
-   return __pte(pte_val(pte) | _PAGE_ACCESSED);
-}
-
-static inline pte_t pte_wrprotect(pte_t pte)
-{
-   return __pte(pte_val(pte) & ~_PAGE_RW);
-}
-
 #define PMD_BAD_BITS   (PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS   (PMD_TABLE_SIZE-1)
 
@@ -165,11 +145,6 @@ static inline void p4d_set(p4d_t *p4dp, unsigned long val)
*p4dp = __p4d(val);
 }
 
-static inline int pte_young(pte_t pte)
-{
-   return pte_val(pte) & _PAGE_ACCESSED;
-}
-
 static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
  unsigned long addr, pte_t *ptep)
 {
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index c64a040f4a6a..21f232d2e34f 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -70,6 +70,37 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 #endif
 
 /* Generic accessors to PTE bits */
+#ifndef pte_mkwrite_novma
+static inline pte_t pte_mkwrite_novma(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_RW);
+}
+#endif
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+#ifndef pte_wrprotect
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+   return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+#endif
+
+#ifndef pte_mkexec
+static inline pte_t pte_mkexec(pte_t pte)
+{
+   return __pte(pte_val(pte) | _PAGE_EXEC);
+}
+#endif
+
 #ifndef pte_write
 static inline int pte_write(pte_t pte)
 {
@@ -96,6 +127,11 @@ static inline bool pte_hw_valid(pte_t pte)
return pte_val(pte) & _PAGE_PRESENT;
 }
 
+static inline int pte_young(pte_t pte)
+{
+   return pte_val(pte) & _PAGE_ACCESSED;
+}
+
 /*
  * Don't just check for any non zero bits in __PAGE_USER, since for book3e
  * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
-- 
2.41.0



[PATCH v2 14/37] powerpc/nohash: Deduplicate _PAGE_CHG_MASK

2023-09-25 Thread Christophe Leroy
_PAGE_CHG_MASK is identical between nohash/32 and nohash/64,
deduplicate it.

While at it, clean the #ifdef for PTE_RPN_MASK in nohash/32 as
it is already CONFIG_PPC32.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 8 +---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 6 --
 arch/powerpc/include/asm/nohash/pgtable.h| 6 ++
 3 files changed, 7 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index ae7f3c8afd4f..a39ecd498084 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -143,7 +143,7 @@
  * The mask covered by the RPN must be a ULL on 32-bit platforms with
  * 64-bit PTEs.
  */
-#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+#ifdef CONFIG_PTE_64BIT
 #define PTE_RPN_MASK   (~((1ULL << PTE_RPN_SHIFT) - 1))
 #define MAX_POSSIBLE_PHYSMEM_BITS 36
 #else
@@ -151,12 +151,6 @@
 #define MAX_POSSIBLE_PHYSMEM_BITS 32
 #endif
 
-/*
- * _PAGE_CHG_MASK masks of bits that are to be preserved across
- * pgprot changes.
- */
-#define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_DIRTY | _PAGE_ACCESSED | 
_PAGE_SPECIAL)
-
 #ifndef __ASSEMBLY__
 
 #define pte_clear(mm, addr, ptep) \
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index cba08a62c52c..34a518a1c04d 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -75,12 +75,6 @@
 
 #define PTE_RPN_MASK   (~((1UL << PTE_RPN_SHIFT) - 1))
 
-/*
- * _PAGE_CHG_MASK masks of bits that are to be preserved across
- * pgprot changes.
- */
-#define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_DIRTY | _PAGE_ACCESSED | 
_PAGE_SPECIAL)
-
 #define H_PAGE_4K_PFN 0
 
 #ifndef __ASSEMBLY__
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 8adaacbbdd1d..c64a040f4a6a 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -13,6 +13,12 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 #include 
 #endif
 
+/*
+ * _PAGE_CHG_MASK masks of bits that are to be preserved across
+ * pgprot changes.
+ */
+#define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_DIRTY | _PAGE_ACCESSED | 
_PAGE_SPECIAL)
+
 /* Permission masks used for kernel mappings */
 #define PAGE_KERNEL__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
 #define PAGE_KERNEL_NC __pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | 
_PAGE_NO_CACHE)
-- 
2.41.0



[PATCH v2 17/37] powerpc/nohash: Deduplicate ptep_set_wrprotect() and ptep_get_and_clear()

2023-09-25 Thread Christophe Leroy
ptep_set_wrprotect() and ptep_get_and_clear are identical for
nohash/32 and nohash/64.

Make them common.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 16 
 arch/powerpc/include/asm/nohash/64/pgtable.h | 15 ---
 arch/powerpc/include/asm/nohash/pgtable.h| 16 
 3 files changed, 16 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index b7605000bd91..0be464af4cb1 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -164,22 +164,6 @@ static inline void pmd_clear(pmd_t *pmdp)
*pmdp = __pmd(0);
 }
 
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
-  pte_t *ptep)
-{
-   return __pte(pte_update(mm, addr, ptep, ~0, 0, 0));
-}
-
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-#ifndef ptep_set_wrprotect
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep)
-{
-   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
-}
-#endif
-
 #ifndef __ptep_set_access_flags
 static inline void __ptep_set_access_flags(struct vm_area_struct *vma,
   pte_t *ptep, pte_t entry,
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 56041036fa34..dc6e35c3a53f 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -145,13 +145,6 @@ static inline void p4d_set(p4d_t *p4dp, unsigned long val)
*p4dp = __p4d(val);
 }
 
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep)
-{
-   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
-}
-
 #define __HAVE_ARCH_HUGE_PTEP_SET_WRPROTECT
 static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
   unsigned long addr, pte_t *ptep)
@@ -166,14 +159,6 @@ static inline void huge_ptep_set_wrprotect(struct 
mm_struct *mm,
__young;\
 })
 
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
-  unsigned long addr, pte_t *ptep)
-{
-   unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
-   return __pte(old);
-}
-
 static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
 pte_t * ptep)
 {
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 2b043b72f642..7e810a84ac15 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -80,6 +80,22 @@ static inline int ptep_test_and_clear_young(struct 
vm_area_struct *vma,
 }
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 
+#ifndef ptep_set_wrprotect
+static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep)
+{
+   pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+}
+#endif
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long 
addr,
+  pte_t *ptep)
+{
+   return __pte(pte_update(mm, addr, ptep, ~0UL, 0, 0));
+}
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+
 /* Generic accessors to PTE bits */
 #ifndef pte_mkwrite_novma
 static inline pte_t pte_mkwrite_novma(pte_t pte)
-- 
2.41.0



[PATCH v2 12/37] powerpc/nohash: Refactor pte_update()

2023-09-25 Thread Christophe Leroy
pte_update() is similar.

Take the nohash/32 version which works on nohash/64 and add the debug
call to assert_pte_locked() which is only on nohash/64.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 33 ---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 17 
 arch/powerpc/include/asm/nohash/pgtable.h| 42 
 3 files changed, 42 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index a74476de1ef6..ae7f3c8afd4f 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -9,8 +9,6 @@
 #include 
 #include/* For sub-arch specific PPC_PIN_SIZE */
 
-extern int icache_44x_need_flush;
-
 #endif /* __ASSEMBLY__ */
 
 #define PTE_INDEX_SIZE PTE_SHIFT
@@ -203,37 +201,6 @@ static inline void pmd_clear(pmd_t *pmdp)
*pmdp = __pmd(0);
 }
 
-/*
- * PTE updates. This function is called whenever an existing
- * valid PTE is updated. This does -not- include set_pte_at()
- * which nowadays only sets a new PTE.
- *
- * Depending on the type of MMU, we may need to use atomic updates
- * and the PTE may be either 32 or 64 bit wide. In the later case,
- * when using atomic updates, only the low part of the PTE is
- * accessed atomically.
- *
- * In addition, on 44x, we also maintain a global flag indicating
- * that an executable user mapping was modified, which is needed
- * to properly flush the virtually tagged instruction cache of
- * those implementations.
- */
-#ifndef pte_update
-static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
-unsigned long clr, unsigned long set, int 
huge)
-{
-   pte_basic_t old = pte_val(*p);
-   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
-
-   *p = __pte(new);
-
-   if (IS_ENABLED(CONFIG_44x) && (old & _PAGE_USER) && (old & _PAGE_EXEC))
-   icache_44x_need_flush = 1;
-
-   return old;
-}
-#endif
-
 #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
 static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
  unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index f5a8e8a9dba4..b149a39f2685 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -171,23 +171,6 @@ static inline void p4d_set(p4d_t *p4dp, unsigned long val)
*p4dp = __p4d(val);
 }
 
-/* Atomic PTE updates */
-static inline unsigned long pte_update(struct mm_struct *mm,
-  unsigned long addr,
-  pte_t *ptep, unsigned long clr,
-  unsigned long set,
-  int huge)
-{
-   unsigned long old = pte_val(*ptep);
-   *ptep = __pte((old & ~clr) | set);
-
-   /* huge pages use the old page table lock */
-   if (!huge)
-   assert_pte_locked(mm, addr);
-
-   return old;
-}
-
 static inline int pte_young(pte_t pte)
 {
return pte_val(pte) & _PAGE_ACCESSED;
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 3d684b500fe6..bd5c3a4baabd 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -2,6 +2,11 @@
 #ifndef _ASM_POWERPC_NOHASH_PGTABLE_H
 #define _ASM_POWERPC_NOHASH_PGTABLE_H
 
+#ifndef __ASSEMBLY__
+static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
+unsigned long clr, unsigned long set, int 
huge);
+#endif
+
 #if defined(CONFIG_PPC64)
 #include 
 #else
@@ -18,6 +23,43 @@
 
 #ifndef __ASSEMBLY__
 
+extern int icache_44x_need_flush;
+
+/*
+ * PTE updates. This function is called whenever an existing
+ * valid PTE is updated. This does -not- include set_pte_at()
+ * which nowadays only sets a new PTE.
+ *
+ * Depending on the type of MMU, we may need to use atomic updates
+ * and the PTE may be either 32 or 64 bit wide. In the later case,
+ * when using atomic updates, only the low part of the PTE is
+ * accessed atomically.
+ *
+ * In addition, on 44x, we also maintain a global flag indicating
+ * that an executable user mapping was modified, which is needed
+ * to properly flush the virtually tagged instruction cache of
+ * those implementations.
+ */
+#ifndef pte_update
+static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, 
pte_t *p,
+unsigned long clr, unsigned long set, int 
huge)
+{
+   pte_basic_t old = pte_val(*p);
+   pte_basic_t new = (old & ~(pte_basic_t)clr) | set;
+
+   *p = __pte(new);
+
+   if (IS_ENABLED(CONFIG_44x) && (old & _PAGE_USER) && (old & _

[PATCH v2 05/37] powerpc: Deduplicate prototypes of ptep_set_access_flags() and phys_mem_access_prot()

2023-09-25 Thread Christophe Leroy
Prototypes of ptep_set_access_flags() and phys_mem_access_prot() are identical
for book3s and nohash.

Deduplicate them.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/book3s/pgtable.h |  9 -
 arch/powerpc/include/asm/nohash/pgtable.h | 10 --
 arch/powerpc/include/asm/pgtable.h| 10 ++
 3 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h 
b/arch/powerpc/include/asm/book3s/pgtable.h
index 3b7bd36a2321..6f4578daea6c 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -9,15 +9,6 @@
 #endif
 
 #ifndef __ASSEMBLY__
-#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
-extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long 
address,
-pte_t *ptep, pte_t entry, int dirty);
-
-struct file;
-extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
-unsigned long size, pgprot_t vma_prot);
-#define __HAVE_PHYS_MEM_ACCESS_PROT
-
 void __update_mmu_cache(struct vm_area_struct *vma, unsigned long address, 
pte_t *ptep);
 
 /*
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index c721478c5934..5b6647fb398b 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -207,11 +207,6 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
mb();
 }
 
-
-#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
-extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long 
address,
-pte_t *ptep, pte_t entry, int dirty);
-
 /*
  * Macro to mark a page protection value as "uncacheable".
  */
@@ -240,11 +235,6 @@ extern int ptep_set_access_flags(struct vm_area_struct 
*vma, unsigned long addre
 
 #define pgprot_writecombine pgprot_noncached_wc
 
-struct file;
-extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
-unsigned long size, pgprot_t vma_prot);
-#define __HAVE_PHYS_MEM_ACCESS_PROT
-
 #ifdef CONFIG_HUGETLB_PAGE
 static inline int hugepd_ok(hugepd_t hpd)
 {
diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index d0ee46de248e..bcdbdeda65d3 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -110,6 +110,16 @@ void mark_initmem_nx(void);
 static inline void mark_initmem_nx(void) { }
 #endif
 
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
+int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, pte_t entry, int dirty);
+
+struct file;
+pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+ unsigned long size, pgprot_t vma_prot);
+#define __HAVE_PHYS_MEM_ACCESS_PROT
+
+
 /*
  * When used, PTE_FRAG_NR is defined in subarch pgtable.h
  * so we are sure it is included when arriving here.
-- 
2.41.0



[PATCH v2 11/37] powerpc/nohash: Replace #ifdef CONFIG_44x by IS_ENABLED(CONFIG_44x) in pgtable.h

2023-09-25 Thread Christophe Leroy
No need of a #ifdef, use IS_ENABLED(CONFIG_44x)

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index be8bca42bdce..a74476de1ef6 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -9,9 +9,7 @@
 #include 
 #include/* For sub-arch specific PPC_PIN_SIZE */
 
-#ifdef CONFIG_44x
 extern int icache_44x_need_flush;
-#endif
 
 #endif /* __ASSEMBLY__ */
 
@@ -229,10 +227,9 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, 
unsigned long addr, p
 
*p = __pte(new);
 
-#ifdef CONFIG_44x
-   if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
+   if (IS_ENABLED(CONFIG_44x) && (old & _PAGE_USER) && (old & _PAGE_EXEC))
icache_44x_need_flush = 1;
-#endif
+
return old;
 }
 #endif
-- 
2.41.0



[PATCH v2 00/37] Implement execute-only protection on powerpc

2023-09-25 Thread Christophe Leroy
This series reworks _PAGE_FLAGS on all platforms in order
to implement execute-only protection on all powerpc.

For all targets except 40x and 604 it will be a real execute-only
protection as the hardware and/or software allows a distinct protection.

For 40x and 604 that's a poor's man execute-only protection in the
way that once the page is in the TLB it can be executed. But it's
better than nothing and allows to have a similar implementation for
all sorts of powerpc.

Patches 1 and 2 are fixes that should also be back-ported to stable
version.

Patches 3 to 7 are generic trivial cleanups.

Patches 8 to 19 are a cleanup of pgtable.h for nohash. Main purpose
is to refactor a lot of common code between nohash/32 and nohash/64.

Patches 20 to 37 do the real work on PAGE flags in order to
switch all platforms to _PAGE_READ and _PAGE_WRITE like book3s/64
today. Once that is done it is easy to implement execute-only
protection.

Patch 1 to 19 were already sent-out as v1 of series
named "cleanup/refactor pgtable.h". Problems reported by robots
are fixed here.

Christophe Leroy (37):
  powerpc/8xx: Fix pte_access_permitted() for PAGE_NONE
  powerpc/64e: Fix wrong test in __ptep_test_and_clear_young()
  powerpc/40x: Remove stale PTE_ATOMIC_UPDATES macro
  powerpc: Remove pte_ERROR()
  powerpc: Deduplicate prototypes of ptep_set_access_flags() and
phys_mem_access_prot()
  powerpc: Refactor update_mmu_cache_range()
  powerpc: Untangle fixmap.h and pgtable.h and mmu.h
  powerpc/nohash: Remove {pte/pmd}_protnone()
  powerpc/nohash: Refactor declaration of {map/unmap}_kernel_page()
  powerpc/nohash: Move 8xx version of pte_update() into pte-8xx.h
  powerpc/nohash: Replace #ifdef CONFIG_44x by IS_ENABLED(CONFIG_44x) in
pgtable.h
  powerpc/nohash: Refactor pte_update()
  powerpc/nohash: Refactor checking of no-change in pte_update()
  powerpc/nohash: Deduplicate _PAGE_CHG_MASK
  powerpc/nohash: Deduplicate pte helpers
  powerpc/nohash: Refactor ptep_test_and_clear_young()
  powerpc/nohash: Deduplicate ptep_set_wrprotect() and
ptep_get_and_clear()
  powerpc/nohash: Refactor pte_clear()
  powerpc/nohash: Refactor __ptep_set_access_flags()
  powerpc/e500: Simplify pte_mkexec()
  powerpc: Implement and use pgprot_nx()
  powerpc: Fail ioremap() instead of silently ignoring flags when
PAGE_USER is set
  powerpc: Remove pte_mkuser() and pte_mkpriviledged()
  powerpc: Rely on address instead of pte_user()
  powerpc: Refactor permission masks used for __P/__S table and kernel
memory flags
  powerpc/8xx: Use generic permission masks
  powerpc/64s: Use generic permission masks
  powerpc/nohash: Add _PAGE_WRITE to supplement _PAGE_RW
  powerpc/nohash: Replace pte_user() by pte_read()
  powerpc/e500: Introduce _PAGE_READ and remove _PAGE_USER
  powerpc/44x: Introduce _PAGE_READ and remove _PAGE_USER
  powerpc/40x: Introduce _PAGE_READ and remove _PAGE_USER
  powerpc/32s: Add _PAGE_WRITE to supplement _PAGE_RW
  powerpc/32s: Introduce _PAGE_READ and remove _PAGE_USER
  powerpc/ptdump: Display _PAGE_READ and _PAGE_WRITE
  powerpc: Finally remove _PAGE_USER
  powerpc: Support execute-only on all powerpc

 arch/powerpc/include/asm/book3s/32/pgtable.h  |  83 +++
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  35 +--
 arch/powerpc/include/asm/book3s/pgtable.h |  33 ---
 arch/powerpc/include/asm/fixmap.h |  16 +-
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h  |   1 -
 arch/powerpc/include/asm/nohash/32/pgtable.h  | 201 +---
 arch/powerpc/include/asm/nohash/32/pte-40x.h  |  21 +-
 arch/powerpc/include/asm/nohash/32/pte-44x.h  |  20 +-
 arch/powerpc/include/asm/nohash/32/pte-85xx.h |  20 +-
 arch/powerpc/include/asm/nohash/32/pte-8xx.h  |  99 +---
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 120 +-
 arch/powerpc/include/asm/nohash/pgtable.h | 216 --
 arch/powerpc/include/asm/nohash/pte-e500.h|  41 +---
 arch/powerpc/include/asm/pgtable-masks.h  |  32 +++
 arch/powerpc/include/asm/pgtable.h|  35 +++
 arch/powerpc/kernel/head_40x.S|  19 +-
 arch/powerpc/kernel/head_44x.S|  40 ++--
 arch/powerpc/kernel/head_85xx.S   |  12 +-
 arch/powerpc/kernel/head_book3s_32.S  |  63 ++---
 arch/powerpc/mm/book3s32/hash_low.S   |  32 ++-
 arch/powerpc/mm/book3s32/mmu.c|   6 +-
 arch/powerpc/mm/book3s64/pgtable.c|  10 +-
 arch/powerpc/mm/fault.c   |   9 +-
 arch/powerpc/mm/init_32.c |   1 +
 arch/powerpc/mm/ioremap.c |   6 +-
 arch/powerpc/mm/mem.c |   1 +
 arch/powerpc/mm/nohash/40x.c  |  19 +-
 arch/powerpc/mm/nohash/8xx.c  |   2 +
 arch/powerpc/mm/nohash/book3e_pgtable.c   |   2 +-
 arch/powerpc/mm/nohash/e500.c |   6 +-
 arch/powerpc/mm/nohash/e500_hugetlbpage.c |   3 +-
 arch/powerpc/mm/pgtable.c |  26 +

[PATCH v2 02/37] powerpc/64e: Fix wrong test in __ptep_test_and_clear_young()

2023-09-25 Thread Christophe Leroy
Commit 45201c879469 ("powerpc/nohash: Remove hash related code from
nohash headers.") replaced:

  if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
return 0;

By:

  if (pte_young(*ptep))
return 0;

But it should be:

  if (!pte_young(*ptep))
return 0;

Fix it.

Fixes: 45201c879469 ("powerpc/nohash: Remove hash related code from nohash 
headers.")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/64/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 5cd9acf58a7d..eb6891e34cbd 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -197,7 +197,7 @@ static inline int __ptep_test_and_clear_young(struct 
mm_struct *mm,
 {
unsigned long old;
 
-   if (pte_young(*ptep))
+   if (!pte_young(*ptep))
return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
-- 
2.41.0



[PATCH v2 01/37] powerpc/8xx: Fix pte_access_permitted() for PAGE_NONE

2023-09-25 Thread Christophe Leroy
On 8xx, PAGE_NONE is handled by setting _PAGE_NA instead of clearing
_PAGE_USER.

But then pte_user() returns 1 also for PAGE_NONE.

As _PAGE_NA prevent reads, add a specific version of pte_read()
that returns 0 when _PAGE_NA is set instead of always returning 1.

Fixes: 351750331fc1 ("powerpc/mm: Introduce _PAGE_NA")
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h | 7 +++
 arch/powerpc/include/asm/nohash/pgtable.h| 2 ++
 2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h 
b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index 21f681ee535a..e6fe1d5731f2 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -94,6 +94,13 @@ static inline pte_t pte_wrprotect(pte_t pte)
 
 #define pte_wrprotect pte_wrprotect
 
+static inline int pte_read(pte_t pte)
+{
+   return (pte_val(pte) & _PAGE_RO) != _PAGE_NA;
+}
+
+#define pte_read pte_read
+
 static inline int pte_write(pte_t pte)
 {
return !(pte_val(pte) & _PAGE_RO);
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h 
b/arch/powerpc/include/asm/nohash/pgtable.h
index 56ea48276356..c721478c5934 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -25,7 +25,9 @@ static inline int pte_write(pte_t pte)
return pte_val(pte) & _PAGE_RW;
 }
 #endif
+#ifndef pte_read
 static inline int pte_read(pte_t pte)  { return 1; }
+#endif
 static inline int pte_dirty(pte_t pte) { return pte_val(pte) & 
_PAGE_DIRTY; }
 static inline int pte_special(pte_t pte)   { return pte_val(pte) & 
_PAGE_SPECIAL; }
 static inline int pte_none(pte_t pte)  { return (pte_val(pte) & 
~_PTE_NONE_MASK) == 0; }
-- 
2.41.0



Re: [RFC PATCH v12 11/33] KVM: Introduce per-page memory attributes

2023-09-25 Thread Sean Christopherson
On Thu, Sep 21, 2023, Yan Zhao wrote:
> On Wed, Sep 20, 2023 at 02:00:22PM -0700, Sean Christopherson wrote:
> > On Fri, Sep 15, 2023, Yan Zhao wrote:
> > > On Wed, Sep 13, 2023 at 06:55:09PM -0700, Sean Christopherson wrote:
> > > > +/* Set @attributes for the gfn range [@start, @end). */
> > > > +static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, 
> > > > gfn_t end,
> > > > +unsigned long attributes)
> > > > +{
> > > > +   struct kvm_mmu_notifier_range pre_set_range = {
> > > > +   .start = start,
> > > > +   .end = end,
> > > > +   .handler = kvm_arch_pre_set_memory_attributes,
> > > > +   .on_lock = kvm_mmu_invalidate_begin,
> > > > +   .flush_on_ret = true,
> > > > +   .may_block = true,
> > > > +   };
> > > > +   struct kvm_mmu_notifier_range post_set_range = {
> > > > +   .start = start,
> > > > +   .end = end,
> > > > +   .arg.attributes = attributes,
> > > > +   .handler = kvm_arch_post_set_memory_attributes,
> > > > +   .on_lock = kvm_mmu_invalidate_end,
> > > > +   .may_block = true,
> > > > +   };
> > > > +   unsigned long i;
> > > > +   void *entry;
> > > > +   int r = 0;
> > > > +
> > > > +   entry = attributes ? xa_mk_value(attributes) : NULL;
> > > Also here, do we need to get existing attributes of a GFN first ?
> > 
> > No?  @entry is the new value that will be set for all entries.  This line 
> > doesn't
> > touch the xarray in any way.  Maybe I'm just not understanding your 
> > question.
> Hmm, I thought this interface was to allow users to add/remove an attribute 
> to a GFN
> rather than overwrite all attributes of a GFN. Now I think I misunderstood 
> the intention.
> 
> But I wonder if there is a way for users to just add one attribute, as I 
> don't find
> ioctl like KVM_GET_MEMORY_ATTRIBUTES for users to get current attributes and 
> then to
> add/remove one based on that. e.g. maybe in future, KVM wants to add one 
> attribute in
> kernel without being told by userspace ?

The plan is that memory attributes will be 100% userspace driven, i.e. that KVM
will never add its own attributes.  That's why there is (currently) no
KVM_GET_MEMORY_ATTRIBUTES, the intended usage model is that userspace is fully
responsible for managing attributes, and so should never need to query 
information
that it already knows.  If there's a compelling case for getting attributes then
we could certainly add such an ioctl(), but I hope we never need to add a GET
because that likely means we've made mistakes along the way.

Giving userspace full control of attributes allows for a simpler uAPI, e.g. if
userspace doesn't have full control, then setting or clearing bits requires a 
RMW
operation, which means creating a more complex ioctl().  That's why its a 
straight
SET operation and not an OR type operation.


[PATCH] powerpc/85xx: Fix math emulation exception

2023-09-25 Thread Christophe Leroy
Booting mpc85xx_defconfig kernel on QEMU leads to:

Bad trap at PC: fe9bab0, SR: 2d000, vector=800
awk[82]: unhandled trap (5) at 0 nip fe9bab0 lr fe9e01c code 5 in 
libc-2.27.so[fe5a000+17a000]
awk[82]: code: 3aa0 3a800010 4bffe03c 9421fff0 7ca62b78 38a0 93c10008 
83c10008
awk[82]: code: 38210010 4bffdec8 9421ffc0 7c0802a6  d8010008 4815190d 
93810030
Trace/breakpoint trap
WARNING: no useful console

This is because allthough CONFIG_MATH_EMULATION is selected,
Exception 800 calls unknown_exception().

Call emulation_assist_interrupt() instead.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_85xx.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/head_85xx.S b/arch/powerpc/kernel/head_85xx.S
index 97e9ea0c7297..0f1641a31250 100644
--- a/arch/powerpc/kernel/head_85xx.S
+++ b/arch/powerpc/kernel/head_85xx.S
@@ -395,7 +395,7 @@ interrupt_base:
 #ifdef CONFIG_PPC_FPU
FP_UNAVAILABLE_EXCEPTION
 #else
-   EXCEPTION(0x0800, FP_UNAVAIL, FloatingPointUnavailable, 
unknown_exception)
+   EXCEPTION(0x0800, FP_UNAVAIL, FloatingPointUnavailable, 
emulation_assist_interrupt)
 #endif
 
/* System Call Interrupt */
-- 
2.41.0



Re: [PATCH v3 3/3] doc: trusted-encrypted: add DCP as new trust source

2023-09-25 Thread Jarkko Sakkinen
On Mon Sep 18, 2023 at 5:18 PM EEST, David Gstir wrote:
> Update the documentation for trusted and encrypted KEYS with DCP as new
> trust source:
>
> - Describe security properties of DCP trust source
> - Describe key usage
> - Document blob format
>
> Co-developed-by: Richard Weinberger 
> Signed-off-by: Richard Weinberger 
> Co-developed-by: David Oberhollenzer 
> Signed-off-by: David Oberhollenzer 
> Signed-off-by: David Gstir 
> ---
>  .../security/keys/trusted-encrypted.rst   | 85 +++
>  1 file changed, 85 insertions(+)
>
> diff --git a/Documentation/security/keys/trusted-encrypted.rst 
> b/Documentation/security/keys/trusted-encrypted.rst
> index 9bc9db8ec651..4452070afbe9 100644
> --- a/Documentation/security/keys/trusted-encrypted.rst
> +++ b/Documentation/security/keys/trusted-encrypted.rst
> @@ -42,6 +42,14 @@ safe.
>   randomly generated and fused into each SoC at manufacturing time.
>   Otherwise, a common fixed test key is used instead.
>  
> + (4) DCP (Data Co-Processor: crypto accelerator of various i.MX SoCs)
> +
> + Rooted to a one-time programmable key (OTP) that is generally burnt
> + in the on-chip fuses and is accessible to the DCP encryption engine 
> only.
> + DCP provides two keys that can be used as root of trust: the OTP key
> + and the UNIQUE key. Default is to use the UNIQUE key, but selecting
> + the OTP key can be done via a module parameter (dcp_use_otp_key).
> +
>*  Execution isolation
>  
>   (1) TPM
> @@ -57,6 +65,12 @@ safe.
>  
>   Fixed set of operations running in isolated execution environment.
>  
> + (4) DCP
> +
> + Fixed set of cryptographic operations running in isolated execution
> + environment. Only basic blob key encryption is executed there.
> + The actual key sealing/unsealing is done on main processor/kernel 
> space.
> +
>* Optional binding to platform integrity state
>  
>   (1) TPM
> @@ -79,6 +93,11 @@ safe.
>   Relies on the High Assurance Boot (HAB) mechanism of NXP SoCs
>   for platform integrity.
>  
> + (4) DCP
> +
> + Relies on Secure/Trusted boot process (called HAB by vendor) for
> + platform integrity.
> +
>*  Interfaces and APIs
>  
>   (1) TPM
> @@ -94,6 +113,11 @@ safe.
>  
>   Interface is specific to silicon vendor.
>  
> + (4) DCP
> +
> + Vendor-specific API that is implemented as part of the DCP crypto 
> driver in
> + ``drivers/crypto/mxs-dcp.c``.
> +
>*  Threat model
>  
>   The strength and appropriateness of a particular trust source for a 
> given
> @@ -129,6 +153,13 @@ selected trust source:
>   CAAM HWRNG, enable CRYPTO_DEV_FSL_CAAM_RNG_API and ensure the device
>   is probed.
>  
> +  *  DCP (Data Co-Processor: crypto accelerator of various i.MX SoCs)
> +
> + The DCP hardware device itself does not provide a dedicated RNG 
> interface,
> + so the kernel default RNG is used. SoCs with DCP like the i.MX6ULL do 
> have
> + a dedicated hardware RNG that is independent from DCP which can be 
> enabled
> + to back the kernel RNG.
> +
>  Users may override this by specifying ``trusted.rng=kernel`` on the kernel
>  command-line to override the used RNG with the kernel's random number pool.
>  
> @@ -231,6 +262,19 @@ Usage::
>  CAAM-specific format.  The key length for new keys is always in bytes.
>  Trusted Keys can be 32 - 128 bytes (256 - 1024 bits).
>  
> +Trusted Keys usage: DCP
> +---
> +
> +Usage::
> +
> +keyctl add trusted name "new keylen" ring
> +keyctl add trusted name "load hex_blob" ring
> +keyctl print keyid
> +
> +"keyctl print" returns an ASCII hex copy of the sealed key, which is in 
> format
> +specific to this DCP key-blob implementation.  The key length for new keys is
> +always in bytes. Trusted Keys can be 32 - 128 bytes (256 - 1024 bits).
> +
>  Encrypted Keys usage
>  
>  
> @@ -426,3 +470,44 @@ string length.
>  privkey is the binary representation of TPM2B_PUBLIC excluding the
>  initial TPM2B header which can be reconstructed from the ASN.1 octed
>  string length.
> +
> +DCP Blob Format
> +---
> +
> +The Data Co-Processor (DCP) provides hardware-bound AES keys using its
> +AES encryption engine only. It does not provide direct key sealing/unsealing.
> +To make DCP hardware encryption keys usable as trust source, we define
> +our own custom format that uses a hardware-bound key to secure the sealing
> +key stored in the key blob.
> +
> +Whenever a new trusted key using DCP is generated, we generate a random 
> 128-bit
> +blob encryption key (BEK) and 128-bit nonce. The BEK and nonce are used to
> +encrypt the trusted key payload using AES-128-GCM.

"When a new trusted key using DCP is created, a random 128-bit
blob encryption key (BEK) and 128-bit nonce are generated."

... or along the lines.

BR, Jarkko


Re: [PATCH v3 2/3] KEYS: trusted: Introduce support for NXP DCP-based trusted keys

2023-09-25 Thread Jarkko Sakkinen
On Mon Sep 18, 2023 at 5:18 PM EEST, David Gstir wrote:
> DCP (Data Co-Processor) is the little brother of NXP's CAAM IP.
>
> Beside of accelerated crypto operations, it also offers support for
> hardware-bound keys. Using this feature it is possible to implement a blob
> mechanism just like CAAM offers. Unlike on CAAM, constructing and
> parsing the blob has to happen in software.
>
> We chose the following format for the blob:

Who is we?

And there is no choosing anything if the below structure if hardware
defined (not software defined):

> /*
>  * struct dcp_blob_fmt - DCP BLOB format.
>  *
>  * @fmt_version: Format version, currently being %1
>  * @blob_key: Random AES 128 key which is used to encrypt @payload,
>  *@blob_key itself is encrypted with OTP or UNIQUE device key in
>  *AES-128-ECB mode by DCP.
>  * @nonce: Random nonce used for @payload encryption.
>  * @payload_len: Length of the plain text @payload.
>  * @payload: The payload itself, encrypted using AES-128-GCM and @blob_key,
>  *   GCM auth tag of size AES_BLOCK_SIZE is attached at the end of it.
>  *
>  * The total size of a DCP BLOB is sizeof(struct dcp_blob_fmt) + @payload_len 
> +
>  * AES_BLOCK_SIZE.
>  */
> struct dcp_blob_fmt {
>   __u8 fmt_version;
>   __u8 blob_key[AES_KEYSIZE_128];
>   __u8 nonce[AES_KEYSIZE_128];
>   __le32 payload_len;
>   __u8 payload[];
> } __packed;
>
> @payload is the key provided by trusted_key_ops->seal().
>
> By default the UNIQUE device key is used, it is also possible to use
> the OTP key. While the UNIQUE device key should be unique it is not
> entirely clear whether this is the case due to unclear documentation.
> If someone wants to be sure they can burn their own unique key
> into the OTP fuse and set the use_otp_key module parameter.
>
> Co-developed-by: Richard Weinberger 
> Signed-off-by: Richard Weinberger 
> Co-developed-by: David Oberhollenzer 
> Signed-off-by: David Oberhollenzer 
> Signed-off-by: David Gstir 
> ---
>  .../admin-guide/kernel-parameters.txt |  13 +

Separate commit for this.

>  MAINTAINERS   |   9 +

Ditto (i.e. total two additional patches).

>  include/keys/trusted_dcp.h|  11 +
>  security/keys/trusted-keys/Kconfig|   9 +-
>  security/keys/trusted-keys/Makefile   |   2 +
>  security/keys/trusted-keys/trusted_core.c |   6 +-
>  security/keys/trusted-keys/trusted_dcp.c  | 311 ++
>  7 files changed, 359 insertions(+), 2 deletions(-)
>  create mode 100644 include/keys/trusted_dcp.h
>  create mode 100644 security/keys/trusted-keys/trusted_dcp.c
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index 0a1731a0f0ef..c11eda8b38e0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -6566,6 +6566,7 @@
>   - "tpm"
>   - "tee"
>   - "caam"
> + - "dcp"
>   If not specified then it defaults to iterating through
>   the trust source list starting with TPM and assigns the
>   first trust source as a backend which is initialized
> @@ -6581,6 +6582,18 @@
>   If not specified, "default" is used. In this case,
>   the RNG's choice is left to each individual trust 
> source.
>  
> + trusted.dcp_use_otp_key
> + This is intended to be used in combination with
> + trusted.source=dcp and will select the DCP OTP key
> + instead of the DCP UNIQUE key blob encryption.
> +
> + trusted.dcp_skip_zk_test
> + This is intended to be used in combination with
> + trusted.source=dcp and will disable the check if all
> + the blob key is zero'ed. This is helpful for situations 
> where
> + having this key zero'ed is acceptable. E.g. in testing
> + scenarios.
> +
>   tsc=Disable clocksource stability checks for TSC.
>   Format: 
>   [x86] reliable: mark tsc clocksource as reliable, this
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 90f13281d297..988d01226131 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -11647,6 +11647,15 @@ S:   Maintained
>  F:   include/keys/trusted_caam.h
>  F:   security/keys/trusted-keys/trusted_caam.c
>  
> +KEYS-TRUSTED-DCP
> +M:   David Gstir 
> +R:   sigma star Kernel Team 
> +L:   linux-integr...@vger.kernel.org
> +L:   keyri...@vger.kernel.org
> +S:   Supported
> +F:   include/keys/trusted_dcp.h
> +F:   security/keys/trusted-keys/trusted_dcp.c
> +
>  KEYS-TRUSTED-TEE
>  M:   Sumit Garg 
>  L:   linux-integr...@vger.kernel.org
> diff --git a/include/keys/trusted_dcp.h b/include/keys/tr

Re: [PATCH v3 1/3] crypto: mxs-dcp: Add support for hardware provided keys

2023-09-25 Thread Jarkko Sakkinen
On Mon Sep 18, 2023 at 5:18 PM EEST, David Gstir wrote:
> DCP is capable to performing AES with hardware-bound keys.
> These keys are not stored in main memory and are therefore not directly
> accessible by the operating system.
>
> So instead of feeding the key into DCP, we need to place a
> reference to such a key before initiating the crypto operation.
> Keys are referenced by a one byte identifiers.

Not sure what the action of feeding key into DCP even means if such
action does not exists.

What you probably would want to describe here is how keys get created
and how they are referenced by the kernel.

For the "use" part please try to avoid academic paper style long
expression starting with "we" pronomine.

So the above paragraph would normalize into "The keys inside DCP
are referenced by one byte identifier". Here of course would be
for the context nice to know what is this set of DCP keys. E.g.
are total 256 keys or some subset?

When using too much prose there can be surprsingly little digestable
information, thus this nitpicking.

> DCP supports 6 different keys: 4 slots in the secure memory area,
> a one time programmable key which can be burnt via on-chip fuses
> and an unique device key.
>
> Using these keys is restricted to in-kernel users that use them as building
> block for other crypto tools such as trusted keys. Allowing userspace
> (e.g. via AF_ALG) to use these keys to crypt or decrypt data is a security
> risk, because there is no access control mechanism.

Unless this patch has anything else than trusted keys this should not
be an open-ended sentence. You want to say roughly that DCP hardware
keys are implemented for the sake to implement trusted keys support,
and exactly and only that.

This description also lacks actions taken by the code changes below,
which is really the beef of any commit description.

BR, Jarkko


Re: [PATCH v6 08/30] dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC

2023-09-25 Thread Herve Codina
On Mon, 25 Sep 2023 12:44:35 +0200
Krzysztof Kozlowski  wrote:

> On 25/09/2023 12:27, Herve Codina wrote:
> > On Mon, 25 Sep 2023 10:21:15 +0200
> > Krzysztof Kozlowski  wrote:
> >   
> >> On 25/09/2023 10:17, Herve Codina wrote:  
> >>> Hi Krzysztof,
> >>>
> >>> On Sat, 23 Sep 2023 19:39:49 +0200
> >>> Krzysztof Kozlowski  wrote:
> >>> 
>  On 22/09/2023 09:58, Herve Codina wrote:
> > The QMC (QUICC mutichannel controller) is a controller present in some
> > PowerQUICC SoC such as MPC885.
> > The QMC HDLC uses the QMC controller to transfer HDLC data.
> >
> > Additionally, a framer can be connected to the QMC HDLC.
> > If present, this framer is the interface between the TDM bus used by the
> > QMC HDLC and the E1/T1 line.
> > The QMC HDLC can use this framer to get information about the E1/T1 line
> > and configure the E1/T1 line.
> >
> > Signed-off-by: Herve Codina 
> > ---
> >  .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 24 +++
> >  1 file changed, 24 insertions(+)
> >
> > diff --git 
> > a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> >  
> > b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > index 82d9beb48e00..61dfd5ef7407 100644
> > --- 
> > a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > +++ 
> > b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > @@ -101,6 +101,27 @@ patternProperties:
> >Channel assigned Rx time-slots within the Rx time-slots 
> > routed by the
> >TSA to this cell.
> >  
> > +  compatible:
> > +const: fsl,qmc-hdlc  
> 
>  Why this is not a device/SoC specific compatible?
> >>>
> >>> This compatible is present in a QMC channel.
> >>> The parent node (the QMC itself) contains a compatible with device/SoC:
> >>> --- 8< ---
> >>>   compatible:
> >>> items:
> >>>   - enum:
> >>>   - fsl,mpc885-scc-qmc
> >>>   - fsl,mpc866-scc-qmc
> >>>   - const: fsl,cpm1-scc-qmc
> >>> --- 8< ---
> >>>
> >>> At the child level (ie QMC channel), I am not sure that adding device/SoC
> >>> makes sense. This compatible indicates that the QMC channel is handled by
> >>> the QMC HDLC driver.
> >>> At this level, whatever the device/SoC, we have to be QMC compliant.
> >>>
> >>> With these details, do you still think I need to change the child 
> >>> (channel)
> >>> compatible ?
> >>
> >> From OS point of view, you have a driver binding to this child-level
> >> compatible. How do you enforce Linux driver binding based on parent
> >> compatible? I looked at your next patch and I did not see it.  
> > 
> > We do not need to have the child driver binding based on parent.  
> 
> Exactly, that's what I said.
> 
> > We have to ensure that the child handles a QMC channel and the parent 
> > provides
> > a QMC channel.
> > 
> > A QMC controller (parent) has to implement the QMC API 
> > (include/soc/fsl/qe/qmc.h)
> > and a QMC channel driver (child) has to use the QMC API.  
> 
> How does this solve my concerns? Sorry, I do not understand. Your driver
> is a platform driver and binds to the generic compatible. How do you
> solve regular compatibility issues (need for quirks) if parent
> compatible is not used?
> 
> How does being QMC compliant affects driver binding and
> compatibility/quirks?
> 
> We are back to my original question and I don't think you answered to
> any of the concerns.

Well, to be sure that I understand correctly, do you mean that I should
provide a compatible for the child (HDLC) with something like this:
--- 8< ---
  compatible:
items:
  - enum:
  - fsl,mpc885-qmc-hdlc
  - fsl,mpc866-qmc-hdlc
  - const: fsl,cpm1-qmc-hdlc
  - const: fsl,qmc-hdlc
--- 8< ---

If so, I didn't do that because a QMC channel consumer (driver matching
fsl,qmc-hdlc) doesn't contains any SoC specific part.
It uses the channel as a communication channel to send/receive HDLC frames
to/from this communication channel.
All the specific SoC part is handled by the QMC controller (parent) itself and
not by any consumer (child).

Best regards,
Hervé


Re: [PATCH 00/40] soc: Convert to platform remove callback returning void

2023-09-25 Thread Konrad Dybcio
On 25.09.2023 11:54, Uwe Kleine-König wrote:
> Hello,
> 
> this series converts all platform drivers below drivers/soc to use
> .remove_new(). The motivation is to get rid of an integer return code
> that is (mostly) ignored by the platform driver core and error prone on
> the driver side.
> 
> See commit 5c5a7680e67b ("platform: Provide a remove callback that
> returns no value") for an extended explanation and the eventual goal.
>
Acked-by: Konrad Dybcio  # qcom

Konrad


Re: [PATCH v2 1/2] ASoC: dt-bindings: fsl_rpmsg: List DAPM endpoints ignoring system suspend

2023-09-25 Thread Mark Brown
On Mon, Sep 25, 2023 at 07:09:45PM +0800, Chancel Liu wrote:

> +  fsl,lpa-widgets:
> +$ref: /schemas/types.yaml#/definitions/non-unique-string-array
> +description: |
> +  A list of DAPM endpoints which mark paths between these endpoints 
> should
> +  not be disabled when system enters in suspend state. LPA means low 
> power
> +  audio case. On asymmetric multiprocessor, there are Cortex-A core and
> +  Cortex-M core, Linux is running on Cortex-A core, RTOS or other OS is
> +  running on Cortex-M core. The audio hardware devices can be controlled 
> by
> +  Cortex-M. LPA can be explained as a mechanism that Cortex-A allocates a
> +  large buffer and fill audio data, then Cortex-A can enter into suspend
> +  for the purpose of power saving. Cortex-M continues to play the sound
> +  during suspend phase of Cortex-A. When the data in buffer is consumed,
> +  Cortex-M will trigger the Cortex-A to wakeup to fill data. LPA requires
> +  some audio paths still enabled when Cortex-A enters into suspend.

This is a fairly standard DSP playback case as far as I can see so it
should work with DAPM without needing this obviously use case specific
stuff peering into the Linux implementation.  Generally this is done by
tagging endpoint widgets and DAIs as ignore_suspend, DAPM will then
figure out the rest of the widgets in the path.


signature.asc
Description: PGP signature


Re: [PATCH 3/8] iommu/vt-d: Use ops->blocked_domain

2023-09-25 Thread Jason Gunthorpe
On Mon, Sep 25, 2023 at 10:29:52AM +0800, Baolu Lu wrote:
> On 9/23/23 1:07 AM, Jason Gunthorpe wrote:
> > Trivially migrate to the ops->blocked_domain for the existing global
> > static.
> > 
> > Signed-off-by: Jason Gunthorpe
> > ---
> >   drivers/iommu/intel/iommu.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> Reviewed-by: Lu Baolu 
> 
> P.S. We can further do the same thing to the identity domain. I will
> clean it up after all patches are landed.

I looked at that, and it is not trivial..

Both the Intel and virtio-iommu drivers create an "identity" domain
out of a paging domain and pass that off as a true "identity"
domain. So neither can set the global static since the determination
is at runtime..

What I was thinking about doing is consolidating that code so that the
core logic is the thing turning a paging domain into an identity
domain.

Jason


[PATCH v2 1/2] ASoC: dt-bindings: fsl_rpmsg: List DAPM endpoints ignoring system suspend

2023-09-25 Thread Chancel Liu
Add a property to list DAPM endpoints which mark paths between these
endpoints should not be disabled when system enters in suspend state.

LPA means low power audio case. On asymmetric multiprocessor, there are
Cortex-A core and Cortex-M core, Linux is running on Cortex-A core,
RTOS or other OS is running on Cortex-M core. The audio hardware
devices can be controlled by Cortex-M. LPA can be explained as a
mechanism that Cortex-A allocates a large buffer and fill audio data,
then Cortex-A can enter into suspend for the purpose of power saving.
Cortex-M continues to play the sound during suspend phase of Cortex-A.
When the data in buffer is consumed, Cortex-M will trigger the Cortex-A
to wakeup to fill data. LPA requires some audio paths still enabled
when Cortex-A enters into suspend.

Signed-off-by: Chancel Liu 
---
 .../devicetree/bindings/sound/fsl,rpmsg.yaml  | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml 
b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
index 188f38baddec..d8fd17615bf2 100644
--- a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
+++ b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
@@ -91,6 +91,21 @@ properties:
   - rpmsg-audio-channel
   - rpmsg-micfil-channel
 
+  fsl,lpa-widgets:
+$ref: /schemas/types.yaml#/definitions/non-unique-string-array
+description: |
+  A list of DAPM endpoints which mark paths between these endpoints should
+  not be disabled when system enters in suspend state. LPA means low power
+  audio case. On asymmetric multiprocessor, there are Cortex-A core and
+  Cortex-M core, Linux is running on Cortex-A core, RTOS or other OS is
+  running on Cortex-M core. The audio hardware devices can be controlled by
+  Cortex-M. LPA can be explained as a mechanism that Cortex-A allocates a
+  large buffer and fill audio data, then Cortex-A can enter into suspend
+  for the purpose of power saving. Cortex-M continues to play the sound
+  during suspend phase of Cortex-A. When the data in buffer is consumed,
+  Cortex-M will trigger the Cortex-A to wakeup to fill data. LPA requires
+  some audio paths still enabled when Cortex-A enters into suspend.
+
 required:
   - compatible
 
-- 
2.25.1



[PATCH v2 2/2] ASoC: imx-rpmsg: Force codec power on in low power audio mode

2023-09-25 Thread Chancel Liu
Low power audio mode requires binding codec still power on while Acore
enters into suspend so Mcore can continue playback music.

ASoC machine driver acquires DAPM endpoints through reading
"fsl,lpa-widgets" property from DT and then forces the path between
these endpoints ignoring suspend.

If the rpmsg sound card is in low power audio mode, the suspend/resume
callback of binding codec is overridden to disable the suspend/resume.

Signed-off-by: Chancel Liu 
---
 sound/soc/fsl/imx-rpmsg.c | 58 +++
 1 file changed, 58 insertions(+)

diff --git a/sound/soc/fsl/imx-rpmsg.c b/sound/soc/fsl/imx-rpmsg.c
index b578f9a32d7f..0568a3420aae 100644
--- a/sound/soc/fsl/imx-rpmsg.c
+++ b/sound/soc/fsl/imx-rpmsg.c
@@ -20,8 +20,11 @@ struct imx_rpmsg {
struct snd_soc_dai_link dai;
struct snd_soc_card card;
unsigned long sysclk;
+   bool lpa;
 };
 
+static struct dev_pm_ops lpa_pm;
+
 static const struct snd_soc_dapm_widget imx_rpmsg_dapm_widgets[] = {
SND_SOC_DAPM_HP("Headphone Jack", NULL),
SND_SOC_DAPM_SPK("Ext Spk", NULL),
@@ -38,6 +41,58 @@ static int imx_rpmsg_late_probe(struct snd_soc_card *card)
struct device *dev = card->dev;
int ret;
 
+   if (data->lpa) {
+   struct snd_soc_component *codec_comp;
+   struct device_node *codec_np;
+   struct device_driver *codec_drv;
+   struct device *codec_dev = NULL;
+
+   codec_np = data->dai.codecs->of_node;
+   if (codec_np) {
+   struct platform_device *codec_pdev;
+   struct i2c_client *codec_i2c;
+
+   codec_i2c = of_find_i2c_device_by_node(codec_np);
+   if (codec_i2c)
+   codec_dev = &codec_i2c->dev;
+   if (!codec_dev) {
+   codec_pdev = of_find_device_by_node(codec_np);
+   if (codec_pdev)
+   codec_dev = &codec_pdev->dev;
+   }
+   }
+   if (codec_dev) {
+   codec_comp = 
snd_soc_lookup_component_nolocked(codec_dev, NULL);
+   if (codec_comp) {
+   int i, num_widgets;
+   const char *widgets;
+   struct snd_soc_dapm_context *dapm;
+
+   num_widgets = 
of_property_count_strings(data->card.dev->of_node,
+   
"fsl,lpa-widgets");
+   for (i = 0; i < num_widgets; i++) {
+   
of_property_read_string_index(data->card.dev->of_node,
+ 
"fsl,lpa-widgets",
+ i, 
&widgets);
+   dapm = 
snd_soc_component_get_dapm(codec_comp);
+   snd_soc_dapm_ignore_suspend(dapm, 
widgets);
+   }
+   }
+   codec_drv = codec_dev->driver;
+   if (codec_drv->pm) {
+   memcpy(&lpa_pm, codec_drv->pm, sizeof(lpa_pm));
+   lpa_pm.suspend = NULL;
+   lpa_pm.resume = NULL;
+   lpa_pm.freeze = NULL;
+   lpa_pm.thaw = NULL;
+   lpa_pm.poweroff = NULL;
+   lpa_pm.restore = NULL;
+   codec_drv->pm = &lpa_pm;
+   }
+   put_device(codec_dev);
+   }
+   }
+
if (!data->sysclk)
return 0;
 
@@ -137,6 +192,9 @@ static int imx_rpmsg_probe(struct platform_device *pdev)
goto fail;
}
 
+   if (of_property_read_bool(np, "fsl,enable-lpa"))
+   data->lpa = true;
+
data->card.dev = &pdev->dev;
data->card.owner = THIS_MODULE;
data->card.dapm_widgets = imx_rpmsg_dapm_widgets;
-- 
2.25.1



Re: [PATCH v6 08/30] dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC

2023-09-25 Thread Krzysztof Kozlowski
On 25/09/2023 12:27, Herve Codina wrote:
> On Mon, 25 Sep 2023 10:21:15 +0200
> Krzysztof Kozlowski  wrote:
> 
>> On 25/09/2023 10:17, Herve Codina wrote:
>>> Hi Krzysztof,
>>>
>>> On Sat, 23 Sep 2023 19:39:49 +0200
>>> Krzysztof Kozlowski  wrote:
>>>   
 On 22/09/2023 09:58, Herve Codina wrote:  
> The QMC (QUICC mutichannel controller) is a controller present in some
> PowerQUICC SoC such as MPC885.
> The QMC HDLC uses the QMC controller to transfer HDLC data.
>
> Additionally, a framer can be connected to the QMC HDLC.
> If present, this framer is the interface between the TDM bus used by the
> QMC HDLC and the E1/T1 line.
> The QMC HDLC can use this framer to get information about the E1/T1 line
> and configure the E1/T1 line.
>
> Signed-off-by: Herve Codina 
> ---
>  .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 24 +++
>  1 file changed, 24 insertions(+)
>
> diff --git 
> a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml 
> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> index 82d9beb48e00..61dfd5ef7407 100644
> --- 
> a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> +++ 
> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> @@ -101,6 +101,27 @@ patternProperties:
>Channel assigned Rx time-slots within the Rx time-slots routed 
> by the
>TSA to this cell.
>  
> +  compatible:
> +const: fsl,qmc-hdlc

 Why this is not a device/SoC specific compatible?  
>>>
>>> This compatible is present in a QMC channel.
>>> The parent node (the QMC itself) contains a compatible with device/SoC:
>>> --- 8< ---
>>>   compatible:
>>> items:
>>>   - enum:
>>>   - fsl,mpc885-scc-qmc
>>>   - fsl,mpc866-scc-qmc
>>>   - const: fsl,cpm1-scc-qmc
>>> --- 8< ---
>>>
>>> At the child level (ie QMC channel), I am not sure that adding device/SoC
>>> makes sense. This compatible indicates that the QMC channel is handled by
>>> the QMC HDLC driver.
>>> At this level, whatever the device/SoC, we have to be QMC compliant.
>>>
>>> With these details, do you still think I need to change the child (channel)
>>> compatible ?  
>>
>> From OS point of view, you have a driver binding to this child-level
>> compatible. How do you enforce Linux driver binding based on parent
>> compatible? I looked at your next patch and I did not see it.
> 
> We do not need to have the child driver binding based on parent.

Exactly, that's what I said.

> We have to ensure that the child handles a QMC channel and the parent provides
> a QMC channel.
> 
> A QMC controller (parent) has to implement the QMC API 
> (include/soc/fsl/qe/qmc.h)
> and a QMC channel driver (child) has to use the QMC API.

How does this solve my concerns? Sorry, I do not understand. Your driver
is a platform driver and binds to the generic compatible. How do you
solve regular compatibility issues (need for quirks) if parent
compatible is not used?

How does being QMC compliant affects driver binding and
compatibility/quirks?

We are back to my original question and I don't think you answered to
any of the concerns.

Best regards,
Krzysztof



Re: [PATCH V4 2/2] tools/perf/tests: Fix object code reading to skip address that falls out of text section

2023-09-25 Thread kajoljain
Patch looks good to me.

Reviewed-by: Kajol Jain 

Thanks,
Kajol Jain

On 9/15/23 11:07, Athira Rajeev wrote:
> The testcase "Object code reading" fails in somecases
> for "fs_something" sub test as below:
> 
> Reading object code for memory address: 0xc00807f0142c
> File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
> On file address is: 0x1114cc
> Objdump command is: objdump -z -d --start-address=0x11142c 
> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
> objdump read too few bytes: 128
> test child finished with -1
> 
> This can alo be reproduced when running perf record with
> workload that exercises fs_something() code. In the test
> setup, this is exercising xfs code since root is xfs.
> 
> # perf record ./a.out
> # perf report -v |grep "xfs.ko"
>   0.76% a.out /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
> 0xc00807de5efc B [k] xlog_cil_commit
>   0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
> 0xc00807d5ae18 B [k] xfs_btree_key_offset
>   0.74% a.out  /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko  
> 0xc00807e11fd4 B [k] 0x00112074
> 
> Here addr "0xc00807e11fd4" is not resolved. since this is a
> kernel module, its offset is from the DSO. Xfs module is loaded
> at 0xc00807d0
> 
># cat /proc/modules | grep xfs
> xfs 2228224 3 - Live 0xc00807d0
> 
> And size is 0x22. So its loaded between  0xc00807d0
> and 0xc00807f2. From objdump, text section is:
> text 0010f7bc    00a0 2**4
> 
> Hence perf captured ip maps to 0x112074 which is:
> ( ip - start of module ) + a0
> 
> This offset 0x112074 falls out .text section which is up to 0x10f7bc
> In this case for module, the address 0xc00807e11fd4 is pointing
> to stub instructions. This address range represents the module stubs
> which is allocated on module load and hence is not part of DSO offset.
> 
> To address this issue in "object code reading", skip the sample if
> address falls out of text section and is within the module end.
> Use the "text_end" member of "struct dso" to do this check.
> 
> To address this issue in "perf report", exploring an option of
> having stubs range as part of the /proc/kallsyms, so that perf
> report can resolve addresses in stubs range
> 
> However this patch uses text_end to skip the stub range for
> Object code reading testcase.
> 
> Reported-by: Disha Goel 
> Signed-off-by: Athira Rajeev 
> Tested-by: Disha Goel
> Reviewed-by: Adrian Hunter 
> ---
> Changelog:
>  v3 -> v4:
>  Fixed indent in V3
> 
>  v2 -> v3:
>  Used strtailcmp in comparison for module check and added Reviewed-by
>  from Adrian, Tested-by from Disha.
> 
>  v1 -> v2:
>  Updated comment to add description on which arch has stub and
>  reason for skipping as suggested by Adrian
> 
>  tools/perf/tests/code-reading.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
> index ed3815163d1b..9e6e6c985840 100644
> --- a/tools/perf/tests/code-reading.c
> +++ b/tools/perf/tests/code-reading.c
> @@ -269,6 +269,16 @@ static int read_object_code(u64 addr, size_t len, u8 
> cpumode,
>   if (addr + len > map__end(al.map))
>   len = map__end(al.map) - addr;
>  
> + /*
> +  * Some architectures (ex: powerpc) have stubs (trampolines) in kernel
> +  * modules to manage long jumps. Check if the ip offset falls in stubs
> +  * sections for kernel modules. And skip module address after text end
> +  */
> + if (!strtailcmp(dso->long_name, ".ko") && al.addr > dso->text_end) {
> + pr_debug("skipping the module address %#"PRIx64" after text 
> end\n", al.addr);
> + goto out;
> + }
> +
>   /* Read the object code using perf */
>   ret_len = dso__data_read_offset(dso, 
> maps__machine(thread__maps(thread)),
>   al.addr, buf1, len);


Re: [PATCH V4 1/2] tools/perf: Add text_end to "struct dso" to save .text section size

2023-09-25 Thread kajoljain
Patch looks good to me.

Reviewed-by: Kajol Jain 

Thanks,
Kajol Jain

On 9/15/23 11:07, Athira Rajeev wrote:
> Update "struct dso" to include new member "text_end".
> This new field will represent the offset for end of text
> section for a dso. For elf, this value is derived as:
> sh_size (Size of section in byes) + sh_offset (Section file
> offst) of the elf header for text.
> 
> For bfd, this value is derived as:
> 1. For PE file,
> section->size + ( section->vma - dso->text_offset)
> 2. Other cases:
> section->filepos (file position) + section->size (size of
> section)
> 
> To resolve the address from a sample, perf looks at the
> DSO maps. In case of address from a kernel module, there
> were some address found to be not resolved. This was
> observed while running perf test for "Object code reading".
> Though the ip falls beteen the start address of the loaded
> module (perf map->start ) and end address ( perf map->end),
> it was unresolved.
> 
> Example:
> 
> Reading object code for memory address: 0xc00807f0142c
> File is: /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
> On file address is: 0x1114cc
> Objdump command is: objdump -z -d --start-address=0x11142c 
> --stop-address=0x1114ac /lib/modules/6.5.0-rc3+/kernel/fs/xfs/xfs.ko
> objdump read too few bytes: 128
> test child finished with -1
> 
> Here, module is loaded at:
> # cat /proc/modules | grep xfs
> xfs 2228224 3 - Live 0xc00807d0
> 
> From objdump for xfs module, text section is:
> text 0010f7bc    00a0 2**4
> 
> Here the offset for 0xc00807f0142c ie  0x112074 falls out
> .text section which is up to 0x10f7bc.
> 
> In this case for module, the address 0xc00807e11fd4 is pointing
> to stub instructions. This address range represents the module stubs
> which is allocated on module load and hence is not part of DSO offset.
> 
> To identify such  address, which falls out of text
> section and within module end, added the new field "text_end" to
> "struct dso".
> 
> Reported-by: Disha Goel 
> Signed-off-by: Athira Rajeev 
> Reviewed-by: Adrian Hunter 
> ---
> Changelog:
> v2 -> v3:
>  Added Reviewed-by from Adrian
> 
>  v1 -> v2:
>  Added text_end for bfd also by updating dso__load_bfd_symbols
>  as suggested by Adrian.
> 
>  tools/perf/util/dso.h| 1 +
>  tools/perf/util/symbol-elf.c | 4 +++-
>  tools/perf/util/symbol.c | 2 ++
>  3 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> index b41c9782c754..70fe0fe69bef 100644
> --- a/tools/perf/util/dso.h
> +++ b/tools/perf/util/dso.h
> @@ -181,6 +181,7 @@ struct dso {
>   u8   rel;
>   struct build_id  bid;
>   u64  text_offset;
> + u64  text_end;
>   const char   *short_name;
>   const char   *long_name;
>   u16  long_name_len;
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 95e99c332d7e..9e7eeaf616b8 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1514,8 +1514,10 @@ dso__load_sym_internal(struct dso *dso, struct map 
> *map, struct symsrc *syms_ss,
>   }
>  
>   if (elf_section_by_name(runtime_ss->elf, &runtime_ss->ehdr, &tshdr,
> - ".text", NULL))
> + ".text", NULL)) {
>   dso->text_offset = tshdr.sh_addr - tshdr.sh_offset;
> + dso->text_end = tshdr.sh_offset + tshdr.sh_size;
> + }
>  
>   if (runtime_ss->opdsec)
>   opddata = elf_rawdata(runtime_ss->opdsec, NULL);
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 3f36675b7c8f..f25e4e62cf25 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -1733,8 +1733,10 @@ int dso__load_bfd_symbols(struct dso *dso, const char 
> *debugfile)
>   /* PE symbols can only have 4 bytes, so use .text high 
> bits */
>   dso->text_offset = section->vma - (u32)section->vma;
>   dso->text_offset += (u32)bfd_asymbol_value(symbols[i]);
> + dso->text_end = (section->vma - dso->text_offset) + 
> section->size;
>   } else {
>   dso->text_offset = section->vma - section->filepos;
> + dso->text_end = section->filepos + section->size;
>   }
>   }
>  


Re: [PATCH v6 08/30] dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC

2023-09-25 Thread Herve Codina
On Mon, 25 Sep 2023 10:21:15 +0200
Krzysztof Kozlowski  wrote:

> On 25/09/2023 10:17, Herve Codina wrote:
> > Hi Krzysztof,
> > 
> > On Sat, 23 Sep 2023 19:39:49 +0200
> > Krzysztof Kozlowski  wrote:
> >   
> >> On 22/09/2023 09:58, Herve Codina wrote:  
> >>> The QMC (QUICC mutichannel controller) is a controller present in some
> >>> PowerQUICC SoC such as MPC885.
> >>> The QMC HDLC uses the QMC controller to transfer HDLC data.
> >>>
> >>> Additionally, a framer can be connected to the QMC HDLC.
> >>> If present, this framer is the interface between the TDM bus used by the
> >>> QMC HDLC and the E1/T1 line.
> >>> The QMC HDLC can use this framer to get information about the E1/T1 line
> >>> and configure the E1/T1 line.
> >>>
> >>> Signed-off-by: Herve Codina 
> >>> ---
> >>>  .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 24 +++
> >>>  1 file changed, 24 insertions(+)
> >>>
> >>> diff --git 
> >>> a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml 
> >>> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> >>> index 82d9beb48e00..61dfd5ef7407 100644
> >>> --- 
> >>> a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> >>> +++ 
> >>> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> >>> @@ -101,6 +101,27 @@ patternProperties:
> >>>Channel assigned Rx time-slots within the Rx time-slots routed 
> >>> by the
> >>>TSA to this cell.
> >>>  
> >>> +  compatible:
> >>> +const: fsl,qmc-hdlc
> >>
> >> Why this is not a device/SoC specific compatible?  
> > 
> > This compatible is present in a QMC channel.
> > The parent node (the QMC itself) contains a compatible with device/SoC:
> > --- 8< ---
> >   compatible:
> > items:
> >   - enum:
> >   - fsl,mpc885-scc-qmc
> >   - fsl,mpc866-scc-qmc
> >   - const: fsl,cpm1-scc-qmc
> > --- 8< ---
> > 
> > At the child level (ie QMC channel), I am not sure that adding device/SoC
> > makes sense. This compatible indicates that the QMC channel is handled by
> > the QMC HDLC driver.
> > At this level, whatever the device/SoC, we have to be QMC compliant.
> > 
> > With these details, do you still think I need to change the child (channel)
> > compatible ?  
> 
> From OS point of view, you have a driver binding to this child-level
> compatible. How do you enforce Linux driver binding based on parent
> compatible? I looked at your next patch and I did not see it.

We do not need to have the child driver binding based on parent.
We have to ensure that the child handles a QMC channel and the parent provides
a QMC channel.

A QMC controller (parent) has to implement the QMC API 
(include/soc/fsl/qe/qmc.h)
and a QMC channel driver (child) has to use the QMC API.

Best regards,
Hervé

> 
> Best regards,
> Krzysztof
> 


[PATCH 00/40] soc: Convert to platform remove callback returning void

2023-09-25 Thread Uwe Kleine-König
Hello,

this series converts all platform drivers below drivers/soc to use
.remove_new(). The motivation is to get rid of an integer return code
that is (mostly) ignored by the platform driver core and error prone on
the driver side.

See commit 5c5a7680e67b ("platform: Provide a remove callback that
returns no value") for an extended explanation and the eventual goal.

As there is no single maintainer team for drivers/soc, I suggest the
individual maintainers to pick up "their" patches. There are no
interdependencies between the patches, so that should work fine. As
there are still quite a few drivers to convert in other areas than
drivers/soc, I'm happy about every patch that makes it in and there is
no need for further coordination.  So even if there is a merge conflict
with one patch until you apply or a subject prefix is suboptimal, please
don't let you stop by negative feedback for other patches (unless it
applies to "your" patches, too, of course).

Best regards and thanks for considering,
Uwe

Uwe Kleine-König (40):
  soc/aspeed: aspeed-lpc-ctrl: Convert to platform remove callback
returning void
  soc/aspeed: aspeed-lpc-snoop: Convert to platform remove callback
returning void
  soc/aspeed: aspeed-p2a-ctrl: Convert to platform remove callback
returning void
  soc/aspeed: aspeed-uart-routing: Convert to platform remove callback
returning void
  soc/fsl: dpaa2-console: Convert to platform remove callback returning
void
  soc/fsl: cpm: qmc: Convert to platform remove callback returning void
  soc/fsl: cpm: tsa: Convert to platform remove callback returning void
  soc/fujitsu: a64fx-diag: Convert to platform remove callback returning
void
  soc/hisilicon: kunpeng_hccs: Convert to platform remove callback
returning void
  soc/ixp4xx: ixp4xx-npe: Convert to platform remove callback returning
void
  soc/ixp4xx: ixp4xx-qmgr: Convert to platform remove callback returning
void
  soc/litex: litex_soc_ctrl: Convert to platform remove callback
returning void
  soc/loongson: loongson2_guts: Convert to platform remove callback
returning void
  soc/mediatek: mtk-devapc: Convert to platform remove callback
returning void
  soc/mediatek: mtk-mmsys: Convert to platform remove callback returning
void
  soc/microchip: mpfs-sys-controller: Convert to platform remove
callback returning void
  soc/pxa: ssp: Convert to platform remove callback returning void
  soc/qcom: icc-bwmon: Convert to platform remove callback returning
void
  soc/qcom: llcc-qcom: Convert to platform remove callback returning
void
  soc/qcom: ocmem: Convert to platform remove callback returning void
  soc/qcom: pmic_glink: Convert to platform remove callback returning
void
  soc/qcom: qcom_aoss: Convert to platform remove callback returning
void
  soc/qcom: qcom_gsbi: Convert to platform remove callback returning
void
  soc/qcom: qcom_stats: Convert to platform remove callback returning
void
  soc/qcom: rmtfs_mem: Convert to platform remove callback returning
void
  soc/qcom: smem: Convert to platform remove callback returning void
  soc/qcom: smp2p: Convert to platform remove callback returning void
  soc/qcom: smsm: Convert to platform remove callback returning void
  soc/qcom: socinfo: Convert to platform remove callback returning void
  soc/rockchip: io-domain: Convert to platform remove callback returning
void
  soc/samsung: exynos-chipid: Convert to platform remove callback
returning void
  soc/tegra: cbb: tegra194-cbb: Convert to platform remove callback
returning void
  soc/ti: k3-ringacc: Convert to platform remove callback returning void
  soc/ti: knav_dma: Convert to platform remove callback returning void
  soc/ti: knav_qmss_queue: Convert to platform remove callback returning
void
  soc/ti: pm33xx: Convert to platform remove callback returning void
  soc/ti: pruss: Convert to platform remove callback returning void
  soc/ti: smartreflex: Convert to platform remove callback returning
void
  soc/ti: wkup_m3_ipc: Convert to platform remove callback returning
void
  soc/xilinx: zynqmp_power: Convert to platform remove callback
returning void

 drivers/soc/aspeed/aspeed-lpc-ctrl.c| 6 ++
 drivers/soc/aspeed/aspeed-lpc-snoop.c   | 6 ++
 drivers/soc/aspeed/aspeed-p2a-ctrl.c| 6 ++
 drivers/soc/aspeed/aspeed-uart-routing.c| 6 ++
 drivers/soc/fsl/dpaa2-console.c | 6 ++
 drivers/soc/fsl/qe/qmc.c| 6 ++
 drivers/soc/fsl/qe/tsa.c| 5 ++---
 drivers/soc/fujitsu/a64fx-diag.c| 6 ++
 drivers/soc/hisilicon/kunpeng_hccs.c| 6 ++
 drivers/soc/ixp4xx/ixp4xx-npe.c | 6 ++
 drivers/soc/ixp4xx/ixp4xx-qmgr.c| 5 ++---
 drivers/soc/litex/litex_soc_ctrl.c  | 5 ++---
 drivers/soc/loongson/loongson2_guts.c   | 6 ++
 drivers/soc/mediatek/mtk-devapc.c   | 6 ++
 drivers/soc/mediatek/

[PATCH 06/40] soc/fsl: cpm: qmc: Convert to platform remove callback returning void

2023-09-25 Thread Uwe Kleine-König
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König 
---
 drivers/soc/fsl/qe/qmc.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c
index b3c292c9a14e..92ec76c03965 100644
--- a/drivers/soc/fsl/qe/qmc.c
+++ b/drivers/soc/fsl/qe/qmc.c
@@ -1415,7 +1415,7 @@ static int qmc_probe(struct platform_device *pdev)
return ret;
 }
 
-static int qmc_remove(struct platform_device *pdev)
+static void qmc_remove(struct platform_device *pdev)
 {
struct qmc *qmc = platform_get_drvdata(pdev);
 
@@ -1427,8 +1427,6 @@ static int qmc_remove(struct platform_device *pdev)
 
/* Disconnect the serial from TSA */
tsa_serial_disconnect(qmc->tsa_serial);
-
-   return 0;
 }
 
 static const struct of_device_id qmc_id_table[] = {
@@ -1443,7 +1441,7 @@ static struct platform_driver qmc_driver = {
.of_match_table = of_match_ptr(qmc_id_table),
},
.probe = qmc_probe,
-   .remove = qmc_remove,
+   .remove_new = qmc_remove,
 };
 module_platform_driver(qmc_driver);
 
-- 
2.40.1



[PATCH 05/40] soc/fsl: dpaa2-console: Convert to platform remove callback returning void

2023-09-25 Thread Uwe Kleine-König
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König 
---
 drivers/soc/fsl/dpaa2-console.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/soc/fsl/dpaa2-console.c b/drivers/soc/fsl/dpaa2-console.c
index 1dca693b6b38..6dbc77db7718 100644
--- a/drivers/soc/fsl/dpaa2-console.c
+++ b/drivers/soc/fsl/dpaa2-console.c
@@ -300,12 +300,10 @@ static int dpaa2_console_probe(struct platform_device 
*pdev)
return error;
 }
 
-static int dpaa2_console_remove(struct platform_device *pdev)
+static void dpaa2_console_remove(struct platform_device *pdev)
 {
misc_deregister(&dpaa2_mc_console_dev);
misc_deregister(&dpaa2_aiop_console_dev);
-
-   return 0;
 }
 
 static const struct of_device_id dpaa2_console_match_table[] = {
@@ -322,7 +320,7 @@ static struct platform_driver dpaa2_console_driver = {
   .of_match_table = dpaa2_console_match_table,
   },
.probe = dpaa2_console_probe,
-   .remove = dpaa2_console_remove,
+   .remove_new = dpaa2_console_remove,
 };
 module_platform_driver(dpaa2_console_driver);
 
-- 
2.40.1



[PATCH 07/40] soc/fsl: cpm: tsa: Convert to platform remove callback returning void

2023-09-25 Thread Uwe Kleine-König
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new() which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König 
---
 drivers/soc/fsl/qe/tsa.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/soc/fsl/qe/tsa.c b/drivers/soc/fsl/qe/tsa.c
index 3646153117b3..3f9981335590 100644
--- a/drivers/soc/fsl/qe/tsa.c
+++ b/drivers/soc/fsl/qe/tsa.c
@@ -706,7 +706,7 @@ static int tsa_probe(struct platform_device *pdev)
return 0;
 }
 
-static int tsa_remove(struct platform_device *pdev)
+static void tsa_remove(struct platform_device *pdev)
 {
struct tsa *tsa = platform_get_drvdata(pdev);
int i;
@@ -729,7 +729,6 @@ static int tsa_remove(struct platform_device *pdev)
clk_put(tsa->tdm[i].l1rclk_clk);
}
}
-   return 0;
 }
 
 static const struct of_device_id tsa_id_table[] = {
@@ -744,7 +743,7 @@ static struct platform_driver tsa_driver = {
.of_match_table = of_match_ptr(tsa_id_table),
},
.probe = tsa_probe,
-   .remove = tsa_remove,
+   .remove_new = tsa_remove,
 };
 module_platform_driver(tsa_driver);
 
-- 
2.40.1



Re: [PATCH v8 00/24] iommu: Make default_domain's mandatory

2023-09-25 Thread Joerg Roedel
On Wed, Sep 13, 2023 at 10:43:33AM -0300, Jason Gunthorpe wrote:
> Jason Gunthorpe (24):
>   iommu: Add iommu_ops->identity_domain
>   iommu: Add IOMMU_DOMAIN_PLATFORM
>   powerpc/iommu: Setup a default domain and remove set_platform_dma_ops
>   iommu: Add IOMMU_DOMAIN_PLATFORM for S390
>   iommu/fsl_pamu: Implement a PLATFORM domain
>   iommu/tegra-gart: Remove tegra-gart
>   iommu/mtk_iommu_v1: Implement an IDENTITY domain
>   iommu: Reorganize iommu_get_default_domain_type() to respect
> def_domain_type()
>   iommu: Allow an IDENTITY domain as the default_domain in ARM32
>   iommu/exynos: Implement an IDENTITY domain
>   iommu/tegra-smmu: Implement an IDENTITY domain
>   iommu/tegra-smmu: Support DMA domains in tegra
>   iommu/omap: Implement an IDENTITY domain
>   iommu/msm: Implement an IDENTITY domain
>   iommu: Remove ops->set_platform_dma_ops()
>   iommu/qcom_iommu: Add an IOMMU_IDENTITIY_DOMAIN
>   iommu/ipmmu: Add an IOMMU_IDENTITIY_DOMAIN
>   iommu/mtk_iommu: Add an IOMMU_IDENTITIY_DOMAIN
>   iommu/sun50i: Add an IOMMU_IDENTITIY_DOMAIN
>   iommu: Require a default_domain for all iommu drivers
>   iommu: Add __iommu_group_domain_alloc()
>   iommu: Add ops->domain_alloc_paging()
>   iommu: Convert simple drivers with DOMAIN_DMA to domain_alloc_paging()
>   iommu: Convert remaining simple drivers to domain_alloc_paging()

Applied, thanks.


Re: [PATCH 1/2] ASoC: dt-bindings: fsl_rpmsg: List DAPM endpoints ignoring suspend

2023-09-25 Thread Krzysztof Kozlowski
On 25/09/2023 10:20, Chancel Liu wrote:
>>> Add a property to list DAPM endpoints which mark paths between these
>>> endpoints ignoring suspend. These DAPM paths can still be power on
>>> when system enters into suspend.
>>>
>>> Signed-off-by: Chancel Liu 
>>> ---
>>>  Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
>>> b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
>>> index 188f38baddec..ec6e09eab427 100644
>>> --- a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
>>> +++ b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
>>> @@ -91,6 +91,12 @@ properties:
>>>- rpmsg-audio-channel
>>>- rpmsg-micfil-channel
>>>
>>> +  fsl,lpa-widgets:
>>
>> What is LPA? It's not explained in property description.
>>
> 
> On asymmetric multiprocessor, there are Cortex-A core and Cortex-M core, Linux
> is running on Cortex-A core, RTOS or other OS is running on Cortex-M core. The
> audio hardware devices can be controlled by Cortex-M. LPA means low power 
> audio
> case. The mechanism can be explained that Cortex-A allocates a large buffer 
> and
> fill audio data, then Cortex-A can enter into suspend for the purpose of power
> saving. Cortex-M continues to play the sound during suspend phase of Cortex-A.
> When the data in buffer is consumed, Cortex-M will trigger the Cortex-A to
> wakeup to fill data.
> 
> I can add above explanation to LPA in patch v2.
> 
>>> +$ref: /schemas/types.yaml#/definitions/non-unique-string-array
>>> +description: |
>>> +  A list of DAPM endpoints which mark paths between these endpoints
>>> +  ignoring suspend.
>>
>> And how does it differ from audio-routing? Also, you need to explain what is
>> "suspend" in this context. Bindings are independent of Linux.
>>
> 
> Normally audio paths will be disabled by ASoC dynamic audio power management 
> if
> Linux enters into suspend. LPA requires some audio paths enabled when Cortex-A
> enters into suspend. We can read DAPM endpoints from the "fsl,lpa-widgets"
> property and keep the paths between these endpoints enabled during suspend
> phase of Cortex-A. Property "audio-routing" just declares the connection
> between widgets and doesn't have such feature.
> 
> I will modify the description as following:
> "A list of DAPM endpoints which mark paths between these endpoints still 
> enabled
> when system enters into suspend."

Yes, that's better, but even better would be to say not how the OS
should behave, but how the actual entire system works. Basically these
widgets remain in use by your co-processor, thus OS should not disable
them when entering in system suspend state.

Best regards,
Krzysztof



Re: [PATCH v6 08/30] dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC

2023-09-25 Thread Krzysztof Kozlowski
On 25/09/2023 10:17, Herve Codina wrote:
> Hi Krzysztof,
> 
> On Sat, 23 Sep 2023 19:39:49 +0200
> Krzysztof Kozlowski  wrote:
> 
>> On 22/09/2023 09:58, Herve Codina wrote:
>>> The QMC (QUICC mutichannel controller) is a controller present in some
>>> PowerQUICC SoC such as MPC885.
>>> The QMC HDLC uses the QMC controller to transfer HDLC data.
>>>
>>> Additionally, a framer can be connected to the QMC HDLC.
>>> If present, this framer is the interface between the TDM bus used by the
>>> QMC HDLC and the E1/T1 line.
>>> The QMC HDLC can use this framer to get information about the E1/T1 line
>>> and configure the E1/T1 line.
>>>
>>> Signed-off-by: Herve Codina 
>>> ---
>>>  .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 24 +++
>>>  1 file changed, 24 insertions(+)
>>>
>>> diff --git 
>>> a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml 
>>> b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
>>> index 82d9beb48e00..61dfd5ef7407 100644
>>> --- a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
>>> +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
>>> @@ -101,6 +101,27 @@ patternProperties:
>>>Channel assigned Rx time-slots within the Rx time-slots routed 
>>> by the
>>>TSA to this cell.
>>>  
>>> +  compatible:
>>> +const: fsl,qmc-hdlc  
>>
>> Why this is not a device/SoC specific compatible?
> 
> This compatible is present in a QMC channel.
> The parent node (the QMC itself) contains a compatible with device/SoC:
> --- 8< ---
>   compatible:
> items:
>   - enum:
>   - fsl,mpc885-scc-qmc
>   - fsl,mpc866-scc-qmc
>   - const: fsl,cpm1-scc-qmc
> --- 8< ---
> 
> At the child level (ie QMC channel), I am not sure that adding device/SoC
> makes sense. This compatible indicates that the QMC channel is handled by
> the QMC HDLC driver.
> At this level, whatever the device/SoC, we have to be QMC compliant.
> 
> With these details, do you still think I need to change the child (channel)
> compatible ?

>From OS point of view, you have a driver binding to this child-level
compatible. How do you enforce Linux driver binding based on parent
compatible? I looked at your next patch and I did not see it.

Best regards,
Krzysztof



RE: Re: [PATCH 1/2] ASoC: dt-bindings: fsl_rpmsg: List DAPM endpoints ignoring suspend

2023-09-25 Thread Chancel Liu
> > Add a property to list DAPM endpoints which mark paths between these
> > endpoints ignoring suspend. These DAPM paths can still be power on
> > when system enters into suspend.
> >
> > Signed-off-by: Chancel Liu 
> > ---
> >  Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
> > b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
> > index 188f38baddec..ec6e09eab427 100644
> > --- a/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
> > +++ b/Documentation/devicetree/bindings/sound/fsl,rpmsg.yaml
> > @@ -91,6 +91,12 @@ properties:
> >- rpmsg-audio-channel
> >- rpmsg-micfil-channel
> >
> > +  fsl,lpa-widgets:
> 
> What is LPA? It's not explained in property description.
> 

On asymmetric multiprocessor, there are Cortex-A core and Cortex-M core, Linux
is running on Cortex-A core, RTOS or other OS is running on Cortex-M core. The
audio hardware devices can be controlled by Cortex-M. LPA means low power audio
case. The mechanism can be explained that Cortex-A allocates a large buffer and
fill audio data, then Cortex-A can enter into suspend for the purpose of power
saving. Cortex-M continues to play the sound during suspend phase of Cortex-A.
When the data in buffer is consumed, Cortex-M will trigger the Cortex-A to
wakeup to fill data.

I can add above explanation to LPA in patch v2.

> > +$ref: /schemas/types.yaml#/definitions/non-unique-string-array
> > +description: |
> > +  A list of DAPM endpoints which mark paths between these endpoints
> > +  ignoring suspend.
> 
> And how does it differ from audio-routing? Also, you need to explain what is
> "suspend" in this context. Bindings are independent of Linux.
> 

Normally audio paths will be disabled by ASoC dynamic audio power management if
Linux enters into suspend. LPA requires some audio paths enabled when Cortex-A
enters into suspend. We can read DAPM endpoints from the "fsl,lpa-widgets"
property and keep the paths between these endpoints enabled during suspend
phase of Cortex-A. Property "audio-routing" just declares the connection
between widgets and doesn't have such feature.

I will modify the description as following:
"A list of DAPM endpoints which mark paths between these endpoints still enabled
when system enters into suspend."

> Best regards,
> Krzysztof

Regards, 
Chancel Liu



Re: [PATCH v6 08/30] dt-bindings: soc: fsl: cpm_qe: cpm1-scc-qmc: Add support for QMC HDLC

2023-09-25 Thread Herve Codina
Hi Krzysztof,

On Sat, 23 Sep 2023 19:39:49 +0200
Krzysztof Kozlowski  wrote:

> On 22/09/2023 09:58, Herve Codina wrote:
> > The QMC (QUICC mutichannel controller) is a controller present in some
> > PowerQUICC SoC such as MPC885.
> > The QMC HDLC uses the QMC controller to transfer HDLC data.
> > 
> > Additionally, a framer can be connected to the QMC HDLC.
> > If present, this framer is the interface between the TDM bus used by the
> > QMC HDLC and the E1/T1 line.
> > The QMC HDLC can use this framer to get information about the E1/T1 line
> > and configure the E1/T1 line.
> > 
> > Signed-off-by: Herve Codina 
> > ---
> >  .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 24 +++
> >  1 file changed, 24 insertions(+)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml 
> > b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > index 82d9beb48e00..61dfd5ef7407 100644
> > --- a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > +++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
> > @@ -101,6 +101,27 @@ patternProperties:
> >Channel assigned Rx time-slots within the Rx time-slots routed 
> > by the
> >TSA to this cell.
> >  
> > +  compatible:
> > +const: fsl,qmc-hdlc  
> 
> Why this is not a device/SoC specific compatible?

This compatible is present in a QMC channel.
The parent node (the QMC itself) contains a compatible with device/SoC:
--- 8< ---
  compatible:
items:
  - enum:
  - fsl,mpc885-scc-qmc
  - fsl,mpc866-scc-qmc
  - const: fsl,cpm1-scc-qmc
--- 8< ---

At the child level (ie QMC channel), I am not sure that adding device/SoC
makes sense. This compatible indicates that the QMC channel is handled by
the QMC HDLC driver.
At this level, whatever the device/SoC, we have to be QMC compliant.

With these details, do you still think I need to change the child (channel)
compatible ?

> 
> > +
> > +  fsl,framer:
> > +$ref: /schemas/types.yaml#/definitions/phandle
> > +description:
> > +  phandle to the framer node. The framer is in charge of an E1/T1 
> > line
> > +  interface connected to the TDM bus. It can be used to get the 
> > E1/T1 line
> > +  status such as link up/down.
> > +
> > +allOf:
> > +  - if:
> > +  properties:
> > +compatible:
> > +  not:
> > +contains:
> > +  const: fsl,qmc-hdlc
> > +then:
> > +  properties:
> > +fsl,framer: false
> > +
> >  required:
> >- reg
> >- fsl,tx-ts-mask
> > @@ -159,5 +180,8 @@ examples:
> >  fsl,operational-mode = "hdlc";
> >  fsl,tx-ts-mask = <0x 0xff00>;
> >  fsl,rx-ts-mask = <0x 0xff00>;
> > +
> > +compatible = "fsl,qmc-hdlc";  
> 
> compatible is always the first property.

Will be moved to the first property in the next iteration.

Best regards,
Hervé

> 
> > +fsl,framer = <&framer>;
> >  };
> >  };  
> 
> Best regards,
> Krzysztof
> 


RE: Questions: Should kernel panic when PCIe fatal error occurs?

2023-09-25 Thread David Laight
From: Shuai Xue
> Sent: 25 September 2023 02:44
> 
> On 2023/9/21 21:20, David Laight wrote:
> > ...
> > I've got a target to generate AER errors by generating read cycles
> > that are inside the address range that the bridge forwards but
> > outside of any BAR because there are 2 different sized BARs.
> > (Pretty easy to setup.)
> > On the system I was using they didn't get propagated all the way
> > to the root bridge - but were visible in the lower bridge.
> 
> So how did you observe it? If the error message does not propagate
> to the root bridge, I think no AER interrupt will be trigger.

I looked at the internal registers (IIRC in PCIe config space)
of the intermediate bridge.
I don't think the root bridge on that system supported AER.
(I was testing the generation of AER indications by our fpga.)

> 
> > It would be nice for a driver to be able to detect/clear such
> > a flag if it gets an unexpected ~0u read value.
> > (I'm not sure an error callback helps.)
> 
> IMHO, a general model is that error detected at endpoint should be
> routed to upstream port for example: RCiEP route error message to RCEC,
> so that the AER port service could handle the error, the device driver
> only have to implement error handler callback.

The problem is that that and callback is too late for something
triggered by a PCIe read.
The driver has to detect that the value is 'dubious' and wants
a method of detecting whether there was an associated AER (or other)
error.
If the AER indication is routed through some external entity (like
board management hardware) there will be additional latency that
means that the associated interrupt (even if an NMI) may not have
been processed when the driver code is trying to determine what
happened.
This can only be made worse by the interrupt coming in on a
different cpu.

> > OTOH a 'nebs compliant' server routed any kind of PCIe link error
> > through to some 'system management' logic that then raised an NMI.
> > I'm not sure who thought an NMI was a good idea - they are pretty
> > impossible to handle in the kernel and too late to be of use to
> > the code performing the access.
> 
> I think it is the responsibility of the device to prevent the spread of
> errors while reporting that errors have been detected. For example, drop
> the current, (drain submit queue) and report error in completion record.

Eh?
I can generate two types of PCIe error:
- Read/write requests for addresses that aren't inside a BAR.
- Link failures that cause retraining and might need config
  space reconfiguring.

> Both NMI and MSI are asynchronous interrupts.

Indeed, which makes neither of them suitable for any indication
relating to a bus cycle failure.

> > In any case we were getting one after 'echo 1 >xxx/remove' and
> > then taking the PCIe link down by reprogramming the fpga.
> > So the link going down was entirely expected, but there seemed
> > to be nothing we could do to stop the kernel crashing.
> >
> > I'm sure 'nebs compliant' ought to contain some requirements for
> > resilience to hardware failures!
> 
> How the kernel crash after a link down? Did the system detect a surprise
> down error?

It was a couple of years ago..
IIRC the 'link down' cause the hub to generate an AER error.
The root hub forwarded it to some 'board management hardware/software'
that then raised and NMI.
The kernel crashed because of an unexpected NMI.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH 0/3] Fix for shellcheck issues with version "0.6"

2023-09-25 Thread kajoljain



On 9/7/23 22:45, Athira Rajeev wrote:
> From: root 
> 
> shellcheck was run on perf tool shell scripts s a pre-requisite
> to include a build option for shellcheck discussed here:
> https://www.spinics.net/lists/linux-perf-users/msg25553.html
> 
> And fixes were added for the coding/formatting issues in
> two patchsets:
> https://lore.kernel.org/linux-perf-users/20230613164145.50488-1-atraj...@linux.vnet.ibm.com/
> https://lore.kernel.org/linux-perf-users/20230709182800.53002-1-atraj...@linux.vnet.ibm.com/
> 
> Three additional issues are observed with shellcheck "0.6" and
> this patchset covers those. With this patchset,
> 
> # for F in $(find tests/shell/ -perm -o=x -name '*.sh'); do shellcheck -S 
> warning $F; done
> # echo $?
> 0
> 

Patchset looks good to me.

Reviewed-by: Kajol Jain 

Thanks,
Kajol Jain

> Athira Rajeev (3):
>   tests/shell: Fix shellcheck SC1090 to handle the location of sourced
> files
>   tests/shell: Fix shellcheck issues in tests/shell/stat+shadow_stat.sh
> tetscase
>   tests/shell: Fix shellcheck warnings for SC2153 in multiple scripts
> 
>  tools/perf/tests/shell/coresight/asm_pure_loop.sh| 4 
>  tools/perf/tests/shell/coresight/memcpy_thread_16k_10.sh | 4 
>  tools/perf/tests/shell/coresight/thread_loop_check_tid_10.sh | 4 
>  tools/perf/tests/shell/coresight/thread_loop_check_tid_2.sh  | 4 
>  tools/perf/tests/shell/coresight/unroll_loop_thread_10.sh| 4 
>  tools/perf/tests/shell/probe_vfs_getname.sh  | 2 ++
>  tools/perf/tests/shell/record+probe_libc_inet_pton.sh| 2 ++
>  tools/perf/tests/shell/record+script_probe_vfs_getname.sh| 2 ++
>  tools/perf/tests/shell/record.sh | 1 +
>  tools/perf/tests/shell/stat+csv_output.sh| 1 +
>  tools/perf/tests/shell/stat+csv_summary.sh   | 4 ++--
>  tools/perf/tests/shell/stat+shadow_stat.sh   | 4 ++--
>  tools/perf/tests/shell/stat+std_output.sh| 1 +
>  tools/perf/tests/shell/test_intel_pt.sh  | 1 +
>  tools/perf/tests/shell/trace+probe_vfs_getname.sh| 1 +
>  15 files changed, 35 insertions(+), 4 deletions(-)
> 


Re: [PATCH V2] perf test: Fix parse-events tests to skip parametrized events

2023-09-25 Thread kajoljain



On 9/7/23 22:29, Athira Rajeev wrote:
> Testcase "Parsing of all PMU events from sysfs" parse events for
> all PMUs, and not just cpu. In case of powerpc, the PowerVM
> environment supports events from hv_24x7 and hv_gpci PMU which
> is of example format like below:
> 
> - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
> - hv_gpci/event,partition_id=?/
> 
> The value for "?" needs to be filled in depending on system
> configuration. It is better to skip these parametrized events
> in this test as it is done in:
> 'commit b50d691e50e6 ("perf test: Fix "all PMU test" to skip
> parametrized events")' which handled a simialr instance with
> "all PMU test".
> 
> Fix parse-events test to skip parametrized events since
> it needs proper setup of the parameters.

Patch looks good to me.

Reviewed-by: Kajol Jain 

Thanks,
Kajol Jain

> 
> Signed-off-by: Athira Rajeev 
> ---
> Changelog:
> v1 -> v2:
>  Addressed review comments from Ian. Updated size of
>  pmu event name variable and changed bool name which is
>  used to skip the test.
> 
>  tools/perf/tests/parse-events.c | 38 +
>  1 file changed, 38 insertions(+)
> 
> diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
> index 658fb9599d95..1ecaeceb69f8 100644
> --- a/tools/perf/tests/parse-events.c
> +++ b/tools/perf/tests/parse-events.c
> @@ -2514,9 +2514,14 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
>   while ((pmu = perf_pmus__scan(pmu)) != NULL) {
>   struct stat st;
>   char path[PATH_MAX];
> + char pmu_event[PATH_MAX];
> + char *buf = NULL;
> + FILE *file;
>   struct dirent *ent;
> + size_t len = 0;
>   DIR *dir;
>   int err;
> + int n;
>  
>   snprintf(path, PATH_MAX, 
> "%s/bus/event_source/devices/%s/events/",
>   sysfs__mountpoint(), pmu->name);
> @@ -2538,11 +2543,44 @@ static int test__pmu_events(struct test_suite *test 
> __maybe_unused, int subtest
>   struct evlist_test e = { .name = NULL, };
>   char name[2 * NAME_MAX + 1 + 12 + 3];
>   int test_ret;
> + bool is_event_parameterized = 0;
>  
>   /* Names containing . are special and cannot be used 
> directly */
>   if (strchr(ent->d_name, '.'))
>   continue;
>  
> + /* exclude parametrized ones (name contains '?') */
> + n = snprintf(pmu_event, sizeof(pmu_event), "%s%s", 
> path, ent->d_name);
> + if (n >= PATH_MAX) {
> + pr_err("pmu event name crossed PATH_MAX(%d) 
> size\n", PATH_MAX);
> + continue;
> + }
> +
> + file = fopen(pmu_event, "r");
> + if (!file) {
> + pr_debug("can't open pmu event file for 
> '%s'\n", ent->d_name);
> + ret = combine_test_results(ret, TEST_FAIL);
> + continue;
> + }
> +
> + if (getline(&buf, &len, file) < 0) {
> + pr_debug(" pmu event: %s is a null event\n", 
> ent->d_name);
> + ret = combine_test_results(ret, TEST_FAIL);
> + continue;
> + }
> +
> + if (strchr(buf, '?'))
> + is_event_parameterized = 1;
> +
> + free(buf);
> + buf = NULL;
> + fclose(file);
> +
> + if (is_event_parameterized == 1) {
> + pr_debug("skipping parametrized PMU event: %s 
> which contains ?\n", pmu_event);
> + continue;
> + }
> +
>   snprintf(name, sizeof(name), "%s/event=%s/u", 
> pmu->name, ent->d_name);
>  
>   e.name  = name;


Re: [PATCH v5 0/5] ppc, fbdev: Clean up fbdev mmap helper

2023-09-25 Thread Thomas Zimmermann
FYI, I intent to merge patches 1 and 2 of this patchset into 
drm-misc-next. The updates for PowerPC can be merged through PPC trees 
later. Let me know if this does not work for you.


Best regards
Thomas

Am 22.09.23 um 10:04 schrieb Thomas Zimmermann:

Clean up and rename fb_pgprotect() to work without struct file. Then
refactor the implementation for PowerPC. This change has been discussed
at [1] in the context of refactoring fbdev's mmap code.

The first two patches update fbdev and replace fbdev's fb_pgprotect()
with pgprot_framebuffer() on all architectures. The new helper's stream-
lined interface enables more refactoring within fbdev's mmap
implementation.

Patches 3 to 5 adapt PowerPC's internal interfaces to provide
phys_mem_access_prot() that works without struct file. Neither the
architecture code or fbdev helpers need the parameter.

v5:
* improve commit descriptions (Javier)
* add missing tags (Geert)
v4:
* fix commit message (Christophe)
v3:
* rename fb_pgrotect() to pgprot_framebuffer() (Arnd)
v2:
* reorder patches to simplify merging (Michael)

[1] 
https://lore.kernel.org/linuxppc-dev/5501ba80-bdb0-6344-16b0-0466a950f...@suse.com/

Thomas Zimmermann (5):
   fbdev: Avoid file argument in fb_pgprotect()
   fbdev: Replace fb_pgprotect() with pgprot_framebuffer()
   arch/powerpc: Remove trailing whitespaces
   arch/powerpc: Remove file parameter from phys_mem_access_prot code
   arch/powerpc: Call internal __phys_mem_access_prot() in fbdev code

  arch/ia64/include/asm/fb.h| 15 +++
  arch/m68k/include/asm/fb.h| 19 ++-
  arch/mips/include/asm/fb.h| 11 +--
  arch/powerpc/include/asm/book3s/pgtable.h | 10 --
  arch/powerpc/include/asm/fb.h | 13 +
  arch/powerpc/include/asm/machdep.h| 13 ++---
  arch/powerpc/include/asm/nohash/pgtable.h | 10 --
  arch/powerpc/include/asm/pci.h|  4 +---
  arch/powerpc/kernel/pci-common.c  |  3 +--
  arch/powerpc/mm/mem.c |  8 
  arch/sparc/include/asm/fb.h   | 15 +--
  arch/x86/include/asm/fb.h | 10 ++
  arch/x86/video/fbdev.c| 15 ---
  drivers/video/fbdev/core/fb_chrdev.c  |  3 ++-
  include/asm-generic/fb.h  | 12 ++--
  15 files changed, 86 insertions(+), 75 deletions(-)


base-commit: f8d21cb17a99b75862196036bb4bb93ee9637b74


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature.asc
Description: OpenPGP digital signature


[PATCHv7 4/4] powerpc/setup: alloc extra paca_ptrs to hold boot_cpuid

2023-09-25 Thread Pingfan Liu
paca_ptrs should be large enough to hold the boot_cpuid, hence, its
lower boundary is set to the bigger one between boot_cpuid+1 and
nr_cpus.

On the other hand, some kernel component: -1. the timer assumes cpu0
online since the timer_list->flags subfield 'TIMER_CPUMASK' is zero if
not initialized to a proper present cpu.  -2. power9_idle_stop() assumes
the primary thread's paca is allocated.

Hence lift nr_cpu_ids from one to two to ensure cpu0 is onlined, if the
boot cpu is not cpu0.

Result:
When nr_cpus=1, taskset -c 14 bash -c 'echo c > /proc/sysrq-trigger'
the kdump kernel brings up two cpus.
While when taskset -c 4 bash -c 'echo c > /proc/sysrq-trigger',
the kdump kernel brings up one cpu.

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Mahesh Salgaonkar 
Cc: Wen Xiong 
Cc: Baoquan He 
Cc: Ming Lei 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/paca.c | 10 ++
 arch/powerpc/kernel/prom.c |  9 ++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index cda4e00b67c1..91e2401de1bd 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -242,9 +242,10 @@ static int __initdata paca_struct_size;
 
 void __init allocate_paca_ptrs(void)
 {
-   paca_nr_cpu_ids = nr_cpu_ids;
+   int n = (boot_cpuid + 1) > nr_cpu_ids ? (boot_cpuid + 1) : nr_cpu_ids;
 
-   paca_ptrs_size = sizeof(struct paca_struct *) * nr_cpu_ids;
+   paca_nr_cpu_ids = n;
+   paca_ptrs_size = sizeof(struct paca_struct *) * n;
paca_ptrs = memblock_alloc_raw(paca_ptrs_size, SMP_CACHE_BYTES);
if (!paca_ptrs)
panic("Failed to allocate %d bytes for paca pointers\n",
@@ -287,13 +288,14 @@ void __init allocate_paca(int cpu)
 void __init free_unused_pacas(void)
 {
int new_ptrs_size;
+   int n = (boot_cpuid + 1) > nr_cpu_ids ? (boot_cpuid + 1) : nr_cpu_ids;
 
-   new_ptrs_size = sizeof(struct paca_struct *) * nr_cpu_ids;
+   new_ptrs_size = sizeof(struct paca_struct *) * n;
if (new_ptrs_size < paca_ptrs_size)
memblock_phys_free(__pa(paca_ptrs) + new_ptrs_size,
   paca_ptrs_size - new_ptrs_size);
 
-   paca_nr_cpu_ids = nr_cpu_ids;
+   paca_nr_cpu_ids = n;
paca_ptrs_size = new_ptrs_size;
 
 #ifdef CONFIG_PPC_64S_HASH_MMU
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 87272a2d8c10..15c994f54bf9 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -362,9 +362,12 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
 */
boot_cpuid = i;
found = true;
-   /* This works around the hole in paca_ptrs[]. */
-   if (nr_cpu_ids < nthreads)
-   set_nr_cpu_ids(nthreads);
+   /*
+* Ideally, nr_cpus=1 can be achieved if each kernel
+* component does not assume cpu0 is onlined.
+*/
+   if (boot_cpuid != 0 && nr_cpu_ids < 2)
+   set_nr_cpu_ids(2);
}
 #ifdef CONFIG_SMP
/* logical cpu id is always 0 on UP kernels */
-- 
2.31.1



[PATCHv7 3/4] powerpc/setup: Handle the case when boot_cpuid greater than nr_cpus

2023-09-25 Thread Pingfan Liu
If the boot_cpuid is smaller than nr_cpus, it requires extra effort to
ensure the boot_cpu is in cpu_present_mask. This can be achieved by
reserving the last quota for the boot cpu.

Note: the restriction on nr_cpus will be lifted with more effort in the
next patch

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Mahesh Salgaonkar 
Cc: Wen Xiong 
Cc: Baoquan He 
Cc: Ming Lei 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/setup-common.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index f6d32324b5a5..a72d00a6cff2 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -454,8 +454,8 @@ struct interrupt_server_node {
 void __init smp_setup_cpu_maps(void)
 {
struct device_node *dn;
-   int shift = 0, cpu = 0;
-   int j, nthreads = 1;
+   int terminate, shift = 0, cpu = 0;
+   int j, bt_thread = 0, nthreads = 1;
int len;
struct interrupt_server_node *intserv_node, *n;
struct list_head *bt_node, head;
@@ -518,6 +518,7 @@ void __init smp_setup_cpu_maps(void)
for (j = 0 ; j < nthreads; j++) {
if (be32_to_cpu(intserv[j]) == boot_cpu_hwid) {
bt_node = &intserv_node->node;
+   bt_thread = j;
found_boot_cpu = true;
/*
 * Record the round-shift between dt
@@ -537,11 +538,21 @@ void __init smp_setup_cpu_maps(void)
/* Select the primary thread, the boot cpu's slibing, as the logic 0 */
list_add_tail(&head, bt_node);
pr_info("the round shift between dt seq and the cpu logic number: 
%d\n", shift);
+   terminate = nr_cpu_ids;
list_for_each_entry(intserv_node, &head, node) {
 
+   j = 0;
+   /* Choose a start point to cover the boot cpu */
+   if (nr_cpu_ids - 1 < bt_thread) {
+   /*
+* The processor core puts assumption on the thread id,
+* not to breach the assumption.
+*/
+   terminate = nr_cpu_ids - 1;
+   }
avail = intserv_node->avail;
nthreads = intserv_node->len / sizeof(int);
-   for (j = 0; j < nthreads && cpu < nr_cpu_ids; j++) {
+   for (; j < nthreads && cpu < terminate; j++) {
set_cpu_present(cpu, avail);
set_cpu_possible(cpu, true);
cpu_to_phys_id[cpu] = 
be32_to_cpu(intserv_node->intserv[j]);
@@ -549,6 +560,14 @@ void __init smp_setup_cpu_maps(void)
j, cpu, be32_to_cpu(intserv[j]));
cpu++;
}
+   /* Online the boot cpu */
+   if (nr_cpu_ids - 1 < bt_thread) {
+   set_cpu_present(bt_thread, avail);
+   set_cpu_possible(bt_thread, true);
+   cpu_to_phys_id[bt_thread] = 
be32_to_cpu(intserv_node->intserv[bt_thread]);
+   DBG("thread %d -> cpu %d (hard id %d)\n",
+   bt_thread, bt_thread, 
be32_to_cpu(intserv[bt_thread]));
+   }
}
 
list_for_each_entry_safe(intserv_node, n, &head, node) {
-- 
2.31.1



[PATCHv7 2/4] powerpc/setup: Loosen the mapping between cpu logical id and its seq in dt

2023-09-25 Thread Pingfan Liu
*** Idea ***
For kexec -p, the boot cpu can be not the cpu0, this causes the problem
of allocating memory for paca_ptrs[]. However, in theory, there is no
requirement to assign cpu's logical id as its present sequence in the
device tree. But there is something like cpu_first_thread_sibling(),
which makes assumption on the mapping inside a core. Hence partially
loosening the mapping, i.e. unbind the mapping of core while keep the
mapping inside a core.

*** Implement ***
At this early stage, there are plenty of memory to utilize. Hence, this
patch allocates interim memory to link the cpu info on a list, then
reorder cpus by changing the list head. As a result, there is a rotate
shift between the sequence number in dt and the cpu logical number.

*** Result ***
After this patch, a boot-cpu's logical id will always be mapped into the
range [0,threads_per_core).

Besides this, at this phase, all threads in the boot core are forced to
be onlined. This restriction will be lifted in a later patch with
extra effort.

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Mahesh Salgaonkar 
Cc: Wen Xiong 
Cc: Baoquan He 
Cc: Ming Lei 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/prom.c | 25 +
 arch/powerpc/kernel/setup-common.c | 87 +++---
 2 files changed, 85 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index ec82f5bda908..87272a2d8c10 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -76,7 +76,9 @@ u64 ppc64_rma_size;
 unsigned int boot_cpu_node_count __ro_after_init;
 #endif
 static phys_addr_t first_memblock_size;
+#ifdef CONFIG_SMP
 static int __initdata boot_cpu_count;
+#endif
 
 static int __init early_parse_mem(char *p)
 {
@@ -331,8 +333,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
const __be32 *intserv;
int i, nthreads;
int len;
-   int found = -1;
-   int found_thread = 0;
+   bool found = false;
 
/* We are scanning "cpu" nodes only */
if (type == NULL || strcmp(type, "cpu") != 0)
@@ -355,8 +356,15 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
for (i = 0; i < nthreads; i++) {
if (be32_to_cpu(intserv[i]) ==
fdt_boot_cpuid_phys(initial_boot_params)) {
-   found = boot_cpu_count;
-   found_thread = i;
+   /*
+* always map the boot-cpu logical id into the
+* range of [0, thread_per_core)
+*/
+   boot_cpuid = i;
+   found = true;
+   /* This works around the hole in paca_ptrs[]. */
+   if (nr_cpu_ids < nthreads)
+   set_nr_cpu_ids(nthreads);
}
 #ifdef CONFIG_SMP
/* logical cpu id is always 0 on UP kernels */
@@ -365,14 +373,13 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
}
 
/* Not the boot CPU */
-   if (found < 0)
+   if (!found)
return 0;
 
-   DBG("boot cpu: logical %d physical %d\n", found,
-   be32_to_cpu(intserv[found_thread]));
-   boot_cpuid = found;
+   DBG("boot cpu: logical %d physical %d\n", boot_cpuid,
+   be32_to_cpu(intserv[boot_cpuid]));
 
-   boot_cpu_hwid = be32_to_cpu(intserv[found_thread]);
+   boot_cpu_hwid = be32_to_cpu(intserv[boot_cpuid]);
 
/*
 * PAPR defines "logical" PVR values for cpus that
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 1b19a9815672..f6d32324b5a5 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -425,6 +426,13 @@ static void __init cpu_init_thread_core_maps(int tpc)
 
 u32 *cpu_to_phys_id = NULL;
 
+struct interrupt_server_node {
+   struct list_head node;
+   boolavail;
+   int len;
+   __be32 *intserv;
+};
+
 /**
  * setup_cpu_maps - initialize the following cpu maps:
  *  cpu_possible_mask
@@ -446,11 +454,16 @@ u32 *cpu_to_phys_id = NULL;
 void __init smp_setup_cpu_maps(void)
 {
struct device_node *dn;
-   int cpu = 0;
-   int nthreads = 1;
+   int shift = 0, cpu = 0;
+   int j, nthreads = 1;
+   int len;
+   struct interrupt_server_node *intserv_node, *n;
+   struct list_head *bt_node, head;
+   bool avail, found_boot_cpu = false;
 
DBG("smp_setup_cpu_maps()\n");
 
+   INIT_LIST_HEAD(&head);
cpu_to_phys_id = memblock_alloc(nr_cpu_ids * sizeof(u32),
__alignof__(u32));
if (!cpu_to_phys_id)
@@ -460,7 +473,6 @@ void __

[PATCHv7 1/4] powerpc/setup : Enable boot_cpu_hwid for PPC32

2023-09-25 Thread Pingfan Liu
In order to identify the boot cpu, its intserv[] should be recorded and
checked in smp_setup_cpu_maps().

smp_setup_cpu_maps() is shared between PPC64 and PPC32. Since PPC64 has
already used boot_cpu_hwid to carry that information, enabling this
variable on PPC32 so later it can also be used to carry that information
for PPC32 in the coming patch.

Signed-off-by: Pingfan Liu 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Mahesh Salgaonkar 
Cc: Wen Xiong 
Cc: Baoquan He 
Cc: Ming Lei 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org
Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202309130232.n2rewhbv-...@intel.com/
---
 arch/powerpc/include/asm/smp.h | 2 +-
 arch/powerpc/kernel/prom.c | 3 +--
 arch/powerpc/kernel/setup-common.c | 2 --
 3 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 576d0e15..5db9178cc800 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -26,7 +26,7 @@
 #include 
 
 extern int boot_cpuid;
-extern int boot_cpu_hwid; /* PPC64 only */
+extern int boot_cpu_hwid;
 extern int spinning_secondaries;
 extern u32 *cpu_to_phys_id;
 extern bool coregroup_enabled;
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 0b5878c3125b..ec82f5bda908 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -372,8 +372,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
be32_to_cpu(intserv[found_thread]));
boot_cpuid = found;
 
-   if (IS_ENABLED(CONFIG_PPC64))
-   boot_cpu_hwid = be32_to_cpu(intserv[found_thread]);
+   boot_cpu_hwid = be32_to_cpu(intserv[found_thread]);
 
/*
 * PAPR defines "logical" PVR values for cpus that
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index d2a446216444..1b19a9815672 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -87,9 +87,7 @@ EXPORT_SYMBOL(machine_id);
 int boot_cpuid = -1;
 EXPORT_SYMBOL_GPL(boot_cpuid);
 
-#ifdef CONFIG_PPC64
 int boot_cpu_hwid = -1;
-#endif
 
 /*
  * These are used in binfmt_elf.c to put aux entries on the stack
-- 
2.31.1



[PATCHv7 0/4] enable nr_cpus for powerpc

2023-09-25 Thread Pingfan Liu
Since my last v4 [1], the code has undergone great changes. The paca[]
array has been reorganized and indexed by paca_ptrs[], which
dramatically decreases the memory consumption even if there are many
unpresent cpus in the middle.

However, reordering the logical cpu numbers can further decrease the
size of paca_ptrs[] in the kdump case. So I keep [2/4], which
rotate-shifts the cpu's sequence number in the device tree to obtain the
logical cpu id.

Patch [3-4/4] make efforts to decrease the nr_cpus to be less than or
equal to two.

[1]: 
https://lore.kernel.org/linuxppc-dev/1520829790-14029-1-git-send-email-kernelf...@gmail.com/
---
v6 -> v7
  Add [1/4], which fixes compilation error on PPC32

Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: Christophe Leroy 
Cc: Mahesh Salgaonkar 
Cc: Wen Xiong 
Cc: Baoquan He 
Cc: Ming Lei 
Cc: ke...@lists.infradead.org
To: linuxppc-dev@lists.ozlabs.org


Pingfan Liu (4):
  powerpc/setup : Enable boot_cpu_hwid for PPC32
  powerpc/setup: Loosen the mapping between cpu logical id and its seq
in dt
  powerpc/setup: Handle the case when boot_cpuid greater than nr_cpus
  powerpc/setup: alloc extra paca_ptrs to hold boot_cpuid

 arch/powerpc/include/asm/smp.h |   2 +-
 arch/powerpc/kernel/paca.c |  10 +--
 arch/powerpc/kernel/prom.c |  29 +---
 arch/powerpc/kernel/setup-common.c | 108 +++--
 4 files changed, 114 insertions(+), 35 deletions(-)

-- 
2.31.1