[PATCH v3] riscv: Only consider swbp/ss handlers for correct privileged mode

2023-09-11 Thread Björn Töpel
From: Björn Töpel 

RISC-V software breakpoint trap handlers are used for {k,u}probes.

When trapping from kernelmode, only the kernelmode handlers should be
considered. Vice versa, only usermode handlers for usermode
traps. This is not the case on RISC-V, which can trigger a bug if a
userspace process uses uprobes, and a WARN() is triggered from
kernelmode (which is implemented via {c.,}ebreak).

The kernel will trap on the kernelmode {c.,}ebreak, look for uprobes
handlers, realize incorrectly that uprobes need to be handled, and
exit the trap handler early. The trap returns to re-executing the
{c.,}ebreak, and enter an infinite trap-loop.

The issue was found running the BPF selftest [1].

Fix this issue by only considering the swbp/ss handlers for
kernel/usermode respectively. Also, move CONFIG ifdeffery from traps.c
to the asm/{k,u}probes.h headers.

Note that linux/uprobes.h only include asm/uprobes.h if CONFIG_UPROBES
is defined, which is why asm/uprobes.h needs to be unconditionally
included in traps.c

Link: 
https://lore.kernel.org/linux-riscv/87v8d19aun@all.your.base.are.belong.to.us/
 # [1]
Fixes: 74784081aac8 ("riscv: Add uprobes supported")
Reviewed-by: Guo Ren 
Reviewed-by: Nam Cao 
Tested-by: Puranjay Mohan 
Signed-off-by: Björn Töpel 
---
v2->v3: Remove incorrect tags (Conor)
Collect review/test tags
v1->v2: Fix Clang build warning (kernel test robot)
---
 arch/riscv/include/asm/kprobes.h | 11 ++-
 arch/riscv/include/asm/uprobes.h | 13 -
 arch/riscv/kernel/traps.c| 28 ++--
 3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index e7882ccb0fd4..78ea44f76718 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -40,6 +40,15 @@ void arch_remove_kprobe(struct kprobe *p);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 bool kprobe_breakpoint_handler(struct pt_regs *regs);
 bool kprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool kprobe_breakpoint_handler(struct pt_regs *regs)
+{
+   return false;
+}
+
+static inline bool kprobe_single_step_handler(struct pt_regs *regs)
+{
+   return false;
+}
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_RISCV_KPROBES_H */
diff --git a/arch/riscv/include/asm/uprobes.h b/arch/riscv/include/asm/uprobes.h
index f2183e00fdd2..3fc7deda9190 100644
--- a/arch/riscv/include/asm/uprobes.h
+++ b/arch/riscv/include/asm/uprobes.h
@@ -34,7 +34,18 @@ struct arch_uprobe {
bool simulate;
 };
 
+#ifdef CONFIG_UPROBES
 bool uprobe_breakpoint_handler(struct pt_regs *regs);
 bool uprobe_single_step_handler(struct pt_regs *regs);
-
+#else
+static inline bool uprobe_breakpoint_handler(struct pt_regs *regs)
+{
+   return false;
+}
+
+static inline bool uprobe_single_step_handler(struct pt_regs *regs)
+{
+   return false;
+}
+#endif /* CONFIG_UPROBES */
 #endif /* _ASM_RISCV_UPROBES_H */
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index 19807c4d3805..fae8f610d867 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -13,6 +13,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -247,22 +249,28 @@ static inline unsigned long 
get_break_insn_length(unsigned long pc)
return GET_INSN_LENGTH(insn);
 }
 
+static bool probe_single_step_handler(struct pt_regs *regs)
+{
+   bool user = user_mode(regs);
+
+   return user ? uprobe_single_step_handler(regs) : 
kprobe_single_step_handler(regs);
+}
+
+static bool probe_breakpoint_handler(struct pt_regs *regs)
+{
+   bool user = user_mode(regs);
+
+   return user ? uprobe_breakpoint_handler(regs) : 
kprobe_breakpoint_handler(regs);
+}
+
 void handle_break(struct pt_regs *regs)
 {
-#ifdef CONFIG_KPROBES
-   if (kprobe_single_step_handler(regs))
+   if (probe_single_step_handler(regs))
return;
 
-   if (kprobe_breakpoint_handler(regs))
-   return;
-#endif
-#ifdef CONFIG_UPROBES
-   if (uprobe_single_step_handler(regs))
+   if (probe_breakpoint_handler(regs))
return;
 
-   if (uprobe_breakpoint_handler(regs))
-   return;
-#endif
current->thread.bad_cause = regs->cause;
 
if (user_mode(regs))

base-commit: 0bb80ecc33a8fb5a682236443c1e740d5c917d1d
-- 
2.39.2



Re: [PATCH] integrity: powerpc: Do not select CA_MACHINE_KEYRING

2023-09-11 Thread Nayna



On 9/7/23 13:32, Michal Suchánek wrote:

Adding more CC's from the original patch, looks like get_maintainers is
not that great for this file.

On Thu, Sep 07, 2023 at 06:52:19PM +0200, Michal Suchanek wrote:

No other platform needs CA_MACHINE_KEYRING, either.

This is policy that should be decided by the administrator, not Kconfig
dependencies.


We certainly agree that flexibility is important. However, in this case, 
this also implies that we are expecting system admins to be security 
experts. As per our understanding, CA based infrastructure(PKI) is the 
standard to be followed and not the policy decision. And we can only 
speak for Power.


INTEGRITY_CA_MACHINE_KEYRING ensures that we always have CA signed leaf 
certs.


INTEGRITY_CA_MACHINE_KEYRING_MAX ensures that CA is only allowed to do 
key signing and not code signing.


Having CA signed certs also permits easy revocation of all leaf certs.

Loading certificates is completely new for Power Systems. We would like 
to make it as clean as possible from the start. We want to enforce CA 
signed leaf certificates(INTEGRITY_CA_MACHINE_KEYRING). As per 
keyUsage(INTEGRITY_CA_MACHINE_KEYRING_MAX), if we want more flexibility, 
probably a boot time override can be considered.


Thanks & Regards,

    - Nayna




cc: joeyli 
Signed-off-by: Michal Suchanek 
---
  security/integrity/Kconfig | 2 --
  1 file changed, 2 deletions(-)

diff --git a/security/integrity/Kconfig b/security/integrity/Kconfig
index 232191ee09e3..b6e074ac0227 100644
--- a/security/integrity/Kconfig
+++ b/security/integrity/Kconfig
@@ -68,8 +68,6 @@ config INTEGRITY_MACHINE_KEYRING
depends on INTEGRITY_ASYMMETRIC_KEYS
depends on SYSTEM_BLACKLIST_KEYRING
depends on LOAD_UEFI_KEYS || LOAD_PPC_KEYS
-   select INTEGRITY_CA_MACHINE_KEYRING if LOAD_PPC_KEYS
-   select INTEGRITY_CA_MACHINE_KEYRING_MAX if LOAD_PPC_KEYS
help
 If set, provide a keyring to which Machine Owner Keys (MOK) may
 be added. This keyring shall contain just MOK keys.  Unlike keys
--
2.41.0



Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Jarkko Sakkinen
On Sat Sep 9, 2023 at 12:34 AM EEST, Eric Snowberg wrote:
> Currently root can dynamically update the blacklist keyring if the hash
> being added is signed and vouched for by the builtin trusted keyring.
> Currently keys in the secondary trusted keyring can not be used.
>
> Keys within the secondary trusted keyring carry the same capabilities as
> the builtin trusted keyring.  Relax the current restriction for updating
> the .blacklist keyring and allow the secondary to also be referenced as
> a trust source.  Since the machine keyring is linked to the secondary
> trusted keyring, any key within it may also be used.
>
> An example use case for this is IMA appraisal.  Now that IMA both
> references the blacklist keyring and allows the machine owner to add
> custom IMA CA certs via the machine keyring, this adds the additional
> capability for the machine owner to also do revocations on a running
> system.
>
> IMA appraisal usage example to add a revocation for /usr/foo:
>
> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
>
> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
>-signer machine-certificate.pem -noattr -binary -outform DER \
>-out hash.p7s
>
> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
>
> Signed-off-by: Eric Snowberg 
> ---
>  certs/Kconfig | 2 +-
>  certs/blacklist.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/certs/Kconfig b/certs/Kconfig
> index 1f109b070877..23dc87c52aff 100644
> --- a/certs/Kconfig
> +++ b/certs/Kconfig
> @@ -134,7 +134,7 @@ config SYSTEM_BLACKLIST_AUTH_UPDATE
>   depends on SYSTEM_DATA_VERIFICATION
>   help
> If set, provide the ability to load new blacklist keys at run time if
> -   they are signed and vouched by a certificate from the builtin trusted
> +   they are signed and vouched by a certificate from the secondary 
> trusted
> keyring.  The PKCS#7 signature of the description is set in the key
> payload.  Blacklist keys cannot be removed.
>  
> diff --git a/certs/blacklist.c b/certs/blacklist.c
> index 675dd7a8f07a..0b346048ae2d 100644
> --- a/certs/blacklist.c
> +++ b/certs/blacklist.c
> @@ -102,12 +102,12 @@ static int blacklist_key_instantiate(struct key *key,
>  
>  #ifdef CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE
>   /*
> -  * Verifies the description's PKCS#7 signature against the builtin
> +  * Verifies the description's PKCS#7 signature against the secondary
>* trusted keyring.
>*/
>   err = verify_pkcs7_signature(key->description,
>   strlen(key->description), prep->data, prep->datalen,
> - NULL, VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
> + VERIFY_USE_SECONDARY_KEYRING, 
> VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
>   if (err)
>   return err;
>  #else
> -- 
> 2.39.3

What if a live system in the wild assumes the old policy? I feel that
this is "sort of" breaking backwards compatibility but please prove me
wrong.

BR, Jarkko


Re: [RESEND][PATCH] tracing/synthetic: Fix order of struct trace_dynamic_info

2023-09-11 Thread Steven Rostedt
On Mon, 11 Sep 2023 09:26:12 +0900
Masami Hiramatsu (Google)  wrote:

> > Fix this and just to be safe also add "__packed".
> > 
> > Link: 
> > https://lore.kernel.org/all/20230908154417.5172e...@gandalf.local.home/  
> 
> Good catch! I'm not sure why this worked. Maybe we don't have any testcase
> for this feature?

We have a test case, it was broken by a change in the README :-p

I noticed issues with it, and then looked at the tests, fixed the test and
it failed. Before that, it was just reporting "unsupported", which is why I
included that fix with this queue.

  
https://lore.kernel.org/linux-trace-kernel/20230614091046.2178539-1-nav...@kernel.org/

> 
> Acked-by: Masami Hiramatsu (Google) 

Thanks!

-- Steve


Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Jarkko Sakkinen
On Mon Sep 11, 2023 at 4:29 PM EEST, Mimi Zohar wrote:
> Hi Eric,
>
> On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
> > Currently root can dynamically update the blacklist keyring if the hash
> > being added is signed and vouched for by the builtin trusted keyring.
> > Currently keys in the secondary trusted keyring can not be used.
> > 
> > Keys within the secondary trusted keyring carry the same capabilities as
> > the builtin trusted keyring.  Relax the current restriction for updating
> > the .blacklist keyring and allow the secondary to also be referenced as
> > a trust source.  Since the machine keyring is linked to the secondary
> > trusted keyring, any key within it may also be used.
> > 
> > An example use case for this is IMA appraisal.  Now that IMA both
> > references the blacklist keyring and allows the machine owner to add
> > custom IMA CA certs via the machine keyring, this adds the additional
> > capability for the machine owner to also do revocations on a running
> > system.
> > 
> > IMA appraisal usage example to add a revocation for /usr/foo:
> > 
> > sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
> > 
> > openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
> >-signer machine-certificate.pem -noattr -binary -outform DER \
> >-out hash.p7s
> > 
> > keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
> > 
> > Signed-off-by: Eric Snowberg 
>
> The secondary keyring may include both CA and code signing keys.  With
> this change any key loaded onto the secondary keyring may blacklist a
> hash.  Wouldn't it make more sense to limit blacklisting
> certificates/hashes to at least CA keys? 

I think a bigger issue is that if a kernel is updated with this patch
it will change the behavior. It is nothing to do whether the "old" or
"new" is better but more like kind of backwards compatibility issue.

BR, Jarkko


[PATCH][next] RDMA/core: Use size_{add,mul}() in calls to struct_size()

2023-09-11 Thread Gustavo A. R. Silva
Harden calls to struct_size() with size_add() and size_mul().

This results in no differences in binary output.

Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs 
attributes")
Fixes: a4676388e2e2 ("RDMA/core: Simplify how the gid_attrs sysfs is created")
Signed-off-by: Gustavo A. R. Silva 
---
 drivers/infiniband/core/sysfs.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index ee59d7391568..ec5efdc16660 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -903,7 +903,7 @@ alloc_hw_stats_device(struct ib_device *ibdev)
 * Two extra attribue elements here, one for the lifespan entry and
 * one to NULL terminate the list for the sysfs core code
 */
-   data = kzalloc(struct_size(data, attrs, stats->num_counters + 1),
+   data = kzalloc(struct_size(data, attrs, size_add(stats->num_counters, 
1)),
   GFP_KERNEL);
if (!data)
goto err_free_stats;
@@ -1009,7 +1009,7 @@ alloc_hw_stats_port(struct ib_port *port, struct 
attribute_group *group)
 * Two extra attribue elements here, one for the lifespan entry and
 * one to NULL terminate the list for the sysfs core code
 */
-   data = kzalloc(struct_size(data, attrs, stats->num_counters + 1),
+   data = kzalloc(struct_size(data, attrs, size_add(stats->num_counters, 
1)),
   GFP_KERNEL);
if (!data)
goto err_free_stats;
@@ -1140,7 +1140,7 @@ static int setup_gid_attrs(struct ib_port *port,
int ret;
 
gid_attr_group = kzalloc(struct_size(gid_attr_group, attrs_list,
-attr->gid_tbl_len * 2),
+size_mul(attr->gid_tbl_len, 2)),
 GFP_KERNEL);
if (!gid_attr_group)
return -ENOMEM;
@@ -1205,8 +1205,8 @@ static struct ib_port *setup_port(struct ib_core_device 
*coredev, int port_num,
int ret;
 
p = kvzalloc(struct_size(p, attrs_list,
-   attr->gid_tbl_len + attr->pkey_tbl_len),
-   GFP_KERNEL);
+   size_add(attr->gid_tbl_len, 
attr->pkey_tbl_len)),
+GFP_KERNEL);
if (!p)
return ERR_PTR(-ENOMEM);
p->ibdev = device;
-- 
2.34.1



Re: [PATCH v5] Randomized slab caches for kmalloc()

2023-09-11 Thread Kees Cook
On Mon, Sep 11, 2023 at 11:18:15PM +0200, jvoisin wrote:
> I wrote a small blogpost[1] about this series, and was told[2] that it
> would be interesting to share it on this thread, so here it is, copied
> verbatim:

Thanks for posting!

> Ruiqi Gong and Xiu Jianfeng got their
> [Randomized slab caches for
> kmalloc()](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c6152940584290668b35fa0800026f6a1ae05fe)
> patch series merged upstream, and I've and enough discussions about it to
> warrant summarising them into a small blogpost.
> 
> The main idea is to have multiple slab caches, and pick one at random
> based on
> the address of code calling `kmalloc()` and a per-boot seed, to make
> heap-spraying harder.
> It's a great idea, but comes with some shortcomings for now:
> 
> - Objects being allocated via wrappers around `kmalloc()`, like
> `sock_kmalloc`,
>   `f2fs_kmalloc`, `aligned_kmalloc`, … will end up in the same slab cache.

I'd love to see some way to "unwrap" these kinds of allocators. Right
now we try to manually mark them so the debugging options can figure out
what did the allocation, but it's not complete by any means.

I'd kind of like to see a common front end that specified some set of
"do stuff" routines. e.g. to replace devm_kmalloc(), we could have:

void *alloc(size_t usable, gfp_t flags,
size_t (*prepare)(size_t, gfp_t, void *ctx),
void * (*finish)(size_t, gfp_t, void *ctx, void *allocated)
void * ctx)

ssize_t devm_prep(size_t usable, gfp_t *flags, void *ctx)
{
ssize_t tot_size;

if (unlikely(check_add_overflow(sizeof(struct devres),
size, &tot_size)))
return -ENOMEM;

tot_size = kmalloc_size_roundup(tot_size);
*flags |= __GFP_ZERO;

return tot_size;
}

void *devm_finish(size_t usable, gfp_t flags, void *ctx, void 
*allocated)
{
struct devres *dr = allocated;
struct device *dev = ctx;

INIT_LIST_HEAD(&dr->node.entry);
dr->node.release = devm_kmalloc_release;

set_node_dbginfo(&dr->node, "devm_kzalloc_release", usable);
devres_add(dev, dr->data);
return dr->data;
}

#define devm_kmalloc(dev, size, gfp)\
alloc(size, gfp, devm_prep, devm_finish, dev)

And now there's no wrapper any more, just a routine to get the actual
size, and a routine to set up the memory and return the "usable"
pointer.

> - The slabs needs to be pinned, otherwise an attacker could
> [feng-shui](https://en.wikipedia.org/wiki/Heap_feng_shui) their way
>   into having the whole slab free'ed, garbage-collected, and have a slab for
>   another type allocated at the same VA. [Jann Horn](https://thejh.net/)
> and [Matteo Rizzo](https://infosec.exchange/@nspace) have a [nice
>   set of
> patches](https://github.com/torvalds/linux/compare/master...thejh:linux:slub-virtual-upstream),
>   discussed a bit in [this Project Zero
> blogpost](https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html),
>   for a feature called [`SLAB_VIRTUAL`](
> https://github.com/torvalds/linux/commit/f3afd3a2152353be355b90f5fd4367adbf6a955e),
>   implementing precisely this.

I'm hoping this will get posted to LKML soon.

> - There are 16 slabs by default, so one chance out of 16 to end up in
> the same
>   slab cache as the target.

Future work can make this more deterministic.

> - There are no guard pages between caches, so inter-caches overflows are
>   possible.

This may be addressed by SLAB_VIRTUAL.

> - As pointed by
> [andreyknvl](https://twitter.com/andreyknvl/status/1700267669336080678)
>   and [minipli](https://infosec.exchange/@minipli/111045336853055793),
>   the fewer allocations hitting a given cache means less noise,
>   so it might even help with some heap feng-shui.

That may be true, but I suspect it'll be mitigated by the overall
reduction shared caches.

> - minipli also pointed that "randomized caches still freely
>   mix kernel allocations with user controlled ones (`xattr`, `keyctl`,
> `msg_msg`, …).
>   So even though merging is disabled for these caches, i.e. no direct
> overlap
>   with `cred_jar` etc., other object types can still be targeted (`struct
>   pipe_buffer`, BPF maps, its verifier state objects,…). It’s just a
> matter of
>   probing which allocation index the targeted object falls into.",
>   but I considered this out of scope, since it's much more involved;
>   albeit something like
> [`CONFIG_KMALLOC_SPLIT_VARSIZE`](https://github.com/thejh/linux/blob/slub-virtual/MITIGATION_README)
>   wouldn't significantly increase complexity.

Now that we have a mechanism to easily deal with "many kmalloc buckets",
I think we can easily start carving out specific variab

[PATCH 1/2] remoteproc: imx_dsp_rproc: add mandatory find_loaded_rsc_table op

2023-09-11 Thread Iuliana Prodan (OSS)
From: Iuliana Prodan 

Add the .find_loaded_rsc_table operation for i.MX DSP.
We need it for inter-process communication between DSP
and main core.

This callback is used to find the resource table (defined
in remote processor linker script) where the address of the
vrings along with the other allocated resources (carveouts etc)
are stored.
If this is not found, the vrings are not allocated and
the IPC between cores will not work.

Signed-off-by: Iuliana Prodan 
Reviewed-by: Daniel Baluta 
---
 drivers/remoteproc/imx_dsp_rproc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/remoteproc/imx_dsp_rproc.c 
b/drivers/remoteproc/imx_dsp_rproc.c
index 8fcda9b74545..a1c62d15f16c 100644
--- a/drivers/remoteproc/imx_dsp_rproc.c
+++ b/drivers/remoteproc/imx_dsp_rproc.c
@@ -940,6 +940,7 @@ static const struct rproc_ops imx_dsp_rproc_ops = {
.kick   = imx_dsp_rproc_kick,
.load   = imx_dsp_rproc_elf_load_segments,
.parse_fw   = imx_dsp_rproc_parse_fw,
+   .find_loaded_rsc_table = rproc_elf_find_loaded_rsc_table,
.sanity_check   = rproc_elf_sanity_check,
.get_boot_addr  = rproc_elf_get_boot_addr,
 };
-- 
2.17.1



Re: suspicious RCU usage warning on tracing/urgent

2023-09-11 Thread Steven Rostedt
On Mon, 11 Sep 2023 12:00:53 +0900
Masami Hiramatsu (Google)  wrote:

> But it seems correctly taking srcu_read_lock().
> 
> 452 
> 453 ei = ti->private;
> 454 idx = srcu_read_lock(&eventfs_srcu);
> 455 list_for_each_entry_rcu(ef, &ei->e_top_files, list) {
> 456 create_dentry(ef, dentry, false);
> 457 }
> 458 srcu_read_unlock(&eventfs_srcu, idx);
> 459 return dcache_dir_open(inode, file);
> 460 }
> 461 
> 
> This may false-positive warning, or srcu_read_lock() is not enough for
> list_for_each_entry_rcu(). In latter case, maybe we need to use a
> mutex instead of srcu for update the ef.

Oops, that should be list_for_each_entry_srcu().

Thanks!

> 
> BTW, the ftracetest itself passed without any problem.

Thanks for testing as well!

-- Steve


Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Eric Snowberg


> On Sep 11, 2023, at 5:08 PM, Mimi Zohar  wrote:
> 
> On Mon, 2023-09-11 at 22:17 +, Eric Snowberg wrote:
>> 
>>> On Sep 11, 2023, at 10:51 AM, Mickaël Salaün  wrote:
>>> 
>>> On Mon, Sep 11, 2023 at 09:29:07AM -0400, Mimi Zohar wrote:
 Hi Eric,
 
 On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
> Currently root can dynamically update the blacklist keyring if the hash
> being added is signed and vouched for by the builtin trusted keyring.
> Currently keys in the secondary trusted keyring can not be used.
> 
> Keys within the secondary trusted keyring carry the same capabilities as
> the builtin trusted keyring.  Relax the current restriction for updating
> the .blacklist keyring and allow the secondary to also be referenced as
> a trust source.  Since the machine keyring is linked to the secondary
> trusted keyring, any key within it may also be used.
> 
> An example use case for this is IMA appraisal.  Now that IMA both
> references the blacklist keyring and allows the machine owner to add
> custom IMA CA certs via the machine keyring, this adds the additional
> capability for the machine owner to also do revocations on a running
> system.
> 
> IMA appraisal usage example to add a revocation for /usr/foo:
> 
> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
> 
> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
>  -signer machine-certificate.pem -noattr -binary -outform DER \
>  -out hash.p7s
> 
> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
> 
> Signed-off-by: Eric Snowberg 
 
 The secondary keyring may include both CA and code signing keys.  With
 this change any key loaded onto the secondary keyring may blacklist a
 hash.  Wouldn't it make more sense to limit blacklisting
 certificates/hashes to at least CA keys? 
>>> 
>>> Some operational constraints may limit what a CA can sign.
>> 
>> Agreed.  
>> 
>> Is there precedents for requiring this S/MIME to be signed by a CA? 
>> 
>>> This change is critical and should be tied to a dedicated kernel config
>>> (disabled by default), otherwise existing systems using this feature
>>> will have their threat model automatically changed without notice.
>> 
>> Today we have INTEGRITY_CA_MACHINE_KEYRING_MAX.  This can 
>> be enabled to enforce CA restrictions on the machine keyring.  Mimi, would 
>> this be a suitable solution for what you are after?
> 
> There needs to be some correlation between the file hashes being added
> to the blacklist and the certificate that signed them.  Without that
> correlation, any key on the secondary trusted keyring could add any
> file hashes it wants to the blacklist.

Today any key in the secondary trusted keyring can be used to validate a 
signed kernel module.  At a later time, if a new hash is added to the blacklist 
keyring to revoke loading a signed kernel module,  the ability to do the 
revocation with this additional change would be more restrictive than loading 
the original module.

But, if you think it would be appropriate, I could add a new Kconfig (disabled 
by default) that validates the key being used to vouch the S/MIME encoded 
hash is a CA.  That would certainly make this more complicated.   With this 
addition, would  the key usage field need to be referenced too?

Another idea I had was changing this patch to reference only the builtin and 
the machine keyring (if configured), not the secondary keyring.   Then with
INTEGRITY_CA_MACHINE_KEYRING_MAX, only CA keys could be 
used. Let me know your thoughts on this approach.  Thanks.



Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread Steven Rostedt
On Mon, 11 Sep 2023 20:36:42 +
SeongJae Park  wrote:

> > Then tracing is fully enabled here, and now we enter:
> > 
> > if (trace_damos_before_apply_enabled()) {
> > trace_damos_before_apply(cidx, sidx, tidx, r,
> > damon_nr_regions(t));
> > }
> > 
> > Now the trace event is hit with sidx and tidx zero when they should not be.
> > This could confuse you when looking at the report.  
> 
> Thank you so much for enlightening me with this kind explanation, Steve!  And
> this all make sense.  I will follow your suggestion in the next spin.
> 
> > 
> > What I suggested was to initialize sidx to zero,  
> 
> Nit.  Initialize to not zero but -1, right?

Yeah, but I was also thinking of the reset of it too :-p

sidx = -1;

if (trace_damos_before_apply_enabled()) {
sidx = 0;

-- Steve


> 
> > set it in the first trace_*_enabled() check, and ignore calling the
> > tracepoint if it's not >= 0.
> > 


Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread Steven Rostedt
On Tue, 12 Sep 2023 01:43:08 +
SeongJae Park  wrote:

> Nevertheless, since the variable is unsigned int, I would need to use UINT_MAX
> instead.  To make the code easier to understand, I'd prefer to add a third
> parameter, as you suggested as another option at the original reply, like
> below:

That works too.

-- Steve


[PATCH 0/2] Rpmsg support for i.MX DSP with resource table

2023-09-11 Thread Iuliana Prodan (OSS)
From: Iuliana Prodan 

These patches are needed in order to support rpmsg on DSP when a
resource table is available.

Iuliana Prodan (2):
  remoteproc: imx_dsp_rproc: add mandatory find_loaded_rsc_table op
  arm64: dts: imx8mp: add reserve-memory nodes for DSP

 arch/arm64/boot/dts/freescale/imx8mp.dtsi | 12 
 drivers/remoteproc/imx_dsp_rproc.c|  1 +
 2 files changed, 13 insertions(+)

-- 
2.17.1



Re: [PATCH] Fix typo in tpmrm class definition

2023-09-11 Thread Jarkko Sakkinen
On Fri Sep 8, 2023 at 5:06 PM EEST, Justin M. Forbes wrote:
> Commit d2e8071bed0be ("tpm: make all 'class' structures const")
> unfortunately had a typo for the name on tpmrm.
>
> Fixes: d2e8071bed0b ("tpm: make all 'class' structures const")
> Signed-off-by: Justin M. Forbes 
> ---
>  drivers/char/tpm/tpm-chip.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index 23f6f2eda84c..42b1062e33cd 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -33,7 +33,7 @@ const struct class tpm_class = {
>   .shutdown_pre = tpm_class_shutdown,
>  };
>  const struct class tpmrm_class = {
> - .name = "tmprm",
> + .name = "tpmrm",
>  };
>  dev_t tpm_devt;
>
> -- 
> 2.41.0

Reviewed-by: Jarkko Sakkinen 

Thanks, I'll queue this up for rc2.

BR, Jarkko


[PATCH] tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()

2023-09-11 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs files list is protected by SRCU. In earlier iterations it was
protected with just RCU, but because it needed to also call sleepable
code, it had to be switch to SRCU. The dcache_dir_open_wrapper()
list_for_each_rcu() was missed and did not get converted over to
list_for_each_srcu(). That needs to be fixed.

Link: 
https://lore.kernel.org/linux-trace-kernel/20230911120053.ca82f545e7f46ea753ded...@kernel.org/

Reported-by: Masami Hiramatsu (Google) 
Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index f168aca45458..9f64e7332796 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -452,7 +452,8 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file)
 
ei = ti->private;
idx = srcu_read_lock(&eventfs_srcu);
-   list_for_each_entry_rcu(ef, &ei->e_top_files, list) {
+   list_for_each_entry_srcu(ef, &ei->e_top_files, list,
+srcu_read_lock_held(&eventfs_srcu)) {
create_dentry(ef, dentry, false);
}
srcu_read_unlock(&eventfs_srcu, idx);
-- 
2.40.1



Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread SeongJae Park
On Mon, 11 Sep 2023 16:51:44 -0400 Steven Rostedt  wrote:

> On Mon, 11 Sep 2023 20:36:42 +
> SeongJae Park  wrote:
> 
> > > Then tracing is fully enabled here, and now we enter:
> > > 
> > >   if (trace_damos_before_apply_enabled()) {
> > >   trace_damos_before_apply(cidx, sidx, tidx, r,
> > >   damon_nr_regions(t));
> > >   }
> > > 
> > > Now the trace event is hit with sidx and tidx zero when they should not 
> > > be.
> > > This could confuse you when looking at the report.  
> > 
> > Thank you so much for enlightening me with this kind explanation, Steve!  
> > And
> > this all make sense.  I will follow your suggestion in the next spin.
> > 
> > > 
> > > What I suggested was to initialize sidx to zero,  
> > 
> > Nit.  Initialize to not zero but -1, right?
> 
> Yeah, but I was also thinking of the reset of it too :-p
> 
>   sidx = -1;
> 
>   if (trace_damos_before_apply_enabled()) {
>   sidx = 0;

Thank you for clarifying, Steve :)

Nevertheless, since the variable is unsigned int, I would need to use UINT_MAX
instead.  To make the code easier to understand, I'd prefer to add a third
parameter, as you suggested as another option at the original reply, like
below:

--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -997,6 +997,7 @@ static void damos_apply_scheme(struct damon_ctx *c, struct 
damon_target *t,
unsigned int sidx = 0;
struct damon_target *titer; /* targets iterator */
unsigned int tidx = 0;
+   bool do_trace = false;

/* get indices for trace_damos_before_apply() */
if (trace_damos_before_apply_enabled()) {
@@ -1010,6 +1011,7 @@ static void damos_apply_scheme(struct damon_ctx *c, 
struct damon_target *t,
break;
tidx++;
}
+   do_trace = true;
}

if (c->ops.apply_scheme) {
@@ -1036,7 +1038,7 @@ static void damos_apply_scheme(struct damon_ctx *c, 
struct damon_target *t,
err = c->callback.before_damos_apply(c, t, r, s);
if (!err) {
trace_damos_before_apply(cidx, sidx, tidx, r,
-   damon_nr_regions(t));
+   damon_nr_regions(t), do_trace);
sz_applied = c->ops.apply_scheme(c, t, r, s);
}
ktime_get_coarse_ts64(&end);


Thanks,
SJ

> 
> -- Steve
> 
> 
> > 
> > > set it in the first trace_*_enabled() check, and ignore calling the
> > > tracepoint if it's not >= 0.
> > > 


Re: [PATCH] x86/tdx: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
On Mon, Sep 11, 2023 at 11:51 AM Dave Hansen  wrote:
>
> On 9/11/23 11:27, Justin Stitt wrote:
> > `strncpy` is deprecated and we should prefer more robust string apis.
>
> I dunno.  It actually seems like a pretty good fit here.
>
> > In this case, `message.str` is not expected to be NUL-terminated as it
> > is simply a buffer of characters residing in a union which allows for
> > named fields representing 8 bytes each. There is only one caller of
> > `tdx_panic()` and they use a 59-length string for `msg`:
> > | const char *msg = "TD misconfiguration: SEPT_VE_DISABLE attribute 
> > must be set.";
>
> I'm not really following this logic.
>
> We need to do the following:
>
> 1. Make sure not to over write past the end of 'message'
> 2. Make sure not to over read past the end of 'msg'
> 3. Make sure not to leak stack data into the hypercall registers
>in the case of short strings.
>
> strncpy() does #1, #2 and #3 just fine.

Right, to be clear, I do not suspect a bug in the current
implementation. Rather, let's move towards a less ambiguous interface
for maintainability's sake

>
> The resulting string does *NOT* need to be NULL-terminated.  See the
> comment:
>
> /* VMM assumes '\0' in byte 65, if the message took all 64 bytes */
>
> Are there cases where strncpy() doesn't NULL-terminate the string other
> than when the buffer is full?
>
> I actually didn't realize that strncpy() pads its output up to the full
> size.  I wonder if Kirill used it intentionally or whether he got lucky
> here. :)

Big reason to use strtomem_pad as it is more obvious about what it does.

I'd love more thoughts/testing here.


Re: [PATCH] integrity: powerpc: Do not select CA_MACHINE_KEYRING

2023-09-11 Thread Jarkko Sakkinen
On Thu Sep 7, 2023 at 7:52 PM EEST, Michal Suchanek wrote:
> No other platform needs CA_MACHINE_KEYRING, either.
>
> This is policy that should be decided by the administrator, not Kconfig

s/administrator/distributor/ ?

> dependencies.
>
> cc: joeyli 
> Signed-off-by: Michal Suchanek 
> ---
>  security/integrity/Kconfig | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/security/integrity/Kconfig b/security/integrity/Kconfig
> index 232191ee09e3..b6e074ac0227 100644
> --- a/security/integrity/Kconfig
> +++ b/security/integrity/Kconfig
> @@ -68,8 +68,6 @@ config INTEGRITY_MACHINE_KEYRING
>   depends on INTEGRITY_ASYMMETRIC_KEYS
>   depends on SYSTEM_BLACKLIST_KEYRING
>   depends on LOAD_UEFI_KEYS || LOAD_PPC_KEYS
> - select INTEGRITY_CA_MACHINE_KEYRING if LOAD_PPC_KEYS
> - select INTEGRITY_CA_MACHINE_KEYRING_MAX if LOAD_PPC_KEYS
>   help
>If set, provide a keyring to which Machine Owner Keys (MOK) may
>be added. This keyring shall contain just MOK keys.  Unlike keys
> -- 
> 2.41.0

I'd suggest to add even fixes tag.

BR, Jarkko


[PATCH] auxdisplay: panel: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
`strncpy` is deprecated and as such we should prefer more robust and
less ambiguous interfaces.

In this case, all of `press_str`, `repeat_str` and `release_str` are
explicitly marked as nonstring:
|   struct {/* valid when type == INPUT_TYPE_KBD */
|   char press_str[sizeof(void *) + sizeof(int)] __nonstring;
|   char repeat_str[sizeof(void *) + sizeof(int)] __nonstring;
|   char release_str[sizeof(void *) + sizeof(int)] __nonstring;
|   } kbd;

... which makes `strtomem_pad` a suitable replacement as it is
functionally the same whilst being more obvious about its behavior.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 
---
Note: build-tested
---
 drivers/auxdisplay/panel.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/auxdisplay/panel.c b/drivers/auxdisplay/panel.c
index eba04c0de7eb..e20d35bdf5fe 100644
--- a/drivers/auxdisplay/panel.c
+++ b/drivers/auxdisplay/panel.c
@@ -1449,10 +1449,9 @@ static struct logical_input *panel_bind_key(const char 
*name, const char *press,
key->rise_time = 1;
key->fall_time = 1;
 
-   strncpy(key->u.kbd.press_str, press, sizeof(key->u.kbd.press_str));
-   strncpy(key->u.kbd.repeat_str, repeat, sizeof(key->u.kbd.repeat_str));
-   strncpy(key->u.kbd.release_str, release,
-   sizeof(key->u.kbd.release_str));
+   strtomem_pad(key->u.kbd.press_str, press, '\0');
+   strtomem_pad(key->u.kbd.repeat_str, repeat, '\0');
+   strtomem_pad(key->u.kbd.release_str, release, '\0');
list_add(&key->list, &logical_inputs);
return key;
 }

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230911-strncpy-drivers-auxdisplay-panel-c-83bce51f32cb

Best regards,
--
Justin Stitt 



Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Mimi Zohar
On Mon, 2023-09-11 at 22:17 +, Eric Snowberg wrote:
> 
> > On Sep 11, 2023, at 10:51 AM, Mickaël Salaün  wrote:
> > 
> > On Mon, Sep 11, 2023 at 09:29:07AM -0400, Mimi Zohar wrote:
> >> Hi Eric,
> >> 
> >> On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
> >>> Currently root can dynamically update the blacklist keyring if the hash
> >>> being added is signed and vouched for by the builtin trusted keyring.
> >>> Currently keys in the secondary trusted keyring can not be used.
> >>> 
> >>> Keys within the secondary trusted keyring carry the same capabilities as
> >>> the builtin trusted keyring.  Relax the current restriction for updating
> >>> the .blacklist keyring and allow the secondary to also be referenced as
> >>> a trust source.  Since the machine keyring is linked to the secondary
> >>> trusted keyring, any key within it may also be used.
> >>> 
> >>> An example use case for this is IMA appraisal.  Now that IMA both
> >>> references the blacklist keyring and allows the machine owner to add
> >>> custom IMA CA certs via the machine keyring, this adds the additional
> >>> capability for the machine owner to also do revocations on a running
> >>> system.
> >>> 
> >>> IMA appraisal usage example to add a revocation for /usr/foo:
> >>> 
> >>> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
> >>> 
> >>> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
> >>>   -signer machine-certificate.pem -noattr -binary -outform DER \
> >>>   -out hash.p7s
> >>> 
> >>> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
> >>> 
> >>> Signed-off-by: Eric Snowberg 
> >> 
> >> The secondary keyring may include both CA and code signing keys.  With
> >> this change any key loaded onto the secondary keyring may blacklist a
> >> hash.  Wouldn't it make more sense to limit blacklisting
> >> certificates/hashes to at least CA keys? 
> > 
> > Some operational constraints may limit what a CA can sign.
> 
> Agreed.  
> 
> Is there precedents for requiring this S/MIME to be signed by a CA? 
> 
> > This change is critical and should be tied to a dedicated kernel config
> > (disabled by default), otherwise existing systems using this feature
> > will have their threat model automatically changed without notice.
> 
> Today we have INTEGRITY_CA_MACHINE_KEYRING_MAX.  This can 
> be enabled to enforce CA restrictions on the machine keyring.  Mimi, would 
> this be a suitable solution for what you are after?

There needs to be some correlation between the file hashes being added
to the blacklist and the certificate that signed them.  Without that
correlation, any key on the secondary trusted keyring could add any
file hashes it wants to the blacklist.

Mimi

> 
> I suppose root could add another key to the secondary keyring if it was 
> signed by a key in the machine keyring.  But then we are getting into an 
> area of key usage enforcement which really only exists for things added 
> to the .ima keyring.
> 
> >>> ---
> >>> certs/Kconfig | 2 +-
> >>> certs/blacklist.c | 4 ++--
> >>> 2 files changed, 3 insertions(+), 3 deletions(-)
> >>> 
> >>> diff --git a/certs/Kconfig b/certs/Kconfig
> >>> index 1f109b070877..23dc87c52aff 100644
> >>> --- a/certs/Kconfig
> >>> +++ b/certs/Kconfig
> >>> @@ -134,7 +134,7 @@ config SYSTEM_BLACKLIST_AUTH_UPDATE
> >>>   depends on SYSTEM_DATA_VERIFICATION
> >>>   help
> >>> If set, provide the ability to load new blacklist keys at run time if
> >>> -   they are signed and vouched by a certificate from the builtin trusted
> >>> +   they are signed and vouched by a certificate from the secondary 
> >>> trusted
> >> 
> >> If CONFIG_SECONDARY_TRUSTED_KEYRING is not enabled, it falls back to
> >> the builtin keyring.  Please update the comment accordingly.
> 
> I’ll fix these in the next round, thanks.
> 




[PATCH v2][next] RDMA/core: Use size_{add,mul}() in calls to struct_size()

2023-09-11 Thread Gustavo A. R. Silva
Harden calls to struct_size() with size_add() and size_mul().

Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs 
attributes")
Fixes: a4676388e2e2 ("RDMA/core: Simplify how the gid_attrs sysfs is created")
Signed-off-by: Gustavo A. R. Silva 
---
Changes in v2:
 - Update changelog text: remove the part about binary differences (it
   was added by mistake).

 drivers/infiniband/core/sysfs.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index ee59d7391568..ec5efdc16660 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -903,7 +903,7 @@ alloc_hw_stats_device(struct ib_device *ibdev)
 * Two extra attribue elements here, one for the lifespan entry and
 * one to NULL terminate the list for the sysfs core code
 */
-   data = kzalloc(struct_size(data, attrs, stats->num_counters + 1),
+   data = kzalloc(struct_size(data, attrs, size_add(stats->num_counters, 
1)),
   GFP_KERNEL);
if (!data)
goto err_free_stats;
@@ -1009,7 +1009,7 @@ alloc_hw_stats_port(struct ib_port *port, struct 
attribute_group *group)
 * Two extra attribue elements here, one for the lifespan entry and
 * one to NULL terminate the list for the sysfs core code
 */
-   data = kzalloc(struct_size(data, attrs, stats->num_counters + 1),
+   data = kzalloc(struct_size(data, attrs, size_add(stats->num_counters, 
1)),
   GFP_KERNEL);
if (!data)
goto err_free_stats;
@@ -1140,7 +1140,7 @@ static int setup_gid_attrs(struct ib_port *port,
int ret;
 
gid_attr_group = kzalloc(struct_size(gid_attr_group, attrs_list,
-attr->gid_tbl_len * 2),
+size_mul(attr->gid_tbl_len, 2)),
 GFP_KERNEL);
if (!gid_attr_group)
return -ENOMEM;
@@ -1205,8 +1205,8 @@ static struct ib_port *setup_port(struct ib_core_device 
*coredev, int port_num,
int ret;
 
p = kvzalloc(struct_size(p, attrs_list,
-   attr->gid_tbl_len + attr->pkey_tbl_len),
-   GFP_KERNEL);
+   size_add(attr->gid_tbl_len, 
attr->pkey_tbl_len)),
+GFP_KERNEL);
if (!p)
return ERR_PTR(-ENOMEM);
p->ibdev = device;
-- 
2.34.1



[PATCH 2/2] arm64: dts: imx8mp: add reserve-memory nodes for DSP

2023-09-11 Thread Iuliana Prodan (OSS)
From: Iuliana Prodan 

Add the reserve-memory nodes used by DSP when the rpmsg
feature is enabled.
These can be later used in a dsp node, like:
dsp: dsp@3b6e8000 {
compatible = "fsl,imx8mp-dsp";
reg = <0x3b6e8000 0x88000>;
mbox-names = "tx0", "rx0", "rxdb0";
mboxes = <&mu2 2 0>, <&mu2 2 1>,
<&mu2 3 0>, <&mu2 3 1>;
memory-region = <&dsp_vdev0buffer>, <&dsp_vdev0vring0>,
<&dsp_vdev0vring1>, <&dsp_reserved>;
status = "okay";
};

Signed-off-by: Iuliana Prodan 
---
 arch/arm64/boot/dts/freescale/imx8mp.dtsi | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/imx8mp.dtsi 
b/arch/arm64/boot/dts/freescale/imx8mp.dtsi
index cc406bb338fe..eedc1921af62 100644
--- a/arch/arm64/boot/dts/freescale/imx8mp.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mp.dtsi
@@ -210,6 +210,18 @@
dsp_reserved: dsp@9240 {
reg = <0 0x9240 0 0x200>;
no-map;
+   dsp_vdev0vring0: vdev0vring0@942f {
+   reg = <0 0x942f 0 0x8000>;
+   no-map;
+   };
+   dsp_vdev0vring1: vdev0vring1@942f8000 {
+   reg = <0 0x942f8000 0 0x8000>;
+   no-map;
+   };
+   dsp_vdev0buffer: vdev0buffer@9430 {
+   compatible = "shared-dma-pool";
+   reg = <0 0x9430 0 0x10>;
+   no-map;
};
};
 
-- 
2.17.1



Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Eric Snowberg



> On Sep 11, 2023, at 4:04 PM, Jarkko Sakkinen  wrote:
> 
> On Mon Sep 11, 2023 at 4:29 PM EEST, Mimi Zohar wrote:
>> Hi Eric,
>> 
>> On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
>>> Currently root can dynamically update the blacklist keyring if the hash
>>> being added is signed and vouched for by the builtin trusted keyring.
>>> Currently keys in the secondary trusted keyring can not be used.
>>> 
>>> Keys within the secondary trusted keyring carry the same capabilities as
>>> the builtin trusted keyring.  Relax the current restriction for updating
>>> the .blacklist keyring and allow the secondary to also be referenced as
>>> a trust source.  Since the machine keyring is linked to the secondary
>>> trusted keyring, any key within it may also be used.
>>> 
>>> An example use case for this is IMA appraisal.  Now that IMA both
>>> references the blacklist keyring and allows the machine owner to add
>>> custom IMA CA certs via the machine keyring, this adds the additional
>>> capability for the machine owner to also do revocations on a running
>>> system.
>>> 
>>> IMA appraisal usage example to add a revocation for /usr/foo:
>>> 
>>> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
>>> 
>>> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
>>>   -signer machine-certificate.pem -noattr -binary -outform DER \
>>>   -out hash.p7s
>>> 
>>> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
>>> 
>>> Signed-off-by: Eric Snowberg 
>> 
>> The secondary keyring may include both CA and code signing keys.  With
>> this change any key loaded onto the secondary keyring may blacklist a
>> hash.  Wouldn't it make more sense to limit blacklisting
>> certificates/hashes to at least CA keys? 
> 
> I think a bigger issue is that if a kernel is updated with this patch
> it will change the behavior. It is nothing to do whether the "old" or
> "new" is better but more like kind of backwards compatibility issue.

For a kernel built without the secondary trusted keyring defined, there is 
no change to their security posture. For a kernel built with the secondary 
trusted keyring defined,  I would view the system as being more secure 
with this patch.   For any system using the secondary trusted keyring, 
root can add trusted keys.  However without this patch, root can not 
mitigate a security problem on a live system and do any type of revocation
for keys it owns.  Without the ability to do a revocation, we really only have 
authenticity, not integrity.



Re: [PATCH] Fix typo in tpmrm class definition

2023-09-11 Thread Jarkko Sakkinen
On Fri Sep 8, 2023 at 5:06 PM EEST, Justin M. Forbes wrote:
> Commit d2e8071bed0be ("tpm: make all 'class' structures const")
> unfortunately had a typo for the name on tpmrm.
>
> Fixes: d2e8071bed0b ("tpm: make all 'class' structures const")
> Signed-off-by: Justin M. Forbes 
> ---
>  drivers/char/tpm/tpm-chip.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index 23f6f2eda84c..42b1062e33cd 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -33,7 +33,7 @@ const struct class tpm_class = {
>   .shutdown_pre = tpm_class_shutdown,
>  };
>  const struct class tpmrm_class = {
> - .name = "tmprm",
> + .name = "tpmrm",
>  };
>  dev_t tpm_devt;
>
> -- 
> 2.41.0

I have issues applying the patch:

$ git am -3 20230908_jforbes_fix_typo_in_tpmrm_class_definition.mbx
Applying: Fix typo in tpmrm class definition
error: corrupt patch at line 18
error: could not build fake ancestor
Patch failed at 0001 Fix typo in tpmrm class definition
hint: Use 'git am --show-current-patch=diff' to see the failed patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

$ git log -2
commit ba46245183940de39e42c8456b85ceaf3519b764 (HEAD -> master, origin/master, 
origin/HEAD)
Author: Sumit Garg 
Date:   Tue Aug 22 16:59:33 2023 +0530

KEYS: trusted: tee: Refactor register SHM usage

The OP-TEE driver using the old SMC based ABI permits overlapping shared
buffers, but with the new FF-A based ABI each physical page may only
be registered once.

As the key and blob buffer are allocated adjancently, there is no need
for redundant register shared memory invocation. Also, it is incompatibile
with FF-A based ABI limitation. So refactor register shared memory
implementation to use only single invocation to register both key and blob
buffers.

[jarkko: Added cc to stable.]
Cc: sta...@vger.kernel.org # v5.16+
Fixes: 4615e5a34b95 ("optee: add FF-A support")
Reported-by: Jens Wiklander 
Signed-off-by: Sumit Garg 
Tested-by: Jens Wiklander 
Reviewed-by: Jens Wiklander 
Signed-off-by: Jarkko Sakkinen 

commit 0bb80ecc33a8fb5a682236443c1e740d5c917d1d (tag: v6.6-rc1, 
upstream/master, origin/next, next)
Author: Linus Torvalds 
Date:   Sun Sep 10 16:28:41 2023 -0700

Linux 6.6-rc1

BR, Jarkko


[PATCH] tpm: Fix typo in tpmrm class definition

2023-09-11 Thread Justin M. Forbes
Commit d2e8071bed0be ("tpm: make all 'class' structures const")
unfortunately had a typo for the name on tpmrm.

Fixes: d2e8071bed0b ("tpm: make all 'class' structures const")
Signed-off-by: Justin M. Forbes 
---
 drivers/char/tpm/tpm-chip.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 23f6f2eda84c..42b1062e33cd 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -33,7 +33,7 @@ const struct class tpm_class = {
.shutdown_pre = tpm_class_shutdown,
 };
 const struct class tpmrm_class = {
-   .name = "tmprm",
+   .name = "tpmrm",
 };
 dev_t tpm_devt;

-- 
2.41.0



Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Eric Snowberg


> On Sep 11, 2023, at 10:51 AM, Mickaël Salaün  wrote:
> 
> On Mon, Sep 11, 2023 at 09:29:07AM -0400, Mimi Zohar wrote:
>> Hi Eric,
>> 
>> On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
>>> Currently root can dynamically update the blacklist keyring if the hash
>>> being added is signed and vouched for by the builtin trusted keyring.
>>> Currently keys in the secondary trusted keyring can not be used.
>>> 
>>> Keys within the secondary trusted keyring carry the same capabilities as
>>> the builtin trusted keyring.  Relax the current restriction for updating
>>> the .blacklist keyring and allow the secondary to also be referenced as
>>> a trust source.  Since the machine keyring is linked to the secondary
>>> trusted keyring, any key within it may also be used.
>>> 
>>> An example use case for this is IMA appraisal.  Now that IMA both
>>> references the blacklist keyring and allows the machine owner to add
>>> custom IMA CA certs via the machine keyring, this adds the additional
>>> capability for the machine owner to also do revocations on a running
>>> system.
>>> 
>>> IMA appraisal usage example to add a revocation for /usr/foo:
>>> 
>>> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
>>> 
>>> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
>>>   -signer machine-certificate.pem -noattr -binary -outform DER \
>>>   -out hash.p7s
>>> 
>>> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
>>> 
>>> Signed-off-by: Eric Snowberg 
>> 
>> The secondary keyring may include both CA and code signing keys.  With
>> this change any key loaded onto the secondary keyring may blacklist a
>> hash.  Wouldn't it make more sense to limit blacklisting
>> certificates/hashes to at least CA keys? 
> 
> Some operational constraints may limit what a CA can sign.

Agreed.  

Is there precedents for requiring this S/MIME to be signed by a CA? 

> This change is critical and should be tied to a dedicated kernel config
> (disabled by default), otherwise existing systems using this feature
> will have their threat model automatically changed without notice.

Today we have INTEGRITY_CA_MACHINE_KEYRING_MAX.  This can 
be enabled to enforce CA restrictions on the machine keyring.  Mimi, would 
this be a suitable solution for what you are after?

I suppose root could add another key to the secondary keyring if it was 
signed by a key in the machine keyring.  But then we are getting into an 
area of key usage enforcement which really only exists for things added 
to the .ima keyring. 

>>> ---
>>> certs/Kconfig | 2 +-
>>> certs/blacklist.c | 4 ++--
>>> 2 files changed, 3 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/certs/Kconfig b/certs/Kconfig
>>> index 1f109b070877..23dc87c52aff 100644
>>> --- a/certs/Kconfig
>>> +++ b/certs/Kconfig
>>> @@ -134,7 +134,7 @@ config SYSTEM_BLACKLIST_AUTH_UPDATE
>>> depends on SYSTEM_DATA_VERIFICATION
>>> help
>>>   If set, provide the ability to load new blacklist keys at run time if
>>> - they are signed and vouched by a certificate from the builtin trusted
>>> + they are signed and vouched by a certificate from the secondary 
>>> trusted
>> 
>> If CONFIG_SECONDARY_TRUSTED_KEYRING is not enabled, it falls back to
>> the builtin keyring.  Please update the comment accordingly.

I’ll fix these in the next round, thanks.



[PATCH] tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()

2023-09-11 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs files list is protected by SRCU. In earlier iterations it was
protected with just RCU, but because it needed to also call sleepable
code, it had to be switch to SRCU. The dcache_dir_open_wrapper()
list_for_each_rcu() was missed and did not get converted over to
list_for_each_srcu(). That needs to be fixed.

Link: 
https://lore.kernel.org/linux-trace-kernel/20230911120053.ca82f545e7f46ea753ded...@kernel.org/

Reported-by: Masami Hiramatsu (Google) 
Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
Signed-off-by: Steven Rostedt (Google) 
---

[ Resend to see if it gets through the mailing list backlog! ]

 fs/tracefs/event_inode.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index f168aca45458..9f64e7332796 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -452,7 +452,8 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file)
 
ei = ti->private;
idx = srcu_read_lock(&eventfs_srcu);
-   list_for_each_entry_rcu(ef, &ei->e_top_files, list) {
+   list_for_each_entry_srcu(ef, &ei->e_top_files, list,
+srcu_read_lock_held(&eventfs_srcu)) {
create_dentry(ef, dentry, false);
}
srcu_read_unlock(&eventfs_srcu, idx);
-- 
2.40.1



Re: [PATCH] Fix typo in tpmrm class definition

2023-09-11 Thread Justin Forbes
On Mon, Sep 11, 2023 at 5:09 PM Jarkko Sakkinen  wrote:
>
> On Fri Sep 8, 2023 at 5:06 PM EEST, Justin M. Forbes wrote:
> > Commit d2e8071bed0be ("tpm: make all 'class' structures const")
> > unfortunately had a typo for the name on tpmrm.
> >
> > Fixes: d2e8071bed0b ("tpm: make all 'class' structures const")
> > Signed-off-by: Justin M. Forbes 
> > ---
> >  drivers/char/tpm/tpm-chip.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > index 23f6f2eda84c..42b1062e33cd 100644
> > --- a/drivers/char/tpm/tpm-chip.c
> > +++ b/drivers/char/tpm/tpm-chip.c
> > @@ -33,7 +33,7 @@ const struct class tpm_class = {
> >   .shutdown_pre = tpm_class_shutdown,
> >  };
> >  const struct class tpmrm_class = {
> > - .name = "tmprm",
> > + .name = "tpmrm",
> >  };
> >  dev_t tpm_devt;
> >
> > --
> > 2.41.0
>
> I have issues applying the patch:

Sorry, not sure what the issue is, but I did a git am of it myself to
a fresh checkout of linus' tree and just recreated and sent it. So,
new thread, but hopefully the patch will apply

Justin

>
> $ git am -3 20230908_jforbes_fix_typo_in_tpmrm_class_definition.mbx
> Applying: Fix typo in tpmrm class definition
> error: corrupt patch at line 18
> error: could not build fake ancestor
> Patch failed at 0001 Fix typo in tpmrm class definition
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".
>
> $ git log -2
> commit ba46245183940de39e42c8456b85ceaf3519b764 (HEAD -> master, 
> origin/master, origin/HEAD)
> Author: Sumit Garg 
> Date:   Tue Aug 22 16:59:33 2023 +0530
>
> KEYS: trusted: tee: Refactor register SHM usage
>
> The OP-TEE driver using the old SMC based ABI permits overlapping shared
> buffers, but with the new FF-A based ABI each physical page may only
> be registered once.
>
> As the key and blob buffer are allocated adjancently, there is no need
> for redundant register shared memory invocation. Also, it is incompatibile
> with FF-A based ABI limitation. So refactor register shared memory
> implementation to use only single invocation to register both key and blob
> buffers.
>
> [jarkko: Added cc to stable.]
> Cc: sta...@vger.kernel.org # v5.16+
> Fixes: 4615e5a34b95 ("optee: add FF-A support")
> Reported-by: Jens Wiklander 
> Signed-off-by: Sumit Garg 
> Tested-by: Jens Wiklander 
> Reviewed-by: Jens Wiklander 
> Signed-off-by: Jarkko Sakkinen 
>
> commit 0bb80ecc33a8fb5a682236443c1e740d5c917d1d (tag: v6.6-rc1, 
> upstream/master, origin/next, next)
> Author: Linus Torvalds 
> Date:   Sun Sep 10 16:28:41 2023 -0700
>
> Linux 6.6-rc1
>
> BR, Jarkko
>


Re: [PATCH v5] Randomized slab caches for kmalloc()

2023-09-11 Thread jvoisin
I wrote a small blogpost[1] about this series, and was told[2] that it
would be interesting to share it on this thread, so here it is, copied
verbatim:

Ruiqi Gong and Xiu Jianfeng got their
[Randomized slab caches for
kmalloc()](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c6152940584290668b35fa0800026f6a1ae05fe)
patch series merged upstream, and I've and enough discussions about it to
warrant summarising them into a small blogpost.

The main idea is to have multiple slab caches, and pick one at random
based on
the address of code calling `kmalloc()` and a per-boot seed, to make
heap-spraying harder.
It's a great idea, but comes with some shortcomings for now:

- Objects being allocated via wrappers around `kmalloc()`, like
`sock_kmalloc`,
  `f2fs_kmalloc`, `aligned_kmalloc`, … will end up in the same slab cache.
- The slabs needs to be pinned, otherwise an attacker could
[feng-shui](https://en.wikipedia.org/wiki/Heap_feng_shui) their way
  into having the whole slab free'ed, garbage-collected, and have a slab for
  another type allocated at the same VA. [Jann Horn](https://thejh.net/)
and [Matteo Rizzo](https://infosec.exchange/@nspace) have a [nice
  set of
patches](https://github.com/torvalds/linux/compare/master...thejh:linux:slub-virtual-upstream),
  discussed a bit in [this Project Zero
blogpost](https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html),
  for a feature called [`SLAB_VIRTUAL`](
https://github.com/torvalds/linux/commit/f3afd3a2152353be355b90f5fd4367adbf6a955e),
  implementing precisely this.
- There are 16 slabs by default, so one chance out of 16 to end up in
the same
  slab cache as the target.
- There are no guard pages between caches, so inter-caches overflows are
  possible.
- As pointed by
[andreyknvl](https://twitter.com/andreyknvl/status/1700267669336080678)
  and [minipli](https://infosec.exchange/@minipli/111045336853055793),
  the fewer allocations hitting a given cache means less noise,
  so it might even help with some heap feng-shui.
- minipli also pointed that "randomized caches still freely
  mix kernel allocations with user controlled ones (`xattr`, `keyctl`,
`msg_msg`, …).
  So even though merging is disabled for these caches, i.e. no direct
overlap
  with `cred_jar` etc., other object types can still be targeted (`struct
  pipe_buffer`, BPF maps, its verifier state objects,…). It’s just a
matter of
  probing which allocation index the targeted object falls into.",
  but I considered this out of scope, since it's much more involved;
  albeit something like
[`CONFIG_KMALLOC_SPLIT_VARSIZE`](https://github.com/thejh/linux/blob/slub-virtual/MITIGATION_README)
  wouldn't significantly increase complexity.

Also, while code addresses as a source of entropy has historically be a
great
way to provide [KASLR](https://lwn.net/Articles/569635/) bypasses,
`hash_64(caller ^
random_kmalloc_seed, ilog2(RANDOM_KMALLOC_CACHES_NR + 1))` shouldn't
trivially
leak offsets.

The segregation technique is a bit like a weaker version of grsecurity's
[AUTOSLAB](https://grsecurity.net/how_autoslab_changes_the_memory_unsafety_game),
or a weaker kernel-land version of
[PartitionAlloc](https://chromium.googlesource.com/chromium/src/+/master/base/allocator/partition_allocator/PartitionAlloc.md),
but to be fair, making use-after-free exploitation harder, and significantly
harder once pinning lands, with only ~150 lines of code and negligible
performance impact is amazing and should be praised. Moreover, I wouldn't be
surprised if this was backported in [Google's
KernelCTF](https://google.github.io/security-research/kernelctf/rules.html)
soon, so we should see if my analysis is correct.

1.
https://dustri.org/b/some-notes-on-randomized-slab-caches-for-kmalloc.html
2. https://infosec.exchange/@vba...@social.kernel.org/111046740392510260

-- 
Julien (jvoisin) Voisin
GPG: 04D041E8171901CC
dustri.org



Re: [PATCH kmod v5 0/5] kmod /usr support

2023-09-11 Thread Michal Suchánek
On Sat, Aug 19, 2023 at 08:25:52PM +0900, Masahiro Yamada wrote:
> On Fri, Aug 18, 2023 at 12:15 PM Michal Suchánek  wrote:
> >
> > Hello,
> >
> > On Tue, Jul 18, 2023 at 02:01:51PM +0200, Michal Suchanek wrote:
> > > Hello,
> > >
> > > with these patches it is possible to install kernel modules in an 
> > > arbitrary
> > > directory - eg. moving the /lib/modules to /usr/lib/modules or /opt/linux.
> > >
> > > While the modprobe.d and depmod.d search which already includes multiple
> > > paths is expanded to also include $(prefix) the module directory still
> > > supports only one location, only a different one under 
> > > $(module_directory).
> > >
> > > Having kmod search multiple module locations while only one is supported 
> > > now
> > > might break some assumption about relative module path corresponding to a
> > > specific file, would require more invasive changes to implement, and is 
> > > not
> > > supportive of the goal of moving the modules away from /lib.
> > >
> > > Both kmod and the kernel need to be patched to make use of this feature.
> > > Patched kernel is backwards compatible with older kmod.  Patched kmod
> > > with $(module_directory) set to /lib/modules is equivalent to unpatched 
> > > kmod.
> >
> > The patch to kernel to support autodetection of module directory is
> > rejected. However, a workaround like
> >
> > make MODLIB='$(INSTALL_MOD_PATH)/usr/lib/modules/$(KERNELRELEASE)'
> >
> > is suggested.
> >
> > Can you consider inluding the kmod changes?
> 
> 
> Hi.
> 
> I have a question about your original patch
> for the Kbuild change.
> 
> In your patch, Kbuild runs 'kmod config' or
> 'pkg-config --variable=module_directory kmod',
> then sets the returned string to MODLIB.
> 
> 
> If kmod is configured to use /usr/lib/modules,
> /opt/modules, or whatever,
> should we change the installation path of the debug
> vdso accordingly?
> 
> Currently, the debug vdso is always installed
> to /lib/modules/$(KERNELRELEASE)/vdso/.
> 
> However, modules and vdso are unrelated to each other.
> kmod does not care about vdso.
> 
> 
> The following commits started to install debug vdso.
> 
> Commit 8150caad0226 ("[POWERPC] powerpc vDSO: install unstripped
> copies on disk")
> Commit f79eb83b3af4 ("x86: Install unstripped copy of 64bit vdso to disk")
> 
> 
> I do not know why they chose $(MODLIB)/vdso as the install destination.
> 
> 
> I am thinking of split the variable into two:
> MODLIB -  installation path for modules
> VDSOLIB - installation path for debug vdso (not affected by kmod config)
> 
> I think that is the way to do this correctly.

The source and build symlinks that point at the kernel tree are not
modules either yet they are in the module directory as well.

Thanks

Michal


Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread SeongJae Park
Hi Steven,

On Mon, 11 Sep 2023 14:19:55 -0400 Steven Rostedt  wrote:

> On Mon, 11 Sep 2023 04:59:07 +
> SeongJae Park  wrote:
> 
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> > @@ -950,6 +950,28 @@ static void damos_apply_scheme(struct damon_ctx *c, 
> > struct damon_target *t,
> > struct timespec64 begin, end;
> > unsigned long sz_applied = 0;
> > int err = 0;
> > +   /*
> > +* We plan to support multiple context per kdamond, as DAMON sysfs
> > +* implies with 'nr_contexts' file.  Nevertheless, only single context
> > +* per kdamond is supported for now.  So, we can simply use '0' context
> > +* index here.
> > +*/
> > +   unsigned int cidx = 0;
> > +   struct damos *siter;/* schemes iterator */
> > +   unsigned int sidx = 0;
> > +   struct damon_target *titer; /* targets iterator */
> > +   unsigned int tidx = 0;
> > +
> 
> If this loop is only for passing sidx and tidx to the trace point,

You're correct.

> you can add around it:
> 
>   if (trace_damos_before_apply_enabled()) {
> 
> > +   damon_for_each_scheme(siter, c) {
> > +   if (siter == s)
> > +   break;
> > +   sidx++;
> > +   }
> > +   damon_for_each_target(titer, c) {
> > +   if (titer == t)
> > +   break;
> > +   tidx++;
> > +   }
> 
>   }
> 
> 
> And then this loop will only be done if that trace event is enabled.

Today I learned yet another great feature of the tracing framework.  Thank you
Steven, I will add that to the next spin of this patchset!

> 
> To prevent races, you may also want to add a third parameter, or initialize
> them to -1:
> 
>   sidx = -1;
> 
>   if (trace_damo_before_apply_enabled()) {
>   sidx = 0;
>   [..]
>   }
> 
> And you can change the TRACE_EVENT() TO TRACE_EVENT_CONDITION():
> 
> TRACE_EVENT_CONDITION(damos_before_apply,
> 
>   TP_PROTO(...),
> 
>   TP_ARGS(...),
> 
>   TP_CONDITION(sidx >= 0),
> 
> and the trace event will not be called if sidx is less than zero.
> 
> Also, this if statement is only done when the trace event is enabled, so
> it's equivalent to:
> 
>   if (trace_damos_before_apply_enabled()) {
>   if (sdx >= 0)
>   trace_damos_before_apply(cidx, sidx, tidx, r,
>   damon_nr_regions(t));
>   }

Again, thank you very much for letting me know this awesome feature.  However,
sidx is supposed to be always >=0 here, since kdamond is running in single
thread and hence no race is expected.  If it exists, it's a bug.  So, I
wouldn't make this change.  Appreciate again for letting me know this very
useful feature, and please let me know if I'm missing something, though!


Thanks,
SJ

> 
> -- Steve
> 
> 
> 
> >  
> > if (c->ops.apply_scheme) {
> > if (quota->esz && quota->charged_sz + sz > quota->esz) {
> > @@ -964,8 +986,11 @@ static void damos_apply_scheme(struct damon_ctx *c, 
> > struct damon_target *t,
> > ktime_get_coarse_ts64(&begin);
> > if (c->callback.before_damos_apply)
> > err = c->callback.before_damos_apply(c, t, r, s);
> > -   if (!err)
> > +   if (!err) {
> > +   trace_damos_before_apply(cidx, sidx, tidx, r,
> > +   damon_nr_regions(t));
> > sz_applied = c->ops.apply_scheme(c, t, r, s);
> > +   }
> > ktime_get_coarse_ts64(&end);
> > quota->total_charged_ns += timespec64_to_ns(&end) -
> > timespec64_to_ns(&begin);
> > -- 
> 


[PATCH] kbuild: rpm-pkg: Fix build with non-default MODLIB

2023-09-11 Thread Michal Suchanek
The default MODLIB value is composed of two variables and the hardcoded
string '/lib/modules/'.

MODLIB = $(INSTALL_MOD_PATH)/lib/modules/$(KERNELRELEASE)

Defining this middle part as a variable was rejected on the basis that
users can pass the whole MODLIB to make, such as

make 'MODLIB=$(INSTALL_MOD_PATH)/usr/lib/modules/$(KERNELRELEASE)'

However, this middle part of MODLIB is independently hardcoded by
rpm-pkg, and when the user alters MODLIB this is not reflected when
building the package.

Given that $(INSTALL_MOD_PATH) is overridden during the rpm package build
it is likely going to be empty. Then MODLIB can be passed to the rpm
package, and used in place of the whole
/usr/lib/modules/$(KERNELRELEASE) part.

Signed-off-by: Michal Suchanek 
---
 scripts/package/kernel.spec | 14 +++---
 scripts/package/mkspec  |  1 +
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/scripts/package/kernel.spec b/scripts/package/kernel.spec
index ac3f2ee6d7a0..1a727f636f67 100644
--- a/scripts/package/kernel.spec
+++ b/scripts/package/kernel.spec
@@ -67,8 +67,8 @@ cp $(%{make} %{makeflags} -s image_name) 
%{buildroot}/boot/vmlinuz-%{KERNELRELEA
 %{make} %{makeflags} INSTALL_HDR_PATH=%{buildroot}/usr headers_install
 cp System.map %{buildroot}/boot/System.map-%{KERNELRELEASE}
 cp .config %{buildroot}/boot/config-%{KERNELRELEASE}
-ln -fns /usr/src/kernels/%{KERNELRELEASE} 
%{buildroot}/lib/modules/%{KERNELRELEASE}/build
-ln -fns /usr/src/kernels/%{KERNELRELEASE} 
%{buildroot}/lib/modules/%{KERNELRELEASE}/source
+ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}%{MODLIB}/build
+ln -fns /usr/src/kernels/%{KERNELRELEASE} %{buildroot}%{MODLIB}/source
 %if %{with_devel}
 %{make} %{makeflags} run-command 
KBUILD_RUN_COMMAND='${srctree}/scripts/package/install-extmod-build 
%{buildroot}/usr/src/kernels/%{KERNELRELEASE}'
 %endif
@@ -99,9 +99,9 @@ fi
 
 %files
 %defattr (-, root, root)
-/lib/modules/%{KERNELRELEASE}
-%exclude /lib/modules/%{KERNELRELEASE}/build
-%exclude /lib/modules/%{KERNELRELEASE}/source
+%{MODLIB}
+%exclude %{MODLIB}/build
+%exclude %{MODLIB}/source
 /boot/*
 
 %files headers
@@ -112,6 +112,6 @@ fi
 %files devel
 %defattr (-, root, root)
 /usr/src/kernels/%{KERNELRELEASE}
-/lib/modules/%{KERNELRELEASE}/build
-/lib/modules/%{KERNELRELEASE}/source
+%{MODLIB}/build
+%{MODLIB}/source
 %endif
diff --git a/scripts/package/mkspec b/scripts/package/mkspec
index d41608efb747..d41b2e5304ac 100755
--- a/scripts/package/mkspec
+++ b/scripts/package/mkspec
@@ -18,6 +18,7 @@ fi
 cat<

Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Mimi Zohar
Hi Eric,

On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
> Currently root can dynamically update the blacklist keyring if the hash
> being added is signed and vouched for by the builtin trusted keyring.
> Currently keys in the secondary trusted keyring can not be used.
> 
> Keys within the secondary trusted keyring carry the same capabilities as
> the builtin trusted keyring.  Relax the current restriction for updating
> the .blacklist keyring and allow the secondary to also be referenced as
> a trust source.  Since the machine keyring is linked to the secondary
> trusted keyring, any key within it may also be used.
> 
> An example use case for this is IMA appraisal.  Now that IMA both
> references the blacklist keyring and allows the machine owner to add
> custom IMA CA certs via the machine keyring, this adds the additional
> capability for the machine owner to also do revocations on a running
> system.
> 
> IMA appraisal usage example to add a revocation for /usr/foo:
> 
> sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
> 
> openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
>-signer machine-certificate.pem -noattr -binary -outform DER \
>-out hash.p7s
> 
> keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
> 
> Signed-off-by: Eric Snowberg 

The secondary keyring may include both CA and code signing keys.  With
this change any key loaded onto the secondary keyring may blacklist a
hash.  Wouldn't it make more sense to limit blacklisting
certificates/hashes to at least CA keys? 

> ---
>  certs/Kconfig | 2 +-
>  certs/blacklist.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/certs/Kconfig b/certs/Kconfig
> index 1f109b070877..23dc87c52aff 100644
> --- a/certs/Kconfig
> +++ b/certs/Kconfig
> @@ -134,7 +134,7 @@ config SYSTEM_BLACKLIST_AUTH_UPDATE
>   depends on SYSTEM_DATA_VERIFICATION
>   help
> If set, provide the ability to load new blacklist keys at run time if
> -   they are signed and vouched by a certificate from the builtin trusted
> +   they are signed and vouched by a certificate from the secondary 
> trusted

If CONFIG_SECONDARY_TRUSTED_KEYRING is not enabled, it falls back to
the builtin keyring.  Please update the comment accordingly.

> keyring.  The PKCS#7 signature of the description is set in the key
> payload.  Blacklist keys cannot be removed.
>  
> diff --git a/certs/blacklist.c b/certs/blacklist.c
> index 675dd7a8f07a..0b346048ae2d 100644
> --- a/certs/blacklist.c
> +++ b/certs/blacklist.c
> @@ -102,12 +102,12 @@ static int blacklist_key_instantiate(struct key *key,
>  
>  #ifdef CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE
>   /*
> -  * Verifies the description's PKCS#7 signature against the builtin
> +  * Verifies the description's PKCS#7 signature against the secondary
>* trusted keyring.
>*/

And similarly here ...

>   err = verify_pkcs7_signature(key->description,
>   strlen(key->description), prep->data, prep->datalen,
> - NULL, VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
> + VERIFY_USE_SECONDARY_KEYRING, 
> VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
>   if (err)
>   return err;
>  #else

-- 
thanks,

Mimi



Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook

2023-09-11 Thread Google
On Mon, 11 Sep 2023 09:55:09 +0200
Sven Schnelle  wrote:

> Masami Hiramatsu (Google)  writes:
> 
> >> > IOW, it is ftrace save regs/restore regs code issue. I need to check how 
> >> > the
> >> > function_graph implements it.
> >> 
> >> gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
> >> regardless of the FTRACE_WITH_REGS flags. The only difference is that
> >> without the FTRACE_WITH_REGS flag the program status word (psw) is not
> >> saved because collecting that is a rather expensive operation.
> >
> > Thanks for checking that! So s390 will recover those saved registers
> > even if FTRACE_WITH_REGS flag is not set? (I wonder what is the requirement
> > of the ftrace_regs when returning from ftrace_call() without
> > FTRACE_WITH_REGS?)
> 
> Yes, it will recover these in all cases.

Thanks for the confirmation!

> 
> >> 
> >> I used the following commands to test rethook (is that the correct
> >> testcase?)
> >> 
> >> #!/bin/bash
> >> cd /sys/kernel/tracing
> >> 
> >> echo 'r:icmp_rcv icmp_rcv' >kprobe_events
> >> echo 1 >events/kprobes/icmp_rcv/enable
> >> ping -c 1 127.0.0.1
> >> cat trace
> >
> > No, the kprobe will path pt_regs to rethook.
> > Cna you run
> >
> > echo "f:icmp_rcv%return icmp_rcv" >> dynamic_events
> 
> Ah, ok. Seems to work as well:
> 
>   ping-481 [001] ..s2.53.918480: icmp_rcv: 
> (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)
>   ping-481 [001] ..s2.53.918491: icmp_rcv: 
> (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)

Nice!
OK, then s390 is safe to use ftrace_regs :) 

Thanks!


-- 
Masami Hiramatsu (Google) 


Re: [PATCH v3] tpm: Enable hwrng only for Pluton on AMD CPUs

2023-09-11 Thread Jarkko Sakkinen
On Tue Sep 5, 2023 at 3:01 PM EEST, Thorsten Leemhuis wrote:
> On 05.09.23 00:32, Jarkko Sakkinen wrote:
> > On Fri Sep 1, 2023 at 11:49 AM EEST, Thorsten Leemhuis wrote:
> >> On 29.08.23 10:38, Linux regression tracking (Thorsten Leemhuis) wrote:
> >>> On 28.08.23 02:35, Mario Limonciello wrote:
>  On 8/27/2023 13:12, Jarkko Sakkinen wrote:
> > On Wed Aug 23, 2023 at 9:58 PM EEST, Mario Limonciello wrote:
> >> On 8/23/2023 12:40, Jarkko Sakkinen wrote:
> >>> On Wed Aug 23, 2023 at 11:23 AM EEST, Paul Menzel wrote:
>  Am 23.08.23 um 01:15 schrieb Jarkko Sakkinen:
> > The vendor check introduced by commit 554b841d4703 ("tpm: Disable
> > RNG for
> > all AMD fTPMs") doesn't work properly on a number of Intel fTPMs. 
> > On the
> > reported systems the TPM doesn't reply at bootup and returns back 
> > the
> > command code. This makes the TPM fail probe.
> > [...]
> >> Hmmm. Quite a bit progress to fix the issue was made in the first week
> >> after Todd's report; Jarkko apparently even applied the earlier patch
> >> from Mario to his master branch:
> >> https://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd.git/commit/?id=b1a62d41bdc1d15b0641759717e8c3651f0a810c
> >> But since then (aka in the past week) there was not much progress.
>
> Jarkko, many thx for picking this up and submitting it to Linus, much
> appreciated. Sorry again for prodding things, but I felt I had to. Hope
> you didn't mind too much.
>
> > Could it be possible to extend the actual kernel documentation
> > to give at least some guidelines how a maintainer should deal
> > with the bugzilla?
>
> I guess it's best if that is done by somebody that cares about bugzilla
> (I don't fall into that group[1]) and knows the official status.
>
> But FWIW, I wonder what you actually want to see documented. From
> https://lore.kernel.org/all/CVAC8VQPD3PK.1CBS5QTWDSS2C@suppilovahvero/
> it sounds like you had trouble with Link:/Closes: tag and Reported-by.
> From what I can see I don't think bugzilla.kernel.org needs special
> documentation in that area:
>
>  * just use Link:/Closes: to reports to public reports that might be
> helpful later in case somebody wants to look at the backstory of a
> commit, wherever those reports may be (lore, bugzilla.kernel.org,
> https://gitlab.freedesktop.org/drm/intel/-/issues,
> https://github.com/thesofproject/linux/issues, ...)
>
>  * use Reported-by: to give credit to anyone that deserves it, as it is
> a nice way to say thx while motivate people to help again in the future.
> That usually will include the initial reporter, but might also include
> people that replied to a report from somebody else and helped
> perceptible with debugging or fixing.

I *kind of* agree with this but checkpatch.pl disagrees with this :-/

And it is an actual issue that bugzilla is hosted in kernel.org domain,
and at the same time it is undocumented process.

AFAIK anything that is not part of the process can be ignored in the
process so *theoretically* anything put to kernel bugzilla ca be
ignored. This is how it is with e.g. patchwork - some people use it,
some people don't.

Personally I think bugzilla, being user approachable system, should
be better defined but *theoretically*, at least by the process, it
can be fully ignored.

This is where the main source of confusion inherits from, when you
put your maintainer hat on.

> Ciao, Thorsten
>
> [1] I only sometimes help people that report regressions to
> bugzilla.kernel.org that otherwise would likely would fall through the
> cracks (among others because many reports are never forwarded to the
> proper developers otherwise).

BR, Jarkko


Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread Steven Rostedt
On Mon, 11 Sep 2023 04:59:07 +
SeongJae Park  wrote:

> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -950,6 +950,28 @@ static void damos_apply_scheme(struct damon_ctx *c, 
> struct damon_target *t,
>   struct timespec64 begin, end;
>   unsigned long sz_applied = 0;
>   int err = 0;
> + /*
> +  * We plan to support multiple context per kdamond, as DAMON sysfs
> +  * implies with 'nr_contexts' file.  Nevertheless, only single context
> +  * per kdamond is supported for now.  So, we can simply use '0' context
> +  * index here.
> +  */
> + unsigned int cidx = 0;
> + struct damos *siter;/* schemes iterator */
> + unsigned int sidx = 0;
> + struct damon_target *titer; /* targets iterator */
> + unsigned int tidx = 0;
> +

If this loop is only for passing sidx and tidx to the trace point, you can
add around it:

if (trace_damos_before_apply_enabled()) {

> + damon_for_each_scheme(siter, c) {
> + if (siter == s)
> + break;
> + sidx++;
> + }
> + damon_for_each_target(titer, c) {
> + if (titer == t)
> + break;
> + tidx++;
> + }

}


And then this loop will only be done if that trace event is enabled.

To prevent races, you may also want to add a third parameter, or initialize
them to -1:

sidx = -1;

if (trace_damo_before_apply_enabled()) {
sidx = 0;
[..]
}

And you can change the TRACE_EVENT() TO TRACE_EVENT_CONDITION():

TRACE_EVENT_CONDITION(damos_before_apply,

TP_PROTO(...),

TP_ARGS(...),

TP_CONDITION(sidx >= 0),

and the trace event will not be called if sidx is less than zero.

Also, this if statement is only done when the trace event is enabled, so
it's equivalent to:

if (trace_damos_before_apply_enabled()) {
if (sdx >= 0)
trace_damos_before_apply(cidx, sidx, tidx, r,
damon_nr_regions(t));
}

-- Steve



>  
>   if (c->ops.apply_scheme) {
>   if (quota->esz && quota->charged_sz + sz > quota->esz) {
> @@ -964,8 +986,11 @@ static void damos_apply_scheme(struct damon_ctx *c, 
> struct damon_target *t,
>   ktime_get_coarse_ts64(&begin);
>   if (c->callback.before_damos_apply)
>   err = c->callback.before_damos_apply(c, t, r, s);
> - if (!err)
> + if (!err) {
> + trace_damos_before_apply(cidx, sidx, tidx, r,
> + damon_nr_regions(t));
>   sz_applied = c->ops.apply_scheme(c, t, r, s);
> + }
>   ktime_get_coarse_ts64(&end);
>   quota->total_charged_ns += timespec64_to_ns(&end) -
>   timespec64_to_ns(&begin);
> -- 


Re: [PATCH v3] scripts/link-vmlinux.sh: Add alias to duplicate symbols for kallsyms

2023-09-11 Thread Alexander Lobakin
From: Alessandro Carminati (Red Hat) 
Date: Mon, 28 Aug 2023 08:04:23 +

> From: Alessandro Carminati 
> 
> It is not uncommon for drivers or modules related to similar peripherals
> to have symbols with the exact same name.

[...]

> Changes from v2:
> - Alias tags are created by querying DWARF information from the vmlinux.
> - The filename + line number is normalized and appended to the original name.
> - The tag begins with '@' to indicate the symbol source.
> - Not a change, but worth mentioning, since the alias is added to the existing
>   list, the old duplicated name is preserved, and the livepatch way of dealing
>   with duplicates is maintained.
> - Acknowledging the existence of scenarios where inlined functions declared in
>   header files may result in multiple copies due to compiler behavior, though
>it is not actionable as it does not pose an operational issue.
> - Highlighting a single exception where the same name refers to different
>   functions: the case of "compat_binfmt_elf.c," which directly includes
>   "binfmt_elf.c" producing identical function copies in two separate
>   modules.

Oh, I thought you managed to handle this in v3 since you didn't reply in
the previous thread...

> 
> sample from new v3
> 
>  ~ # cat /proc/kallsyms | grep gic_mask_irq
>  d0b03c04dae4 t gic_mask_irq
>  d0b03c04dae4 t gic_mask_irq@_drivers_irqchip_irq-gic_c_167
>  d0b03c050960 t gic_mask_irq
>  d0b03c050960 t gic_mask_irq@_drivers_irqchip_irq-gic-v3_c_404

BTW, why normalize them? Why not just

gic_mask_irq@drivers/irqchip/...

And why line number? Line numbers break reproducible builds and also
would make it harder to refer to a particular symbol by its path and
name since we also have to pass its line number which may change once
you add a debug print there, for example.
OTOH there can't be 2 symbols with the same name within one file, so
just path + name would be enough. Or not?

(sorry if some of this was already discussed previously)

[...]

Thanks,
Olek


Re: Slow boot and shutdown/reboot problems with 6.5.0+

2023-09-11 Thread Thorsten Leemhuis
On 11.09.23 16:00, Bagas Sanjaya wrote:
> On 07/09/2023 20:56, Marcus Seyfarth wrote:
>> As to bisecting: Unfortunately I cannot afford the time right now to bisect 
>> this further as the system is used in production and already did invest a 
>> lot of time without success into it. Hopefully someone else can find the 
>> root cause of the problem. My systemd version is: 254.1, and I also use dbus 
>> 1.14.10 and dbus-broker 33.r35.g2220a84 which was configured with -D 
>> linux-4-17=true.
>>
> 
> [also Cc: Thorsten]
> 
> To Thorsten: It looks like the reporter can neither bisect nor
> test the mainline since he only has production machine instead.
> What can I (and the reporter do) in this case?

Well, unless someone[1] can find the root cause (either by guessing,
looking at the code-flow, the kernel warning, bisection, and others
things like that) we are out of luck; the only thing left is then to
hope that someone else will run into this and find the cause or that it
is accidentally fixed. That's definitely not ideal, but that's how it is
sometimes.

Ciao, Thorsten

[1] either the reporter or someone that sees this conversation or the
ticket; but things are quite confusing from the outside perspective and
a lot of private conversation seem to happen, so I doubt anyone will
take a closer look; a bisection is likely the best way forward here


Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread SeongJae Park
On Mon, 11 Sep 2023 16:31:27 -0400 Steven Rostedt  wrote:

> On Mon, 11 Sep 2023 19:05:04 +
> SeongJae Park  wrote:
> 
> > > Also, this if statement is only done when the trace event is enabled, so
> > > it's equivalent to:
> > > 
> > >   if (trace_damos_before_apply_enabled()) {
> > >   if (sdx >= 0)
> > >   trace_damos_before_apply(cidx, sidx, tidx, r,
> > >   damon_nr_regions(t));
> > >   }  
> > 
> > Again, thank you very much for letting me know this awesome feature.  
> > However,
> > sidx is supposed to be always >=0 here, since kdamond is running in single
> > thread and hence no race is expected.  If it exists, it's a bug.  So, I
> > wouldn't make this change.  Appreciate again for letting me know this very
> > useful feature, and please let me know if I'm missing something, though!
> 
> The race isn't with your code, but the enabling of tracing.
> 
> Let's say you enable tracing just ass it passed the first:
> 
>   if (trace_damos_before_apply_enabled()) {
>
>   damon_for_each_scheme(siter, c) {
>   if (siter == s)
>   break;
>   sidx++;
>   }
>   damon_for_each_target(titer, c) {
>   if (titer == t)
>   break;
>   tidx++;
>   }  
> 
> Now, sidx and tidx is zero (when they were not computed, thus, they
> shouldn't be zero.
> 
> Then tracing is fully enabled here, and now we enter:
> 
>   if (trace_damos_before_apply_enabled()) {
>   trace_damos_before_apply(cidx, sidx, tidx, r,
>   damon_nr_regions(t));
>   }
> 
> Now the trace event is hit with sidx and tidx zero when they should not be.
> This could confuse you when looking at the report.

Thank you so much for enlightening me with this kind explanation, Steve!  And
this all make sense.  I will follow your suggestion in the next spin.

> 
> What I suggested was to initialize sidx to zero,

Nit.  Initialize to not zero but -1, right?

> set it in the first trace_*_enabled() check, and ignore calling the
> tracepoint if it's not >= 0.
> 
> -- Steve
> 


Thanks,
SJ


Re: [PATCH RFC 00/37] Add support for arm64 MTE dynamic tag storage reuse

2023-09-11 Thread Catalin Marinas
On Wed, Sep 06, 2023 at 12:23:21PM +0100, Alexandru Elisei wrote:
> On Thu, Aug 24, 2023 at 04:24:30PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 24, 2023 at 01:25:41PM +0200, David Hildenbrand wrote:
> > > On 24.08.23 13:06, David Hildenbrand wrote:
> > > > Regarding one complication: "The kernel needs to know where to allocate
> > > > a PROT_MTE page from or migrate a current page if it becomes PROT_MTE
> > > > (mprotect()) and the range it is in does not support tagging.",
> > > > simplified handling would be if it's in a MIGRATE_CMA pageblock, it
> > > > doesn't support tagging. You have to migrate to a !CMA page (for
> > > > example, not specifying GFP_MOVABLE as a quick way to achieve that).
> > > 
> > > Okay, I now realize that this patch set effectively duplicates some CMA
> > > behavior using a new migrate-type.
[...]
> I considered mixing the tag storage memory memory with normal memory and
> adding it to MIGRATE_CMA. But since tag storage memory cannot be tagged,
> this means that it's not enough anymore to have a __GFP_MOVABLE allocation
> request to use MIGRATE_CMA.
> 
> I considered two solutions to this problem:
> 
> 1. Only allocate from MIGRATE_CMA is the requested memory is not tagged =>
> this effectively means transforming all memory from MIGRATE_CMA into the
> MIGRATE_METADATA migratetype that the series introduces. Not very
> appealing, because that means treating normal memory that is also on the
> MIGRATE_CMA lists as tagged memory.

That's indeed not ideal. We could try this if it makes the patches
significantly simpler, though I'm not so sure.

Allocating metadata is the easier part as we know the correspondence
from the tagged pages (32 PROT_MTE page) to the metadata page (1 tag
storage page), so alloc_contig_range() does this for us. Just adding it
to the CMA range is sufficient.

However, making sure that we don't allocate PROT_MTE pages from the
metadata range is what led us to another migrate type. I guess we could
achieve something similar with a new zone or a CPU-less NUMA node,
though the latter is not guaranteed not to allocate memory from the
range, only make it less likely. Both these options are less flexible in
terms of size/alignment/placement.

Maybe as a quick hack - only allow PROT_MTE from ZONE_NORMAL and
configure the metadata range in ZONE_MOVABLE but at some point I'd
expect some CXL-attached memory to support MTE with additional carveout
reserved.

To recap, in this series, a PROT_MTE page allocation starts with a
typical allocation from anywhere other than MIGRATE_METADATA followed by
the hooks to reserve the corresponding metadata range at (pfn * 128 +
offset) for a 4K page. The whole metadata page is reserved, so the
adjacent 31 pages around the original allocation can also be mapped as
PROT_MTE.

(Peter and Evgenii @ Google had a slightly different approach in their
prototype: separate free_area[] array for PROT_MTE pages; while it has
some advantages, I found it more intrusive since the same page can be on
a free_area/free_list or another)

> 2. Keep track of which pages are tag storage at page granularity (either by
> a page flag, or by checking that the pfn falls in one of the tag storage
> region, or by some other mechanism). When the page allocator takes free
> pages from the MIGRATE_METADATA list to satisfy an allocation, compare the
> gfp mask with the page type, and if the allocation is tagged and the page
> is a tag storage page, put it back at the tail of the free list and choose
> the next page. Repeat until the page allocator finds a normal memory page
> that can be tagged (some refinements obviously needed to need to avoid
> infinite loops).

With large enough CMA areas, there's a real risk of latency spikes, RCU
stalls etc. Not really keen on such heuristics.

-- 
Catalin


Re: [PATCH v3] tpm: Enable hwrng only for Pluton on AMD CPUs

2023-09-11 Thread Jarkko Sakkinen
On Mon Sep 11, 2023 at 1:40 PM EEST, Jarkko Sakkinen wrote:
> Personally I think bugzilla, being user approachable system, should
> be better defined but *theoretically*, at least by the process, it
> can be fully ignored.

I.e. I don't think it should be ignored :-) 

BR, Jarkko


Re: [PATCH] KEYS: trusted: tee: Refactor register SHM usage

2023-09-11 Thread Jarkko Sakkinen
On Tue Sep 5, 2023 at 2:04 PM EEST, Sumit Garg wrote:
> Hi Jarkko,
>
> On Wed, 23 Aug 2023 at 19:58, Jens Wiklander  
> wrote:
> >
> > On Wed, Aug 23, 2023 at 3:04 PM Sumit Garg  wrote:
> > >
> > > On Wed, 23 Aug 2023 at 13:32, Jens Wiklander  
> > > wrote:
> > > >
> > > > On Wed, Aug 23, 2023 at 8:55 AM Sumit Garg  
> > > > wrote:
> > > > >
> > > > > On Tue, 22 Aug 2023 at 18:25, Jens Wiklander 
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Aug 22, 2023 at 04:59:33PM +0530, Sumit Garg wrote:
> > > > > > > The OP-TEE driver using the old SMC based ABI permits overlapping 
> > > > > > > shared
> > > > > > > buffers, but with the new FF-A based ABI each physical page may 
> > > > > > > only
> > > > > > > be registered once.
> > > > > > >
> > > > > > > As the key and blob buffer are allocated adjancently, there is no 
> > > > > > > need
> > > > > > > for redundant register shared memory invocation. Also, it is 
> > > > > > > incompatibile
> > > > > > > with FF-A based ABI limitation. So refactor register shared memory
> > > > > > > implementation to use only single invocation to register both key 
> > > > > > > and blob
> > > > > > > buffers.
> > > > > > >
> > > > > > > Fixes: 4615e5a34b95 ("optee: add FF-A support")
> > > > > > > Reported-by: Jens Wiklander 
> > > > > > > Signed-off-by: Sumit Garg 
> > > > > > > ---
> > > > > > >  security/keys/trusted-keys/trusted_tee.c | 64 
> > > > > > > 
> > > > > > >  1 file changed, 20 insertions(+), 44 deletions(-)
> > > > > > >
> > > > > > > diff --git a/security/keys/trusted-keys/trusted_tee.c 
> > > > > > > b/security/keys/trusted-keys/trusted_tee.c
> > > > > > > index ac3e270ade69..aa3d477de6db 100644
> > > > > > > --- a/security/keys/trusted-keys/trusted_tee.c
> > > > > > > +++ b/security/keys/trusted-keys/trusted_tee.c
> > > > > > > @@ -65,24 +65,16 @@ static int trusted_tee_seal(struct 
> > > > > > > trusted_key_payload *p, char *datablob)
> > > > > > >   int ret;
> > > > > > >   struct tee_ioctl_invoke_arg inv_arg;
> > > > > > >   struct tee_param param[4];
> > > > > > > - struct tee_shm *reg_shm_in = NULL, *reg_shm_out = NULL;
> > > > > > > + struct tee_shm *reg_shm = NULL;
> > > > > > >
> > > > > > >   memset(&inv_arg, 0, sizeof(inv_arg));
> > > > > > >   memset(¶m, 0, sizeof(param));
> > > > > > >
> > > > > > > - reg_shm_in = tee_shm_register_kernel_buf(pvt_data.ctx, 
> > > > > > > p->key,
> > > > > > > -  p->key_len);
> > > > > > > - if (IS_ERR(reg_shm_in)) {
> > > > > > > - dev_err(pvt_data.dev, "key shm register failed\n");
> > > > > > > - return PTR_ERR(reg_shm_in);
> > > > > > > - }
> > > > > > > -
> > > > > > > - reg_shm_out = tee_shm_register_kernel_buf(pvt_data.ctx, 
> > > > > > > p->blob,
> > > > > > > -   sizeof(p->blob));
> > > > > > > - if (IS_ERR(reg_shm_out)) {
> > > > > > > - dev_err(pvt_data.dev, "blob shm register failed\n");
> > > > > > > - ret = PTR_ERR(reg_shm_out);
> > > > > > > - goto out;
> > > > > > > + reg_shm = tee_shm_register_kernel_buf(pvt_data.ctx, p->key,
> > > > > > > +   sizeof(p->key) + 
> > > > > > > sizeof(p->blob));
> > > > > >
> > > > > > This is somewhat fragile. What if struct trusted_key_payload has a 
> > > > > > small
> > > > > > unexpected change in layout?
> > > > >
> > > > > key and blob buffers are just two adjacent fixed sized byte arrays. So
> > > > > I am not worried here as long as they stay adjacent (which has been
> > > > > the case since trusted keys were introduced in the kernel).
> > > >
> > > > Yeah, that was my point, but fine if you don't believe it's an issue.
> > > >
> > >
> > > Does it resolve the issue with FFA ABI for you? It would be good to
> > > have your Tested-by tag.
> >
> > It does:
> > Tested-by: Jens Wiklander 
> > Reviewed-by: Jens Wiklander 
> >
>
> Can you help pick up this fix for v6.6 kernel release?

I pushed it and also added the missing stable tag:

commit 1037d6ec29cdfaaec5277c194b0278eb0a30c3f8 (HEAD -> master, origin/master, 
origin/HEAD)
Author: Sumit Garg 
Date:   Tue Aug 22 16:59:33 2023 +0530

KEYS: trusted: tee: Refactor register SHM usage

The OP-TEE driver using the old SMC based ABI permits overlapping shared
buffers, but with the new FF-A based ABI each physical page may only
be registered once.

As the key and blob buffer are allocated adjancently, there is no need
for redundant register shared memory invocation. Also, it is incompatibile
with FF-A based ABI limitation. So refactor register shared memory
implementation to use only single invocation to register both key and blob
buffers.

[jarkko: Added cc to stable.]
Cc: sta...@vger.kernel.org # v5.16+
Fixes: 4615e5a34b95 ("optee: add FF-A support")
Reported-by: Jens Wiklander 
Signed-off-by: Sumi

Re: Slow boot and shutdown/reboot problems with 6.5.0+

2023-09-11 Thread Bagas Sanjaya
On 07/09/2023 20:56, Marcus Seyfarth wrote:
> As to bisecting: Unfortunately I cannot afford the time right now to bisect 
> this further as the system is used in production and already did invest a lot 
> of time without success into it. Hopefully someone else can find the root 
> cause of the problem. My systemd version is: 254.1, and I also use dbus 
> 1.14.10 and dbus-broker 33.r35.g2220a84 which was configured with -D 
> linux-4-17=true.
> 

[also Cc: Thorsten]

To Thorsten: It looks like the reporter can neither bisect nor
test the mainline since he only has production machine instead.
What can I (and the reporter do) in this case?

Thanks.

-- 
An old man doll... just what I always wanted! - Clara



Re: [PATCH v1 1/1] bitops: Share BYTES_TO_BITS() for everyone

2023-09-11 Thread Alexander Lobakin
From: Yury Norov 
Date: Sun, 10 Sep 2023 07:07:16 -0700

> On Wed, Sep 06, 2023 at 05:54:26PM +0300, Andy Shevchenko wrote:
>> On Wed, Sep 06, 2023 at 04:40:39PM +0200, Alexander Lobakin wrote:
>>> From: Andy Shevchenko 
>>> Date: Thu, 31 Aug 2023 16:21:30 +0300
 On Fri, Aug 25, 2023 at 04:49:07PM +0200, Alexander Lobakin wrote:
> From: Andy Shevchenko 
> Date: Thu, 24 Aug 2023 15:37:28 +0300
>
>> It may be new callers for the same macro, share it.
>>
>> Note, it's unknown why it's represented in the current form instead of
>> simple multiplication and commit 1ff511e35ed8 ("tracing/kprobes: Add
>> bitfield type") doesn't explain that neither. Let leave it as is and
>> we may improve it in the future.
>
> Maybe symmetrical change in tools/ like I did[0] an aeon ago?

 Hmm... Why can't you simply upstream your version? It seems better than 
 mine.
>>>
>>> It was a part of the Netlink bigint API which is a bit on hold for now
>>> (I needed this macro available treewide).
>>> But I can send it as standalone if you're fine with that.
>>
>> I'm fine. Yury?
> 
> Do we have opencoded BYTES_TO_BITS() somewhere else? If so, it should be
> fixed in the same series.

Treewide -- a ton.
We could add it so that devs could start using it and stop open-coding :D

> 
> Regarding implementation, the current:
> 
> #define BYTES_TO_BITS(nb)  ((BITS_PER_LONG * (nb)) / sizeof(long))
> 
> looks weird. Maybe there are some special considerations in a tracing
> subsystem to make it like this, but as per Masami's email - there's
> not. 
> 
> For a general purpose I'd suggest a simpler:
> #define BYTES_TO_BITS(nb)  ((nb) * BITS_PER_BYTE)

I also didn't notice anything that would require using logic more
complex than this one. It would probably make more sense to define
it that way when moving.

> 
> Thanks,
> Yury

Thanks,
Olek


[PATCH] tracing/synthetic: Print out u64 values properly

2023-09-11 Thread Tero Kristo
The synth traces incorrectly print pointer to the synthetic event values
instead of the actual value when using u64 type. Fix by addressing the
contents of the union properly.

Fixes: ddeea494a16f ("tracing/synthetic: Use union instead of casts")
Cc: sta...@vger.kernel.org
Signed-off-by: Tero Kristo 
---
 kernel/trace/trace_events_synth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_synth.c 
b/kernel/trace/trace_events_synth.c
index 7fff8235075f..070365959c0a 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -337,7 +337,7 @@ static void print_synth_event_num_val(struct trace_seq *s,
break;
 
default:
-   trace_seq_printf(s, print_fmt, name, val, space);
+   trace_seq_printf(s, print_fmt, name, val->as_u64, space);
break;
}
 }
-- 
2.40.1



Re: [PATCH 1/2] mm/damon/core: add a tracepoint for damos apply target regions

2023-09-11 Thread Steven Rostedt
On Mon, 11 Sep 2023 19:05:04 +
SeongJae Park  wrote:

> > Also, this if statement is only done when the trace event is enabled, so
> > it's equivalent to:
> > 
> > if (trace_damos_before_apply_enabled()) {
> > if (sdx >= 0)
> > trace_damos_before_apply(cidx, sidx, tidx, r,
> > damon_nr_regions(t));
> > }  
> 
> Again, thank you very much for letting me know this awesome feature.  However,
> sidx is supposed to be always >=0 here, since kdamond is running in single
> thread and hence no race is expected.  If it exists, it's a bug.  So, I
> wouldn't make this change.  Appreciate again for letting me know this very
> useful feature, and please let me know if I'm missing something, though!

The race isn't with your code, but the enabling of tracing.

Let's say you enable tracing just ass it passed the first:

if (trace_damos_before_apply_enabled()) {
   
damon_for_each_scheme(siter, c) {
if (siter == s)
break;
sidx++;
}
damon_for_each_target(titer, c) {
if (titer == t)
break;
tidx++;
}  

Now, sidx and tidx is zero (when they were not computed, thus, they
shouldn't be zero.

Then tracing is fully enabled here, and now we enter:

if (trace_damos_before_apply_enabled()) {
trace_damos_before_apply(cidx, sidx, tidx, r,
damon_nr_regions(t));
}

Now the trace event is hit with sidx and tidx zero when they should not be.
This could confuse you when looking at the report.

What I suggested was to initialize sidx to zero, set it in the first
trace_*_enabled() check, and ignore calling the tracepoint if it's not >= 0.

-- Steve


Re: [PATCH v2 11/25] security: Align inode_setattr hook definition with EVM

2023-09-11 Thread Jarkko Sakkinen
On Tue Sep 5, 2023 at 6:56 PM EEST, Casey Schaufler wrote:
> On 9/4/2023 2:08 PM, Jarkko Sakkinen wrote:
> > On Thu Aug 31, 2023 at 1:41 PM EEST, Roberto Sassu wrote:
> >> From: Roberto Sassu 
> >>
> >> Add the idmap parameter to the definition, so that evm_inode_setattr() can
> >> be registered as this hook implementation.
> >>
> >> Signed-off-by: Roberto Sassu 
> >> Reviewed-by: Stefan Berger 
> >> Acked-by: Casey Schaufler 
> >> ---
> >>  include/linux/lsm_hook_defs.h | 3 ++-
> >>  security/security.c   | 2 +-
> >>  security/selinux/hooks.c  | 3 ++-
> >>  security/smack/smack_lsm.c| 4 +++-
> >>  4 files changed, 8 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> >> index 4bdddb52a8fe..fdf075a6b1bb 100644
> >> --- a/include/linux/lsm_hook_defs.h
> >> +++ b/include/linux/lsm_hook_defs.h
> >> @@ -134,7 +134,8 @@ LSM_HOOK(int, 0, inode_readlink, struct dentry *dentry)
> >>  LSM_HOOK(int, 0, inode_follow_link, struct dentry *dentry, struct inode 
> >> *inode,
> >> bool rcu)
> >>  LSM_HOOK(int, 0, inode_permission, struct inode *inode, int mask)
> >> -LSM_HOOK(int, 0, inode_setattr, struct dentry *dentry, struct iattr *attr)
> >> +LSM_HOOK(int, 0, inode_setattr, struct mnt_idmap *idmap, struct dentry 
> >> *dentry,
> >> +   struct iattr *attr)
> > LSM_HOOK(int, 0, inode_setattr, struct mnt_idmap *idmap, struct dentry 
> > *dentry, struct iattr *attr)
> >
> > Only 99 characters, i.e. breaking into two lines is not necessary.
>
> We're keeping the LSM code in the ancient 80 character format.
> Until we get some fresh, young maintainers involved who can convince
> us that line wrapped 80 character terminals are kewl we're sticking
> with what we know.
>
>   https://lwn.net/Articles/822168/

Pretty artificial counter-example tbh :-) Even with Rust people tend to
stick one character variable names for trivial integer indices.

BR, Jarkko


Re: [PATCH RFC 00/37] Add support for arm64 MTE dynamic tag storage reuse

2023-09-11 Thread David Hildenbrand

On 11.09.23 13:52, Catalin Marinas wrote:

On Wed, Sep 06, 2023 at 12:23:21PM +0100, Alexandru Elisei wrote:

On Thu, Aug 24, 2023 at 04:24:30PM +0100, Catalin Marinas wrote:

On Thu, Aug 24, 2023 at 01:25:41PM +0200, David Hildenbrand wrote:

On 24.08.23 13:06, David Hildenbrand wrote:

Regarding one complication: "The kernel needs to know where to allocate
a PROT_MTE page from or migrate a current page if it becomes PROT_MTE
(mprotect()) and the range it is in does not support tagging.",
simplified handling would be if it's in a MIGRATE_CMA pageblock, it
doesn't support tagging. You have to migrate to a !CMA page (for
example, not specifying GFP_MOVABLE as a quick way to achieve that).


Okay, I now realize that this patch set effectively duplicates some CMA
behavior using a new migrate-type.

[...]

I considered mixing the tag storage memory memory with normal memory and
adding it to MIGRATE_CMA. But since tag storage memory cannot be tagged,
this means that it's not enough anymore to have a __GFP_MOVABLE allocation
request to use MIGRATE_CMA.

I considered two solutions to this problem:

1. Only allocate from MIGRATE_CMA is the requested memory is not tagged =>
this effectively means transforming all memory from MIGRATE_CMA into the
MIGRATE_METADATA migratetype that the series introduces. Not very
appealing, because that means treating normal memory that is also on the
MIGRATE_CMA lists as tagged memory.


That's indeed not ideal. We could try this if it makes the patches
significantly simpler, though I'm not so sure.

Allocating metadata is the easier part as we know the correspondence
from the tagged pages (32 PROT_MTE page) to the metadata page (1 tag
storage page), so alloc_contig_range() does this for us. Just adding it
to the CMA range is sufficient.

However, making sure that we don't allocate PROT_MTE pages from the
metadata range is what led us to another migrate type. I guess we could
achieve something similar with a new zone or a CPU-less NUMA node,


Ideally, no significant core-mm changes to optimize for an architecture 
oddity. That implies, no new zones and no new migratetypes -- unless it 
is unavoidable and you are confident that you can convince core-MM 
people that the use case (giving back 3% of system RAM at max in some 
setups) is worth the trouble.


I also had CPU-less NUMA nodes in mind when thinking about that, but not 
sure how easy it would be to integrate it. If the tag memory has 
actually different performance characteristics as well, a NUMA node 
would be the right choice.


If we could find some way to easily support this either via CMA or 
CPU-less NUMA nodes, that would be much preferable; even if we cannot 
cover each and every future use case right now. I expect some issues 
with CXL+MTE either way , but are happy to be taught otherwise :)



Another thought I had was adding something like CMA memory 
characteristics. Like, asking if a given CMA area/page supports tagging 
(i.e., flag for the CMA area set?)?


When you need memory that supports tagging and have a page that does not 
support tagging (CMA && taggable), simply migrate to !MOVABLE memory 
(eventually we could also try adding !CMA).


Was that discussed and what would be the challenges with that? Page 
migration due to compaction comes to mind, but it might also be easy to 
handle if we can just avoid CMA memory for that.



though the latter is not guaranteed not to allocate memory from the
range, only make it less likely. Both these options are less flexible in
terms of size/alignment/placement.

Maybe as a quick hack - only allow PROT_MTE from ZONE_NORMAL and
configure the metadata range in ZONE_MOVABLE but at some point I'd
expect some CXL-attached memory to support MTE with additional carveout
reserved.


I have no idea how we could possibly cleanly support memory hotplug in 
virtual environments (virtual DIMMs, virtio-mem) with MTE. In contrast 
to s390x storage keys, the approach that arm64 with MTE took here 
(exposing tag memory to the VM) makes it rather hard and complicated.


--
Cheers,

David / dhildenb



Re: [PATCH] certs: Restrict blacklist updates to the secondary trusted keyring

2023-09-11 Thread Mickaël Salaün
On Mon, Sep 11, 2023 at 09:29:07AM -0400, Mimi Zohar wrote:
> Hi Eric,
> 
> On Fri, 2023-09-08 at 17:34 -0400, Eric Snowberg wrote:
> > Currently root can dynamically update the blacklist keyring if the hash
> > being added is signed and vouched for by the builtin trusted keyring.
> > Currently keys in the secondary trusted keyring can not be used.
> > 
> > Keys within the secondary trusted keyring carry the same capabilities as
> > the builtin trusted keyring.  Relax the current restriction for updating
> > the .blacklist keyring and allow the secondary to also be referenced as
> > a trust source.  Since the machine keyring is linked to the secondary
> > trusted keyring, any key within it may also be used.
> > 
> > An example use case for this is IMA appraisal.  Now that IMA both
> > references the blacklist keyring and allows the machine owner to add
> > custom IMA CA certs via the machine keyring, this adds the additional
> > capability for the machine owner to also do revocations on a running
> > system.
> > 
> > IMA appraisal usage example to add a revocation for /usr/foo:
> > 
> > sha256sum /bin/foo | awk '{printf "bin:" $1}' > hash.txt
> > 
> > openssl smime -sign -in hash.txt -inkey machine-private-key.pem \
> >-signer machine-certificate.pem -noattr -binary -outform DER \
> >-out hash.p7s
> > 
> > keyctl padd blacklist "$(< hash.txt)" %:.blacklist < hash.p7s
> > 
> > Signed-off-by: Eric Snowberg 
> 
> The secondary keyring may include both CA and code signing keys.  With
> this change any key loaded onto the secondary keyring may blacklist a
> hash.  Wouldn't it make more sense to limit blacklisting
> certificates/hashes to at least CA keys? 

Some operational constraints may limit what a CA can sign.

This change is critical and should be tied to a dedicated kernel config
(disabled by default), otherwise existing systems using this feature
will have their threat model automatically changed without notice.

> 
> > ---
> >  certs/Kconfig | 2 +-
> >  certs/blacklist.c | 4 ++--
> >  2 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/certs/Kconfig b/certs/Kconfig
> > index 1f109b070877..23dc87c52aff 100644
> > --- a/certs/Kconfig
> > +++ b/certs/Kconfig
> > @@ -134,7 +134,7 @@ config SYSTEM_BLACKLIST_AUTH_UPDATE
> > depends on SYSTEM_DATA_VERIFICATION
> > help
> >   If set, provide the ability to load new blacklist keys at run time if
> > - they are signed and vouched by a certificate from the builtin trusted
> > + they are signed and vouched by a certificate from the secondary 
> > trusted
> 
> If CONFIG_SECONDARY_TRUSTED_KEYRING is not enabled, it falls back to
> the builtin keyring.  Please update the comment accordingly.
> 
> >   keyring.  The PKCS#7 signature of the description is set in the key
> >   payload.  Blacklist keys cannot be removed.
> >  
> > diff --git a/certs/blacklist.c b/certs/blacklist.c
> > index 675dd7a8f07a..0b346048ae2d 100644
> > --- a/certs/blacklist.c
> > +++ b/certs/blacklist.c
> > @@ -102,12 +102,12 @@ static int blacklist_key_instantiate(struct key *key,
> >  
> >  #ifdef CONFIG_SYSTEM_BLACKLIST_AUTH_UPDATE
> > /*
> > -* Verifies the description's PKCS#7 signature against the builtin
> > +* Verifies the description's PKCS#7 signature against the secondary
> >  * trusted keyring.
> >  */
> 
> And similarly here ...
> 
> > err = verify_pkcs7_signature(key->description,
> > strlen(key->description), prep->data, prep->datalen,
> > -   NULL, VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
> > +   VERIFY_USE_SECONDARY_KEYRING, 
> > VERIFYING_UNSPECIFIED_SIGNATURE, NULL, NULL);
> > if (err)
> > return err;
> >  #else
> 
> -- 
> thanks,
> 
> Mimi
> 


[PATCH] ACPI: OSI: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

We know `osi->string` is a NUL-terminated string due to its eventual use
in `acpi_install_interface()` and `acpi_remove_interface()` which expect
a `acpi_string` which has been specifically typedef'd as:
|  typedef char *acpi_string;   /* Null terminated ASCII string */

... and which also has other string functions used on it like `strlen`.
Furthermore, padding is not needed in this instance either.

Due to the reasoning above a suitable replacement is `strscpy` [2] since
it guarantees NUL-termination on the destination buffer and doesn't
unnecessarily NUL-pad.

While there is unlikely to be a buffer overread (or other related bug)
in this case, we should still favor a more robust and less ambiguous
interface.

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings
 [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 
---
Note: build-tested
---
 drivers/acpi/osi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/osi.c b/drivers/acpi/osi.c
index d4405e1ca9b9..df9328c850bd 100644
--- a/drivers/acpi/osi.c
+++ b/drivers/acpi/osi.c
@@ -110,7 +110,7 @@ void __init acpi_osi_setup(char *str)
break;
} else if (osi->string[0] == '\0') {
osi->enable = enable;
-   strncpy(osi->string, str, OSI_STRING_LENGTH_MAX);
+   strscpy(osi->string, str, OSI_STRING_LENGTH_MAX);
break;
}
}

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230911-strncpy-drivers-acpi-osi-c-c801b7427987

Best regards,
--
Justin Stitt 



Re: [PATCH] slub: Introduce CONFIG_SLUB_RCU_DEBUG

2023-09-11 Thread Dmitry Vyukov
On Mon, 28 Aug 2023 at 16:40, Jann Horn  wrote:
>
> On Sat, Aug 26, 2023 at 5:32 AM Dmitry Vyukov  wrote:
> > On Fri, 25 Aug 2023 at 23:15, Jann Horn  wrote:
> > > Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> > > slabs because use-after-free is allowed within the RCU grace period by
> > > design.
> > >
> > > Add a SLUB debugging feature which RCU-delays every individual
> > > kmem_cache_free() before either actually freeing the object or handing it
> > > off to KASAN, and change KASAN to poison freed objects as normal when this
> > > option is enabled.
> > >
> > > Note that this creates a 16-byte unpoisoned area in the middle of the
> > > slab metadata area, which kinda sucks but seems to be necessary in order
> > > to be able to store an rcu_head in there without triggering an ASAN
> > > splat during RCU callback processing.
> >
> > Nice!
> >
> > Can't we unpoision this rcu_head right before call_rcu() and repoison
> > after receiving the callback?
>
> Yeah, I think that should work. It looks like currently
> kasan_unpoison() is exposed in include/linux/kasan.h but
> kasan_poison() is not, and its inline definition probably means I
> can't just move it out of mm/kasan/kasan.h into include/linux/kasan.h;
> do you have a preference for how I should handle this? Hmm, and it
> also looks like code outside of mm/kasan/ anyway wouldn't know what
> are valid values for the "value" argument to kasan_poison().
> I also have another feature idea that would also benefit from having
> something like kasan_poison() available in include/linux/kasan.h, so I
> would prefer that over adding another special-case function inside
> KASAN for poisoning this piece of slab metadata...
>
> I guess I could define a wrapper around kasan_poison() in
> mm/kasan/generic.c that uses a new poison value for "some other part
> of the kernel told us to poison this area", and then expose that
> wrapper with a declaration in include/mm/kasan.h? Something like:
>
> void kasan_poison_outline(const void *addr, size_t size, bool init)
> {
>   kasan_poison(addr, size, KASAN_CUSTOM, init);
> }

Looks reasonable.

> > What happens on cache destruction?
> > Currently we purge quarantine on cache destruction to be able to
> > safely destroy the cache. I suspect we may need to somehow purge rcu
> > callbacks as well, or do something else.
>
> Ooh, good point, I hadn't thought about that... currently
> shutdown_cache() assumes that all the objects have already been freed,
> then puts the kmem_cache on a list for
> slab_caches_to_rcu_destroy_workfn(), which then waits with an
> rcu_barrier() until the slab's pages are all gone.

I guess this is what the test robot found as well.

> Luckily kmem_cache_destroy() is already a sleepable operation, so
> maybe I should just slap another rcu_barrier() in there for builds
> with this config option enabled... I think that should be fine for an
> option mostly intended for debugging.

This is definitely the simplest option.
I am a bit concerned about performance if massive cache destruction
happens (e.g. maybe during destruction of a set of namespaces for a
container). Net namespace is slow to destroy for this reason IIRC,
there were some optimizations to batch rcu synchronization. And now we
are adding more.
But I don't see any reasonable faster option as well.
So I guess let's do this now and optimize later (or not).


Re: [PATCH] x86/tdx: refactor deprecated strncpy

2023-09-11 Thread Dave Hansen
On 9/11/23 11:27, Justin Stitt wrote:
> `strncpy` is deprecated and we should prefer more robust string apis.

I dunno.  It actually seems like a pretty good fit here.

> In this case, `message.str` is not expected to be NUL-terminated as it
> is simply a buffer of characters residing in a union which allows for
> named fields representing 8 bytes each. There is only one caller of
> `tdx_panic()` and they use a 59-length string for `msg`:
> | const char *msg = "TD misconfiguration: SEPT_VE_DISABLE attribute must 
> be set.";

I'm not really following this logic.

We need to do the following:

1. Make sure not to over write past the end of 'message'
2. Make sure not to over read past the end of 'msg'
3. Make sure not to leak stack data into the hypercall registers
   in the case of short strings.

strncpy() does #1, #2 and #3 just fine.

The resulting string does *NOT* need to be NULL-terminated.  See the
comment:

/* VMM assumes '\0' in byte 65, if the message took all 64 bytes */

Are there cases where strncpy() doesn't NULL-terminate the string other
than when the buffer is full?

I actually didn't realize that strncpy() pads its output up to the full
size.  I wonder if Kirill used it intentionally or whether he got lucky
here. :)


RE: [PATCH V2] ACPI: APEI: Use ERST timeout for slow devices

2023-09-11 Thread Jeshua Smith
Any further questions? Anything else holding up this patch?

-Original Message-
From: Jeshua Smith  
Sent: Friday, August 4, 2023 7:05 PM
To: Luck, Tony ; keesc...@chromium.org; 
gpicc...@igalia.com; raf...@kernel.org; l...@kernel.org; james.mo...@arm.com; 
b...@alien8.de
Cc: linux-a...@vger.kernel.org; linux-kernel@vger.kernel.org; 
linux-harden...@vger.kernel.org; linux-te...@vger.kernel.org; Thierry Reding 
; Jonathan Hunter 
Subject: RE: [PATCH V2] ACPI: APEI: Use ERST timeout for slow devices

Thanks for the reply.

It's not very easy to see. It's just a bit down from the link you sent. It's 
the last possible action in the Serialization Actions table:
https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#serialization-actions

18.5.1.1. Serialization Actions

GET_EXECUTE-_OPERATION_TIMINGS

Returns an encoded QWORD:
[63:32] value in microseconds that the platform expects would be the maximum 
amount of time it will take to process and complete an EXECUTE_OPERATION.
[31:0] value in microseconds that the platform expects would be the nominal 
amount of time it will take to process and complete an EXECUTE_OPERATION.

-Original Message-
From: Luck, Tony  
Sent: Friday, August 4, 2023 10:31 AM
To: Jeshua Smith ; keesc...@chromium.org; 
gpicc...@igalia.com; raf...@kernel.org; l...@kernel.org; james.mo...@arm.com; 
b...@alien8.de
Cc: linux-a...@vger.kernel.org; linux-kernel@vger.kernel.org; 
linux-harden...@vger.kernel.org; linux-te...@vger.kernel.org; Thierry Reding 
; Jonathan Hunter 
Subject: RE: [PATCH V2] ACPI: APEI: Use ERST timeout for slow devices

External email: Use caution opening links or attachments


> Can the maintainers please respond to my patch?

Can you give a reference to the ACPI spec where this timing information is 
documented? I'm looking at ACPI 6.5 and don't see anything about this.

https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html#error-serialization

-Tony


Re: [PATCH] slub: Introduce CONFIG_SLUB_RCU_DEBUG

2023-09-11 Thread Marco Elver
On Fri, 25 Aug 2023 at 23:15, 'Jann Horn' via kasan-dev
 wrote:
>
> Currently, KASAN is unable to catch use-after-free in SLAB_TYPESAFE_BY_RCU
> slabs because use-after-free is allowed within the RCU grace period by
> design.
>
> Add a SLUB debugging feature which RCU-delays every individual
> kmem_cache_free() before either actually freeing the object or handing it
> off to KASAN, and change KASAN to poison freed objects as normal when this
> option is enabled.
>
> Note that this creates a 16-byte unpoisoned area in the middle of the
> slab metadata area, which kinda sucks but seems to be necessary in order
> to be able to store an rcu_head in there without triggering an ASAN
> splat during RCU callback processing.
>
> For now I've configured Kconfig.kasan to always enable this feature in the
> GENERIC and SW_TAGS modes; I'm not forcibly enabling it in HW_TAGS mode
> because I'm not sure if it might have unwanted performance degradation
> effects there.
>
> Signed-off-by: Jann Horn 
> ---
> can I get a review from the KASAN folks of this?
> I have been running it on my laptop for a bit and it seems to be working
> fine.
>
> Notes:
> With this patch, a UAF on a TYPESAFE_BY_RCU will splat with an error
> like this (tested by reverting a security bugfix).
> Note that, in the ASAN memory state dump, we can see the little
> unpoisoned 16-byte areas storing the rcu_head.
>
> BUG: KASAN: slab-use-after-free in folio_lock_anon_vma_read+0x129/0x4c0
> Read of size 8 at addr 888004e85b00 by task forkforkfork/592
>
> CPU: 0 PID: 592 Comm: forkforkfork Not tainted 
> 6.5.0-rc7-00105-gae70c1e1f6f5-dirty #334
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.16.2-debian-1.16.2-1 04/01/2014
> Call Trace:
>  
>  dump_stack_lvl+0x4a/0x80
>  print_report+0xcf/0x660
>  kasan_report+0xd4/0x110
>  folio_lock_anon_vma_read+0x129/0x4c0
>  rmap_walk_anon+0x1cc/0x290
>  folio_referenced+0x277/0x2a0
>  shrink_folio_list+0xb8c/0x1680
>  reclaim_folio_list+0xdc/0x1f0
>  reclaim_pages+0x211/0x280
>  madvise_cold_or_pageout_pte_range+0x812/0xb70
>  walk_pgd_range+0x70b/0xce0
>  __walk_page_range+0x343/0x360
>  walk_page_range+0x227/0x280
>  madvise_pageout+0x1cd/0x2d0
>  do_madvise+0x552/0x15a0
>  __x64_sys_madvise+0x62/0x70
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [...]
>  
>
> Allocated by task 574:
>  kasan_save_stack+0x33/0x60
>  kasan_set_track+0x25/0x30
>  __kasan_slab_alloc+0x6e/0x70
>  kmem_cache_alloc+0xfd/0x2b0
>  anon_vma_fork+0x88/0x270
>  dup_mmap+0x87c/0xc10
>  copy_process+0x3399/0x3590
>  kernel_clone+0x10e/0x480
>  __do_sys_clone+0xa1/0xe0
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>
> Freed by task 0:
>  kasan_save_stack+0x33/0x60
>  kasan_set_track+0x25/0x30
>  kasan_save_free_info+0x2b/0x50
>  __kasan_slab_free+0xfe/0x180
>  slab_free_after_rcu_debug+0xad/0x200
>  rcu_core+0x638/0x1620
>  __do_softirq+0x14c/0x581
>
> Last potentially related work creation:
>  kasan_save_stack+0x33/0x60
>  __kasan_record_aux_stack+0x94/0xa0
>  __call_rcu_common.constprop.0+0x47/0x730
>  __put_anon_vma+0x6e/0x150
>  unlink_anon_vmas+0x277/0x2e0
>  vma_complete+0x341/0x580
>  vma_merge+0x613/0xff0
>  mprotect_fixup+0x1c0/0x510
>  do_mprotect_pkey+0x5a7/0x710
>  __x64_sys_mprotect+0x47/0x60
>  do_syscall_64+0x3b/0x90
>  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
>
> Second to last potentially related work creation:
> [...]
>
> The buggy address belongs to the object at 888004e85b00
>  which belongs to the cache anon_vma of size 192
> The buggy address is located 0 bytes inside of
>  freed 192-byte region [888004e85b00, 888004e85bc0)
>
> The buggy address belongs to the physical page:
> [...]
>
> Memory state around the buggy address:
>  888004e85a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  888004e85a80: 00 00 00 00 00 00 00 00 fc 00 00 fc fc fc fc fc
> >888004e85b00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>^
>  888004e85b80: fb fb fb fb fb fb fb fb fc 00 00 fc fc fc fc fc
>  888004e85c00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
>  include/linux/kasan.h|  6 
>  include/linux/slub_def.h |  3 ++
>  lib/Kconfig.kasan|  2 ++
>  mm/Kconfig.debug | 21 +
>  mm/kasan/common.c| 15 -
>  mm/slub.c| 66 +---

Nice!

It'd be good to add a test case to lib/test_kasan module. I think you
could just copy/adjust the test case "test_memcache_typesafe_by_rcu"
from the KFENCE KUnit test suite.

>  6 files changed, 107 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/kasan.h b/incl

Re: [PATCH] remoteproc: k3-r5: Wait for core0 power-up before powering up core1

2023-09-11 Thread Mathieu Poirier
Hi Apurva,

On Wed, Sep 06, 2023 at 06:17:56PM +0530, Apurva Nandan wrote:
> PSC controller has a limitation that it can only power-up the second core
> when the first core is in ON state. Power-state for core0 should be equal
> to or higher than core1, else the kernel is seen hanging during rproc
> loading.
> 
> Make the powering up of cores sequential, by waiting for the current core
> to power-up before proceeding to the next core, with a timeout of 2sec.
> Add a wait queue event in k3_r5_cluster_rproc_init call, that will wait
> for the current core to be released from reset before proceeding with the
> next core.
> 
> Fixes: 6dedbd1d5443 ("remoteproc: k3-r5: Add a remoteproc driver for R5F 
> subsystem")
> 
> Signed-off-by: Apurva Nandan 
> ---
> 
>  kpv report: 
> https://gist.githubusercontent.com/apurvanandan1997/feb3b304121c265b7827be43752b7ae8/raw
> 
>  drivers/remoteproc/ti_k3_r5_remoteproc.c | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/drivers/remoteproc/ti_k3_r5_remoteproc.c 
> b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> index ad3415a3851b..ba5e503f7c9c 100644
> --- a/drivers/remoteproc/ti_k3_r5_remoteproc.c
> +++ b/drivers/remoteproc/ti_k3_r5_remoteproc.c
> @@ -103,12 +103,14 @@ struct k3_r5_soc_data {
>   * @dev: cached device pointer
>   * @mode: Mode to configure the Cluster - Split or LockStep
>   * @cores: list of R5 cores within the cluster
> + * @core_transition: wait queue to sync core state changes
>   * @soc_data: SoC-specific feature data for a R5FSS
>   */
>  struct k3_r5_cluster {
>   struct device *dev;
>   enum cluster_mode mode;
>   struct list_head cores;
> + wait_queue_head_t core_transition;
>   const struct k3_r5_soc_data *soc_data;
>  };
>  
> @@ -128,6 +130,7 @@ struct k3_r5_cluster {
>   * @atcm_enable: flag to control ATCM enablement
>   * @btcm_enable: flag to control BTCM enablement
>   * @loczrama: flag to dictate which TCM is at device address 0x0
> + * @released_from_reset: flag to signal when core is out of reset
>   */
>  struct k3_r5_core {
>   struct list_head elem;
> @@ -144,6 +147,7 @@ struct k3_r5_core {
>   u32 atcm_enable;
>   u32 btcm_enable;
>   u32 loczrama;
> + bool released_from_reset;
>  };
>  
>  /**
> @@ -460,6 +464,8 @@ static int k3_r5_rproc_prepare(struct rproc *rproc)
>   ret);
>   return ret;
>   }
> + core->released_from_reset = true;
> + wake_up_interruptible(&cluster->core_transition);
>  
>   /*
>* Newer IP revisions like on J7200 SoCs support h/w auto-initialization
> @@ -1140,6 +1146,7 @@ static int k3_r5_rproc_configure_mode(struct 
> k3_r5_rproc *kproc)
>   return ret;
>   }
>  
> + core->released_from_reset = c_state;
>   ret = ti_sci_proc_get_status(core->tsp, &boot_vec, &cfg, &ctrl,
>&stat);
>   if (ret < 0) {
> @@ -1280,6 +1287,21 @@ static int k3_r5_cluster_rproc_init(struct 
> platform_device *pdev)
>   cluster->mode == CLUSTER_MODE_SINGLECPU ||
>   cluster->mode == CLUSTER_MODE_SINGLECORE)
>   break;
> +
> + /* R5 cores require to be powered on sequentially, core0
> +  * should be in higher power state than core1 in a cluster
> +  * So, wait for current core to power up before proceeding
> +  * to next core and put timeout of 2sec for each core.
> +  */

Wrong multi-line comment format.

> + ret = wait_event_interruptible_timeout(cluster->core_transition,
> +
> core->released_from_reset,
> +msecs_to_jiffies(2000));
> + if (ret <= 0) {
> + dev_err(dev,
> + "Timed out waiting for %s core to power up!\n",
> + rproc->name);
> + return ret;
> + }

>From my perspective, this is needed because rproc_auto_boot_callback() for 
>core1
can be called before core0 due to thread execution order.  Am I correct?  

If so please add this explanation to the comment you have above.  Also, let's
say a user decides to switch both cores off after reboot.  At that time, what
prevents a user from switching on core1 before core0 via sysfs?

Thanks,
Mathieu

>   }
>  
>   return 0;
> @@ -1709,6 +1731,7 @@ static int k3_r5_probe(struct platform_device *pdev)
>   cluster->dev = dev;
>   cluster->soc_data = data;
>   INIT_LIST_HEAD(&cluster->cores);
> + init_waitqueue_head(&cluster->core_transition);
>  
>   ret = of_property_read_u32(np, "ti,cluster-mode", &cluster->mode);
>   if (ret < 0 && ret != -EINVAL) {
> -- 
> 2.34.1
> 


[PATCH] x86/tdx: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
`strncpy` is deprecated and we should prefer more robust string apis.

In this case, `message.str` is not expected to be NUL-terminated as it
is simply a buffer of characters residing in a union which allows for
named fields representing 8 bytes each. There is only one caller of
`tdx_panic()` and they use a 59-length string for `msg`:
|   const char *msg = "TD misconfiguration: SEPT_VE_DISABLE attribute must 
be set.";

Given all this information, let's use `strtomem_pad` as this matches the
functionality of `strncpy` in this instances whilst being a more robust
and easier to understand interface.

Link: 
www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Cc: Nick Desaulniers 
Signed-off-by: Justin Stitt 
---
Note: build-tested
---
 arch/x86/coco/tdx/tdx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..2e1be592c220 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -119,7 +119,7 @@ static void __noreturn tdx_panic(const char *msg)
} message;
 
/* VMM assumes '\0' in byte 65, if the message took all 64 bytes */
-   strncpy(message.str, msg, 64);
+   strtomem_pad(message.str, msg, '\0');
 
args.r8  = message.r8;
args.r9  = message.r9;

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230911-strncpy-arch-x86-coco-tdx-tdx-c-98b0b966bb8d

Best regards,
--
Justin Stitt 



[PATCH] xen/efi: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

`efi_loader_signature` has space for 4 bytes. We are copying "Xen" (3 bytes)
plus a NUL-byte which makes 4 total bytes. With that being said, there is
currently not a bug with the current `strncpy()` implementation in terms of
buffer overreads but we should favor a more robust string interface
either way.

A suitable replacement is `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer while being functionally the
same in this case.

Link: 
www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 
---
Note: build-tested
---
 arch/x86/xen/efi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/xen/efi.c b/arch/x86/xen/efi.c
index 863d0d6b3edc..7250d0e0e1a9 100644
--- a/arch/x86/xen/efi.c
+++ b/arch/x86/xen/efi.c
@@ -138,7 +138,7 @@ void __init xen_efi_init(struct boot_params *boot_params)
if (efi_systab_xen == NULL)
return;
 
-   strncpy((char *)&boot_params->efi_info.efi_loader_signature, "Xen",
+   strscpy((char *)&boot_params->efi_info.efi_loader_signature, "Xen",
sizeof(boot_params->efi_info.efi_loader_signature));
boot_params->efi_info.efi_systab = (__u32)__pa(efi_systab_xen);
boot_params->efi_info.efi_systab_hi = (__u32)(__pa(efi_systab_xen) >> 
32);

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 20230911-strncpy-arch-x86-xen-efi-c-14292f5a79ee

Best regards,
--
Justin Stitt 



[PATCH] um,ethertap: refactor deprecated strncpy

2023-09-11 Thread Justin Stitt
`strncpy` is deprecated for use on NUL-terminated destination strings [1].

`gate_buf` should always be NUL-terminated and does not require
NUL-padding. It is used as a string arg inside an argv array given to
`run_helper()`. Due to this, let's use `strscpy` as it guarantees
NUL-terminated on the destination buffer preventing potential buffer
overreads [2].

This exact invocation was changed from `strcpy` to `strncpy` in commit
7879b1d94badb ("um,ethertap: use strncpy") back in 2015. Let's continue
hardening our `str*cpy` apis and use the newer and safer `strscpy`!

Link: 
www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings[1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-harden...@vger.kernel.org
Cc: Kees Cook 
Signed-off-by: Justin Stitt 
---
 arch/um/os-Linux/drivers/ethertap_user.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/um/os-Linux/drivers/ethertap_user.c 
b/arch/um/os-Linux/drivers/ethertap_user.c
index 9483021d86dd..3363851a4ae8 100644
--- a/arch/um/os-Linux/drivers/ethertap_user.c
+++ b/arch/um/os-Linux/drivers/ethertap_user.c
@@ -105,7 +105,7 @@ static int etap_tramp(char *dev, char *gate, int control_me,
sprintf(data_fd_buf, "%d", data_remote);
sprintf(version_buf, "%d", UML_NET_VERSION);
if (gate != NULL) {
-   strncpy(gate_buf, gate, 15);
+   strscpy(gate_buf, gate, sizeof(gate_buf));
args = setup_args;
}
else args = nosetup_args;

---
base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
change-id: 
20230911-strncpy-arch-um-os-linux-drivers-ethertap_user-c-859160d13f59

Best regards,
--
Justin Stitt 



Re: [PATCH 04/11] arm64: dts: qcom: pm7250b: make SID configurable

2023-09-11 Thread Luca Weiss
On Tue Sep 5, 2023 at 10:30 AM CEST, Luca Weiss wrote:
> On Thu Aug 31, 2023 at 2:27 PM CEST, Dmitry Baryshkov wrote:
> > On Thu, 31 Aug 2023 at 14:54, Krzysztof Kozlowski
> >  wrote:
> > >
> > > On 31/08/2023 13:33, Dmitry Baryshkov wrote:
> > > > On Thu, 31 Aug 2023 at 13:13, Luca Weiss  
> > > > wrote:
> > > >>
> > > >> On Wed Aug 30, 2023 at 12:06 PM CEST, Krzysztof Kozlowski wrote:
> > > >>> On 30/08/2023 11:58, Luca Weiss wrote:
> > >  Like other Qualcomm PMICs the PM7250B can be used on different 
> > >  addresses
> > >  on the SPMI bus. Use similar defines like the PMK8350 to make this
> > >  possible.
> > > 
> > >  Signed-off-by: Luca Weiss 
> > >  ---
> > >   arch/arm64/boot/dts/qcom/pm7250b.dtsi | 23 ---
> > >   1 file changed, 16 insertions(+), 7 deletions(-)
> > > 
> > >  diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
> > >  b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> > >  index e8540c36bd99..3514de536baa 100644
> > >  --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> > >  +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
> > >  @@ -7,6 +7,15 @@
> > >   #include 
> > >   #include 
> > > 
> > >  +/* This PMIC can be configured to be at different SIDs */
> > >  +#ifndef PM7250B_SID
> > >  +   #define PM7250B_SID 2
> > >  +#endif
> > > >>>
> > > >>> Why do you send the same patch as v1, without any reference to 
> > > >>> previous
> > > >>> discussions?
> > > >>>
> > > >>> You got here feedback already.
> > > >>>
> > > >>> https://lore.kernel.org/linux-arm-msm/f52524da-719b-790f-ad2c-0c3f313d9...@linaro.org/
> > > >>
> > > >> Hi Krzysztof,
> > > >>
> > > >> I did mention that original patch in the cover letter of this series.
> > > >> I'm definitely aware of the discussion earlier this year there but also
> > > >> tried to get an update lately if there's any update with no response.
> > > >
> > > > I think the overall consensus was that my proposal is too complicated
> > > > for the DT files.
> > >
> > > I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
> > > customize per address? No.
> >
> > At the same time, we do keep SoC files separate from the board files.
> > Yes, I'm slightly exaggerating here.
> >
> > I think that for PMIC files it makes sense to extract common parts if
> > that eases reuse of the common parts.
>
> Hi all,
>
> what can I do for v2 now?
>
> 1. Keep this patch as-is, and keep pm7250b in device dts.
>
> 2. Drop pm7250b patch and drop from device dts, until _someone_ figures
> out a solution talking to the PMIC on different SID.
>
> 3. Something else like copy-pasting pm7250b.dtsi to pm7250-8.dtsi and
> changing the SID there, and using that in device dts.
>
> Please let me know what to do.
>
> Regards
> Luca

Hi,

if there's no feedback I'll keep this patch in v2 of this series and we
can continue to discuss there (if necessary).

Regards
Luca

>
> >
> > >
> > > I definitely do not agree to these ifndef->define. Maybe using just
> > > define would work (so drop ifndef->define), because this makes it
> > > obvious and fail-safe if included in wrong place... except that it is
> > > still not the define we expect. This is not the coding style present in
> > > other DTSes.
> > >
> > > The true problem how these SPMI bindings were created. Requiring SID
> > > address in every child is clearly redundant and I think we do not follow
> > > such approach anywhere else.
> > >
> > > Best regards,
> > > Krzysztof
> > >



Re: [PATCH 04/11] arm64: dts: qcom: pm7250b: make SID configurable

2023-09-11 Thread Krzysztof Kozlowski
On 11/09/2023 10:34, Luca Weiss wrote:
> On Tue Sep 5, 2023 at 10:30 AM CEST, Luca Weiss wrote:
>> On Thu Aug 31, 2023 at 2:27 PM CEST, Dmitry Baryshkov wrote:
>>> On Thu, 31 Aug 2023 at 14:54, Krzysztof Kozlowski
>>>  wrote:

 On 31/08/2023 13:33, Dmitry Baryshkov wrote:
> On Thu, 31 Aug 2023 at 13:13, Luca Weiss  wrote:
>>
>> On Wed Aug 30, 2023 at 12:06 PM CEST, Krzysztof Kozlowski wrote:
>>> On 30/08/2023 11:58, Luca Weiss wrote:
 Like other Qualcomm PMICs the PM7250B can be used on different 
 addresses
 on the SPMI bus. Use similar defines like the PMK8350 to make this
 possible.

 Signed-off-by: Luca Weiss 
 ---
  arch/arm64/boot/dts/qcom/pm7250b.dtsi | 23 ---
  1 file changed, 16 insertions(+), 7 deletions(-)

 diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
 b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
 index e8540c36bd99..3514de536baa 100644
 --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
 +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
 @@ -7,6 +7,15 @@
  #include 
  #include 

 +/* This PMIC can be configured to be at different SIDs */
 +#ifndef PM7250B_SID
 +   #define PM7250B_SID 2
 +#endif
>>>
>>> Why do you send the same patch as v1, without any reference to previous
>>> discussions?
>>>
>>> You got here feedback already.
>>>
>>> https://lore.kernel.org/linux-arm-msm/f52524da-719b-790f-ad2c-0c3f313d9...@linaro.org/
>>
>> Hi Krzysztof,
>>
>> I did mention that original patch in the cover letter of this series.
>> I'm definitely aware of the discussion earlier this year there but also
>> tried to get an update lately if there's any update with no response.
>
> I think the overall consensus was that my proposal is too complicated
> for the DT files.

 I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
 customize per address? No.
>>>
>>> At the same time, we do keep SoC files separate from the board files.
>>> Yes, I'm slightly exaggerating here.
>>>
>>> I think that for PMIC files it makes sense to extract common parts if
>>> that eases reuse of the common parts.
>>
>> Hi all,
>>
>> what can I do for v2 now?
>>
>> 1. Keep this patch as-is, and keep pm7250b in device dts.

This was NAKed by me. What Qualcomm SoC maintainers decide (or not
decide) about other options, should not cause the wrong solution to be
re-posted...

>>
>> 2. Drop pm7250b patch and drop from device dts, until _someone_ figures
>> out a solution talking to the PMIC on different SID.
>>
>> 3. Something else like copy-pasting pm7250b.dtsi to pm7250-8.dtsi and
>> changing the SID there, and using that in device dts.
>>
>> Please let me know what to do.
>>
>> Regards
>> Luca
> 
> Hi,
> 
> if there's no feedback I'll keep this patch in v2 of this series and we
> can continue to discuss there (if necessary).

Sorry, I still do not agree and there were no arguments convincing me to
change the mind.

I gave you the solution from my perspective. Why do you decided to
ignore it and send it as is?


Best regards,
Krzysztof



Re: [PATCH 04/11] arm64: dts: qcom: pm7250b: make SID configurable

2023-09-11 Thread Luca Weiss
On Mon Sep 11, 2023 at 11:44 AM CEST, Krzysztof Kozlowski wrote:
> On 11/09/2023 10:34, Luca Weiss wrote:
> > On Tue Sep 5, 2023 at 10:30 AM CEST, Luca Weiss wrote:
> >> On Thu Aug 31, 2023 at 2:27 PM CEST, Dmitry Baryshkov wrote:
> >>> On Thu, 31 Aug 2023 at 14:54, Krzysztof Kozlowski
> >>>  wrote:
> 
>  On 31/08/2023 13:33, Dmitry Baryshkov wrote:
> > On Thu, 31 Aug 2023 at 13:13, Luca Weiss  
> > wrote:
> >>
> >> On Wed Aug 30, 2023 at 12:06 PM CEST, Krzysztof Kozlowski wrote:
> >>> On 30/08/2023 11:58, Luca Weiss wrote:
>  Like other Qualcomm PMICs the PM7250B can be used on different 
>  addresses
>  on the SPMI bus. Use similar defines like the PMK8350 to make this
>  possible.
> 
>  Signed-off-by: Luca Weiss 
>  ---
>   arch/arm64/boot/dts/qcom/pm7250b.dtsi | 23 ---
>   1 file changed, 16 insertions(+), 7 deletions(-)
> 
>  diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
>  b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>  index e8540c36bd99..3514de536baa 100644
>  --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>  +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>  @@ -7,6 +7,15 @@
>   #include 
>   #include 
> 
>  +/* This PMIC can be configured to be at different SIDs */
>  +#ifndef PM7250B_SID
>  +   #define PM7250B_SID 2
>  +#endif
> >>>
> >>> Why do you send the same patch as v1, without any reference to 
> >>> previous
> >>> discussions?
> >>>
> >>> You got here feedback already.
> >>>
> >>> https://lore.kernel.org/linux-arm-msm/f52524da-719b-790f-ad2c-0c3f313d9...@linaro.org/
> >>
> >> Hi Krzysztof,
> >>
> >> I did mention that original patch in the cover letter of this series.
> >> I'm definitely aware of the discussion earlier this year there but also
> >> tried to get an update lately if there's any update with no response.
> >
> > I think the overall consensus was that my proposal is too complicated
> > for the DT files.
> 
>  I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
>  customize per address? No.
> >>>
> >>> At the same time, we do keep SoC files separate from the board files.
> >>> Yes, I'm slightly exaggerating here.
> >>>
> >>> I think that for PMIC files it makes sense to extract common parts if
> >>> that eases reuse of the common parts.
> >>
> >> Hi all,
> >>
> >> what can I do for v2 now?
> >>
> >> 1. Keep this patch as-is, and keep pm7250b in device dts.
>
> This was NAKed by me. What Qualcomm SoC maintainers decide (or not
> decide) about other options, should not cause the wrong solution to be
> re-posted...
>
> >>
> >> 2. Drop pm7250b patch and drop from device dts, until _someone_ figures
> >> out a solution talking to the PMIC on different SID.
> >>
> >> 3. Something else like copy-pasting pm7250b.dtsi to pm7250-8.dtsi and
> >> changing the SID there, and using that in device dts.

@Konrad, @Bjorn: Can you give any feedback here what's preferable?
Otherwise I'm just blocked on this series.

> >>
> >> Please let me know what to do.
> >>
> >> Regards
> >> Luca
> > 
> > Hi,
> > 
> > if there's no feedback I'll keep this patch in v2 of this series and we
> > can continue to discuss there (if necessary).
>
> Sorry, I still do not agree and there were no arguments convincing me to
> change the mind.
>
> I gave you the solution from my perspective. Why do you decided to
> ignore it and send it as is?

I get it that you are not final decider for qcom dts changes but it's
quite difficult for someone sending patches to not get any feedback what
other change to replace this is appropriate. I doubt it's a good idea to
just implement some random pm7250-8.dtsi or whatever to potentially
immediately get a response that that way is also bad.

That's why I'm trying to get some info before working on something and
sending it. Hopefully Bjorn or Konrad can add their thoughts above.

Also I don't recall me ever reading a "solution" from your side but
maybe I need to dig through the old emails again.

Regards
Luca

>
>
> Best regards,
> Krzysztof



Re: [PATCH 04/11] arm64: dts: qcom: pm7250b: make SID configurable

2023-09-11 Thread Konrad Dybcio
On 11.09.2023 13:15, Krzysztof Kozlowski wrote:
> On 11/09/2023 11:59, Luca Weiss wrote:
>> On Mon Sep 11, 2023 at 11:44 AM CEST, Krzysztof Kozlowski wrote:
>>> On 11/09/2023 10:34, Luca Weiss wrote:
 On Tue Sep 5, 2023 at 10:30 AM CEST, Luca Weiss wrote:
> On Thu Aug 31, 2023 at 2:27 PM CEST, Dmitry Baryshkov wrote:
>> On Thu, 31 Aug 2023 at 14:54, Krzysztof Kozlowski
>>  wrote:
>>>
>>> On 31/08/2023 13:33, Dmitry Baryshkov wrote:
 On Thu, 31 Aug 2023 at 13:13, Luca Weiss  
 wrote:
>
> On Wed Aug 30, 2023 at 12:06 PM CEST, Krzysztof Kozlowski wrote:
>> On 30/08/2023 11:58, Luca Weiss wrote:
>>> Like other Qualcomm PMICs the PM7250B can be used on different 
>>> addresses
>>> on the SPMI bus. Use similar defines like the PMK8350 to make this
>>> possible.
>>>
>>> Signed-off-by: Luca Weiss 
>>> ---
>>>  arch/arm64/boot/dts/qcom/pm7250b.dtsi | 23 ---
>>>  1 file changed, 16 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
>>> b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>>> index e8540c36bd99..3514de536baa 100644
>>> --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>>> +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>>> @@ -7,6 +7,15 @@
>>>  #include 
>>>  #include 
>>>
>>> +/* This PMIC can be configured to be at different SIDs */
>>> +#ifndef PM7250B_SID
>>> +   #define PM7250B_SID 2
>>> +#endif
>>
>> Why do you send the same patch as v1, without any reference to 
>> previous
>> discussions?
>>
>> You got here feedback already.
>>
>> https://lore.kernel.org/linux-arm-msm/f52524da-719b-790f-ad2c-0c3f313d9...@linaro.org/
>
> Hi Krzysztof,
>
> I did mention that original patch in the cover letter of this series.
> I'm definitely aware of the discussion earlier this year there but 
> also
> tried to get an update lately if there's any update with no response.

 I think the overall consensus was that my proposal is too complicated
 for the DT files.
>>>
>>> I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
>>> customize per address? No.
>>
>> At the same time, we do keep SoC files separate from the board files.
>> Yes, I'm slightly exaggerating here.
>>
>> I think that for PMIC files it makes sense to extract common parts if
>> that eases reuse of the common parts.
>
> Hi all,
>
> what can I do for v2 now?
>
> 1. Keep this patch as-is, and keep pm7250b in device dts.
>>>
>>> This was NAKed by me. What Qualcomm SoC maintainers decide (or not
>>> decide) about other options, should not cause the wrong solution to be
>>> re-posted...
>>>
>
> 2. Drop pm7250b patch and drop from device dts, until _someone_ figures
> out a solution talking to the PMIC on different SID.
>
> 3. Something else like copy-pasting pm7250b.dtsi to pm7250-8.dtsi and
> changing the SID there, and using that in device dts.
>>
>> @Konrad, @Bjorn: Can you give any feedback here what's preferable?
>> Otherwise I'm just blocked on this series.
I'm sure Krzysztof will disagree, but all of the solutions (which are
either duplicate the dt, add ifdefs or skip adding this pmic) are
equally band-aid-class.. A bright future where this PMIC thing is
handled on the driver side that will hopefully come soon(tm) should
resolve such problems..

>From my side, ifdef is the least burdensome, even if ugly..

Konrad


Re: [PATCH 04/11] arm64: dts: qcom: pm7250b: make SID configurable

2023-09-11 Thread Krzysztof Kozlowski
On 11/09/2023 11:59, Luca Weiss wrote:
> On Mon Sep 11, 2023 at 11:44 AM CEST, Krzysztof Kozlowski wrote:
>> On 11/09/2023 10:34, Luca Weiss wrote:
>>> On Tue Sep 5, 2023 at 10:30 AM CEST, Luca Weiss wrote:
 On Thu Aug 31, 2023 at 2:27 PM CEST, Dmitry Baryshkov wrote:
> On Thu, 31 Aug 2023 at 14:54, Krzysztof Kozlowski
>  wrote:
>>
>> On 31/08/2023 13:33, Dmitry Baryshkov wrote:
>>> On Thu, 31 Aug 2023 at 13:13, Luca Weiss  
>>> wrote:

 On Wed Aug 30, 2023 at 12:06 PM CEST, Krzysztof Kozlowski wrote:
> On 30/08/2023 11:58, Luca Weiss wrote:
>> Like other Qualcomm PMICs the PM7250B can be used on different 
>> addresses
>> on the SPMI bus. Use similar defines like the PMK8350 to make this
>> possible.
>>
>> Signed-off-by: Luca Weiss 
>> ---
>>  arch/arm64/boot/dts/qcom/pm7250b.dtsi | 23 ---
>>  1 file changed, 16 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/boot/dts/qcom/pm7250b.dtsi 
>> b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>> index e8540c36bd99..3514de536baa 100644
>> --- a/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/pm7250b.dtsi
>> @@ -7,6 +7,15 @@
>>  #include 
>>  #include 
>>
>> +/* This PMIC can be configured to be at different SIDs */
>> +#ifndef PM7250B_SID
>> +   #define PM7250B_SID 2
>> +#endif
>
> Why do you send the same patch as v1, without any reference to 
> previous
> discussions?
>
> You got here feedback already.
>
> https://lore.kernel.org/linux-arm-msm/f52524da-719b-790f-ad2c-0c3f313d9...@linaro.org/

 Hi Krzysztof,

 I did mention that original patch in the cover letter of this series.
 I'm definitely aware of the discussion earlier this year there but also
 tried to get an update lately if there's any update with no response.
>>>
>>> I think the overall consensus was that my proposal is too complicated
>>> for the DT files.
>>
>> I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
>> customize per address? No.
>
> At the same time, we do keep SoC files separate from the board files.
> Yes, I'm slightly exaggerating here.
>
> I think that for PMIC files it makes sense to extract common parts if
> that eases reuse of the common parts.

 Hi all,

 what can I do for v2 now?

 1. Keep this patch as-is, and keep pm7250b in device dts.
>>
>> This was NAKed by me. What Qualcomm SoC maintainers decide (or not
>> decide) about other options, should not cause the wrong solution to be
>> re-posted...
>>

 2. Drop pm7250b patch and drop from device dts, until _someone_ figures
 out a solution talking to the PMIC on different SID.

 3. Something else like copy-pasting pm7250b.dtsi to pm7250-8.dtsi and
 changing the SID there, and using that in device dts.
> 
> @Konrad, @Bjorn: Can you give any feedback here what's preferable?
> Otherwise I'm just blocked on this series.
> 

 Please let me know what to do.

 Regards
 Luca
>>>
>>> Hi,
>>>
>>> if there's no feedback I'll keep this patch in v2 of this series and we
>>> can continue to discuss there (if necessary).
>>
>> Sorry, I still do not agree and there were no arguments convincing me to
>> change the mind.
>>
>> I gave you the solution from my perspective. Why do you decided to
>> ignore it and send it as is?
> 
> I get it that you are not final decider for qcom dts changes but it's
> quite difficult for someone sending patches to not get any feedback what
> other change to replace this is appropriate. I doubt it's a good idea to
> just implement some random pm7250-8.dtsi or whatever to potentially
> immediately get a response that that way is also bad.
> 
> That's why I'm trying to get some info before working on something and
> sending it. Hopefully Bjorn or Konrad can add their thoughts above.

I understand, and it is frustrating. If such case happens the solution
in upstream is not sending the same NAKed version but send something else.

> 
> Also I don't recall me ever reading a "solution" from your side but
> maybe I need to dig through the old emails again.

Here:
"I proposed to duplicate the entries. Do you keep QUP nodes in DTSI and
customize per address? No."

Dmitry responded that having PMICs extracted help re-using. He is right.
But here you hit the limit of such re-usage.

Best regards,
Krzysztof



Re: [FIX PATCH] selftests: tracing: Fix to unmount tracefs for recovering environment

2023-09-11 Thread Steven Rostedt
On Sat,  9 Sep 2023 18:36:39 +0900
"Masami Hiramatsu (Google)"  wrote:

> From: Masami Hiramatsu (Google) 
> 
> Fix to unmount the tracefs if the ftracetest mounted it for recovering
> system environment. If the tracefs is already mounted, this does nothing.
> 
> Suggested-by: Mark Brown 
> Link: 
> https://lore.kernel.org/all/29fce076-746c-4650-8358-b4e0fa215...@sirena.org.uk/
> Fixes: cbd965bde74c ("ftrace/selftests: Return the skip code when tracing 
> directory not configured in kernel")
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  tools/testing/selftests/ftrace/ftracetest |8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/ftrace/ftracetest 
> b/tools/testing/selftests/ftrace/ftracetest
> index cb5f18c06593..89c212d82256 100755
> --- a/tools/testing/selftests/ftrace/ftracetest
> +++ b/tools/testing/selftests/ftrace/ftracetest
> @@ -31,6 +31,9 @@ err_ret=1
>  # kselftest skip code is 4
>  err_skip=4
>  
> +# umount required
> +UMOUNT_DIR=""
> +
>  # cgroup RT scheduling prevents chrt commands from succeeding, which
>  # induces failures in test wakeup tests.  Disable for the duration of
>  # the tests.
> @@ -45,6 +48,9 @@ setup() {
>  
>  cleanup() {
>echo $sched_rt_runtime_orig > $sched_rt_runtime
> +  if [ "${UMOUNT_DIR}" ]; then

Shouldn't the above be:

 if [ ! -z "${UNMOUNT_DIR}" ]; then

?

-- Steve

> +umount ${UMOUNT_DIR} ||:
> +  fi
>  }
>  
>  errexit() { # message
> @@ -160,11 +166,13 @@ if [ -z "$TRACING_DIR" ]; then
>   mount -t tracefs nodev /sys/kernel/tracing ||
> errexit "Failed to mount /sys/kernel/tracing"
>   TRACING_DIR="/sys/kernel/tracing"
> + UMOUNT_DIR=${TRACING_DIR}
>   # If debugfs exists, then so does /sys/kernel/debug
>   elif [ -d "/sys/kernel/debug" ]; then
>   mount -t debugfs nodev /sys/kernel/debug ||
> errexit "Failed to mount /sys/kernel/debug"
>   TRACING_DIR="/sys/kernel/debug/tracing"
> + UMOUNT_DIR=${TRACING_DIR}
>   else
>   err_ret=$err_skip
>   errexit "debugfs and tracefs are not configured in this kernel"



Re: [PATCH RESEND v3 1/2] selftests/resctrl: Fix schemata write error check

2023-09-11 Thread Reinette Chatre
Hi Maciej,

On 9/1/2023 6:42 AM, Wieczor-Retman Maciej wrote:
> Writing bitmasks to the schemata can fail when the bitmask doesn't
> adhere to constraints defined by what a particular CPU supports.
> Some example of constraints are max length or having contiguous bits.
> The driver should properly return errors when any rule concerning
> bitmask format is broken.
> 
> Resctrl FS returns error codes from fprintf() only when fclose() is
> called. Current error checking scheme allows invalid bitmasks to be
> written into schemata file and the selftest doesn't notice because the
> fclose() error code isn't checked.
> 
> Substitute fopen(), flose() and fprintf() with open(), close() and
> write() to avoid error code buffering between fprintf() and fclose().
> 
> Remove newline character from the schema string after writing it to
> the schemata file so it prints correctly before function return.
> 
> Pass the string generated with strerror() to the "reason" buffer so
> the error message is more verbose. Extend "reason" buffer so it can hold
> longer messages.
> 
> Signed-off-by: Wieczor-Retman Maciej 

When I build the tests with this applied I encounter the following:

resctrlfs.c: In function ‘write_schemata’:
resctrlfs.c:475:14: warning: implicit declaration of function ‘open’; did you 
mean ‘popen’? [-Wimplicit-function-declaration]
  475 | fd = open(controlgroup, O_WRONLY);
  |  ^~~~
  |  popen
resctrlfs.c:475:33: error: ‘O_WRONLY’ undeclared (first use in this function)
  475 | fd = open(controlgroup, O_WRONLY);
  | ^~~~
resctrlfs.c:475:33: note: each undeclared identifier is reported only once for 
each function it appears in

> ---
> Changelog v3:
> - Rename fp to fd (Ilpo)
> - Remove strlen, strcspn and just use the snprintf value instead (Ilpo)
> 
> Changelog v2:
> - Rewrite patch message.
> - Double "reason" buffer size to fit longer error explanation.
> - Redo file interactions with syscalls instead of stdio functions.
> 
>  tools/testing/selftests/resctrl/resctrlfs.c | 26 +++--
>  1 file changed, 14 insertions(+), 12 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c 
> b/tools/testing/selftests/resctrl/resctrlfs.c
> index bd36ee206602..b0b14a5bcbf5 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -488,9 +488,8 @@ int write_bm_pid_to_resctrl(pid_t bm_pid, char *ctrlgrp, 
> char *mongrp,
>   */
>  int write_schemata(char *ctrlgrp, char *schemata, int cpu_no, char 
> *resctrl_val)
>  {
> - char controlgroup[1024], schema[1024], reason[64];
> - int resource_id, ret = 0;
> - FILE *fp;
> + char controlgroup[1024], schema[1024], reason[128];
> + int resource_id, fd, schema_len = -1, ret = 0;
>  
>   if (strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR)) &&
>   strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) &&
> @@ -518,27 +517,30 @@ int write_schemata(char *ctrlgrp, char *schemata, int 
> cpu_no, char *resctrl_val)
>  
>   if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR)) ||
>   !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR)))
> - sprintf(schema, "%s%d%c%s", "L3:", resource_id, '=', schemata);
> + schema_len = snprintf(schema, sizeof(schema), "%s%d%c%s\n",
> +   "L3:", resource_id, '=', schemata);
>   if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR)) ||
>   !strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)))
> - sprintf(schema, "%s%d%c%s", "MB:", resource_id, '=', schemata);
> + schema_len = snprintf(schema, sizeof(schema), "%s%d%c%s\n",
> +   "MB:", resource_id, '=', schemata);
>  
> - fp = fopen(controlgroup, "w");
> - if (!fp) {
> + fd = open(controlgroup, O_WRONLY);
> + if (!fd) {
>   sprintf(reason, "Failed to open control group");

It makes code easier to understand and maintain if it is kept
consistent. It is thus unexpected for open() error handling to
be untouched while write() error handling is modified. I think
the addition of errno in error handling of write() is helpful. 
Could you do the same for open()?

>   ret = -1;
>  
>   goto out;
>   }
> -
> - if (fprintf(fp, "%s\n", schema) < 0) {
> - sprintf(reason, "Failed to write schemata in control group");
> - fclose(fp);
> + if (write(fd, schema, schema_len) < 0) {
> + snprintf(reason, sizeof(reason),
> +  "write() failed : %s", strerror(errno));
> + close(fd);
>   ret = -1;
>  
>   goto out;
>   }
> - fclose(fp);
> + close(fd);
> + schema[schema_len - 1] = 0;
>  
>  out:
>   ksft_print_msg("Write schema \"%s\" to resctrl FS%s%s\n",

Reinette


[PATCH 0/5] selftests/resctrl: Fixes to failing tests

2023-09-11 Thread Ilpo Järvinen
Fix three issues with resctrl selftests.

The signal handling fix became necessary after the mount/umount fixes.

The other two came up when I ran resctrl selftests across the server
fleet in our lab to validate the upcoming CAT test rewrite (the rewrite
is not part of this series).

These are developed and should apply cleanly at least on top the
benchmark cleanup series (might apply cleanly also w/o the benchmark
series, I didn't test).

Ilpo Järvinen (5):
  selftests/resctrl: Extend signal handler coverage to unmount on
receiving signal
  selftests/resctrl: Remove duplicate feature check from CMT test
  selftests/resctrl: Refactor feature check to use resource and feature
name
  selftests/resctrl: Fix feature checks
  selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests

 tools/testing/selftests/resctrl/cat_test.c|  8 ---
 tools/testing/selftests/resctrl/cmt_test.c|  3 -
 tools/testing/selftests/resctrl/mba_test.c|  2 +-
 tools/testing/selftests/resctrl/mbm_test.c|  2 +-
 tools/testing/selftests/resctrl/resctrl.h |  6 +-
 .../testing/selftests/resctrl/resctrl_tests.c | 37 --
 tools/testing/selftests/resctrl/resctrl_val.c | 22 +++---
 tools/testing/selftests/resctrl/resctrlfs.c   | 69 ---
 8 files changed, 73 insertions(+), 76 deletions(-)

-- 
2.30.2



RE: [EXTERNAL] Re: [Patch v5 0/5] RDMA/mana_ib

2023-09-11 Thread Ajay Sharma
I have updated the last patch to use xarray, will post the update patch. We 
currently use aux bus for ib device. Gd_register_device is firmware specific. 
All the patches use RDMA/mana_ib format which is aligned with 
drivers/infiniband/hw/mana/ .

Thanks

> -Original Message-
> From: Leon Romanovsky 
> Sent: Monday, September 11, 2023 7:33 AM
> To: sharmaa...@linuxonhyperv.com
> Cc: Long Li ; Jason Gunthorpe ; Dexuan
> Cui ; Wei Liu ; David S. Miller
> ; Eric Dumazet ; Jakub
> Kicinski ; Paolo Abeni ; linux-
> r...@vger.kernel.org; linux-hyp...@vger.kernel.org; net...@vger.kernel.org;
> linux-kernel@vger.kernel.org; Ajay Sharma 
> Subject: [EXTERNAL] Re: [Patch v5 0/5] RDMA/mana_ib
> 
> On Thu, Sep 07, 2023 at 09:52:34AM -0700, sharmaa...@linuxonhyperv.com
> wrote:
> > From: Ajay Sharma 
> >
> > Change from v4:
> > Send qp fatal error event to the context that created the qp. Add
> > lookup table for qp.
> >
> > Ajay Sharma (5):
> >   RDMA/mana_ib : Rename all mana_ib_dev type variables to mib_dev
> >   RDMA/mana_ib : Register Mana IB  device with Management SW
> >   RDMA/mana_ib : Create adapter and Add error eq
> >   RDMA/mana_ib : Query adapter capabilities
> >   RDMA/mana_ib : Send event to qp
> 
> I didn't look very deep into the series and has three very initial comments.
> 1. Please do git log drivers/infiniband/hw/mana/ and use same format for
> commit messages.
> 2. Don't invent your own index-to-qp query mechanism in last patch and use
> xarray.
> 3. Once you decided to export mana_gd_register_device, it hinted me that it is
> time to move to auxbus infrastructure.
> 
> Thanks
> 
> >
> >  drivers/infiniband/hw/mana/cq.c   |  12 +-
> >  drivers/infiniband/hw/mana/device.c   |  81 +++--
> >  drivers/infiniband/hw/mana/main.c | 288 +-
> >  drivers/infiniband/hw/mana/mana_ib.h  | 102 ++-
> >  drivers/infiniband/hw/mana/mr.c   |  42 ++-
> >  drivers/infiniband/hw/mana/qp.c   |  86 +++---
> >  drivers/infiniband/hw/mana/wq.c   |  21 +-
> >  .../net/ethernet/microsoft/mana/gdma_main.c   | 152 +
> >  drivers/net/ethernet/microsoft/mana/mana_en.c |   3 +
> >  include/net/mana/gdma.h   |  16 +-
> >  10 files changed, 545 insertions(+), 258 deletions(-)
> >
> > --
> > 2.25.1
> >



Re: [Patch v5 0/5] RDMA/mana_ib

2023-09-11 Thread Leon Romanovsky
On Thu, Sep 07, 2023 at 09:52:34AM -0700, sharmaa...@linuxonhyperv.com wrote:
> From: Ajay Sharma 
> 
> Change from v4:
> Send qp fatal error event to the context that
> created the qp. Add lookup table for qp.
> 
> Ajay Sharma (5):
>   RDMA/mana_ib : Rename all mana_ib_dev type variables to mib_dev
>   RDMA/mana_ib : Register Mana IB  device with Management SW
>   RDMA/mana_ib : Create adapter and Add error eq
>   RDMA/mana_ib : Query adapter capabilities
>   RDMA/mana_ib : Send event to qp

I didn't look very deep into the series and has three very initial comments.
1. Please do git log drivers/infiniband/hw/mana/ and use same format for
commit messages.
2. Don't invent your own index-to-qp query mechanism in last patch and use 
xarray.
3. Once you decided to export mana_gd_register_device, it hinted me that
it is time to move to auxbus infrastructure.

Thanks

> 
>  drivers/infiniband/hw/mana/cq.c   |  12 +-
>  drivers/infiniband/hw/mana/device.c   |  81 +++--
>  drivers/infiniband/hw/mana/main.c | 288 +-
>  drivers/infiniband/hw/mana/mana_ib.h  | 102 ++-
>  drivers/infiniband/hw/mana/mr.c   |  42 ++-
>  drivers/infiniband/hw/mana/qp.c   |  86 +++---
>  drivers/infiniband/hw/mana/wq.c   |  21 +-
>  .../net/ethernet/microsoft/mana/gdma_main.c   | 152 +
>  drivers/net/ethernet/microsoft/mana/mana_en.c |   3 +
>  include/net/mana/gdma.h   |  16 +-
>  10 files changed, 545 insertions(+), 258 deletions(-)
> 
> -- 
> 2.25.1
> 



RE: [PATCH v2] uapi: fix __DECLARE_FLEX_ARRAY for C++

2023-09-11 Thread David Laight
...
> Okay, can you please split the patch so they can be backported
> separately? Then I'll get them landed, etc.

Since the header with just the extra #endif is badly broken on C++
isn't it best to ensure they get back-ported together?
So one patch is probably better.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [PATCH] x86/hyperv/vtl: Replace real_mode_header only under Hyper-V

2023-09-11 Thread Mathias Krause
On 08.09.23 17:02, Saurabh Singh Sengar wrote:
> On Fri, Sep 08, 2023 at 12:26:10PM +0200, Mathias Krause wrote:
>> Booting a CONFIG_HYPERV_VTL_MODE=y enabled kernel on bare metal or a
>> non-Hyper-V hypervisor leads to serve memory corruption as
> 
> FWIW, CONFIG_HYPERV_VTL_MODE is not expected to be enabled for non VTL
> platforms.

Fair enough, but there's really no excuse to randomly crashing the
kernel if one forgot to RTFM like I did. The code should (and easily
can) handle such situations, especially if it's just a matter of a two
line change.

> Referring Kconfig documentation:
> "A kernel built with this option must run at VTL2, and will not run as
> a normal guest."

So, maybe, the 'return 0' below should be a 'panic("Need to run on
Hyper-V!")' instead?

But then, looking at the code, most of the VTL specifics only run when
the Hyper-V hypervisor was actually detected during early boot. It's
just hv_vtl_early_init() that runs unconditionally and spoils the game.

Is there really a *hard* requirement / reason for having VTL support
disabled when not running under Hyper-V? At least I can't find any from
the code side. It'll all be fine with the below patch, also enabling
running the same kernel on multiple platforms -- bare metal, KVM, Hyper-V,..

Thanks,
Mathias

> 
> - Saurabh
> 
>> hv_vtl_early_init() will run even though hv_vtl_init_platform() did not.
>> This skips no-oping the 'realmode_reserve' and 'realmode_init' platform
>> hooks, making init_real_mode() -> setup_real_mode() try to copy
>> 'real_mode_blob' over 'real_mode_header' which we set to the stub
>> 'hv_vtl_real_mode_header'. However, as 'real_mode_blob' isn't just a
>> 'struct real_mode_header' -- it's the complete code! -- copying it over
>> 'hv_vtl_real_mode_header' will corrupt quite some memory following it.
>>
>> The real cause for this erroneous behaviour is that hv_vtl_early_init()
>> blindly assumes the kernel is running on Hyper-V, which it may not.
>>
>> Fix this by making sure the code only replaces the real mode header with
>> the stub one iff the kernel is running under Hyper-V.
>>
>> Fixes: 3be1bc2fe9d2 ("x86/hyperv: VTL support for Hyper-V")
>> Cc: Saurabh Sengar 
>> Cc: sta...@kernel.org
>> Signed-off-by: Mathias Krause 
>> ---
>>  arch/x86/hyperv/hv_vtl.c | 4 
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/hyperv/hv_vtl.c b/arch/x86/hyperv/hv_vtl.c
>> index 57df7821d66c..54c06f4b8b4c 100644
>> --- a/arch/x86/hyperv/hv_vtl.c
>> +++ b/arch/x86/hyperv/hv_vtl.c
>> @@ -12,6 +12,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>  #include 
>>  
>>  extern struct boot_params boot_params;
>> @@ -214,6 +215,9 @@ static int hv_vtl_wakeup_secondary_cpu(int apicid, 
>> unsigned long start_eip)
>>  
>>  static int __init hv_vtl_early_init(void)
>>  {
>> +if (!hypervisor_is_type(X86_HYPER_MS_HYPERV))
>> +return 0;
>> +
>>  /*
>>   * `boot_cpu_has` returns the runtime feature support,
>>   * and here is the earliest it can be used.
>> -- 
>> 2.30.2
>>



Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook

2023-09-11 Thread Sven Schnelle
Masami Hiramatsu (Google)  writes:

>> > IOW, it is ftrace save regs/restore regs code issue. I need to check how 
>> > the
>> > function_graph implements it.
>> 
>> gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
>> regardless of the FTRACE_WITH_REGS flags. The only difference is that
>> without the FTRACE_WITH_REGS flag the program status word (psw) is not
>> saved because collecting that is a rather expensive operation.
>
> Thanks for checking that! So s390 will recover those saved registers
> even if FTRACE_WITH_REGS flag is not set? (I wonder what is the requirement
> of the ftrace_regs when returning from ftrace_call() without
> FTRACE_WITH_REGS?)

Yes, it will recover these in all cases.

>> 
>> I used the following commands to test rethook (is that the correct
>> testcase?)
>> 
>> #!/bin/bash
>> cd /sys/kernel/tracing
>> 
>> echo 'r:icmp_rcv icmp_rcv' >kprobe_events
>> echo 1 >events/kprobes/icmp_rcv/enable
>> ping -c 1 127.0.0.1
>> cat trace
>
> No, the kprobe will path pt_regs to rethook.
> Cna you run
>
> echo "f:icmp_rcv%return icmp_rcv" >> dynamic_events

Ah, ok. Seems to work as well:

  ping-481 [001] ..s2.53.918480: icmp_rcv: 
(ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)
  ping-481 [001] ..s2.53.918491: icmp_rcv: 
(ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)