Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Shijie Huang



在 2024/1/19 12:42, Yury Norov 写道:

Regardless, I don't think that the approach is correct. As per your
description, some initialization functions erroneously call
cpu_to_node() instead of early_cpu_to_node() which exists specifically
for that case.


I checked the code again.

The sparc, mips and s390 (which support the NUMA) do not support 
early_cpu_to_node().


So we cannot use early_cpu_to_node() for these functions.


Thanks

Huang Shijie



Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Shijie Huang



在 2024/1/19 14:46, Shijie Huang 写道:


在 2024/1/19 12:42, Yury Norov 写道:

This adds another level of indirection, I think. Currently cpu_to_node
is a simple inliner. After the patch it would be a real function with
all the associate overhead. Can you share a bloat-o-meter output here?

#./scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
Function old new   delta
numa_update_cpu  148 244 +96

 ...(to 
many to skip)


Total: Before=32990130, After=32990710, chg +0.00%




Regardless, I don't think that the approach is correct. As per your
description, some initialization functions erroneously call
cpu_to_node() instead of early_cpu_to_node() which exists specifically
for that case.


sorry, I missed something.

I am not sure if  the early_cpu_to_node() works on all ARCHs.


Thanks

Huang Shijie



If the above correct, it's clearly a caller problem, and the fix is to
simply switch all those callers to use early version.


It is easy to change to early_cpu_to_node() for sched_init(), 
init_sched_fair_class()


and workqueue_init_early(). These three places call the cpu_to_node() 
in the __init function.



But it is a little hard to change the early_trace_init(), since it 
calls cpu_to_node in the deep


function stack:

  early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()


For early_trace_init(), we need to change more code.


Anyway, If we think it is not a good idea to change the common code, I 
am oaky too.





I would also initialize the numa_node with NUMA_NO_NODE at declaration,
so that if someone calls cpu_to_node() before the variable is properly
initialized at runtime, he'll get NO_NODE, which is obviously an error.


Even we set the numa_node with NUMA_NO_NODE, it does not always 
produce error.


Please see the alloc_pages_node().


Thanks

Huang Shijie



Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Shijie Huang



在 2024/1/19 12:42, Yury Norov 写道:

This adds another level of indirection, I think. Currently cpu_to_node
is a simple inliner. After the patch it would be a real function with
all the associate overhead. Can you share a bloat-o-meter output here?

#./scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 6/1 grow/shrink: 61/51 up/down: 1168/-588 (580)
Function old new   delta
numa_update_cpu  148 244 +96

 
...(to
 many to skip)

Total: Before=32990130, After=32990710, chg +0.00%




Regardless, I don't think that the approach is correct. As per your
description, some initialization functions erroneously call
cpu_to_node() instead of early_cpu_to_node() which exists specifically
for that case.

If the above correct, it's clearly a caller problem, and the fix is to
simply switch all those callers to use early version.


It is easy to change to early_cpu_to_node() for sched_init(), 
init_sched_fair_class()


and workqueue_init_early(). These three places call the cpu_to_node() in 
the __init function.



But it is a little hard to change the early_trace_init(), since it calls 
cpu_to_node in the deep


function stack:

  early_trace_init() --> ring_buffer_alloc() -->rb_allocate_cpu_buffer()


For early_trace_init(), we need to change more code.


Anyway, If we think it is not a good idea to change the common code, I 
am oaky too.





I would also initialize the numa_node with NUMA_NO_NODE at declaration,
so that if someone calls cpu_to_node() before the variable is properly
initialized at runtime, he'll get NO_NODE, which is obviously an error.


Even we set the numa_node with NUMA_NO_NODE, it does not always produce 
error.


Please see the alloc_pages_node().


Thanks

Huang Shijie



Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Greg KH
On Fri, Jan 19, 2024 at 11:32:27AM +0800, Huang Shijie wrote:
> During the kernel booting, the generic cpu_to_node() is called too early in
> arm64, powerpc and riscv when CONFIG_NUMA is enabled.
> 
> There are at least four places in the common code where
> the generic cpu_to_node() is called before it is initialized:
>  1.) early_trace_init() in kernel/trace/trace.c
>  2.) sched_init()   in kernel/sched/core.c
>  3.) init_sched_fair_class()in kernel/sched/fair.c
>  4.) workqueue_init_early() in kernel/workqueue.c
> 
> In order to fix the bug, the patch changes generic cpu_to_node to
> function pointer, and export it for kernel modules.
> Introduce smp_prepare_boot_cpu_start() to wrap the original
> smp_prepare_boot_cpu(), and set cpu_to_node with early_cpu_to_node.
> Introduce smp_prepare_cpus_done() to wrap the original smp_prepare_cpus(),
> and set the cpu_to_node to formal _cpu_to_node().
> 
> Signed-off-by: Huang Shijie 
> ---
>  drivers/base/arch_numa.c | 11 +++
>  include/linux/topology.h |  6 ++
>  init/main.c  | 29 +++--
>  3 files changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index 5b59d133b6af..867a477fa975 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -61,6 +61,17 @@ EXPORT_SYMBOL(cpumask_of_node);
>  
>  #endif
>  
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> +int _cpu_to_node(int cpu)
> +{
> + return per_cpu(numa_node, cpu);
> +}
> +int (*cpu_to_node)(int cpu);
> +EXPORT_SYMBOL(cpu_to_node);
> +#endif
> +#endif
> +
>  static void numa_update_cpu(unsigned int cpu, bool remove)
>  {
>   int nid = cpu_to_node(cpu);
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..e7ce2bae11dd 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -91,10 +91,8 @@ static inline int numa_node_id(void)
>  #endif
>  
>  #ifndef cpu_to_node
> -static inline int cpu_to_node(int cpu)
> -{
> - return per_cpu(numa_node, cpu);
> -}
> +extern int (*cpu_to_node)(int cpu);
> +extern int _cpu_to_node(int cpu);
>  #endif
>  
>  #ifndef set_numa_node
> diff --git a/init/main.c b/init/main.c
> index e24b0780fdff..b142e9c51161 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -870,6 +870,18 @@ static void __init print_unknown_bootoptions(void)
>   memblock_free(unknown_options, len);
>  }
>  
> +static void __init smp_prepare_boot_cpu_start(void)
> +{
> + smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
> +
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> + /* The early_cpu_to_node should be ready now. */
> + cpu_to_node = early_cpu_to_node;
> +#endif
> +#endif
> +}
> +
>  asmlinkage __visible __init __no_sanitize_address __noreturn 
> __no_stack_protector
>  void start_kernel(void)
>  {
> @@ -899,7 +911,7 @@ void start_kernel(void)
>   setup_command_line(command_line);
>   setup_nr_cpu_ids();
>   setup_per_cpu_areas();
> - smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
> + smp_prepare_boot_cpu_start();
>   boot_cpu_hotplug_init();
>  
>   pr_notice("Kernel command line: %s\n", saved_command_line);
> @@ -1519,6 +1531,19 @@ void __init console_on_rootfs(void)
>   fput(file);
>  }
>  
> +static void __init smp_prepare_cpus_done(unsigned int setup_max_cpus)
> +{
> + /* Different ARCHs may override smp_prepare_cpus() */
> + smp_prepare_cpus(setup_max_cpus);
> +
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> + /* Change to the formal function. */
> + cpu_to_node = _cpu_to_node;
> +#endif
> +#endif
> +}
> +
>  static noinline void __init kernel_init_freeable(void)
>  {
>   /* Now the scheduler is fully set up and can do blocking allocations */
> @@ -1531,7 +1556,7 @@ static noinline void __init kernel_init_freeable(void)
>  
>   cad_pid = get_pid(task_pid(current));
>  
> - smp_prepare_cpus(setup_max_cpus);
> + smp_prepare_cpus_done(setup_max_cpus);
>  
>   workqueue_init();
>  
> -- 
> 2.40.1
> 

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/process/submitting-patches.rst f

Re: [PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Yury Norov
On Fri, Jan 19, 2024 at 11:32:27AM +0800, Huang Shijie wrote:
> hZ7bkEvc+Z19RHkS/HVG3KMg
> X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM8PR01MB7144
> Status: O
> Content-Length: 3779
> Lines: 126
> 
> During the kernel booting, the generic cpu_to_node() is called too early in
> arm64, powerpc and riscv when CONFIG_NUMA is enabled.
> 
> There are at least four places in the common code where
> the generic cpu_to_node() is called before it is initialized:
>  1.) early_trace_init() in kernel/trace/trace.c
>  2.) sched_init()   in kernel/sched/core.c
>  3.) init_sched_fair_class()in kernel/sched/fair.c
>  4.) workqueue_init_early() in kernel/workqueue.c
> 
> In order to fix the bug, the patch changes generic cpu_to_node to
> function pointer, and export it for kernel modules.
> Introduce smp_prepare_boot_cpu_start() to wrap the original
> smp_prepare_boot_cpu(), and set cpu_to_node with early_cpu_to_node.
> Introduce smp_prepare_cpus_done() to wrap the original smp_prepare_cpus(),
> and set the cpu_to_node to formal _cpu_to_node().

This adds another level of indirection, I think. Currently cpu_to_node
is a simple inliner. After the patch it would be a real function with
all the associate overhead. Can you share a bloat-o-meter output here?

Regardless, I don't think that the approach is correct. As per your
description, some initialization functions erroneously call
cpu_to_node() instead of early_cpu_to_node() which exists specifically
for that case.

If the above correct, it's clearly a caller problem, and the fix is to
simply switch all those callers to use early version.

I would also initialize the numa_node with NUMA_NO_NODE at declaration,
so that if someone calls cpu_to_node() before the variable is properly
initialized at runtime, he'll get NO_NODE, which is obviously an error.

Thanks,
Yury
 
> Signed-off-by: Huang Shijie 
> ---
>  drivers/base/arch_numa.c | 11 +++
>  include/linux/topology.h |  6 ++
>  init/main.c  | 29 +++--
>  3 files changed, 40 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
> index 5b59d133b6af..867a477fa975 100644
> --- a/drivers/base/arch_numa.c
> +++ b/drivers/base/arch_numa.c
> @@ -61,6 +61,17 @@ EXPORT_SYMBOL(cpumask_of_node);
>  
>  #endif
>  
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> +int _cpu_to_node(int cpu)
> +{
> + return per_cpu(numa_node, cpu);
> +}
> +int (*cpu_to_node)(int cpu);
> +EXPORT_SYMBOL(cpu_to_node);
> +#endif
> +#endif
> +
>  static void numa_update_cpu(unsigned int cpu, bool remove)
>  {
>   int nid = cpu_to_node(cpu);
> diff --git a/include/linux/topology.h b/include/linux/topology.h
> index 52f5850730b3..e7ce2bae11dd 100644
> --- a/include/linux/topology.h
> +++ b/include/linux/topology.h
> @@ -91,10 +91,8 @@ static inline int numa_node_id(void)
>  #endif
>  
>  #ifndef cpu_to_node
> -static inline int cpu_to_node(int cpu)
> -{
> - return per_cpu(numa_node, cpu);
> -}
> +extern int (*cpu_to_node)(int cpu);
> +extern int _cpu_to_node(int cpu);
>  #endif
>  
>  #ifndef set_numa_node
> diff --git a/init/main.c b/init/main.c
> index e24b0780fdff..b142e9c51161 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -870,6 +870,18 @@ static void __init print_unknown_bootoptions(void)
>   memblock_free(unknown_options, len);
>  }
>  
> +static void __init smp_prepare_boot_cpu_start(void)
> +{
> + smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
> +
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> + /* The early_cpu_to_node should be ready now. */
> + cpu_to_node = early_cpu_to_node;
> +#endif
> +#endif
> +}
> +
>  asmlinkage __visible __init __no_sanitize_address __noreturn 
> __no_stack_protector
>  void start_kernel(void)
>  {
> @@ -899,7 +911,7 @@ void start_kernel(void)
>   setup_command_line(command_line);
>   setup_nr_cpu_ids();
>   setup_per_cpu_areas();
> - smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
> + smp_prepare_boot_cpu_start();
>   boot_cpu_hotplug_init();
>  
>   pr_notice("Kernel command line: %s\n", saved_command_line);
> @@ -1519,6 +1531,19 @@ void __init console_on_rootfs(void)
>   fput(file);
>  }
>  
> +static void __init smp_prepare_cpus_done(unsigned int setup_max_cpus)
> +{
> + /* Different ARCHs may override smp_prepare_cpus() */
> + smp_prepare_cpus(setup_max_cpus);
> +
> +#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
> +#ifndef cpu_to_node
> + /* Change to the formal function. */
> + cpu_to_node = _cpu_to_node;
> +#endif
> +#endif
> +}
> +
>  static noinline void __init kernel_init_freeable(void)
>  {
>   /* Now the scheduler is fully set up and can do blocking allocations */
> @@ -1531,7 +1556,7 @@ static noinline void __init kernel_init_freeable(void)
>  
>   cad_pid = get_pid(t


[PATCH] NUMA: Early use of cpu_to_node() returns 0 instead of the correct node id

2024-01-18 Thread Huang Shijie
During the kernel booting, the generic cpu_to_node() is called too early in
arm64, powerpc and riscv when CONFIG_NUMA is enabled.

There are at least four places in the common code where
the generic cpu_to_node() is called before it is initialized:
   1.) early_trace_init() in kernel/trace/trace.c
   2.) sched_init()   in kernel/sched/core.c
   3.) init_sched_fair_class()in kernel/sched/fair.c
   4.) workqueue_init_early() in kernel/workqueue.c

In order to fix the bug, the patch changes generic cpu_to_node to
function pointer, and export it for kernel modules.
Introduce smp_prepare_boot_cpu_start() to wrap the original
smp_prepare_boot_cpu(), and set cpu_to_node with early_cpu_to_node.
Introduce smp_prepare_cpus_done() to wrap the original smp_prepare_cpus(),
and set the cpu_to_node to formal _cpu_to_node().

Signed-off-by: Huang Shijie 
---
 drivers/base/arch_numa.c | 11 +++
 include/linux/topology.h |  6 ++
 init/main.c  | 29 +++--
 3 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/drivers/base/arch_numa.c b/drivers/base/arch_numa.c
index 5b59d133b6af..867a477fa975 100644
--- a/drivers/base/arch_numa.c
+++ b/drivers/base/arch_numa.c
@@ -61,6 +61,17 @@ EXPORT_SYMBOL(cpumask_of_node);
 
 #endif
 
+#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
+#ifndef cpu_to_node
+int _cpu_to_node(int cpu)
+{
+   return per_cpu(numa_node, cpu);
+}
+int (*cpu_to_node)(int cpu);
+EXPORT_SYMBOL(cpu_to_node);
+#endif
+#endif
+
 static void numa_update_cpu(unsigned int cpu, bool remove)
 {
int nid = cpu_to_node(cpu);
diff --git a/include/linux/topology.h b/include/linux/topology.h
index 52f5850730b3..e7ce2bae11dd 100644
--- a/include/linux/topology.h
+++ b/include/linux/topology.h
@@ -91,10 +91,8 @@ static inline int numa_node_id(void)
 #endif
 
 #ifndef cpu_to_node
-static inline int cpu_to_node(int cpu)
-{
-   return per_cpu(numa_node, cpu);
-}
+extern int (*cpu_to_node)(int cpu);
+extern int _cpu_to_node(int cpu);
 #endif
 
 #ifndef set_numa_node
diff --git a/init/main.c b/init/main.c
index e24b0780fdff..b142e9c51161 100644
--- a/init/main.c
+++ b/init/main.c
@@ -870,6 +870,18 @@ static void __init print_unknown_bootoptions(void)
memblock_free(unknown_options, len);
 }
 
+static void __init smp_prepare_boot_cpu_start(void)
+{
+   smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
+
+#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
+#ifndef cpu_to_node
+   /* The early_cpu_to_node should be ready now. */
+   cpu_to_node = early_cpu_to_node;
+#endif
+#endif
+}
+
 asmlinkage __visible __init __no_sanitize_address __noreturn 
__no_stack_protector
 void start_kernel(void)
 {
@@ -899,7 +911,7 @@ void start_kernel(void)
setup_command_line(command_line);
setup_nr_cpu_ids();
setup_per_cpu_areas();
-   smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */
+   smp_prepare_boot_cpu_start();
boot_cpu_hotplug_init();
 
pr_notice("Kernel command line: %s\n", saved_command_line);
@@ -1519,6 +1531,19 @@ void __init console_on_rootfs(void)
fput(file);
 }
 
+static void __init smp_prepare_cpus_done(unsigned int setup_max_cpus)
+{
+   /* Different ARCHs may override smp_prepare_cpus() */
+   smp_prepare_cpus(setup_max_cpus);
+
+#ifdef CONFIG_USE_PERCPU_NUMA_NODE_ID
+#ifndef cpu_to_node
+   /* Change to the formal function. */
+   cpu_to_node = _cpu_to_node;
+#endif
+#endif
+}
+
 static noinline void __init kernel_init_freeable(void)
 {
/* Now the scheduler is fully set up and can do blocking allocations */
@@ -1531,7 +1556,7 @@ static noinline void __init kernel_init_freeable(void)
 
cad_pid = get_pid(task_pid(current));
 
-   smp_prepare_cpus(setup_max_cpus);
+   smp_prepare_cpus_done(setup_max_cpus);
 
workqueue_init();
 
-- 
2.40.1



Re: [PATCH -fixes v2] RISC-V: KVM: Require HAVE_KVM

2024-01-18 Thread Anup Patel
On Thu, Jan 18, 2024 at 11:10 PM Sean Christopherson  wrote:
>
> On Thu, Jan 18, 2024, Anup Patel wrote:
> > On Thu, Jan 4, 2024 at 6:07 PM Andrew Jones  wrote:
> > >
> > > KVM requires EVENTFD, which is selected by HAVE_KVM. Other KVM
> > > supporting architectures select HAVE_KVM and then their KVM
> > > Kconfigs ensure its there with a depends on HAVE_KVM. Make RISCV
> > > consistent with that approach which fixes configs which have KVM
> > > but not EVENTFD, as was discovered with a randconfig test.
> > >
> > > Fixes: 99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
> > > Reported-by: Randy Dunlap 
> > > Closes: 
> > > https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d382553...@infradead.org/
> > > Signed-off-by: Andrew Jones 
> >
> > Queued this patch for Linux-6.8
>
> That should be unnecessary.  Commit caadf876bb74 ("KVM: introduce 
> CONFIG_KVM_COMMON"),
> which is in Paolo's pull request for 6.8, addresses the EVENTFD issue.  And 
> the
> rest of Paolo's series[*], which presumably will get queued for 6.9, 
> eliminates
> HAVE_KVM entirely.
>
> [*] https://lore.kernel.org/all/20240108124740.114453-6-pbonz...@redhat.com

I was not sure about the timeline of when Paolo's series would be merged
hence thought of taking this patch as a fix.

For now, I will drop this patch from my queue. If required we can have it
as a 6.8-rc fix.

Regards,
Anup


Re: [RFC PATCH 2/3] fs: remove duplicate ifdefs

2024-01-18 Thread Darrick J. Wong
On Thu, Jan 18, 2024 at 01:33:25PM +0530, Shrikanth Hegde wrote:
> when a ifdef is used in the below manner, second one could be considered as
> duplicate.
> 
> ifdef DEFINE_A
> ...code block...
> ifdef DEFINE_A
> ...code block...
> endif
> ...code block...
> endif
> 
> There are few places in fs code where above pattern was seen.
> No functional change is intended here. It only aims to improve code
> readability.
> 
> Signed-off-by: Shrikanth Hegde 
> ---
>  fs/ntfs/inode.c| 2 --
>  fs/xfs/xfs_sysfs.c | 4 
>  2 files changed, 6 deletions(-)
> 
> diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
> index aba1e22db4e9..d2c8622d53d1 100644
> --- a/fs/ntfs/inode.c
> +++ b/fs/ntfs/inode.c
> @@ -2859,11 +2859,9 @@ int ntfs_truncate(struct inode *vi)
>   *
>   * See ntfs_truncate() description above for details.
>   */
> -#ifdef NTFS_RW
>  void ntfs_truncate_vfs(struct inode *vi) {
>   ntfs_truncate(vi);
>  }
> -#endif
> 
>  /**
>   * ntfs_setattr - called from notify_change() when an attribute is being 
> changed
> diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
> index 17485666b672..d2391eec37fe 100644
> --- a/fs/xfs/xfs_sysfs.c
> +++ b/fs/xfs/xfs_sysfs.c
> @@ -193,7 +193,6 @@ always_cow_show(
>  }
>  XFS_SYSFS_ATTR_RW(always_cow);
> 
> -#ifdef DEBUG
>  /*
>   * Override how many threads the parallel work queue is allowed to create.
>   * This has to be a debug-only global (instead of an errortag) because one of
> @@ -260,7 +259,6 @@ larp_show(
>   return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.larp);
>  }
>  XFS_SYSFS_ATTR_RW(larp);
> -#endif /* DEBUG */
> 
>  STATIC ssize_t
>  bload_leaf_slack_store(
> @@ -319,10 +317,8 @@ static struct attribute *xfs_dbg_attrs[] = {
>   ATTR_LIST(log_recovery_delay),
>   ATTR_LIST(mount_delay),
>   ATTR_LIST(always_cow),
> -#ifdef DEBUG
>   ATTR_LIST(pwork_threads),
>   ATTR_LIST(larp),
> -#endif

The xfs part seems fine to me bcause I think some bot already
complained about this...

Reviewed-by: Darrick J. Wong 

--D

>   ATTR_LIST(bload_leaf_slack),
>   ATTR_LIST(bload_node_slack),
>   NULL,
> --
> 2.39.3
> 
> 


[PATCH] KVM: PPC: Book3S HV: Fix L2 guest reboot failure due to empty 'arch_compat'

2024-01-18 Thread Amit Machhiwal
Currently, rebooting a pseries nested qemu-kvm guest (L2) results in
below error as L1 qemu sends PVR value 'arch_compat' == 0 via
ppc_set_compat ioctl. This triggers a condition failure in
kvmppc_set_arch_compat() resulting in an EINVAL.

qemu-system-ppc64: Unable to set CPU compatibility mode in KVM: Invalid

This patch updates kvmppc_set_arch_compat() to use the host PVR value if
'compat_pvr' == 0 indicating that qemu doesn't want to enforce any
specific PVR compat mode.

Signed-off-by: Amit Machhiwal 
---
 arch/powerpc/kvm/book3s_hv.c  |  2 +-
 arch/powerpc/kvm/book3s_hv_nestedv2.c | 12 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 1ed6ec140701..9573d7f4764a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -439,7 +439,7 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, 
u32 arch_compat)
if (guest_pcr_bit > host_pcr_bit)
return -EINVAL;
 
-   if (kvmhv_on_pseries() && kvmhv_is_nestedv2()) {
+   if (kvmhv_on_pseries() && kvmhv_is_nestedv2() && arch_compat) {
if (!(cap & nested_capabilities))
return -EINVAL;
}
diff --git a/arch/powerpc/kvm/book3s_hv_nestedv2.c 
b/arch/powerpc/kvm/book3s_hv_nestedv2.c
index fd3c4f2d9480..069a1fcfd782 100644
--- a/arch/powerpc/kvm/book3s_hv_nestedv2.c
+++ b/arch/powerpc/kvm/book3s_hv_nestedv2.c
@@ -138,6 +138,7 @@ static int gs_msg_ops_vcpu_fill_info(struct kvmppc_gs_buff 
*gsb,
vector128 v;
int rc, i;
u16 iden;
+   u32 arch_compat = 0;
 
vcpu = gsm->data;
 
@@ -347,8 +348,15 @@ static int gs_msg_ops_vcpu_fill_info(struct kvmppc_gs_buff 
*gsb,
break;
}
case KVMPPC_GSID_LOGICAL_PVR:
-   rc = kvmppc_gse_put_u32(gsb, iden,
-   vcpu->arch.vcore->arch_compat);
+   if (!vcpu->arch.vcore->arch_compat) {
+   if (cpu_has_feature(CPU_FTR_ARCH_31))
+   arch_compat = PVR_ARCH_31;
+   else if (cpu_has_feature(CPU_FTR_ARCH_300))
+   arch_compat = PVR_ARCH_300;
+   } else {
+   arch_compat = vcpu->arch.vcore->arch_compat;
+   }
+   rc = kvmppc_gse_put_u32(gsb, iden, arch_compat);
break;
}
 
-- 
2.43.0



Re: [PATCH] init: refactor the generic cpu_to_node for NUMA

2024-01-18 Thread Shijie Huang

Hi Greg,

在 2024/1/18 17:27, Greg KH 写道:

On Thu, Jan 18, 2024 at 11:14:12AM +0800, Huang Shijie wrote:

(0) We list the ARCHs which support the NUMA:
arm64, loongarch, powerpc, riscv,
sparc, mips, s390, x86,

I do not understand this format, what are you saying here?


Sorry for the confusing.


I should put the conclusion at the beginning:

  The generic cpu_to_node() has bug in some situations.

  The generic cpu_to_node()  does not work in arm64, powerpc, riscv 
when the CONFIG_NUMA is enabled:


 The cpu_to_node() is called before it is initialized.

 So all the four places are set with the wrong node id (get by 
cpu_to_node()):


   a.) early_trace_init() in kernel/trace/trace.c
   b.) sched_init()   in kernel/sched/core.c
   c.) init_sched_fair_class()in kernel/sched/fair.c
   d.) workqueue_init_early() in kernel/workqueue.c


Thanks

Huang Shijie



[RFC PATCH 2/3] fs: remove duplicate ifdefs

2024-01-18 Thread Shrikanth Hegde
when a ifdef is used in the below manner, second one could be considered as
duplicate.

ifdef DEFINE_A
...code block...
ifdef DEFINE_A
...code block...
endif
...code block...
endif

There are few places in fs code where above pattern was seen.
No functional change is intended here. It only aims to improve code
readability.

Signed-off-by: Shrikanth Hegde 
---
 fs/ntfs/inode.c| 2 --
 fs/xfs/xfs_sysfs.c | 4 
 2 files changed, 6 deletions(-)

diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
index aba1e22db4e9..d2c8622d53d1 100644
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -2859,11 +2859,9 @@ int ntfs_truncate(struct inode *vi)
  *
  * See ntfs_truncate() description above for details.
  */
-#ifdef NTFS_RW
 void ntfs_truncate_vfs(struct inode *vi) {
ntfs_truncate(vi);
 }
-#endif

 /**
  * ntfs_setattr - called from notify_change() when an attribute is being 
changed
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index 17485666b672..d2391eec37fe 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -193,7 +193,6 @@ always_cow_show(
 }
 XFS_SYSFS_ATTR_RW(always_cow);

-#ifdef DEBUG
 /*
  * Override how many threads the parallel work queue is allowed to create.
  * This has to be a debug-only global (instead of an errortag) because one of
@@ -260,7 +259,6 @@ larp_show(
return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.larp);
 }
 XFS_SYSFS_ATTR_RW(larp);
-#endif /* DEBUG */

 STATIC ssize_t
 bload_leaf_slack_store(
@@ -319,10 +317,8 @@ static struct attribute *xfs_dbg_attrs[] = {
ATTR_LIST(log_recovery_delay),
ATTR_LIST(mount_delay),
ATTR_LIST(always_cow),
-#ifdef DEBUG
ATTR_LIST(pwork_threads),
ATTR_LIST(larp),
-#endif
ATTR_LIST(bload_leaf_slack),
ATTR_LIST(bload_node_slack),
NULL,
--
2.39.3



[RFC PATCH 0/3] remove duplicate ifdefs

2024-01-18 Thread Shrikanth Hegde
When going through the code observed a case in scheduler,
where #ifdef CONFIG_SMP was used to inside an #ifdef CONFIG_SMP.
That didn't make sense since first one is good enough and second
one is a duplicate.

This could improve code readability. No functional change is intended.
Maybe this is not an issue these days as language servers can parse
the config and users can read the code without bothering about
whats true and whats not.

Does this change makes sense?

Since this might be present in other code areas wrote a very basic
python script which helps in finding these cases. It doesn't handle any
complicated #defines or space separated "# if". At some places the
log collected had to be manually corrected due to space separated ifdefs.
Thats why its not a treewide change.
There might be an opportunity for other files as well.

Logic is very simple. If there is #ifdef or #if or #ifndef add that
variable to list. Upon every subsequent #ifdef or #if or #ifndef
check if the same variable is in the list. If yes flag
an error. Verification was done manually later checking for any #undef
or any error due to script. These were the ones that flagged out and
made sense after going through code.

ifdefs were collected using grep in below way and that file was used as
the input to the script.
grep -rIwn --include="*.c*" --include="*.h"  -e "#if" -e "#ifndef" -e "#ifdef" 
-e "#else" -e "#endif" * > /tmp/input.txt

-
script used:
-
import os
import argparse

def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--file",
help="file to input to script",
type=str)
parser.add_argument("--verbose",
help="Print additional debugging info, 0 to disable 
",
type=int)
args = parser.parse_args()
return args

def parseFiles(args):
file_to_parse = open(args.file, "r")
lines = file_to_parse.readlines()
check_length = len(lines)
ifdefs_list = []
i=0

while i < check_length:
line = lines[i]
last_word = line.strip().split(":")[2]
last_word = last_word.split("/")[0]

if (args.verbose):
print(line)
last_word_splits = last_word.split()
if (args.verbose):
print(last_word_splits)
if last_word_splits[0] == "#ifdef" or last_word_splits[0] == "#ifndef" 
or last_word_splits[0] == "#if":
if last_word_splits[1] in ifdefs_list:
print("This is duplicate and may be fixed: %s, parent_list:\n" 
% (line))
print(ifdefs_list)
ifdefs_list.append(last_word_splits[1])
if last_word_splits[0] == "#endif"":
ifdefs_list.pop()

i=i+1

if __name__ == "__main__":
args = parse_args()
parseFiles(args)
-


Shrikanth Hegde (3):
  sched: remove duplicate ifdefs
  fs: remove depulicate ifdefs
  arch/powerpc: remove duplicate ifdefs

 arch/powerpc/include/asm/paca.h   | 4 
 arch/powerpc/kernel/asm-offsets.c | 2 --
 arch/powerpc/platforms/powermac/feature.c | 2 --
 arch/powerpc/xmon/xmon.c  | 2 --
 fs/ntfs/inode.c   | 2 --
 fs/xfs/xfs_sysfs.c| 4 
 kernel/sched/core.c   | 4 +---
 kernel/sched/fair.c   | 2 --
 8 files changed, 1 insertion(+), 21 deletions(-)

--
2.39.3



[RFC PATCH 3/3] arch/powerpc: remove duplicate ifdefs

2024-01-18 Thread Shrikanth Hegde
when a ifdef is used in the below manner, second one could be considered as
duplicate.

ifdef DEFINE_A
...code block...
ifdef DEFINE_A
...code block...
endif
...code block...
endif

few places in arch/powerpc where this pattern was seen. In addition to that
in paca.h, CONFIG_PPC_BOOK3S_64 was defined back to back. merged the two
ifdefs.

No functional change is intended here. It only aims to improve code
readability.

Signed-off-by: Shrikanth Hegde 
---
 arch/powerpc/include/asm/paca.h   | 4 
 arch/powerpc/kernel/asm-offsets.c | 2 --
 arch/powerpc/platforms/powermac/feature.c | 2 --
 arch/powerpc/xmon/xmon.c  | 2 --
 4 files changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index e667d455ecb4..1d58da946739 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -163,9 +163,7 @@ struct paca_struct {
u64 kstack; /* Saved Kernel stack addr */
u64 saved_r1;   /* r1 save for RTAS calls or PM or EE=0 
*/
u64 saved_msr;  /* MSR saved here by enter_rtas */
-#ifdef CONFIG_PPC64
u64 exit_save_r1;   /* Syscall/interrupt R1 save */
-#endif
 #ifdef CONFIG_PPC_BOOK3E_64
u16 trap_save;  /* Used when bad stack is encountered */
 #endif
@@ -214,8 +212,6 @@ struct paca_struct {
/* Non-maskable exceptions that are not performance critical */
u64 exnmi[EX_SIZE]; /* used for system reset (nmi) */
u64 exmc[EX_SIZE];  /* used for machine checks */
-#endif
-#ifdef CONFIG_PPC_BOOK3S_64
/* Exclusive stacks for system reset and machine check exception. */
void *nmi_emergency_sp;
void *mc_emergency_sp;
diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 9f14d95b8b32..f029755f9e69 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -246,9 +246,7 @@ int main(void)
OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id);
OFFSET(PACAKEXECSTATE, paca_struct, kexec_state);
OFFSET(PACA_DSCR_DEFAULT, paca_struct, dscr_default);
-#ifdef CONFIG_PPC64
OFFSET(PACA_EXIT_SAVE_R1, paca_struct, exit_save_r1);
-#endif
 #ifdef CONFIG_PPC_BOOK3E_64
OFFSET(PACA_TRAP_SAVE, paca_struct, trap_save);
 #endif
diff --git a/arch/powerpc/platforms/powermac/feature.c 
b/arch/powerpc/platforms/powermac/feature.c
index 81c9fbae88b1..2cc257f75c50 100644
--- a/arch/powerpc/platforms/powermac/feature.c
+++ b/arch/powerpc/platforms/powermac/feature.c
@@ -2333,7 +2333,6 @@ static struct pmac_mb_def pmac_mb_defs[] = {
PMAC_TYPE_POWERMAC_G5,  g5_features,
0,
},
-#ifdef CONFIG_PPC64
{   "PowerMac7,3",  "PowerMac G5",
PMAC_TYPE_POWERMAC_G5,  g5_features,
0,
@@ -2359,7 +2358,6 @@ static struct pmac_mb_def pmac_mb_defs[] = {
0,
},
 #endif /* CONFIG_PPC64 */
-#endif /* CONFIG_PPC64 */
 };

 /*
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index b3b94cd37713..f413c220165c 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -643,10 +643,8 @@ static int xmon_core(struct pt_regs *regs, volatile int 
fromipi)
touch_nmi_watchdog();
} else {
cmd = 1;
-#ifdef CONFIG_SMP
if (xmon_batch)
cmd = batch_cmds(regs);
-#endif
if (!locked_down && cmd)
cmd = cmds(regs);
if (locked_down || cmd != 0) {
--
2.39.3



[RFC PATCH 1/3] sched: remove duplicate ifdefs

2024-01-18 Thread Shrikanth Hegde
when a ifdef is used in the below manner, second one could be considered as
duplicate.

ifdef DEFINE_A
...code block...
ifdef DEFINE_A
...code block...
endif
...code block...
endif

In the scheduler code, there are two places where above pattern can be
observed. Hence second ifdef is a duplicate and not needed.
Plus a minor comment update to reflect the else case.

No functional change is intended here. It only aims to improve code
readability.

Signed-off-by: Shrikanth Hegde 
---
 kernel/sched/core.c | 4 +---
 kernel/sched/fair.c | 2 --
 2 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 038eeaf76d2d..1bfb186fd67f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1792,7 +1792,6 @@ static void cpu_util_update_eff(struct 
cgroup_subsys_state *css);
 #endif

 #ifdef CONFIG_SYSCTL
-#ifdef CONFIG_UCLAMP_TASK
 #ifdef CONFIG_UCLAMP_TASK_GROUP
 static void uclamp_update_root_tg(void)
 {
@@ -1898,7 +1897,6 @@ static int sysctl_sched_uclamp_handler(struct ctl_table 
*table, int write,
return result;
 }
 #endif
-#endif

 static int uclamp_validate(struct task_struct *p,
   const struct sched_attr *attr)
@@ -2065,7 +2063,7 @@ static void __init init_uclamp(void)
}
 }

-#else /* CONFIG_UCLAMP_TASK */
+#else /* !CONFIG_UCLAMP_TASK */
 static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
 static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
 static inline int uclamp_validate(struct task_struct *p,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f2bb83675e4a..6158a6752c25 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10166,10 +10166,8 @@ static int idle_cpu_without(int cpu, struct 
task_struct *p)
 * be computed and tested before calling idle_cpu_without().
 */

-#ifdef CONFIG_SMP
if (rq->ttwu_pending)
return 0;
-#endif

return 1;
 }
--
2.39.3



Re: [PATCH] powerpc/pseries/iommu: DLPAR ADD of pci device doesn't completely initialize pci_controller structure

2024-01-18 Thread Nathan Lynch
Hi Gaurav,

A couple minor comments below.

Gaurav Batra  writes:
> diff --git a/arch/powerpc/include/asm/ppc-pci.h 
> b/arch/powerpc/include/asm/ppc-pci.h
> index ce2b1b5eebdd..55a2ba36e9c4 100644
> --- a/arch/powerpc/include/asm/ppc-pci.h
> +++ b/arch/powerpc/include/asm/ppc-pci.h
> @@ -29,6 +29,9 @@ void *pci_traverse_device_nodes(struct device_node *start,
>   void *(*fn)(struct device_node *, void *),
>   void *data);
>  extern void pci_devs_phb_init_dynamic(struct pci_controller *phb);
> +extern void pci_register_device_dynamic(struct pci_controller *phb);
> +extern void pci_unregister_device_dynamic(struct pci_controller *phb);
> +
>  
>  /* From rtas_pci.h */
>  extern void init_pci_config_tokens (void);
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index ebe259bdd462..342739fe74c4 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -1388,6 +1388,21 @@ static const struct attribute_group 
> *spapr_tce_iommu_groups[] = {
>   NULL,
>  };
>  
> +void pci_register_device_dynamic(struct pci_controller *phb)
> +{
> + iommu_device_sysfs_add(&phb->iommu, phb->parent,
> + spapr_tce_iommu_groups, "iommu-phb%04x",
> + phb->global_number);
> + iommu_device_register(&phb->iommu, &spapr_tce_iommu_ops,
> + phb->parent);
> +}
> +
> +void pci_unregister_device_dynamic(struct pci_controller *phb)
> +{
> + iommu_device_unregister(&phb->iommu);
> + iommu_device_sysfs_remove(&phb->iommu);
> +}
> +
>  /*
>   * This registers IOMMU devices of PHBs. This needs to happen
>   * after core_initcall(iommu_init) + postcore_initcall(pci_driver_init) and
> diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c 
> b/arch/powerpc/platforms/pseries/pci_dlpar.c
> index 4ba824568119..ec70ca435b7e 100644
> --- a/arch/powerpc/platforms/pseries/pci_dlpar.c
> +++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
> @@ -35,6 +35,8 @@ struct pci_controller *init_phb_dynamic(struct device_node 
> *dn)
>  
>   pseries_msi_allocate_domains(phb);
>  
> + pci_register_device_dynamic(phb);
> +
>   /* Create EEH devices for the PHB */
>   eeh_phb_pe_create(phb);
>  
> @@ -76,6 +78,8 @@ int remove_phb_dynamic(struct pci_controller *phb)
>   }
>   }
>  
> + pci_unregister_device_dynamic(phb);
> +
>   pseries_msi_free_domains(phb);
>  
>   /* Keep a reference so phb isn't freed yet */
> --

The change overall looks correct to me, but:

1. I don't think the new functions should use the "pci_" prefix; that
   should probably be reserved for code in the core PCI subsystem. Some
   existing code in arch/powerpc/kernel/iommu.c uses "ppc_iommu_" and
   "spapr_tce_", maybe one of those would work instead?

2. Your pci_register_device_dynamic() duplicates code from
   spapr_tce_setup_phb_iommus_initcall():

list_for_each_entry(hose, &hose_list, list_node) {
iommu_device_sysfs_add(&hose->iommu, hose->parent,
   spapr_tce_iommu_groups, "iommu-phb%04x",
   hose->global_number);
iommu_device_register(&hose->iommu, &spapr_tce_iommu_ops,
  hose->parent);
}

  Can the loop body be factored into a common function that can be
  used in both paths?


Re: [PATCH -fixes v2] RISC-V: KVM: Require HAVE_KVM

2024-01-18 Thread Sean Christopherson
On Thu, Jan 18, 2024, Anup Patel wrote:
> On Thu, Jan 4, 2024 at 6:07 PM Andrew Jones  wrote:
> >
> > KVM requires EVENTFD, which is selected by HAVE_KVM. Other KVM
> > supporting architectures select HAVE_KVM and then their KVM
> > Kconfigs ensure its there with a depends on HAVE_KVM. Make RISCV
> > consistent with that approach which fixes configs which have KVM
> > but not EVENTFD, as was discovered with a randconfig test.
> >
> > Fixes: 99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
> > Reported-by: Randy Dunlap 
> > Closes: 
> > https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d382553...@infradead.org/
> > Signed-off-by: Andrew Jones 
> 
> Queued this patch for Linux-6.8

That should be unnecessary.  Commit caadf876bb74 ("KVM: introduce 
CONFIG_KVM_COMMON"),
which is in Paolo's pull request for 6.8, addresses the EVENTFD issue.  And the
rest of Paolo's series[*], which presumably will get queued for 6.9, eliminates
HAVE_KVM entirely.

[*] https://lore.kernel.org/all/20240108124740.114453-6-pbonz...@redhat.com


Re: [PATCH v6] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

2024-01-18 Thread Nathan Lynch
Haren Myneni  writes:
> VAS allocate, modify and deallocate HCALLs returns
> H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
> delay and expects OS to reissue HCALL after that delay. But using
> msleep() will often sleep at least 20 msecs even though the
> hypervisor suggests OS reissue these HCALLs after 1 or 10msecs.
>
> The open and close VAS window functions hold mutex and then issue
> these HCALLs. So these operations can take longer than the
> necessary when multiple threads issue open or close window APIs
> simultaneously, especially might affect the performance in the
> case of repeat open/close APIs for each compression request.
>
> Multiple tasks can open / close VAS windows at the same time
> which depends on the available VAS credits. For example, 240
> cores system provides 4800 VAS credits. It means 4800 tasks can
> execute open VAS windows HCALLs with the mutex. Since each
> msleep() will often sleep more than 20 msecs, some tasks are
> waiting more than 120 secs to acquire mutex. It can cause hung
> traces for these tasks in dmesg due to mutex contention around
> open/close HCALLs.
>
> Instead of msleep(), use usleep_range() to ensure sleep with
> the expected value before issuing HCALL again. So since each
> task sleep 10 msecs maximum, this patch allow more tasks can
> issue open/close VAS calls without any hung traces in the
> dmesg.
>
> Signed-off-by: Haren Myneni 
> Suggested-by: Nathan Lynch 

Reviewed-by: Nathan Lynch 

IMO this can be converted to a more generic helper in the future, should
one emerge.


[PATCH RFC 0/5] dump_stack: Allow runtime updates of the hardware description

2024-01-18 Thread Nathan Lynch via B4 Relay
When the kernel emits a stack trace, typically it includes a hardware
description string, e.g.

  Kernel panic - not syncing: sysrq triggered crash
  CPU: 6 PID: 46433 Comm: bash Tainted: GW  6.7.0-rc2+ #83
> Hardware name: IBM,9040-MR9 POWER9 (architected) 0x4e2102 0xf05 
> of:IBM,FW950.01 (VM950_047) hv:phyp pSeries
  Call Trace:
   dump_stack_lvl+0xc4/0x170 (unreliable)
   panic+0x39c/0x584
   sysrq_handle_crash+0x80/0xe0
   __handle_sysrq+0x208/0x4bc
   [...]

This string is a statically allocated buffer populated during boot by
arch code calling dump_stack_set_arch_desc(). For most platforms this
is sufficient.

But the string may become inaccurate on the IBM PowerVM platform due
to live migration between machine models and firmware versions. Stack
dumps emitted after a migration reflect the machine on which the
kernel booted, not necessarily the machine on which it is currently
running. This is potentially confusing for anyone investigating
kernel issues on the platform.

To address this, this series introduces a new function that safely
updates the hardware description string and updates the powerpc
pseries platform code to call it after a migration. The series also
includes changes addressing minor latent issues identified during the
implementation.

Platforms which do not need the new functionality remain unchanged.

For this initial version at least, the powerpc/pseries part includes
some "self-test" code that 1. verifies that reconstructing the
hardware description string late in boot matches the one that was
built earlier, and 2. fully exercises the update path before any
migrations occur. This could be dropped or made configurable in the
future.

Signed-off-by: Nathan Lynch 
---
Nathan Lynch (5):
  dump_stack: Make arch description buffer __ro_after_init
  dump_stack: Allow update of arch description string at runtime
  powerpc/prom: Add CPU info to hardware description string later
  powerpc/pseries: Prepare pseries_add_hw_description() for runtime use
  powerpc/pseries: Update hardware description string after migration

 arch/powerpc/kernel/prom.c| 12 +++--
 arch/powerpc/platforms/pseries/mobility.c |  5 ++
 arch/powerpc/platforms/pseries/pseries.h  |  1 +
 arch/powerpc/platforms/pseries/setup.c| 80 +--
 include/linux/printk.h|  5 ++
 lib/dump_stack.c  | 57 --
 6 files changed, 146 insertions(+), 14 deletions(-)
---
base-commit: 44a1aad2fe6c10bfe0589d8047057b10a4c18a19
change-id: 20240111-update-dump-stack-arch-str-7f0880d23f30

Best regards,
-- 
Nathan Lynch 



[PATCH RFC 5/5] powerpc/pseries: Update hardware description string after migration

2024-01-18 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

Introduce code that rebuilds the short hardware description printed by
stack traces. This sort of duplicates some code from boot (prom.c
mainly), but that code populates the string as early as possible using
APIs that aren't available later. So sharing all the code between the
boot and runtime versions isn't feasible.

To prevent "drift" between the boot and runtime versions, rebuild the
description using the new runtime APIs in a late initcall and warn if
it doesn't match the one built earlier. The initcall also invokes
dump_stack_update_arch_desc() twice to fully exercise it before any
partition migration occurs. These checks could be dropped or made
configurable later.

Call pseries_update_hw_description() immediately after updating the
device tree when resuming from a partition migration.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/mobility.c |  5 +++
 arch/powerpc/platforms/pseries/pseries.h  |  1 +
 arch/powerpc/platforms/pseries/setup.c| 70 +++
 3 files changed, 76 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 1798f0f14d58..ff573cb5aee5 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -378,6 +378,11 @@ void post_mobility_fixup(void)
rc = pseries_devicetree_update(MIGRATION_SCOPE);
if (rc)
pr_err("device tree update failed: %d\n", rc);
+   /*
+* Rebuild the hardware description printed in stack traces
+* using the updated device tree.
+*/
+   pseries_update_hw_description();
 
cacheinfo_rebuild();
 
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index bba4ad192b0f..810a64fccc7e 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -56,6 +56,7 @@ extern int dlpar_acquire_drc(u32 drc_index);
 extern int dlpar_release_drc(u32 drc_index);
 extern int dlpar_unisolate_drc(u32 drc_index);
 extern void post_mobility_fixup(void);
+void pseries_update_hw_description(void);
 
 void queue_hotplug_event(struct pseries_hp_errorlog *hp_errlog);
 int handle_dlpar_errorlog(struct pseries_hp_errorlog *hp_errlog);
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 9ae1951f8312..72177411026e 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -1034,6 +1034,76 @@ static void pseries_add_hw_description(struct seq_buf 
*sb)
seq_buf_printf(sb, "hv:phyp ");
 }
 
+static void pseries_rebuild_hw_desc(struct seq_buf *sb)
+{
+   struct device_node *cpudn, *root;
+   const char *model;
+   u32 cpu_version;
+
+   seq_buf_clear(sb);
+
+   root = of_find_node_by_path("/");
+   if (!of_property_read_string(root, "model", &model))
+   seq_buf_printf(sb, "%s ", model);
+   of_node_put(root);
+
+   seq_buf_printf(sb, "%s 0x%04lx ", cur_cpu_spec->cpu_name, 
mfspr(SPRN_PVR));
+
+   cpudn = of_get_next_cpu_node(NULL);
+   if (!of_property_read_u32(cpudn, "cpu-version", &cpu_version)) {
+   if ((cpu_version & 0xff00) == 0x0f00)
+   seq_buf_printf(sb, "0x%04x ", cpu_version);
+   }
+   of_node_put(cpudn);
+
+   pseries_add_hw_description(sb);
+
+   seq_buf_puts(sb, ppc_md.name);
+}
+
+void pseries_update_hw_description(void)
+{
+   struct seq_buf sb = { // todo: use DECLARE_SEQ_BUF() once it's fixed
+   .buffer = (char[128]) { 0 },
+   .size = sizeof(char[128]),
+   };
+
+   pseries_rebuild_hw_desc(&sb);
+   dump_stack_update_arch_desc("%s", seq_buf_str(&sb));
+}
+
+static int __init pseries_test_update_hw_desc(void)
+{
+   struct seq_buf sb = { // todo: use DECLARE_SEQ_BUF() once it's fixed
+   .buffer = (char[128]) { 0 },
+   .size = sizeof(char[128]),
+   };
+   bool mismatch;
+
+   /*
+* Ensure the rebuilt description matches the one built during
+* boot.
+*/
+   pseries_rebuild_hw_desc(&sb);
+
+   mismatch = strcmp(seq_buf_str(&ppc_hw_desc), seq_buf_str(&sb));
+   if (WARN(mismatch, "rebuilt hardware description string mismatch")) {
+   pr_err("  boot:'%s'\n", ppc_hw_desc.buffer);
+   pr_err("  runtime: '%s'\n", sb.buffer);
+   return -EINVAL;
+   }
+
+   /*
+* Invoke dump_stack_update_arch_desc() *twice* to ensure it
+* exercises the free path.
+*/
+   dump_stack_update_arch_desc("%s", sb.buffer);
+   dump_stack_update_arch_desc("%s", sb.buffer);
+
+   return 0;
+}
+late_initcall(pseries_test_update_hw_desc);
+
 /*
  * Early initialization.  Relocation is on but do not reference unbolted pages
  */

-- 
2.43.0



[PATCH RFC 2/5] dump_stack: Allow update of arch description string at runtime

2024-01-18 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

The IBM PowerVM platform (targeted by powerpc/pseries) exposes the
physical machine model and firmware version to partitions (guests),
and this information is used to populate the arch description string,
e.g.

  IBM,8408-E8E POWER8E (raw) 0x4b0201 0xf04 \
of:IBM,FW860.50 (SV860_146) hv:phyp pSeries

The platform supports live migration of partitions between different
machine models and firmware versions, so the arch description string
set at boot can become inaccurate, potentially misleading anyone who's
analyzing stack traces produced after a migration.

Introduce a RCU-guarded pointer to the current arch description
string, initializing it to the static buffer populated at boot. Add to
dump_stack_print_info() a RCU read-side critical section that accesses
the buffer through this pointer. The majority of architectures which
don't need to update the string after boot incur only an additional
indirection.

As for platforms which do need that ability, they can use
dump_stack_update_arch_desc(), which allocates and formats a new
buffer, updates the pointer, and if appropriate frees the previous
buffer.

Signed-off-by: Nathan Lynch 
---
 include/linux/printk.h |  5 +
 lib/dump_stack.c   | 54 ++
 2 files changed, 55 insertions(+), 4 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 8ef499ab3c1e..6138ae019d2a 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -187,6 +187,7 @@ u32 log_buf_len_get(void);
 void log_buf_vmcoreinfo_setup(void);
 void __init setup_log_buf(int early);
 __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
+__printf(1, 2) void dump_stack_update_arch_desc(const char *fmt, ...);
 void dump_stack_print_info(const char *log_lvl);
 void show_regs_print_info(const char *log_lvl);
 extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold;
@@ -253,6 +254,10 @@ static inline __printf(1, 2) void 
dump_stack_set_arch_desc(const char *fmt, ...)
 {
 }
 
+static inline __printf(1, 2) void dump_stack_update_arch_desc(const char *fmt, 
...)
+{
+}
+
 static inline void dump_stack_print_info(const char *log_lvl)
 {
 }
diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 1057f102f6f2..bd497e7797ee 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -8,15 +8,18 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 static char dump_stack_arch_desc_str[128] __ro_after_init;
+static const char *dump_stack_arch_desc_ptr = dump_stack_arch_desc_str;
 
 /**
  * dump_stack_set_arch_desc - set arch-specific str to show with task dumps
@@ -28,7 +31,7 @@ static char dump_stack_arch_desc_str[128] __ro_after_init;
  * arch wants to make use of such an ID string, it should initialize this
  * as soon as possible during boot.
  */
-void __init dump_stack_set_arch_desc(const char *fmt, ...)
+void dump_stack_set_arch_desc(const char *fmt, ...)
 {
va_list args;
 
@@ -38,6 +41,45 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
va_end(args);
 }
 
+/**
+ * dump_stack_update_arch_desc() - Update the arch description string at 
runtime.
+ * @fmt: printf-style format string
+ * @...: arguments for the format string
+ *
+ * A runtime counterpart of dump_stack_set_arch_desc(). Arch code
+ * should use this when the arch description set at boot potentially
+ * has become inaccurate, such as after a guest migration.
+ *
+ * Context: May sleep.
+ */
+void dump_stack_update_arch_desc(const char *fmt, ...)
+{
+   static DEFINE_SPINLOCK(arch_desc_update_lock);
+   const char *old;
+   const char *new;
+   va_list args;
+
+   va_start(args, fmt);
+   new = kvasprintf(GFP_KERNEL, fmt, args);
+   va_end(args);
+
+   if (!new)
+   return;
+
+   spin_lock(&arch_desc_update_lock);
+   old = rcu_replace_pointer(dump_stack_arch_desc_ptr, new,
+ lockdep_is_held(&arch_desc_update_lock));
+   spin_unlock(&arch_desc_update_lock);
+
+   /*
+* Avoid freeing the static buffer initialized during boot.
+*/
+   if (old == dump_stack_arch_desc_str)
+   return;
+
+   kfree_rcu_mightsleep(old);
+}
+
 #if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID)
 #define BUILD_ID_FMT " %20phN"
 #define BUILD_ID_VAL vmlinux_build_id
@@ -55,6 +97,8 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
  */
 void dump_stack_print_info(const char *log_lvl)
 {
+   const char *arch_str;
+
printk("%sCPU: %d PID: %d Comm: %.20s %s%s %s %.*s" BUILD_ID_FMT "\n",
   log_lvl, raw_smp_processor_id(), current->pid, current->comm,
   kexec_crash_loaded() ? "Kdump: loaded " : "",
@@ -63,9 +107,11 @@ void dump_stack_print_info(const char *log_lvl)
   (int)strcspn(init_utsname()->version, " "),
   init_utsname()->version, BUILD_ID_

[PATCH RFC 3/5] powerpc/prom: Add CPU info to hardware description string later

2024-01-18 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

cur_cpu_spec->cpu_name is appended to ppc_hw_desc before cur_cpu_spec
has taken on its final value. This is illustrated on pseries by
comparing the CPU name as reported at boot ("POWER8E (raw)") to the
contents of /proc/cpuinfo ("POWER8 (architected)"):

  $ dmesg | grep Hardware
  Hardware name: IBM,8408-E8E POWER8E (raw) 0x4b0201 0xf04 \
of:IBM,FW860.50 (SV860_146) hv:phyp pSeries

  $ grep -m 1 ^cpu /proc/cpuinfo
  cpu : POWER8 (architected), altivec supported

Some 44x models would appear to be affected as well; see
identical_pvr_fixup().

This results in incorrect CPU information in stack dumps --
ppc_hw_desc is an input to dump_stack_set_arch_desc().

Delay gathering the CPU name until after all potential calls to
identify_cpu().

Signed-off-by: Nathan Lynch 
Fixes: bd649d40e0f2 ("powerpc: Add PVR & CPU name to hardware description")
---
 arch/powerpc/kernel/prom.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 0b5878c3125b..c12b4434336f 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -327,6 +327,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
  void *data)
 {
const char *type = of_get_flat_dt_prop(node, "device_type", NULL);
+   const __be32 *cpu_version = NULL;
const __be32 *prop;
const __be32 *intserv;
int i, nthreads;
@@ -398,7 +399,7 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
prop = of_get_flat_dt_prop(node, "cpu-version", NULL);
if (prop && (be32_to_cpup(prop) & 0xff00) == 0x0f00) {
identify_cpu(0, be32_to_cpup(prop));
-   seq_buf_printf(&ppc_hw_desc, "0x%04x ", 
be32_to_cpup(prop));
+   cpu_version = prop;
}
 
check_cpu_feature_properties(node);
@@ -409,6 +410,12 @@ static int __init early_init_dt_scan_cpus(unsigned long 
node,
}
 
identical_pvr_fixup(node);
+
+   // We can now add the CPU name & PVR to the hardware description
+   seq_buf_printf(&ppc_hw_desc, "%s 0x%04lx ", cur_cpu_spec->cpu_name, 
mfspr(SPRN_PVR));
+   if (cpu_version)
+   seq_buf_printf(&ppc_hw_desc, "0x%04x ", 
be32_to_cpup(cpu_version));
+
init_mmu_slb_size(node);
 
 #ifdef CONFIG_PPC64
@@ -846,9 +853,6 @@ void __init early_init_devtree(void *params)
 
dt_cpu_ftrs_scan();
 
-   // We can now add the CPU name & PVR to the hardware description
-   seq_buf_printf(&ppc_hw_desc, "%s 0x%04lx ", cur_cpu_spec->cpu_name, 
mfspr(SPRN_PVR));
-
/* Retrieve CPU related informations from the flat tree
 * (altivec support, boot CPU ID, ...)
 */

-- 
2.43.0



[PATCH RFC 4/5] powerpc/pseries: Prepare pseries_add_hw_description() for runtime use

2024-01-18 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

pseries_add_hw_description() will be used after boot to update the
hardware description string emitted in stack dumps. Remove the __init
and make it take a seq_buf * parameter instead of referencing
ppc_hw_desc directly.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/platforms/pseries/setup.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index ecea85c74c43..9ae1951f8312 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -1007,7 +1007,7 @@ static void __init pSeries_cmo_feature_init(void)
pr_debug(" <- fw_cmo_feature_init()\n");
 }
 
-static void __init pseries_add_hw_description(void)
+static void pseries_add_hw_description(struct seq_buf *sb)
 {
struct device_node *dn;
const char *s;
@@ -1015,7 +1015,7 @@ static void __init pseries_add_hw_description(void)
dn = of_find_node_by_path("/openprom");
if (dn) {
if (of_property_read_string(dn, "model", &s) == 0)
-   seq_buf_printf(&ppc_hw_desc, "of:%s ", s);
+   seq_buf_printf(sb, "of:%s ", s);
 
of_node_put(dn);
}
@@ -1023,7 +1023,7 @@ static void __init pseries_add_hw_description(void)
dn = of_find_node_by_path("/hypervisor");
if (dn) {
if (of_property_read_string(dn, "compatible", &s) == 0)
-   seq_buf_printf(&ppc_hw_desc, "hv:%s ", s);
+   seq_buf_printf(sb, "hv:%s ", s);
 
of_node_put(dn);
return;
@@ -1031,7 +1031,7 @@ static void __init pseries_add_hw_description(void)
 
if (of_property_read_bool(of_root, "ibm,powervm-partition") ||
of_property_read_bool(of_root, "ibm,fw-net-version"))
-   seq_buf_printf(&ppc_hw_desc, "hv:phyp ");
+   seq_buf_printf(sb, "hv:phyp ");
 }
 
 /*
@@ -1041,7 +1041,7 @@ static void __init pseries_init(void)
 {
pr_debug(" -> pseries_init()\n");
 
-   pseries_add_hw_description();
+   pseries_add_hw_description(&ppc_hw_desc);
 
 #ifdef CONFIG_HVC_CONSOLE
if (firmware_has_feature(FW_FEATURE_LPAR))

-- 
2.43.0



[PATCH RFC 1/5] dump_stack: Make arch description buffer __ro_after_init

2024-01-18 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

The static hardware description buffer is populated by arch code
during boot and should not change afterwards, so mark it
__ro_after_init.

Signed-off-by: Nathan Lynch 
---
 lib/dump_stack.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/dump_stack.c b/lib/dump_stack.c
index 83471e81501a..1057f102f6f2 100644
--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -15,7 +16,7 @@
 #include 
 #include 
 
-static char dump_stack_arch_desc_str[128];
+static char dump_stack_arch_desc_str[128] __ro_after_init;
 
 /**
  * dump_stack_set_arch_desc - set arch-specific str to show with task dumps

-- 
2.43.0



Re: [PATCH v2 06/13] mm/gup: Drop folio_fast_pin_allowed() in hugepd processing

2024-01-18 Thread Ryan Roberts
On 17/01/2024 13:22, Jason Gunthorpe wrote:
> On Tue, Jan 16, 2024 at 06:32:32PM +, Christophe Leroy wrote:
 hugepd is a page directory dedicated to huge pages, where you have huge
 pages listed instead of regular pages. For instance, on powerpc 32 with
 each PGD entries covering 4Mbytes, a regular page table has 1024 PTEs. A
 hugepd for 512k is a page table with 8 entries.

 And for 8Mbytes entries, the hugepd is a page table with only one entry.
 And 2 consecutive PGS entries will point to the same hugepd to cover the
 entire 8Mbytes.
>>>
>>> That still sounds alot like the ARM thing - except ARM replicates the
>>> entry, you also said PPC relicates the entry like ARM to get to the
>>> 8M?
>>
>> Is it like ARM ? Not sure. The PTE is not in the PGD it must be in a L2 
>> directory, even for 8M.
> 
> Your diagram looks almost exactly like ARM to me.
> 
> The key thing is that the address for the L2 Table is *always* formed as:
> 
>L2 Table Base << 12 + L2 Index << 2 + 00
> 
> Then the L2 Descriptor must contains bits indicating the page
> size. The L2 Descriptor is replicated to every 4k entry that the page
> size covers.
> 
> The only difference I see is the 8M case which has a page size greater
> than a single L1 entry.
> 
>> Yes that's how it works on powerpc. For 8xx we used to do that for both 
>> 8M and 512k pages. Now for 512k pages we do kind of like ARM (which 
>> means replicating the entry 128 times) as that's needed to allow mixing 
>> different page sizes for a given PGD entry.
> 
> Right, you want to have granular page sizes or it becomes unusable in
> the general case
>  
>> But for 8M pages that would mean replicating the entry 2048 times. 
>> That's a bit too much isn't it ?
> 
> Indeed, de-duplicating the L2 Table is a neat optimization.
> 
>>> So if you imagine a pmd_leaf(), pmd_leaf_size() and a pte_leaf_size()
>>> that would return enough information for both.
>>
>> pmd_leaf() ? Unless I'm missing something I can't do leaf at PMD (PGD) 
>> level. It must be a two-level process even for pages bigger than a PMD 
>> entry.
> 
> Right, this is the normal THP/hugetlb situation on x86/etc. It
> wouldn't apply here since it seems the HW doesn't have a bit in the L1
> descriptor to indicate leaf.
> 
> Instead for PPC this hugepd stuff should start to follow Ryan's
> generic work for ARM contig:
> 
> https://lore.kernel.org/all/20231218105100.172635-1-ryan.robe...@arm.com/
> 
> Specifically the arch implementation:
> 
> https://lore.kernel.org/linux-mm/20231218105100.172635-15-ryan.robe...@arm.com/
> 
> Ie the arch should ultimately wire up the replication and variable
> page size bits within its implementation of set_ptes(). set_ptes()s
> gets a contiguous run of address and should install it with maximum
> use of the variable page sizes. The core code will start to call
> set_ptes() in more cases as Ryan gets along his project.

Note that it's not just set_ptes() that you want to batch; there are other calls
that can benefit too. See patches 2 and 3 in the series you linked. (although
I'm working with DavidH on this and the details are going to change a little).

> 
> For the purposes of GUP, where are are today and where we are going,
> it would be much better to not have a special PPC specific "hugepd"
> parser. Just process each of the 4k replicates one by one like ARM is
> starting with.
> 
> The arch would still have to return the correct page address from
> pte_phys() which I think Ryan is doing by having the replicates encode
> the full 4k based address in each entry.

Yes; although its actually also a requirement of the arm architecture. Since the
contig bit is just a hint that the HW may or may not take any notice of, the
page tables have to be correct for the case where the HW just reads them in base
pages. Fixing up the bottom bits should be trivial using the PTE pointer, if
needed for ppc.

> The HW will ignore those low
> bits and pte_phys() then works properly. This would work for PPC as
> well, excluding the 8M optimization.
> 
> Going forward I'd expect to see some pte_page_size() that returns the
> size bits and GUP can have logic to skip reading replicates.

Yes; pte_batch_remaining() in patch 2 is an attempt at this. But as I said the
details will likely change a little.

> 
> The advantage of all this is that it stops making the feature special
> and the work Ryan is doing to generically push larger folios into
> set_ptes will become usable on these PPC platforms as well. And we can
> kill the PPC specific hugepd.
> 
> Jason



[PATCH v12 15/15] media: vim2m-audio: add virtual driver for audio memory to memory

2024-01-18 Thread Shengjiu Wang
Audio memory to memory virtual driver use video memory to memory
virtual driver vim2m.c as example. The main difference is
device type is VFL_TYPE_AUDIO and device cap type is V4L2_CAP_AUDIO_M2M.

The device_run function is a dummy function, which is simply
copy the data from input buffer to output buffer.

Signed-off-by: Shengjiu Wang 
---
 drivers/media/test-drivers/Kconfig   |  10 +
 drivers/media/test-drivers/Makefile  |   1 +
 drivers/media/test-drivers/vim2m-audio.c | 793 +++
 3 files changed, 804 insertions(+)
 create mode 100644 drivers/media/test-drivers/vim2m-audio.c

diff --git a/drivers/media/test-drivers/Kconfig 
b/drivers/media/test-drivers/Kconfig
index 5a5379524bde..b6b52a7ca042 100644
--- a/drivers/media/test-drivers/Kconfig
+++ b/drivers/media/test-drivers/Kconfig
@@ -16,6 +16,16 @@ config VIDEO_VIM2M
  This is a virtual test device for the memory-to-memory driver
  framework.
 
+config VIDEO_VIM2M_AUDIO
+   tristate "Virtual Memory-to-Memory Driver For Audio"
+   depends on VIDEO_DEV
+   select VIDEOBUF2_VMALLOC
+   select V4L2_MEM2MEM_DEV
+   select MEDIA_CONTROLLER
+   help
+ This is a virtual audio test device for the memory-to-memory driver
+ framework.
+
 source "drivers/media/test-drivers/vicodec/Kconfig"
 source "drivers/media/test-drivers/vimc/Kconfig"
 source "drivers/media/test-drivers/vivid/Kconfig"
diff --git a/drivers/media/test-drivers/Makefile 
b/drivers/media/test-drivers/Makefile
index 740714a4584d..0c61c9ada3e1 100644
--- a/drivers/media/test-drivers/Makefile
+++ b/drivers/media/test-drivers/Makefile
@@ -10,6 +10,7 @@ obj-$(CONFIG_DVB_VIDTV) += vidtv/
 
 obj-$(CONFIG_VIDEO_VICODEC) += vicodec/
 obj-$(CONFIG_VIDEO_VIM2M) += vim2m.o
+obj-$(CONFIG_VIDEO_VIM2M_AUDIO) += vim2m-audio.o
 obj-$(CONFIG_VIDEO_VIMC) += vimc/
 obj-$(CONFIG_VIDEO_VIVID) += vivid/
 obj-$(CONFIG_VIDEO_VISL) += visl/
diff --git a/drivers/media/test-drivers/vim2m-audio.c 
b/drivers/media/test-drivers/vim2m-audio.c
new file mode 100644
index ..6361df6320b3
--- /dev/null
+++ b/drivers/media/test-drivers/vim2m-audio.c
@@ -0,0 +1,793 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * A virtual v4l2-mem2mem example for audio device.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+MODULE_DESCRIPTION("Virtual device for audio mem2mem testing");
+MODULE_LICENSE("GPL");
+
+static unsigned int debug;
+module_param(debug, uint, 0644);
+MODULE_PARM_DESC(debug, "debug level");
+
+#define MEM2MEM_NAME "vim2m-audio"
+
+#define dprintk(dev, lvl, fmt, arg...) \
+   v4l2_dbg(lvl, debug, &(dev)->v4l2_dev, "%s: " fmt, __func__, ## arg)
+
+#define SAMPLE_NUM 4096
+
+static void audm2m_dev_release(struct device *dev)
+{}
+
+static struct platform_device audm2m_pdev = {
+   .name   = MEM2MEM_NAME,
+   .dev.release= audm2m_dev_release,
+};
+
+static u32 formats[] = {
+   V4L2_AUDIO_FMT_S16_LE,
+};
+
+#define NUM_FORMATS ARRAY_SIZE(formats)
+
+/* Per-queue, driver-specific private data */
+struct audm2m_q_data {
+   unsigned intrate;
+   unsigned intchannels;
+   unsigned intbuffersize;
+   unsigned intsequence;
+   u32 fourcc;
+};
+
+enum {
+   V4L2_M2M_SRC = 0,
+   V4L2_M2M_DST = 1,
+};
+
+static snd_pcm_format_t find_format(u32 fourcc)
+{
+   snd_pcm_format_t fmt;
+   unsigned int k;
+
+   for (k = 0; k < NUM_FORMATS; k++) {
+   if (formats[k] == fourcc)
+   break;
+   }
+
+   if (k == NUM_FORMATS)
+   return 0;
+
+   fmt = v4l2_fourcc_to_audfmt(formats[k]);
+
+   return fmt;
+}
+
+struct audm2m_dev {
+   struct v4l2_device  v4l2_dev;
+   struct video_device vfd;
+
+   struct mutexdev_mutex;
+
+   struct v4l2_m2m_dev *m2m_dev;
+#ifdef CONFIG_MEDIA_CONTROLLER
+   struct media_device mdev;
+#endif
+};
+
+struct audm2m_ctx {
+   struct v4l2_fh  fh;
+   struct v4l2_ctrl_handlerctrl_handler;
+   struct audm2m_dev   *dev;
+
+   struct mutexvb_mutex;
+
+   /* Source and destination queue data */
+   struct audm2m_q_data   q_data[2];
+};
+
+static inline struct audm2m_ctx *file2ctx(struct file *file)
+{
+   return container_of(file->private_data, struct audm2m_ctx, fh);
+}
+
+static struct audm2m_q_data *get_q_data(struct audm2m_ctx *ctx,
+   enum v4l2_buf_type type)
+{
+   if (type == V4L2_BUF_TYPE_AUDIO_OUTPUT)
+   return &ctx->q_data[V4L2_M2M_SRC];
+   return &ctx->q_data[V4L2_M2M_DST];
+}
+
+static const char *type_name(enum v4l2_buf_type type)
+{
+   if (type == V4L2_BUF_TYPE_AUDIO_OUTPUT)
+   return "Output";
+   return "Capture";
+}
+
+/*
+ * mem2mem callb

[PATCH v12 14/15] media: imx-asrc: Add memory to memory driver

2024-01-18 Thread Shengjiu Wang
Implement the ASRC memory to memory function using
the v4l2 framework, user can use this function with
v4l2 ioctl interface.

User send the output and capture buffer to driver and
driver store the converted data to the capture buffer.

This feature can be shared by ASRC and EASRC drivers

Signed-off-by: Shengjiu Wang 
---
 drivers/media/platform/nxp/Kconfig|   13 +
 drivers/media/platform/nxp/Makefile   |1 +
 drivers/media/platform/nxp/imx-asrc.c | 1256 +
 3 files changed, 1270 insertions(+)
 create mode 100644 drivers/media/platform/nxp/imx-asrc.c

diff --git a/drivers/media/platform/nxp/Kconfig 
b/drivers/media/platform/nxp/Kconfig
index 40e3436669e2..8d0ca335601f 100644
--- a/drivers/media/platform/nxp/Kconfig
+++ b/drivers/media/platform/nxp/Kconfig
@@ -67,3 +67,16 @@ config VIDEO_MX2_EMMAPRP
 
 source "drivers/media/platform/nxp/dw100/Kconfig"
 source "drivers/media/platform/nxp/imx-jpeg/Kconfig"
+
+config VIDEO_IMX_ASRC
+   tristate "NXP i.MX ASRC M2M support"
+   depends on V4L_MEM2MEM_DRIVERS
+   depends on MEDIA_SUPPORT
+   select VIDEOBUF2_DMA_CONTIG
+   select V4L2_MEM2MEM_DEV
+   select MEDIA_CONTROLLER
+   help
+   Say Y if you want to add ASRC M2M support for NXP CPUs.
+   It is a complement for ASRC M2P and ASRC P2M features.
+   This option is only useful for out-of-tree drivers since
+   in-tree drivers select it automatically.
diff --git a/drivers/media/platform/nxp/Makefile 
b/drivers/media/platform/nxp/Makefile
index 4d90eb713652..1325675e34f5 100644
--- a/drivers/media/platform/nxp/Makefile
+++ b/drivers/media/platform/nxp/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_VIDEO_IMX8MQ_MIPI_CSI2) += imx8mq-mipi-csi2.o
 obj-$(CONFIG_VIDEO_IMX_MIPI_CSIS) += imx-mipi-csis.o
 obj-$(CONFIG_VIDEO_IMX_PXP) += imx-pxp.o
 obj-$(CONFIG_VIDEO_MX2_EMMAPRP) += mx2_emmaprp.o
+obj-$(CONFIG_VIDEO_IMX_ASRC) += imx-asrc.o
diff --git a/drivers/media/platform/nxp/imx-asrc.c 
b/drivers/media/platform/nxp/imx-asrc.c
new file mode 100644
index ..0c25a36199b1
--- /dev/null
+++ b/drivers/media/platform/nxp/imx-asrc.c
@@ -0,0 +1,1256 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Copyright (C) 2014-2016 Freescale Semiconductor, Inc.
+// Copyright (C) 2019-2023 NXP
+//
+// Freescale ASRC Memory to Memory (M2M) driver
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define V4L_CAP OUT
+#define V4L_OUT IN
+
+#define ASRC_xPUT_DMA_CALLBACK(dir) \
+   (((dir) == V4L_OUT) ? asrc_input_dma_callback \
+   : asrc_output_dma_callback)
+
+#define DIR_STR(dir) (dir) == V4L_OUT ? "out" : "cap"
+
+/* Maximum output and capture buffer size */
+#define ASRC_M2M_BUFFER_SIZE (512 * 1024)
+
+/* Maximum output and capture period size */
+#define ASRC_M2M_PERIOD_SIZE (48 * 1024)
+
+struct asrc_pair_m2m {
+   struct fsl_asrc_pair *pair;
+   struct asrc_m2m *m2m;
+   struct v4l2_fh fh;
+   struct v4l2_ctrl_handler ctrl_handler;
+   int channels[2];
+   unsigned int sequence[2];
+   s64 src_rate_off_prev;  /* Q31.32 */
+   s64 dst_rate_off_prev;  /* Q31.32 */
+   s64 src_rate_off_cur;   /* Q31.32 */
+   s64 dst_rate_off_cur;   /* Q31.32 */
+};
+
+struct asrc_m2m {
+   struct fsl_asrc_m2m_pdata pdata;
+   struct v4l2_device v4l2_dev;
+   struct v4l2_m2m_dev *m2m_dev;
+   struct video_device *dec_vdev;
+   struct mutex mlock; /* v4l2 ioctls serialization */
+   struct platform_device *pdev;
+#ifdef CONFIG_MEDIA_CONTROLLER
+   struct media_device mdev;
+#endif
+};
+
+static u32 formats[] = {
+   V4L2_AUDIO_FMT_S8,
+   V4L2_AUDIO_FMT_S16_LE,
+   V4L2_AUDIO_FMT_U16_LE,
+   V4L2_AUDIO_FMT_S24_LE,
+   V4L2_AUDIO_FMT_S24_3LE,
+   V4L2_AUDIO_FMT_U24_LE,
+   V4L2_AUDIO_FMT_U24_3LE,
+   V4L2_AUDIO_FMT_S32_LE,
+   V4L2_AUDIO_FMT_U32_LE,
+   V4L2_AUDIO_FMT_S20_3LE,
+   V4L2_AUDIO_FMT_U20_3LE,
+   V4L2_AUDIO_FMT_FLOAT_LE,
+   V4L2_AUDIO_FMT_IEC958_SUBFRAME_LE,
+};
+
+#define NUM_FORMATS ARRAY_SIZE(formats)
+
+static const s64 asrc_v1_m2m_rates[] = {
+   5512, 8000, 11025, 12000, 16000,
+   22050, 24000, 32000, 44100,
+   48000, 64000, 88200, 96000,
+   128000, 176400, 192000,
+};
+
+static const s64 asrc_v2_m2m_rates[] = {
+   8000, 11025, 12000, 16000,
+   22050, 24000, 32000, 44100,
+   48000, 64000, 88200, 96000,
+   128000, 176400, 192000, 256000,
+   352800, 384000, 705600, 768000,
+};
+
+static u32 find_fourcc(snd_pcm_format_t format)
+{
+   snd_pcm_format_t fmt;
+   unsigned int k;
+
+   for (k = 0; k < NUM_FORMATS; k++) {
+   fmt = v4l2_fourcc_to_audfmt(formats[k]);
+   if (fmt == format)
+   return formats[k];
+   }
+
+   return 0;
+}
+
+static snd_pcm_format_t find_format(u32 fourcc)
+{
+   unsigned int k;
+

[PATCH v12 13/15] media: vivid: add fixed point test controls

2024-01-18 Thread Shengjiu Wang
Add fixed point test controls, one is for Q4.16 format
another one is for Q63 format.

Signed-off-by: Shengjiu Wang 
---
 drivers/media/test-drivers/vivid/vivid-core.h |  2 ++
 .../media/test-drivers/vivid/vivid-ctrls.c| 26 +++
 include/media/v4l2-ctrls.h|  6 +
 3 files changed, 34 insertions(+)

diff --git a/drivers/media/test-drivers/vivid/vivid-core.h 
b/drivers/media/test-drivers/vivid/vivid-core.h
index cfb8e66083f6..f65465191bc9 100644
--- a/drivers/media/test-drivers/vivid/vivid-core.h
+++ b/drivers/media/test-drivers/vivid/vivid-core.h
@@ -222,6 +222,8 @@ struct vivid_dev {
struct v4l2_ctrl*boolean;
struct v4l2_ctrl*int32;
struct v4l2_ctrl*int64;
+   struct v4l2_ctrl*int32_q16;
+   struct v4l2_ctrl*int64_q63;
struct v4l2_ctrl*menu;
struct v4l2_ctrl*string;
struct v4l2_ctrl*bitmask;
diff --git a/drivers/media/test-drivers/vivid/vivid-ctrls.c 
b/drivers/media/test-drivers/vivid/vivid-ctrls.c
index f2b20e25a7a4..2444ea95b285 100644
--- a/drivers/media/test-drivers/vivid/vivid-ctrls.c
+++ b/drivers/media/test-drivers/vivid/vivid-ctrls.c
@@ -38,6 +38,8 @@
 #define VIVID_CID_U8_PIXEL_ARRAY   (VIVID_CID_CUSTOM_BASE + 14)
 #define VIVID_CID_S32_ARRAY(VIVID_CID_CUSTOM_BASE + 15)
 #define VIVID_CID_S64_ARRAY(VIVID_CID_CUSTOM_BASE + 16)
+#define VIVID_CID_INT_Q4_16(VIVID_CID_CUSTOM_BASE + 17)
+#define VIVID_CID_INT64_Q63(VIVID_CID_CUSTOM_BASE + 18)
 
 #define VIVID_CID_VIVID_BASE   (0x00f0 | 0xf000)
 #define VIVID_CID_VIVID_CLASS  (0x00f0 | 1)
@@ -182,6 +184,28 @@ static const struct v4l2_ctrl_config vivid_ctrl_int64 = {
.step = 1,
 };
 
+static const struct v4l2_ctrl_config vivid_ctrl_int32_q16 = {
+   .ops = &vivid_user_gen_ctrl_ops,
+   .id = VIVID_CID_INT_Q4_16,
+   .name = "Integer 32 Bits Q4.16",
+   .type = V4L2_CTRL_TYPE_INTEGER,
+   .min = v4l2_ctrl_fp_compose(-16, 0, 16),
+   .max = v4l2_ctrl_fp_compose(15, 0x, 16),
+   .step = 1,
+   .fraction_bits = 16,
+};
+
+static const struct v4l2_ctrl_config vivid_ctrl_int64_q63 = {
+   .ops = &vivid_user_gen_ctrl_ops,
+   .id = VIVID_CID_INT64_Q63,
+   .name = "Integer 64 Bits Q63",
+   .type = V4L2_CTRL_TYPE_INTEGER64,
+   .min = v4l2_ctrl_fp_compose(-1, 0, 63),
+   .max = v4l2_ctrl_fp_compose(0, LLONG_MAX, 63),
+   .step = 1,
+   .fraction_bits = 63,
+};
+
 static const struct v4l2_ctrl_config vivid_ctrl_u32_array = {
.ops = &vivid_user_gen_ctrl_ops,
.id = VIVID_CID_U32_ARRAY,
@@ -1670,6 +1694,8 @@ int vivid_create_controls(struct vivid_dev *dev, bool 
show_ccs_cap,
dev->button = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_button, 
NULL);
dev->int32 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_int32, 
NULL);
dev->int64 = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_int64, 
NULL);
+   dev->int32_q16 = v4l2_ctrl_new_custom(hdl_user_gen, 
&vivid_ctrl_int32_q16, NULL);
+   dev->int64_q63 = v4l2_ctrl_new_custom(hdl_user_gen, 
&vivid_ctrl_int64_q63, NULL);
dev->boolean = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_boolean, 
NULL);
dev->menu = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_menu, NULL);
dev->string = v4l2_ctrl_new_custom(hdl_user_gen, &vivid_ctrl_string, 
NULL);
diff --git a/include/media/v4l2-ctrls.h b/include/media/v4l2-ctrls.h
index c35514c5bf88..197d8b67ac13 100644
--- a/include/media/v4l2-ctrls.h
+++ b/include/media/v4l2-ctrls.h
@@ -1593,4 +1593,10 @@ void v4l2_ctrl_type_op_log(const struct v4l2_ctrl *ctrl);
  */
 int v4l2_ctrl_type_op_validate(const struct v4l2_ctrl *ctrl, union 
v4l2_ctrl_ptr ptr);
 
+/*
+ * Fixed point compose helper define. This helper maps to the value
+ * i + f / (1 << fraction_bits).
+ */
+#define v4l2_ctrl_fp_compose(i, f, fraction_bits) (((s64)(i) << fraction_bits) 
+ (f))
+
 #endif
-- 
2.34.1



[PATCH v12 12/15] media: uapi: Add an entity type for audio resampler

2024-01-18 Thread Shengjiu Wang
Add and document a media entity type for an audio resampler.
It is MEDIA_ENT_F_PROC_AUDIO_RESAMPLER.

Signed-off-by: Shengjiu Wang 
---
 Documentation/userspace-api/media/mediactl/media-types.rst | 6 ++
 include/uapi/linux/media.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/Documentation/userspace-api/media/mediactl/media-types.rst 
b/Documentation/userspace-api/media/mediactl/media-types.rst
index f0880aea41d6..9ca6542517d5 100644
--- a/Documentation/userspace-api/media/mediactl/media-types.rst
+++ b/Documentation/userspace-api/media/mediactl/media-types.rst
@@ -40,6 +40,7 @@ Types and flags used to represent the media graph elements
 .. _MEDIA-ENT-F-PROC-VIDEO-ENCODER:
 .. _MEDIA-ENT-F-PROC-VIDEO-DECODER:
 .. _MEDIA-ENT-F-PROC-VIDEO-ISP:
+.. _MEDIA-ENT-F-PROC-AUDIO-RESAMPLER:
 .. _MEDIA-ENT-F-VID-MUX:
 .. _MEDIA-ENT-F-VID-IF-BRIDGE:
 .. _MEDIA-ENT-F-DV-DECODER:
@@ -208,6 +209,11 @@ Types and flags used to represent the media graph elements
  combination of custom V4L2 controls and IOCTLs, and parameters
  supplied in a metadata buffer.
 
+*  -  ``MEDIA_ENT_F_PROC_AUDIO_RESAMPLER``
+   -  An Audio Resampler device. An entity capable of
+ resampling an audio stream from one sample rate to another sample
+ rate. Must have one sink pad and at least one source pad.
+
 *  -  ``MEDIA_ENT_F_VID_MUX``
- Video multiplexer. An entity capable of multiplexing must have at
  least two sink pads and one source pad, and must pass the video
diff --git a/include/uapi/linux/media.h b/include/uapi/linux/media.h
index 9ff6dec7393a..a8266eaa8042 100644
--- a/include/uapi/linux/media.h
+++ b/include/uapi/linux/media.h
@@ -125,6 +125,7 @@ struct media_device_info {
 #define MEDIA_ENT_F_PROC_VIDEO_ENCODER (MEDIA_ENT_F_BASE + 0x4007)
 #define MEDIA_ENT_F_PROC_VIDEO_DECODER (MEDIA_ENT_F_BASE + 0x4008)
 #define MEDIA_ENT_F_PROC_VIDEO_ISP (MEDIA_ENT_F_BASE + 0x4009)
+#define MEDIA_ENT_F_PROC_AUDIO_RESAMPLER   (MEDIA_ENT_F_BASE + 0x400a)
 
 /*
  * Switch and bridge entity functions
-- 
2.34.1



[PATCH v12 11/15] media: uapi: Declare interface types for Audio

2024-01-18 Thread Shengjiu Wang
Declare the interface types that will be used by Audio.
The type is MEDIA_INTF_T_V4L_AUDIO.

Signed-off-by: Shengjiu Wang 
---
 .../userspace-api/media/mediactl/media-types.rst|  5 +
 drivers/media/v4l2-core/v4l2-dev.c  |  4 
 drivers/media/v4l2-core/v4l2-mem2mem.c  | 13 +
 include/uapi/linux/media.h  |  1 +
 4 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/Documentation/userspace-api/media/mediactl/media-types.rst 
b/Documentation/userspace-api/media/mediactl/media-types.rst
index 0ffeece1e0c8..f0880aea41d6 100644
--- a/Documentation/userspace-api/media/mediactl/media-types.rst
+++ b/Documentation/userspace-api/media/mediactl/media-types.rst
@@ -265,6 +265,7 @@ Types and flags used to represent the media graph elements
 .. _MEDIA-INTF-T-V4L-SUBDEV:
 .. _MEDIA-INTF-T-V4L-SWRADIO:
 .. _MEDIA-INTF-T-V4L-TOUCH:
+.. _MEDIA-INTF-T-V4L-AUDIO:
 .. _MEDIA-INTF-T-ALSA-PCM-CAPTURE:
 .. _MEDIA-INTF-T-ALSA-PCM-PLAYBACK:
 .. _MEDIA-INTF-T-ALSA-CONTROL:
@@ -322,6 +323,10 @@ Types and flags used to represent the media graph elements
-  Device node interface for Touch device (V4L)
-  typically, /dev/v4l-touch?
 
+*  -  ``MEDIA_INTF_T_V4L_AUDIO``
+   -  Device node interface for Audio device (V4L)
+   -  typically, /dev/v4l-audio?
+
 *  -  ``MEDIA_INTF_T_ALSA_PCM_CAPTURE``
-  Device node interface for ALSA PCM Capture
-  typically, /dev/snd/pcmC?D?c
diff --git a/drivers/media/v4l2-core/v4l2-dev.c 
b/drivers/media/v4l2-core/v4l2-dev.c
index bac008fcedc6..ca8462a61e1f 100644
--- a/drivers/media/v4l2-core/v4l2-dev.c
+++ b/drivers/media/v4l2-core/v4l2-dev.c
@@ -844,6 +844,10 @@ static int video_register_media_controller(struct 
video_device *vdev)
intf_type = MEDIA_INTF_T_V4L_SUBDEV;
/* Entity will be created via v4l2_device_register_subdev() */
break;
+   case VFL_TYPE_AUDIO:
+   intf_type = MEDIA_INTF_T_V4L_AUDIO;
+   /* Entity will be created via v4l2_device_register_subdev() */
+   break;
default:
return 0;
}
diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c 
b/drivers/media/v4l2-core/v4l2-mem2mem.c
index 9e983176542b..e899674c7d22 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -1137,10 +1137,15 @@ int v4l2_m2m_register_media_controller(struct 
v4l2_m2m_dev *m2m_dev,
if (ret)
goto err_rm_links0;
 
-   /* Create video interface */
-   m2m_dev->intf_devnode = media_devnode_create(mdev,
-   MEDIA_INTF_T_V4L_VIDEO, 0,
-   VIDEO_MAJOR, vdev->minor);
+   if (vdev->vfl_type == VFL_TYPE_AUDIO)
+   m2m_dev->intf_devnode = media_devnode_create(mdev,
+   MEDIA_INTF_T_V4L_AUDIO, 0,
+   VIDEO_MAJOR, vdev->minor);
+   else
+   /* Create video interface */
+   m2m_dev->intf_devnode = media_devnode_create(mdev,
+   MEDIA_INTF_T_V4L_VIDEO, 0,
+   VIDEO_MAJOR, vdev->minor);
if (!m2m_dev->intf_devnode) {
ret = -ENOMEM;
goto err_rm_links1;
diff --git a/include/uapi/linux/media.h b/include/uapi/linux/media.h
index 1c80b1d6bbaf..9ff6dec7393a 100644
--- a/include/uapi/linux/media.h
+++ b/include/uapi/linux/media.h
@@ -260,6 +260,7 @@ struct media_links_enum {
 #define MEDIA_INTF_T_V4L_SUBDEV(MEDIA_INTF_T_V4L_BASE 
+ 3)
 #define MEDIA_INTF_T_V4L_SWRADIO   (MEDIA_INTF_T_V4L_BASE + 4)
 #define MEDIA_INTF_T_V4L_TOUCH (MEDIA_INTF_T_V4L_BASE + 5)
+#define MEDIA_INTF_T_V4L_AUDIO (MEDIA_INTF_T_V4L_BASE + 6)
 
 #define MEDIA_INTF_T_ALSA_BASE 0x0300
 #define MEDIA_INTF_T_ALSA_PCM_CAPTURE  (MEDIA_INTF_T_ALSA_BASE)
-- 
2.34.1



[PATCH v12 10/15] media: uapi: Add audio rate controls support

2024-01-18 Thread Shengjiu Wang
Add V4L2_CID_M2M_AUDIO_SOURCE_RATE and V4L2_CID_M2M_AUDIO_DEST_RATE
new IDs for rate control.

Add V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET and
V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET for clock drift.

Signed-off-by: Shengjiu Wang 
---
 .../media/v4l/ext-ctrls-audio-m2m.rst | 20 +++
 drivers/media/v4l2-core/v4l2-ctrls-defs.c |  6 ++
 include/uapi/linux/v4l2-controls.h|  5 +
 3 files changed, 31 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst 
b/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
index 82d2ecedbfee..de579ab8fb94 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
@@ -19,3 +19,23 @@ Audio M2M Control IDs
 The Audio M2M class descriptor. Calling
 :ref:`VIDIOC_QUERYCTRL` for this control will
 return a description of this control class.
+
+.. _v4l2-audio-asrc:
+
+``V4L2_CID_M2M_AUDIO_SOURCE_RATE (integer menu)``
+Sets the audio source sample rate, unit is Hz
+
+``V4L2_CID_M2M_AUDIO_DEST_RATE (integer menu)``
+Sets the audio destination sample rate, unit is Hz
+
+``V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET (fixed point)``
+Sets the offset from the audio source sample rate, unit is Hz.
+The offset compensates for any clock drift. The actual source audio
+sample rate is the ideal source audio sample rate from
+``V4L2_CID_M2M_AUDIO_SOURCE_RATE`` plus this fixed point offset.
+
+``V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET (fixed point)``
+Sets the offset from the audio destination sample rate, unit is Hz.
+The offset compensates for any clock drift. The actual destination audio
+sample rate is the ideal source audio sample rate from
+``V4L2_CID_M2M_AUDIO_DEST_RATE`` plus this fixed point offset.
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-defs.c 
b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
index 2a85ea3dc92f..91e1f5348c23 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-defs.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
@@ -1245,6 +1245,8 @@ const char *v4l2_ctrl_get_name(u32 id)
 
/* Audio M2M controls */
case V4L2_CID_M2M_AUDIO_CLASS:  return "Audio M2M Controls";
+   case V4L2_CID_M2M_AUDIO_SOURCE_RATE:return "Audio Source Sample 
Rate";
+   case V4L2_CID_M2M_AUDIO_DEST_RATE:  return "Audio Destination 
Sample Rate";
default:
return NULL;
}
@@ -1606,6 +1608,10 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum 
v4l2_ctrl_type *type,
case V4L2_CID_COLORIMETRY_HDR10_MASTERING_DISPLAY:
*type = V4L2_CTRL_TYPE_HDR10_MASTERING_DISPLAY;
break;
+   case V4L2_CID_M2M_AUDIO_SOURCE_RATE:
+   case V4L2_CID_M2M_AUDIO_DEST_RATE:
+   *type = V4L2_CTRL_TYPE_INTEGER_MENU;
+   break;
default:
*type = V4L2_CTRL_TYPE_INTEGER;
break;
diff --git a/include/uapi/linux/v4l2-controls.h 
b/include/uapi/linux/v4l2-controls.h
index a8b4b830c757..30129ccdc282 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -3495,6 +3495,11 @@ struct v4l2_ctrl_av1_film_grain {
 #define V4L2_CID_M2M_AUDIO_CLASS_BASE  (V4L2_CTRL_CLASS_M2M_AUDIO | 0x900)
 #define V4L2_CID_M2M_AUDIO_CLASS   (V4L2_CTRL_CLASS_M2M_AUDIO | 1)
 
+#define V4L2_CID_M2M_AUDIO_SOURCE_RATE (V4L2_CID_M2M_AUDIO_CLASS_BASE + 0)
+#define V4L2_CID_M2M_AUDIO_DEST_RATE   (V4L2_CID_M2M_AUDIO_CLASS_BASE + 1)
+#define V4L2_CID_M2M_AUDIO_SOURCE_RATE_OFFSET  (V4L2_CID_M2M_AUDIO_CLASS_BASE 
+ 2)
+#define V4L2_CID_M2M_AUDIO_DEST_RATE_OFFSET(V4L2_CID_M2M_AUDIO_CLASS_BASE 
+ 3)
+
 /* MPEG-compression definitions kept for backwards compatibility */
 #ifndef __KERNEL__
 #define V4L2_CTRL_CLASS_MPEGV4L2_CTRL_CLASS_CODEC
-- 
2.34.1



[PATCH v12 09/15] media: uapi: Add V4L2_CTRL_CLASS_M2M_AUDIO

2024-01-18 Thread Shengjiu Wang
The Audio M2M class includes controls for audio memory-to-memory
use cases. The controls can be used for audio codecs, audio
preprocessing, audio postprocessing.

Signed-off-by: Shengjiu Wang 
---
 .../userspace-api/media/v4l/common.rst|  1 +
 .../media/v4l/ext-ctrls-audio-m2m.rst | 21 +++
 .../media/v4l/vidioc-g-ext-ctrls.rst  |  4 
 drivers/media/v4l2-core/v4l2-ctrls-defs.c |  4 
 include/uapi/linux/v4l2-controls.h|  4 
 5 files changed, 34 insertions(+)
 create mode 100644 
Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst

diff --git a/Documentation/userspace-api/media/v4l/common.rst 
b/Documentation/userspace-api/media/v4l/common.rst
index ea0435182e44..d5366e96a596 100644
--- a/Documentation/userspace-api/media/v4l/common.rst
+++ b/Documentation/userspace-api/media/v4l/common.rst
@@ -52,6 +52,7 @@ applicable to all devices.
 ext-ctrls-fm-rx
 ext-ctrls-detect
 ext-ctrls-colorimetry
+ext-ctrls-audio-m2m
 fourcc
 format
 planar-apis
diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst 
b/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
new file mode 100644
index ..82d2ecedbfee
--- /dev/null
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-audio-m2m.rst
@@ -0,0 +1,21 @@
+.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later
+
+.. _audiom2m-controls:
+
+***
+Audio M2M Control Reference
+***
+
+The Audio M2M class includes controls for audio memory-to-memory
+use cases. The controls can be used for audio codecs, audio
+preprocessing, audio postprocessing.
+
+Audio M2M Control IDs
+---
+
+.. _audiom2m-control-id:
+
+``V4L2_CID_M2M_AUDIO_CLASS (class)``
+The Audio M2M class descriptor. Calling
+:ref:`VIDIOC_QUERYCTRL` for this control will
+return a description of this control class.
diff --git a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst 
b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
index 4d56c0528ad7..aeb1ad8e7d29 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-g-ext-ctrls.rst
@@ -488,6 +488,10 @@ still cause this situation.
   - 0xa5
   - The class containing colorimetry controls. These controls are
described in :ref:`colorimetry-controls`.
+* - ``V4L2_CTRL_CLASS_M2M_AUDIO``
+  - 0xa6
+  - The class containing audio m2m controls. These controls are
+   described in :ref:`audiom2m-controls`.
 
 Return Value
 
diff --git a/drivers/media/v4l2-core/v4l2-ctrls-defs.c 
b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
index 8696eb1cdd61..2a85ea3dc92f 100644
--- a/drivers/media/v4l2-core/v4l2-ctrls-defs.c
+++ b/drivers/media/v4l2-core/v4l2-ctrls-defs.c
@@ -1242,6 +1242,9 @@ const char *v4l2_ctrl_get_name(u32 id)
case V4L2_CID_COLORIMETRY_CLASS:return "Colorimetry Controls";
case V4L2_CID_COLORIMETRY_HDR10_CLL_INFO:   return "HDR10 
Content Light Info";
case V4L2_CID_COLORIMETRY_HDR10_MASTERING_DISPLAY:  return "HDR10 
Mastering Display";
+
+   /* Audio M2M controls */
+   case V4L2_CID_M2M_AUDIO_CLASS:  return "Audio M2M Controls";
default:
return NULL;
}
@@ -1451,6 +1454,7 @@ void v4l2_ctrl_fill(u32 id, const char **name, enum 
v4l2_ctrl_type *type,
case V4L2_CID_DETECT_CLASS:
case V4L2_CID_CODEC_STATELESS_CLASS:
case V4L2_CID_COLORIMETRY_CLASS:
+   case V4L2_CID_M2M_AUDIO_CLASS:
*type = V4L2_CTRL_TYPE_CTRL_CLASS;
/* You can neither read nor write these */
*flags |= V4L2_CTRL_FLAG_READ_ONLY | V4L2_CTRL_FLAG_WRITE_ONLY;
diff --git a/include/uapi/linux/v4l2-controls.h 
b/include/uapi/linux/v4l2-controls.h
index 99c3f5e99da7..a8b4b830c757 100644
--- a/include/uapi/linux/v4l2-controls.h
+++ b/include/uapi/linux/v4l2-controls.h
@@ -30,6 +30,7 @@
 #define V4L2_CTRL_CLASS_DETECT 0x00a3  /* Detection controls */
 #define V4L2_CTRL_CLASS_CODEC_STATELESS 0x00a4 /* Stateless codecs 
controls */
 #define V4L2_CTRL_CLASS_COLORIMETRY0x00a5  /* Colorimetry controls 
*/
+#define V4L2_CTRL_CLASS_M2M_AUDIO  0x00a6  /* Audio M2M controls */
 
 /* User-class control IDs */
 
@@ -3491,6 +3492,9 @@ struct v4l2_ctrl_av1_film_grain {
__u8 reserved[4];
 };
 
+#define V4L2_CID_M2M_AUDIO_CLASS_BASE  (V4L2_CTRL_CLASS_M2M_AUDIO | 0x900)
+#define V4L2_CID_M2M_AUDIO_CLASS   (V4L2_CTRL_CLASS_M2M_AUDIO | 1)
+
 /* MPEG-compression definitions kept for backwards compatibility */
 #ifndef __KERNEL__
 #define V4L2_CTRL_CLASS_MPEGV4L2_CTRL_CLASS_CODEC
-- 
2.34.1



[PATCH v12 08/15] media: uapi: Define audio sample format fourcc type

2024-01-18 Thread Shengjiu Wang
The audio sample format definition is from alsa,
the header file is include/uapi/sound/asound.h, but
don't include this header file directly, because in
user space, there is another copy in alsa-lib.
There will be conflict in userspace for include
videodev2.h & asound.h and asoundlib.h

Here still use the fourcc format.

Signed-off-by: Shengjiu Wang 
---
 .../userspace-api/media/v4l/pixfmt-audio.rst  | 87 +++
 .../userspace-api/media/v4l/pixfmt.rst|  1 +
 drivers/media/v4l2-core/v4l2-ioctl.c  | 13 +++
 include/uapi/linux/videodev2.h| 23 +
 4 files changed, 124 insertions(+)
 create mode 100644 Documentation/userspace-api/media/v4l/pixfmt-audio.rst

diff --git a/Documentation/userspace-api/media/v4l/pixfmt-audio.rst 
b/Documentation/userspace-api/media/v4l/pixfmt-audio.rst
new file mode 100644
index ..04b4a7fbd8f4
--- /dev/null
+++ b/Documentation/userspace-api/media/v4l/pixfmt-audio.rst
@@ -0,0 +1,87 @@
+.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later
+
+.. _pixfmt-audio:
+
+*
+Audio Formats
+*
+
+These formats are used for :ref:`audiomem2mem` interface only.
+
+.. tabularcolumns:: |p{5.8cm}|p{1.2cm}|p{10.3cm}|
+
+.. cssclass:: longtable
+
+.. flat-table:: Audio Format
+:header-rows:  1
+:stub-columns: 0
+:widths:   3 1 4
+
+* - Identifier
+  - Code
+  - Details
+* .. _V4L2-AUDIO-FMT-S8:
+
+  - ``V4L2_AUDIO_FMT_S8``
+  - 'S8'
+  - Corresponds to SNDRV_PCM_FORMAT_S8 in ALSA
+* .. _V4L2-AUDIO-FMT-S16-LE:
+
+  - ``V4L2_AUDIO_FMT_S16_LE``
+  - 'S16_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_S16_LE in ALSA
+* .. _V4L2-AUDIO-FMT-U16-LE:
+
+  - ``V4L2_AUDIO_FMT_U16_LE``
+  - 'U16_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_U16_LE in ALSA
+* .. _V4L2-AUDIO-FMT-S24-LE:
+
+  - ``V4L2_AUDIO_FMT_S24_LE``
+  - 'S24_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_S24_LE in ALSA
+* .. _V4L2-AUDIO-FMT-U24-LE:
+
+  - ``V4L2_AUDIO_FMT_U24_LE``
+  - 'U24_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_U24_LE in ALSA
+* .. _V4L2-AUDIO-FMT-S32-LE:
+
+  - ``V4L2_AUDIO_FMT_S32_LE``
+  - 'S32_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_S32_LE in ALSA
+* .. _V4L2-AUDIO-FMT-U32-LE:
+
+  - ``V4L2_AUDIO_FMT_U32_LE``
+  - 'U32_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_U32_LE in ALSA
+* .. _V4L2-AUDIO-FMT-FLOAT-LE:
+
+  - ``V4L2_AUDIO_FMT_FLOAT_LE``
+  - 'FLOAT_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_FLOAT_LE in ALSA
+* .. _V4L2-AUDIO-FMT-IEC958-SUBFRAME-LE:
+
+  - ``V4L2_AUDIO_FMT_IEC958_SUBFRAME_LE``
+  - 'IEC958_SUBFRAME_LE'
+  - Corresponds to SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE in ALSA
+* .. _V4L2-AUDIO-FMT-S24-3LE:
+
+  - ``V4L2_AUDIO_FMT_S24_3LE``
+  - 'S24_3LE'
+  - Corresponds to SNDRV_PCM_FORMAT_S24_3LE in ALSA
+* .. _V4L2-AUDIO-FMT-U24-3LE:
+
+  - ``V4L2_AUDIO_FMT_U24_3LE``
+  - 'U24_3LE'
+  - Corresponds to SNDRV_PCM_FORMAT_U24_3LE in ALSA
+* .. _V4L2-AUDIO-FMT-S20-3LE:
+
+  - ``V4L2_AUDIO_FMT_S20_3LE``
+  - 'S20_3LE'
+  - Corresponds to SNDRV_PCM_FORMAT_S24_3LE in ALSA
+* .. _V4L2-AUDIO-FMT-U20-3LE:
+
+  - ``V4L2_AUDIO_FMT_U20_3LE``
+  - 'U20_3LE'
+  - Corresponds to SNDRV_PCM_FORMAT_U20_3LE in ALSA
diff --git a/Documentation/userspace-api/media/v4l/pixfmt.rst 
b/Documentation/userspace-api/media/v4l/pixfmt.rst
index 11dab4a90630..2eb6fdd3b43d 100644
--- a/Documentation/userspace-api/media/v4l/pixfmt.rst
+++ b/Documentation/userspace-api/media/v4l/pixfmt.rst
@@ -36,3 +36,4 @@ see also :ref:`VIDIOC_G_FBUF `.)
 colorspaces
 colorspaces-defs
 colorspaces-details
+pixfmt-audio
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c 
b/drivers/media/v4l2-core/v4l2-ioctl.c
index e7be7c2f302d..e5094f0e75c1 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -1471,6 +1471,19 @@ static void v4l_fill_fmtdesc(struct v4l2_fmtdesc *fmt)
case V4L2_PIX_FMT_Y210: descr = "10-bit YUYV Packed"; break;
case V4L2_PIX_FMT_Y212: descr = "12-bit YUYV Packed"; break;
case V4L2_PIX_FMT_Y216: descr = "16-bit YUYV Packed"; break;
+   case V4L2_AUDIO_FMT_S8: descr = "8-bit Signed"; break;
+   case V4L2_AUDIO_FMT_S16_LE: descr = "16-bit Signed LE"; break;
+   case V4L2_AUDIO_FMT_U16_LE: descr = "16-bit Unsigned LE"; 
break;
+   case V4L2_AUDIO_FMT_S24_LE: descr = "24(32)-bit Signed LE"; 
break;
+   case V4L2_AUDIO_FMT_U24_LE: descr = "24(32)-bit Unsigned 
LE"; break;
+   case V4L2_AUDIO_FMT_S32_LE: descr = "32-bit Signed LE"; 
break;
+   case V4L2_AUDIO_FMT_U32_LE: descr = "32-bit Unsigned LE"; 
break;
+   case V4L2_AUDIO_FMT_FLOAT_LE:   descr = "32-bit Float LE"; 
break;
+   case V4L2_AUDIO_FMT_IEC958_SUBFRAME_LE: descr = "32-bi

[PATCH v12 07/15] media: v4l2: Add audio capture and output support

2024-01-18 Thread Shengjiu Wang
Audio signal processing has the requirement for memory to
memory similar as Video.

This patch is to add this support in v4l2 framework, defined
new buffer type V4L2_BUF_TYPE_AUDIO_CAPTURE and
V4L2_BUF_TYPE_AUDIO_OUTPUT, defined new format v4l2_audio_format
for audio case usage.

The created audio device is named "/dev/v4l-audioX".

Signed-off-by: Shengjiu Wang 
---
 .../userspace-api/media/v4l/buffer.rst|  6 ++
 .../media/v4l/dev-audio-mem2mem.rst   | 71 +++
 .../userspace-api/media/v4l/devices.rst   |  1 +
 .../media/v4l/vidioc-enum-fmt.rst |  2 +
 .../userspace-api/media/v4l/vidioc-g-fmt.rst  |  4 ++
 .../media/videodev2.h.rst.exceptions  |  2 +
 .../media/common/videobuf2/videobuf2-v4l2.c   |  4 ++
 drivers/media/v4l2-core/v4l2-compat-ioctl32.c |  9 +++
 drivers/media/v4l2-core/v4l2-dev.c| 17 +
 drivers/media/v4l2-core/v4l2-ioctl.c  | 53 ++
 include/media/v4l2-dev.h  |  2 +
 include/media/v4l2-ioctl.h| 34 +
 include/uapi/linux/videodev2.h| 17 +
 13 files changed, 222 insertions(+)
 create mode 100644 Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst

diff --git a/Documentation/userspace-api/media/v4l/buffer.rst 
b/Documentation/userspace-api/media/v4l/buffer.rst
index 52bbee81c080..a3754ca6f0d6 100644
--- a/Documentation/userspace-api/media/v4l/buffer.rst
+++ b/Documentation/userspace-api/media/v4l/buffer.rst
@@ -438,6 +438,12 @@ enum v4l2_buf_type
 * - ``V4L2_BUF_TYPE_META_OUTPUT``
   - 14
   - Buffer for metadata output, see :ref:`metadata`.
+* - ``V4L2_BUF_TYPE_AUDIO_CAPTURE``
+  - 15
+  - Buffer for audio capture, see :ref:`audio`.
+* - ``V4L2_BUF_TYPE_AUDIO_OUTPUT``
+  - 16
+  - Buffer for audio output, see :ref:`audio`.
 
 
 .. _buffer-flags:
diff --git a/Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst 
b/Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst
new file mode 100644
index ..68faecfe3a02
--- /dev/null
+++ b/Documentation/userspace-api/media/v4l/dev-audio-mem2mem.rst
@@ -0,0 +1,71 @@
+.. SPDX-License-Identifier: GFDL-1.1-no-invariants-or-later
+
+.. _audiomem2mem:
+
+
+Audio Memory-To-Memory Interface
+
+
+An audio memory-to-memory device can compress, decompress, transform, or
+otherwise convert audio data from one format into another format, in memory.
+Such memory-to-memory devices set the ``V4L2_CAP_AUDIO_M2M`` capability.
+Examples of memory-to-memory devices are audio codecs, audio preprocessing,
+audio postprocessing.
+
+A memory-to-memory audio node supports both output (sending audio frames from
+memory to the hardware) and capture (receiving the processed audio frames
+from the hardware into memory) stream I/O. An application will have to
+setup the stream I/O for both sides and finally call
+:ref:`VIDIOC_STREAMON ` for both capture and output to
+start the hardware.
+
+Memory-to-memory devices function as a shared resource: you can
+open the audio node multiple times, each application setting up their
+own properties that are local to the file handle, and each can use
+it independently from the others. The driver will arbitrate access to
+the hardware and reprogram it whenever another file handler gets access.
+
+Audio memory-to-memory devices are accessed through character device
+special files named ``/dev/v4l-audio``
+
+Querying Capabilities
+=
+
+Device nodes supporting the audio memory-to-memory interface set the
+``V4L2_CAP_AUDIO_M2M`` flag in the ``device_caps`` field of the
+:c:type:`v4l2_capability` structure returned by the :c:func:`VIDIOC_QUERYCAP`
+ioctl.
+
+Data Format Negotiation
+===
+
+The audio device uses the :ref:`format` ioctls to select the capture format.
+The audio buffer content format is bound to that selected format. In addition
+to the basic :ref:`format` ioctls, the :c:func:`VIDIOC_ENUM_FMT` ioctl must be
+supported as well.
+
+To use the :ref:`format` ioctls applications set the ``type`` field of the
+:c:type:`v4l2_format` structure to ``V4L2_BUF_TYPE_AUDIO_CAPTURE`` or to
+``V4L2_BUF_TYPE_AUDIO_OUTPUT``. Both drivers and applications must set the
+remainder of the :c:type:`v4l2_format` structure to 0.
+
+.. c:type:: v4l2_audio_format
+
+.. tabularcolumns:: |p{1.4cm}|p{2.4cm}|p{13.5cm}|
+
+.. flat-table:: struct v4l2_audio_format
+:header-rows:  0
+:stub-columns: 0
+:widths:   1 1 2
+
+* - __u32
+  - ``pixelformat``
+  - The sample format, set by the application. see :ref:`pixfmt-audio`
+* - __u32
+  - ``channels``
+  - The channel number, set by the application. channel number range is
+[1, 32].
+* - __u32
+  - ``buffersize``
+  - Maximum buffer size in bytes required for data. The value is set by the
+driver.
diff --git a/Documentation/userspac

[PATCH v12 06/15] media: uapi: Add V4L2_CAP_AUDIO_M2M capability flag

2024-01-18 Thread Shengjiu Wang
V4L2_CAP_AUDIO_M2M is similar to V4L2_CAP_VIDEO_M2M flag.

It is used for audio memory to memory case.

Signed-off-by: Shengjiu Wang 
---
 Documentation/userspace-api/media/v4l/vidioc-querycap.rst| 3 +++
 Documentation/userspace-api/media/videodev2.h.rst.exceptions | 1 +
 include/uapi/linux/videodev2.h   | 1 +
 3 files changed, 5 insertions(+)

diff --git a/Documentation/userspace-api/media/v4l/vidioc-querycap.rst 
b/Documentation/userspace-api/media/v4l/vidioc-querycap.rst
index 6c57b8428356..1c0d97bf192a 100644
--- a/Documentation/userspace-api/media/v4l/vidioc-querycap.rst
+++ b/Documentation/userspace-api/media/v4l/vidioc-querycap.rst
@@ -173,6 +173,9 @@ specification the ioctl returns an ``EINVAL`` error code.
interface. A video overlay device typically stores captured images
directly in the video memory of a graphics card, with hardware
clipping and scaling.
+* - ``V4L2_CAP_AUDIO_M2M``
+  - 0x0008
+  - The device supports the audio Memory-To-Memory interface.
 * - ``V4L2_CAP_VBI_CAPTURE``
   - 0x0010
   - The device supports the :ref:`Raw VBI Capture `
diff --git a/Documentation/userspace-api/media/videodev2.h.rst.exceptions 
b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
index 3e58aac4ef0b..da6d0b8e4c2c 100644
--- a/Documentation/userspace-api/media/videodev2.h.rst.exceptions
+++ b/Documentation/userspace-api/media/videodev2.h.rst.exceptions
@@ -197,6 +197,7 @@ replace define V4L2_CAP_META_OUTPUT device-capabilities
 replace define V4L2_CAP_DEVICE_CAPS device-capabilities
 replace define V4L2_CAP_TOUCH device-capabilities
 replace define V4L2_CAP_IO_MC device-capabilities
+replace define V4L2_CAP_AUDIO_M2M device-capabilities
 
 # V4L2 pix flags
 replace define V4L2_PIX_FMT_PRIV_MAGIC :c:type:`v4l2_pix_format`
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 2c23f0b369e4..6cd65969c2b5 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -473,6 +473,7 @@ struct v4l2_capability {
 #define V4L2_CAP_VIDEO_CAPTURE 0x0001  /* Is a video capture 
device */
 #define V4L2_CAP_VIDEO_OUTPUT  0x0002  /* Is a video output device 
*/
 #define V4L2_CAP_VIDEO_OVERLAY 0x0004  /* Can do video overlay */
+#define V4L2_CAP_AUDIO_M2M 0x0008  /* audio memory to memory */
 #define V4L2_CAP_VBI_CAPTURE   0x0010  /* Is a raw VBI capture 
device */
 #define V4L2_CAP_VBI_OUTPUT0x0020  /* Is a raw VBI output 
device */
 #define V4L2_CAP_SLICED_VBI_CAPTURE0x0040  /* Is a sliced VBI capture 
device */
-- 
2.34.1



[PATCH v12 05/15] ASoC: fsl_easrc: register m2m platform device

2024-01-18 Thread Shengjiu Wang
Register m2m platform device,that user can
use M2M feature.

Signed-off-by: Shengjiu Wang 
Acked-by: Mark Brown 
---
 sound/soc/fsl/fsl_easrc.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/sound/soc/fsl/fsl_easrc.c b/sound/soc/fsl/fsl_easrc.c
index cf7ad30a323b..ccbf45c7abf4 100644
--- a/sound/soc/fsl/fsl_easrc.c
+++ b/sound/soc/fsl/fsl_easrc.c
@@ -2075,6 +2075,7 @@ MODULE_DEVICE_TABLE(of, fsl_easrc_dt_ids);
 static int fsl_easrc_probe(struct platform_device *pdev)
 {
struct fsl_easrc_priv *easrc_priv;
+   struct fsl_asrc_m2m_pdata m2m_pdata;
struct device *dev = &pdev->dev;
struct fsl_asrc *easrc;
struct resource *res;
@@ -2190,6 +2191,19 @@ static int fsl_easrc_probe(struct platform_device *pdev)
goto err_pm_disable;
}
 
+   m2m_pdata.asrc = easrc;
+   m2m_pdata.fmt_in = FSL_EASRC_FORMATS;
+   m2m_pdata.fmt_out = FSL_EASRC_FORMATS | 
SNDRV_PCM_FMTBIT_IEC958_SUBFRAME_LE;
+   m2m_pdata.rate_min = 8000;
+   m2m_pdata.rate_max = 768000;
+   m2m_pdata.chan_min = 1;
+   m2m_pdata.chan_max = 32;
+   easrc->m2m_pdev = platform_device_register_data(&pdev->dev,
+   M2M_DRV_NAME,
+   PLATFORM_DEVID_AUTO,
+   &m2m_pdata,
+   sizeof(m2m_pdata));
+
return 0;
 
 err_pm_disable:
@@ -2199,6 +2213,11 @@ static int fsl_easrc_probe(struct platform_device *pdev)
 
 static void fsl_easrc_remove(struct platform_device *pdev)
 {
+   struct fsl_asrc *easrc = dev_get_drvdata(&pdev->dev);
+
+   if (easrc->m2m_pdev && !IS_ERR(easrc->m2m_pdev))
+   platform_device_unregister(easrc->m2m_pdev);
+
pm_runtime_disable(&pdev->dev);
 }
 
-- 
2.34.1



[PATCH v12 04/15] ASoC: fsl_asrc: register m2m platform device

2024-01-18 Thread Shengjiu Wang
Register m2m platform device, that user can
use M2M feature.

Defined platform data structure and platform
driver name.

Signed-off-by: Shengjiu Wang 
Acked-by: Mark Brown 
---
 include/sound/fsl_asrc_common.h | 23 +++
 sound/soc/fsl/fsl_asrc.c| 18 ++
 2 files changed, 41 insertions(+)

diff --git a/include/sound/fsl_asrc_common.h b/include/sound/fsl_asrc_common.h
index 3b53d366182f..c709b8906929 100644
--- a/include/sound/fsl_asrc_common.h
+++ b/include/sound/fsl_asrc_common.h
@@ -71,6 +71,7 @@ struct fsl_asrc_pair {
  * @dma_params_rx: DMA parameters for receive channel
  * @dma_params_tx: DMA parameters for transmit channel
  * @pdev: platform device pointer
+ * @m2m_pdev: m2m platform device pointer
  * @regmap: regmap handler
  * @paddr: physical address to the base address of registers
  * @mem_clk: clock source to access register
@@ -103,6 +104,7 @@ struct fsl_asrc {
struct snd_dmaengine_dai_dma_data dma_params_rx;
struct snd_dmaengine_dai_dma_data dma_params_tx;
struct platform_device *pdev;
+   struct platform_device *m2m_pdev;
struct regmap *regmap;
unsigned long paddr;
struct clk *mem_clk;
@@ -139,6 +141,27 @@ struct fsl_asrc {
void *private;
 };
 
+/**
+ * struct fsl_asrc_m2m_pdata - platform data
+ * @asrc: pointer to struct fsl_asrc
+ * @fmt_in: input sample format
+ * @fmt_out: output sample format
+ * @chan_min: minimum channel number
+ * @chan_max: maximum channel number
+ * @rate_min: minimum rate
+ * @rate_max: maximum rete
+ */
+struct fsl_asrc_m2m_pdata {
+   struct fsl_asrc *asrc;
+   u64 fmt_in;
+   u64 fmt_out;
+   int chan_min;
+   int chan_max;
+   int rate_min;
+   int rate_max;
+};
+
+#define M2M_DRV_NAME "fsl_asrc_m2m"
 #define DRV_NAME "fsl-asrc-dai"
 extern struct snd_soc_component_driver fsl_asrc_component;
 
diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index 7d8643ee0ba0..5ecb5d869607 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -1187,6 +1187,7 @@ static int fsl_asrc_runtime_suspend(struct device *dev);
 static int fsl_asrc_probe(struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
+   struct fsl_asrc_m2m_pdata m2m_pdata;
struct fsl_asrc_priv *asrc_priv;
struct fsl_asrc *asrc;
struct resource *res;
@@ -1368,6 +1369,18 @@ static int fsl_asrc_probe(struct platform_device *pdev)
goto err_pm_get_sync;
}
 
+   m2m_pdata.asrc = asrc;
+   m2m_pdata.fmt_in = FSL_ASRC_FORMATS;
+   m2m_pdata.fmt_out = FSL_ASRC_FORMATS | SNDRV_PCM_FMTBIT_S8;
+   m2m_pdata.rate_min = 5512;
+   m2m_pdata.rate_max = 192000;
+   m2m_pdata.chan_min = 1;
+   m2m_pdata.chan_max = 10;
+   asrc->m2m_pdev = platform_device_register_data(&pdev->dev,
+  M2M_DRV_NAME,
+  PLATFORM_DEVID_AUTO,
+  &m2m_pdata,
+  sizeof(m2m_pdata));
return 0;
 
 err_pm_get_sync:
@@ -1380,6 +1393,11 @@ static int fsl_asrc_probe(struct platform_device *pdev)
 
 static void fsl_asrc_remove(struct platform_device *pdev)
 {
+   struct fsl_asrc *asrc = dev_get_drvdata(&pdev->dev);
+
+   if (asrc->m2m_pdev && !IS_ERR(asrc->m2m_pdev))
+   platform_device_unregister(asrc->m2m_pdev);
+
pm_runtime_disable(&pdev->dev);
if (!pm_runtime_status_suspended(&pdev->dev))
fsl_asrc_runtime_suspend(&pdev->dev);
-- 
2.34.1



[PATCH v12 03/15] ASoC: fsl_asrc: move fsl_asrc_common.h to include/sound

2024-01-18 Thread Shengjiu Wang
Move fsl_asrc_common.h to include/sound that it can be
included from other drivers.

Signed-off-by: Shengjiu Wang 
Acked-by: Mark Brown 
---
 {sound/soc/fsl => include/sound}/fsl_asrc_common.h | 0
 sound/soc/fsl/fsl_asrc.h   | 2 +-
 sound/soc/fsl/fsl_asrc_dma.c   | 2 +-
 sound/soc/fsl/fsl_easrc.h  | 2 +-
 4 files changed, 3 insertions(+), 3 deletions(-)
 rename {sound/soc/fsl => include/sound}/fsl_asrc_common.h (100%)

diff --git a/sound/soc/fsl/fsl_asrc_common.h b/include/sound/fsl_asrc_common.h
similarity index 100%
rename from sound/soc/fsl/fsl_asrc_common.h
rename to include/sound/fsl_asrc_common.h
diff --git a/sound/soc/fsl/fsl_asrc.h b/sound/soc/fsl/fsl_asrc.h
index 1c492eb237f5..66544624de7b 100644
--- a/sound/soc/fsl/fsl_asrc.h
+++ b/sound/soc/fsl/fsl_asrc.h
@@ -10,7 +10,7 @@
 #ifndef _FSL_ASRC_H
 #define _FSL_ASRC_H
 
-#include  "fsl_asrc_common.h"
+#include  
 
 #define ASRC_M2M_INPUTFIFO_WML 0x4
 #define ASRC_M2M_OUTPUTFIFO_WML0x2
diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index f501f47242fb..f067bf1ecea7 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -12,7 +12,7 @@
 #include 
 #include 
 
-#include "fsl_asrc_common.h"
+#include 
 
 #define FSL_ASRC_DMABUF_SIZE   (256 * 1024)
 
diff --git a/sound/soc/fsl/fsl_easrc.h b/sound/soc/fsl/fsl_easrc.h
index c9f770862662..a24e540876a4 100644
--- a/sound/soc/fsl/fsl_easrc.h
+++ b/sound/soc/fsl/fsl_easrc.h
@@ -9,7 +9,7 @@
 #include 
 #include 
 
-#include "fsl_asrc_common.h"
+#include 
 
 /* EASRC Register Map */
 
-- 
2.34.1



[PATCH v12 02/15] ASoC: fsl_easrc: define functions for memory to memory usage

2024-01-18 Thread Shengjiu Wang
ASRC can be used on memory to memory case, define several
functions for m2m usage and export them as function pointer.

Signed-off-by: Shengjiu Wang 
Acked-by: Mark Brown 
---
 sound/soc/fsl/fsl_easrc.c | 214 ++
 sound/soc/fsl/fsl_easrc.h |   4 +
 2 files changed, 218 insertions(+)

diff --git a/sound/soc/fsl/fsl_easrc.c b/sound/soc/fsl/fsl_easrc.c
index ec53bda46a46..cf7ad30a323b 100644
--- a/sound/soc/fsl/fsl_easrc.c
+++ b/sound/soc/fsl/fsl_easrc.c
@@ -1861,6 +1861,211 @@ static int fsl_easrc_get_fifo_addr(u8 dir, enum 
asrc_pair_index index)
return REG_EASRC_FIFO(dir, index);
 }
 
+/* Get sample numbers in FIFO */
+static unsigned int fsl_easrc_get_output_fifo_size(struct fsl_asrc_pair *pair)
+{
+   struct fsl_asrc *asrc = pair->asrc;
+   enum asrc_pair_index index = pair->index;
+   u32 val;
+
+   regmap_read(asrc->regmap, REG_EASRC_SFS(index), &val);
+   val &= EASRC_SFS_NSGO_MASK;
+
+   return val >> EASRC_SFS_NSGO_SHIFT;
+}
+
+static int fsl_easrc_m2m_prepare(struct fsl_asrc_pair *pair)
+{
+   struct fsl_easrc_ctx_priv *ctx_priv = pair->private;
+   struct fsl_asrc *asrc = pair->asrc;
+   struct device *dev = &asrc->pdev->dev;
+   int ret;
+
+   ctx_priv->in_params.sample_rate = pair->rate[IN];
+   ctx_priv->in_params.sample_format = pair->sample_format[IN];
+   ctx_priv->out_params.sample_rate = pair->rate[OUT];
+   ctx_priv->out_params.sample_format = pair->sample_format[OUT];
+
+   ctx_priv->in_params.fifo_wtmk = FSL_EASRC_INPUTFIFO_WML;
+   ctx_priv->out_params.fifo_wtmk = FSL_EASRC_OUTPUTFIFO_WML;
+   /* Fill the right half of the re-sampler with zeros */
+   ctx_priv->rs_init_mode = 0x2;
+   /* Zero fill the right half of the prefilter */
+   ctx_priv->pf_init_mode = 0x2;
+
+   ret = fsl_easrc_set_ctx_format(pair,
+  &ctx_priv->in_params.sample_format,
+  &ctx_priv->out_params.sample_format);
+   if (ret) {
+   dev_err(dev, "failed to set context format: %d\n", ret);
+   return ret;
+   }
+
+   ret = fsl_easrc_config_context(asrc, pair->index);
+   if (ret) {
+   dev_err(dev, "failed to config context %d\n", ret);
+   return ret;
+   }
+
+   ctx_priv->in_params.iterations = 1;
+   ctx_priv->in_params.group_len = pair->channels;
+   ctx_priv->in_params.access_len = pair->channels;
+   ctx_priv->out_params.iterations = 1;
+   ctx_priv->out_params.group_len = pair->channels;
+   ctx_priv->out_params.access_len = pair->channels;
+
+   ret = fsl_easrc_set_ctx_organziation(pair);
+   if (ret) {
+   dev_err(dev, "failed to set fifo organization\n");
+   return ret;
+   }
+
+   /* The context start flag */
+   pair->first_convert = 1;
+   return 0;
+}
+
+static int fsl_easrc_m2m_start(struct fsl_asrc_pair *pair)
+{
+   /* start context once */
+   if (pair->first_convert) {
+   fsl_easrc_start_context(pair);
+   pair->first_convert = 0;
+   }
+
+   return 0;
+}
+
+static int fsl_easrc_m2m_stop(struct fsl_asrc_pair *pair)
+{
+   /* Stop pair/context */
+   if (!pair->first_convert) {
+   fsl_easrc_stop_context(pair);
+   pair->first_convert = 1;
+   }
+
+   return 0;
+}
+
+/* calculate capture data length according to output data length and sample 
rate */
+static int fsl_easrc_m2m_calc_out_len(struct fsl_asrc_pair *pair, int 
input_buffer_length)
+{
+   struct fsl_asrc *easrc = pair->asrc;
+   struct fsl_easrc_priv *easrc_priv = easrc->private;
+   struct fsl_easrc_ctx_priv *ctx_priv = pair->private;
+   unsigned int in_rate = ctx_priv->in_params.norm_rate;
+   unsigned int out_rate = ctx_priv->out_params.norm_rate;
+   unsigned int channels = pair->channels;
+   unsigned int in_samples, out_samples;
+   unsigned int in_width, out_width;
+   unsigned int out_length;
+   unsigned int frac_bits;
+   u64 val1, val2;
+
+   switch (easrc_priv->rs_num_taps) {
+   case EASRC_RS_32_TAPS:
+   /* integer bits = 5; */
+   frac_bits = 39;
+   break;
+   case EASRC_RS_64_TAPS:
+   /* integer bits = 6; */
+   frac_bits = 38;
+   break;
+   case EASRC_RS_128_TAPS:
+   /* integer bits = 7; */
+   frac_bits = 37;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   val1 = (u64)in_rate << frac_bits;
+   do_div(val1, out_rate);
+   val1 += (s64)ctx_priv->ratio_mod << (frac_bits - 31);
+
+   in_width = 
snd_pcm_format_physical_width(ctx_priv->in_params.sample_format) / 8;
+   out_width = 
snd_pcm_format_physical_width(ctx_priv->out_params.sample_format) / 8;
+
+   ctx_priv->in_filled_len += input_b

[PATCH v12 00/15] Add audio support in v4l2 framework

2024-01-18 Thread Shengjiu Wang
Audio signal processing also has the requirement for memory to
memory similar as Video.

This asrc memory to memory (memory ->asrc->memory) case is a non
real time use case.

User fills the input buffer to the asrc module, after conversion, then asrc
sends back the output buffer to user. So it is not a traditional ALSA playback
and capture case.

It is a specific use case,  there is no reference in current kernel.
v4l2 memory to memory is the closed implementation,  v4l2 current
support video, image, radio, tuner, touch devices, so it is not
complicated to add support for this specific audio case.

Because we had implemented the "memory -> asrc ->i2s device-> codec"
use case in ALSA.  Now the "memory->asrc->memory" needs
to reuse the code in asrc driver, so the first 3 patches is for refining
the code to make it can be shared by the "memory->asrc->memory"
driver.

The main change is in the v4l2 side, A /dev/vl4-audioX will be created,
user applications only use the ioctl of v4l2 framework.

Other change is to add memory to memory support for two kinds of i.MX ASRC
module.

changes in v12
- minor changes according to comments
- drop min_buffers_needed = 1 and V4L2_CTRL_FLAG_UPDATE flag
- drop bus_info

changes in v11
- add add-fixed-point-test-controls in vivid.
- add v4l2_ctrl_fp_compose() helper function for min and max

changes in v10
- remove FIXED_POINT type
- change code base on media: v4l2-ctrls: add support for fraction_bits
- fix issue reported by kernel test robot
- remove module_alias

changes in v9:
- add MEDIA_ENT_F_PROC_AUDIO_RESAMPLER.
- add MEDIA_INTF_T_V4L_AUDIO
- add media controller support
- refine the vim2m-audio to support 8k<->16k conversion.

changes in v8:
- refine V4L2_CAP_AUDIO_M2M to be 0x0008
- update doc for FIXED_POINT
- address comments for imx-asrc

changes in v7:
- add acked-by from Mark
- separate commit for fixed point, m2m audio class, audio rate controls
- use INTEGER_MENU for rate,  FIXED_POINT for rate offset
- remove used fmts
- address other comments for Hans

changes in v6:
- use m2m_prepare/m2m_unprepare/m2m_start/m2m_stop to replace
  m2m_start_part_one/m2m_stop_part_one, m2m_start_part_two/m2m_stop_part_two.
- change V4L2_CTRL_TYPE_ASRC_RATE to V4L2_CTRL_TYPE_FIXED_POINT
- fix warning by kernel test rebot
- remove some unused format V4L2_AUDIO_FMT_XX
- Get SNDRV_PCM_FORMAT from V4L2_AUDIO_FMT in driver.
- rename audm2m to viaudm2m.

changes in v5:
- remove V4L2_AUDIO_FMT_LPCM
- define audio pixel format like V4L2_AUDIO_FMT_S8...
- remove rate and format in struct v4l2_audio_format.
- Add V4L2_CID_ASRC_SOURCE_RATE and V4L2_CID_ASRC_DEST_RATE controls
- updata document accordingly.

changes in v4:
- update document style
- separate V4L2_AUDIO_FMT_LPCM and V4L2_CAP_AUDIO_M2M in separate commit

changes in v3:
- Modify documents for adding audio m2m support
- Add audio virtual m2m driver
- Defined V4L2_AUDIO_FMT_LPCM format type for audio.
- Defined V4L2_CAP_AUDIO_M2M capability type for audio m2m case.
- with modification in v4l-utils, pass v4l2-compliance test.

changes in v2:
- decouple the implementation in v4l2 and ALSA
- implement the memory to memory driver as a platfrom driver
  and move it to driver/media
- move fsl_asrc_common.h to include/sound folder

Shengjiu Wang (15):
  ASoC: fsl_asrc: define functions for memory to memory usage
  ASoC: fsl_easrc: define functions for memory to memory usage
  ASoC: fsl_asrc: move fsl_asrc_common.h to include/sound
  ASoC: fsl_asrc: register m2m platform device
  ASoC: fsl_easrc: register m2m platform device
  media: uapi: Add V4L2_CAP_AUDIO_M2M capability flag
  media: v4l2: Add audio capture and output support
  media: uapi: Define audio sample format fourcc type
  media: uapi: Add V4L2_CTRL_CLASS_M2M_AUDIO
  media: uapi: Add audio rate controls support
  media: uapi: Declare interface types for Audio
  media: uapi: Add an entity type for audio resampler
  media: vivid: add fixed point test controls
  media: imx-asrc: Add memory to memory driver
  media: vim2m-audio: add virtual driver for audio memory to memory

 .../media/mediactl/media-types.rst|   11 +
 .../userspace-api/media/v4l/buffer.rst|6 +
 .../userspace-api/media/v4l/common.rst|1 +
 .../media/v4l/dev-audio-mem2mem.rst   |   71 +
 .../userspace-api/media/v4l/devices.rst   |1 +
 .../media/v4l/ext-ctrls-audio-m2m.rst |   41 +
 .../userspace-api/media/v4l/pixfmt-audio.rst  |   87 ++
 .../userspace-api/media/v4l/pixfmt.rst|1 +
 .../media/v4l/vidioc-enum-fmt.rst |2 +
 .../media/v4l/vidioc-g-ext-ctrls.rst  |4 +
 .../userspace-api/media/v4l/vidioc-g-fmt.rst  |4 +
 .../media/v4l/vidioc-querycap.rst |3 +
 .../media/videodev2.h.rst.exceptions  |3 +
 .../media/common/videobuf2/videobuf2-v4l2.c   |4 +
 drivers/media/platform/nxp/Kconfig|   13 +
 drivers/media/platform/nxp/Makefile   |1 +
 drivers/media/platf

[PATCH v12 01/15] ASoC: fsl_asrc: define functions for memory to memory usage

2024-01-18 Thread Shengjiu Wang
ASRC can be used on memory to memory case, define several
functions for m2m usage.

m2m_prepare: prepare for the start step
m2m_start: the start step
m2m_unprepare: unprepare for stop step, optional
m2m_stop: stop step
m2m_check_format: check format is supported or not
m2m_calc_out_len: calculate output length according to input length
m2m_get_maxburst: burst size for dma
m2m_pair_suspend: suspend function of pair, optional.
m2m_pair_resume: resume function of pair
get_output_fifo_size: get remaining data size in FIFO

Signed-off-by: Shengjiu Wang 
Acked-by: Mark Brown 
---
 sound/soc/fsl/fsl_asrc.c| 126 
 sound/soc/fsl/fsl_asrc.h|   2 +
 sound/soc/fsl/fsl_asrc_common.h |  37 ++
 3 files changed, 165 insertions(+)

diff --git a/sound/soc/fsl/fsl_asrc.c b/sound/soc/fsl/fsl_asrc.c
index b793263291dc..7d8643ee0ba0 100644
--- a/sound/soc/fsl/fsl_asrc.c
+++ b/sound/soc/fsl/fsl_asrc.c
@@ -1063,6 +1063,124 @@ static int fsl_asrc_get_fifo_addr(u8 dir, enum 
asrc_pair_index index)
return REG_ASRDx(dir, index);
 }
 
+/* Get sample numbers in FIFO */
+static unsigned int fsl_asrc_get_output_fifo_size(struct fsl_asrc_pair *pair)
+{
+   struct fsl_asrc *asrc = pair->asrc;
+   enum asrc_pair_index index = pair->index;
+   u32 val;
+
+   regmap_read(asrc->regmap, REG_ASRFST(index), &val);
+
+   val &= ASRFSTi_OUTPUT_FIFO_MASK;
+
+   return val >> ASRFSTi_OUTPUT_FIFO_SHIFT;
+}
+
+static int fsl_asrc_m2m_prepare(struct fsl_asrc_pair *pair)
+{
+   struct fsl_asrc_pair_priv *pair_priv = pair->private;
+   struct fsl_asrc *asrc = pair->asrc;
+   struct device *dev = &asrc->pdev->dev;
+   struct asrc_config config;
+   int ret;
+
+   /* fill config */
+   config.pair = pair->index;
+   config.channel_num = pair->channels;
+   config.input_sample_rate = pair->rate[IN];
+   config.output_sample_rate = pair->rate[OUT];
+   config.input_format = pair->sample_format[IN];
+   config.output_format = pair->sample_format[OUT];
+   config.inclk = INCLK_NONE;
+   config.outclk = OUTCLK_ASRCK1_CLK;
+
+   pair_priv->config = &config;
+   ret = fsl_asrc_config_pair(pair, true);
+   if (ret) {
+   dev_err(dev, "failed to config pair: %d\n", ret);
+   return ret;
+   }
+
+   pair->first_convert = 1;
+
+   return 0;
+}
+
+static int fsl_asrc_m2m_start(struct fsl_asrc_pair *pair)
+{
+   if (pair->first_convert) {
+   fsl_asrc_start_pair(pair);
+   pair->first_convert = 0;
+   }
+   /*
+* Clear DMA request during the stall state of ASRC:
+* During STALL state, the remaining in input fifo would never be
+* smaller than the input threshold while the output fifo would not
+* be bigger than output one. Thus the DMA request would be cleared.
+*/
+   fsl_asrc_set_watermarks(pair, ASRC_FIFO_THRESHOLD_MIN,
+   ASRC_FIFO_THRESHOLD_MAX);
+
+   /* Update the real input threshold to raise DMA request */
+   fsl_asrc_set_watermarks(pair, ASRC_M2M_INPUTFIFO_WML,
+   ASRC_M2M_OUTPUTFIFO_WML);
+
+   return 0;
+}
+
+static int fsl_asrc_m2m_stop(struct fsl_asrc_pair *pair)
+{
+   if (!pair->first_convert) {
+   fsl_asrc_stop_pair(pair);
+   pair->first_convert = 1;
+   }
+
+   return 0;
+}
+
+/* calculate capture data length according to output data length and sample 
rate */
+static int fsl_asrc_m2m_calc_out_len(struct fsl_asrc_pair *pair, int 
input_buffer_length)
+{
+   unsigned int in_width, out_width;
+   unsigned int channels = pair->channels;
+   unsigned int in_samples, out_samples;
+   unsigned int out_length;
+
+   in_width = snd_pcm_format_physical_width(pair->sample_format[IN]) / 8;
+   out_width = snd_pcm_format_physical_width(pair->sample_format[OUT]) / 8;
+
+   in_samples = input_buffer_length / in_width / channels;
+   out_samples = pair->rate[OUT] * in_samples / pair->rate[IN];
+   out_length = (out_samples - ASRC_OUTPUT_LAST_SAMPLE) * out_width * 
channels;
+
+   return out_length;
+}
+
+static int fsl_asrc_m2m_get_maxburst(u8 dir, struct fsl_asrc_pair *pair)
+{
+   struct fsl_asrc *asrc = pair->asrc;
+   struct fsl_asrc_priv *asrc_priv = asrc->private;
+   int wml = (dir == IN) ? ASRC_M2M_INPUTFIFO_WML : 
ASRC_M2M_OUTPUTFIFO_WML;
+
+   if (!asrc_priv->soc->use_edma)
+   return wml * pair->channels;
+   else
+   return 1;
+}
+
+static int fsl_asrc_m2m_pair_resume(struct fsl_asrc_pair *pair)
+{
+   struct fsl_asrc *asrc = pair->asrc;
+   int i;
+
+   for (i = 0; i < pair->channels * 4; i++)
+   regmap_write(asrc->regmap, REG_ASRDI(pair->index), 0);
+
+   pair->first_convert = 1;
+   return 0;
+}
+
 static int fsl_asrc_runtime_resume(struct device *dev);
 static in

Re: [PATCH -fixes v2] RISC-V: KVM: Require HAVE_KVM

2024-01-18 Thread Anup Patel
On Thu, Jan 4, 2024 at 6:07 PM Andrew Jones  wrote:
>
> KVM requires EVENTFD, which is selected by HAVE_KVM. Other KVM
> supporting architectures select HAVE_KVM and then their KVM
> Kconfigs ensure its there with a depends on HAVE_KVM. Make RISCV
> consistent with that approach which fixes configs which have KVM
> but not EVENTFD, as was discovered with a randconfig test.
>
> Fixes: 99cdc6c18c2d ("RISC-V: Add initial skeletal KVM support")
> Reported-by: Randy Dunlap 
> Closes: 
> https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d382553...@infradead.org/
> Signed-off-by: Andrew Jones 

Queued this patch for Linux-6.8

Regards,
Anup

> ---
>
> v2:
>  - Added Fixes tag and -fixes prefix [Alexandre/Anup]
>
>  arch/riscv/Kconfig | 1 +
>  arch/riscv/kvm/Kconfig | 2 +-
>  2 files changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index a935a5f736b9..daba06a3b76f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -128,6 +128,7 @@ config RISCV
> select HAVE_KPROBES if !XIP_KERNEL
> select HAVE_KPROBES_ON_FTRACE if !XIP_KERNEL
> select HAVE_KRETPROBES if !XIP_KERNEL
> +   select HAVE_KVM
> # https://github.com/ClangBuiltLinux/linux/issues/1881
> select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if !LD_IS_LLD
> select HAVE_MOVE_PMD
> diff --git a/arch/riscv/kvm/Kconfig b/arch/riscv/kvm/Kconfig
> index 1fd76aee3b71..36fa8ec9e5ba 100644
> --- a/arch/riscv/kvm/Kconfig
> +++ b/arch/riscv/kvm/Kconfig
> @@ -19,7 +19,7 @@ if VIRTUALIZATION
>
>  config KVM
> tristate "Kernel-based Virtual Machine (KVM) support (EXPERIMENTAL)"
> -   depends on RISCV_SBI && MMU
> +   depends on HAVE_KVM && RISCV_SBI && MMU
> select HAVE_KVM_IRQCHIP
> select HAVE_KVM_IRQ_ROUTING
> select HAVE_KVM_MSI
> --
> 2.43.0
>


[PATCH 1/1] PCI/DPC: Fix TLP Prefix register reading offset

2024-01-18 Thread Ilpo Järvinen
The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec
7.9.14.13) but the loop in dpc_process_rp_pio_error() keeps reading
from the first DWORD. Add the iteration count based offset calculation
into the config read.

Fixes: f20c4ea49ec4 ("PCI/DPC: Add eDPC support")
Signed-off-by: Ilpo Järvinen 
---
 drivers/pci/pcie/dpc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/dpc.c b/drivers/pci/pcie/dpc.c
index 94111e438241..e5d7c12854fa 100644
--- a/drivers/pci/pcie/dpc.c
+++ b/drivers/pci/pcie/dpc.c
@@ -234,7 +234,7 @@ static void dpc_process_rp_pio_error(struct pci_dev *pdev)
 
for (i = 0; i < pdev->dpc_rp_log_size - 5; i++) {
pci_read_config_dword(pdev,
-   cap + PCI_EXP_DPC_RP_PIO_TLPPREFIX_LOG, &prefix);
+   cap + PCI_EXP_DPC_RP_PIO_TLPPREFIX_LOG + i * 4, 
&prefix);
pci_err(pdev, "TLP Prefix Header: dw%d, %#010x\n", i, prefix);
}
  clear_status:
-- 
2.39.2



Re: [PATCH] init: refactor the generic cpu_to_node for NUMA

2024-01-18 Thread Greg KH
On Thu, Jan 18, 2024 at 11:14:12AM +0800, Huang Shijie wrote:
> (0) We list the ARCHs which support the NUMA:
>arm64, loongarch, powerpc, riscv,
>sparc, mips, s390, x86,

I do not understand this format, what are you saying here?

Have you read the kernel documentation for how to write changelog texts?
It doesn't say "list a bunch of things", it's a bit more descriptive.

> 
> (1) Some ARCHs in (0) override the generic cpu_to_node(), such as:
>sparc, mips, s390, x86.
> 
> Since these ARCHs have their own cpu_to_node(), we do not care
> about them.
> 
> (2) The ARCHs enable NUMA and use the generic cpu_to_node.
> From (0) and (1), we can know that four ARCHs support NUMA and
> use the generic cpu_to_node:
> arm64, loongarch, powerpc, riscv,
> 
> The generic cpu_to_node depends on percpu "numa_node".
> 
> (2.1) The loongarch sets "numa_node" in:
>   start_kernel --> smp_prepare_boot_cpu()
> 
> (2.2) The arm64, powerpc, riscv set "numa_node" in:
> start_kernel --> arch_call_rest_init() --> rest_init()
>  --> kernel_init() --> kernel_init_freeable()
>--> smp_prepare_cpus()
> 
> (2.3) The first place calling the cpu_to_node() is early_trace_init():
>   start_kernel --> early_trace_init()--> __ring_buffer_alloc()
>  --> rb_allocate_cpu_buffer()
> 
> (2.4) So it safe for loongarch. But for arm64, powerpc and riscv,
>   there are at least four places in the common code where
> the cpu_to_node() is called before it is initialized:
>  a.) early_trace_init() in kernel/trace/trace.c
>  b.) sched_init()   in kernel/sched/core.c
>  c.) init_sched_fair_class()in kernel/sched/fair.c
>  d.) workqueue_init_early() in kernel/workqueue.c
> 
> (3) In order to fix the issue, the patch refactors the generic cpu_to_node:
> (3.1) change cpu_to_node to function pointer,
>   and export it for kernel modules.
> 
> (3.2) introduce _cpu_to_node() which is the original cpu_to_node().
> 
> (3.3) introduce smp_prepare_boot_cpu_start() to wrap the original
>   smp_prepare_boot_cpu(), and set cpu_to_node with
> early_cpu_to_node which works fine for arm64, powerpc,
> riscv and loongarch.
> 
> (3.4) introduce smp_prepare_cpus_done() to wrap the original
>   smp_prepare_cpus().
> The "numa_node" is ready after smp_prepare_cpus(),
> then set cpu_to_node with _cpu_to_node().

When you start listing different things in a changelog, that's a hint to
the reviewer to say "please break this up" as patches need to do only
one thing at a time.  As I can't follow the above text at all, that's
all the review comments I'm able to give here, sorry.

But as-is, this isn't acceptable :(

thanks,

greg k-h


Re: [PATCH v11 15/15] media: vim2m-audio: add virtual driver for audio memory to memory

2024-01-18 Thread Shengjiu Wang
On Thu, Jan 18, 2024 at 3:56 PM Hans Verkuil  wrote:
>
> On 18/01/2024 07:13, Shengjiu Wang wrote:
> > On Wed, Jan 17, 2024 at 6:32 PM Hans Verkuil  wrote:
> >>
> >> On 22/11/2023 08:23, Shengjiu Wang wrote:
> >>> Audio memory to memory virtual driver use video memory to memory
> >>> virtual driver vim2m.c as example. The main difference is
> >>> device type is VFL_TYPE_AUDIO and device cap type is V4L2_CAP_AUDIO_M2M.
> >>>
> >>> The device_run function is a dummy function, which is simply
> >>> copy the data from input buffer to output buffer.
> >>>
> >>> Signed-off-by: Shengjiu Wang 
> >>> ---
> >>>  drivers/media/test-drivers/Kconfig   |  11 +
> >>>  drivers/media/test-drivers/Makefile  |   1 +
> >>>  drivers/media/test-drivers/vim2m-audio.c | 799 +++
> >>>  3 files changed, 811 insertions(+)
> >>>  create mode 100644 drivers/media/test-drivers/vim2m-audio.c
> >>>
> >>> diff --git a/drivers/media/test-drivers/Kconfig 
> >>> b/drivers/media/test-drivers/Kconfig
> >>> index 459b433e9fae..55f8af6ee4e2 100644
> >>> --- a/drivers/media/test-drivers/Kconfig
> >>> +++ b/drivers/media/test-drivers/Kconfig
> >>> @@ -17,6 +17,17 @@ config VIDEO_VIM2M
> >>> This is a virtual test device for the memory-to-memory driver
> >>> framework.
> >>>
> >>> +config VIDEO_VIM2M_AUDIO
> >>> + tristate "Virtual Memory-to-Memory Driver For Audio"
> >>> + depends on VIDEO_DEV
> >>> + select VIDEOBUF2_VMALLOC
> >>> + select V4L2_MEM2MEM_DEV
> >>> + select MEDIA_CONTROLLER
> >>> + select MEDIA_CONTROLLER_REQUEST_API
> >>
> >> Drop this. This option has been removed.
> >>
> >>> + help
> >>> +   This is a virtual audio test device for the memory-to-memory 
> >>> driver
> >>> +   framework.
> >>> +
> >>>  source "drivers/media/test-drivers/vicodec/Kconfig"
> >>>  source "drivers/media/test-drivers/vimc/Kconfig"
> >>>  source "drivers/media/test-drivers/vivid/Kconfig"
> >>> diff --git a/drivers/media/test-drivers/Makefile 
> >>> b/drivers/media/test-drivers/Makefile
> >>> index 740714a4584d..0c61c9ada3e1 100644
> >>> --- a/drivers/media/test-drivers/Makefile
> >>> +++ b/drivers/media/test-drivers/Makefile
> >>> @@ -10,6 +10,7 @@ obj-$(CONFIG_DVB_VIDTV) += vidtv/
> >>>
> >>>  obj-$(CONFIG_VIDEO_VICODEC) += vicodec/
> >>>  obj-$(CONFIG_VIDEO_VIM2M) += vim2m.o
> >>> +obj-$(CONFIG_VIDEO_VIM2M_AUDIO) += vim2m-audio.o
> >>>  obj-$(CONFIG_VIDEO_VIMC) += vimc/
> >>>  obj-$(CONFIG_VIDEO_VIVID) += vivid/
> >>>  obj-$(CONFIG_VIDEO_VISL) += visl/
> >>> diff --git a/drivers/media/test-drivers/vim2m-audio.c 
> >>> b/drivers/media/test-drivers/vim2m-audio.c
> >>> new file mode 100644
> >>> index ..72806ada8628
> >>> --- /dev/null
> >>> +++ b/drivers/media/test-drivers/vim2m-audio.c
> >>> @@ -0,0 +1,799 @@
> >>> +// SPDX-License-Identifier: GPL-2.0+
> >>> +/*
> >>> + * A virtual v4l2-mem2mem example for audio device.
> >>> + */
> >>> +
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>> +
> >>> +MODULE_DESCRIPTION("Virtual device for audio mem2mem testing");
> >>> +MODULE_LICENSE("GPL");
> >>> +
> >>> +static unsigned int debug;
> >>> +module_param(debug, uint, 0644);
> >>> +MODULE_PARM_DESC(debug, "debug level");
> >>> +
> >>> +#define MEM2MEM_NAME "vim2m-audio"
> >>> +
> >>> +#define dprintk(dev, lvl, fmt, arg...) \
> >>> + v4l2_dbg(lvl, debug, &(dev)->v4l2_dev, "%s: " fmt, __func__, ## arg)
> >>> +
> >>> +#define SAMPLE_NUM 4096
> >>> +
> >>> +static void audm2m_dev_release(struct device *dev)
> >>> +{}
> >>> +
> >>> +static struct platform_device audm2m_pdev = {
> >>> + .name   = MEM2MEM_NAME,
> >>> + .dev.release= audm2m_dev_release,
> >>> +};
> >>> +
> >>> +static u32 formats[] = {
> >>> + V4L2_AUDIO_FMT_S16_LE,
> >>> +};
> >>> +
> >>> +#define NUM_FORMATS ARRAY_SIZE(formats)
> >>> +
> >>> +/* Per-queue, driver-specific private data */
> >>> +struct audm2m_q_data {
> >>> + unsigned intrate;
> >>> + unsigned intchannels;
> >>> + unsigned intbuffersize;
> >>> + unsigned intsequence;
> >>> + u32 fourcc;
> >>> +};
> >>> +
> >>> +enum {
> >>> + V4L2_M2M_SRC = 0,
> >>> + V4L2_M2M_DST = 1,
> >>> +};
> >>> +
> >>> +static snd_pcm_format_t find_format(u32 fourcc)
> >>> +{
> >>> + snd_pcm_format_t fmt;
> >>> + unsigned int k;
> >>> +
> >>> + for (k = 0; k < NUM_FORMATS; k++) {
> >>> + if (formats[k] == fourcc)
> >>> + break;
> >>> + }
> >>> +
> >>> + if (k == NUM_FORMATS)
> >>> + return 0;
> >>> +
> >>> + fmt = v4l2_fourcc_to_audfmt(formats[k]);
> >>> +
> >>> + return fmt;
> >>> +}
> >>> +
> >>> +struct audm2m_dev {
> >>> + struct v4l2_device  v4l2_dev;
> >>> + struct video_device