Re: [PATCH] PowerPC: Replace kretprobe with rethook

2024-05-16 Thread kernel test robot
Hi Abhishek,

kernel test robot noticed the following build errors:

[auto build test ERROR on powerpc/next]
[also build test ERROR on powerpc/fixes linus/master v6.9 next-20240516]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Abhishek-Dubey/PowerPC-Replace-kretprobe-with-rethook/20240516-214818
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:
https://lore.kernel.org/r/20240516134646.1059114-1-adubey%40linux.ibm.com
patch subject: [PATCH] PowerPC: Replace kretprobe with rethook
config: powerpc-asp8347_defconfig 
(https://download.01.org/0day-ci/archive/20240517/202405171203.casoixjg-...@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 
6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240517/202405171203.casoixjg-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405171203.casoixjg-...@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: arch_rethook_trampoline
   >>> referenced by stacktrace.c
   >>>   
arch/powerpc/kernel/stacktrace.o:(arch_stack_walk_reliable) in archive vmlinux.a
   >>> referenced by stacktrace.c
   >>>   
arch/powerpc/kernel/stacktrace.o:(arch_stack_walk_reliable) in archive vmlinux.a

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH] PowerPC: Replace kretprobe with rethook

2024-05-16 Thread kernel test robot
Hi Abhishek,

kernel test robot noticed the following build errors:

[auto build test ERROR on powerpc/next]
[also build test ERROR on powerpc/fixes linus/master v6.9 next-20240516]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Abhishek-Dubey/PowerPC-Replace-kretprobe-with-rethook/20240516-214818
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:
https://lore.kernel.org/r/20240516134646.1059114-1-adubey%40linux.ibm.com
patch subject: [PATCH] PowerPC: Replace kretprobe with rethook
config: powerpc-allnoconfig 
(https://download.01.org/0day-ci/archive/20240517/202405171247.spwntdjg-...@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240517/202405171247.spwntdjg-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405171247.spwntdjg-...@intel.com/

All errors (new ones prefixed by >>):

   powerpc-linux-ld: arch/powerpc/kernel/stacktrace.o: in function 
`arch_stack_walk_reliable':
>> stacktrace.c:(.text+0x172): undefined reference to `arch_rethook_trampoline'
>> powerpc-linux-ld: stacktrace.c:(.text+0x17e): undefined reference to 
>> `arch_rethook_trampoline'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


[powerpc:next] BUILD SUCCESS 61700f816e6f58f6b1aaa881a69a784d146e30f0

2024-05-16 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: 61700f816e6f58f6b1aaa881a69a784d146e30f0  powerpc/fadump: Fix 
section mismatch warning

elapsed time: 734m

configs tested: 190
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

tested configs:
alpha allnoconfig   gcc  
alphaallyesconfig   gcc  
alpha   defconfig   gcc  
arc  allmodconfig   gcc  
arc   allnoconfig   gcc  
arc  allyesconfig   gcc  
arc defconfig   gcc  
arc   randconfig-001-20240517   gcc  
arc   randconfig-002-20240517   gcc  
arm  allmodconfig   gcc  
arm   allnoconfig   clang
arm  allyesconfig   gcc  
arm  collie_defconfig   gcc  
arm davinci_all_defconfig   clang
arm defconfig   clang
armdove_defconfig   gcc  
armhisi_defconfig   gcc  
arm lpc32xx_defconfig   clang
arm nhk8815_defconfig   clang
arm pxa_defconfig   gcc  
arm   randconfig-001-20240517   clang
arm   randconfig-002-20240517   clang
arm   randconfig-003-20240517   clang
arm   randconfig-004-20240517   clang
arm   sama7_defconfig   clang
arm   spear13xx_defconfig   gcc  
arm   tegra_defconfig   gcc  
arm vf610m4_defconfig   gcc  
arm wpcm450_defconfig   gcc  
arm64allmodconfig   clang
arm64 allnoconfig   gcc  
arm64   defconfig   gcc  
arm64 randconfig-001-20240517   clang
arm64 randconfig-002-20240517   gcc  
arm64 randconfig-003-20240517   clang
arm64 randconfig-004-20240517   clang
csky allmodconfig   gcc  
csky  allnoconfig   gcc  
csky allyesconfig   gcc  
cskydefconfig   gcc  
csky  randconfig-001-20240517   gcc  
csky  randconfig-002-20240517   gcc  
hexagon  allmodconfig   clang
hexagon   allnoconfig   clang
hexagon  allyesconfig   clang
hexagon defconfig   clang
hexagon   randconfig-001-20240517   clang
hexagon   randconfig-002-20240517   clang
i386 allmodconfig   gcc  
i386  allnoconfig   gcc  
i386 allyesconfig   gcc  
i386 buildonly-randconfig-001-20240516   clang
i386 buildonly-randconfig-001-20240517   clang
i386 buildonly-randconfig-002-20240516   clang
i386 buildonly-randconfig-002-20240517   clang
i386 buildonly-randconfig-003-20240516   clang
i386 buildonly-randconfig-003-20240517   gcc  
i386 buildonly-randconfig-004-20240516   gcc  
i386 buildonly-randconfig-004-20240517   clang
i386 buildonly-randconfig-005-20240516   gcc  
i386 buildonly-randconfig-005-20240517   clang
i386 buildonly-randconfig-006-20240516   gcc  
i386 buildonly-randconfig-006-20240517   gcc  
i386defconfig   clang
i386  randconfig-001-20240516   gcc  
i386  randconfig-001-20240517   gcc  
i386  randconfig-002-20240516   gcc  
i386  randconfig-002-20240517   gcc  
i386  randconfig-003-20240516   clang
i386  randconfig-003-20240517   gcc  
i386  randconfig-004-20240516   clang
i386  randconfig-004-20240517   gcc  
i386  randconfig-005-20240516   clang
i386  randconfig-005-20240517   gcc  
i386  randconfig-006-20240516   clang
i386  randconfig-006-20240517   gcc  
i386  randconfig-011-20240516   gcc  
i386  randconfig-011-20240517   gcc  
i386  randconfig-012-20240516   gcc  
i386  randconfig-012-20240517   clang
i386  randconfig-013-20240516   clang
i386  randconfig-013-20240517   gcc  
i386  randconfig-014-20240516   gcc  
i386  randconfig-014-20240517   gcc  
i386  randconfig-015-20240516   gcc  
i386  randconfig

Re: [PATCH v2 2/2] powerpc: hotplug driver bridge support

2024-05-16 Thread Oliver O'Halloran
On Tue, May 14, 2024 at 11:54 PM Krishna Kumar  wrote:
>
> There is an issue with the hotplug operation when it's done on the
> bridge/switch slot. The bridge-port and devices behind the bridge, which
> become offline by hot-unplug operation, don't get hot-plugged/enabled by
> doing hot-plug operation on that slot. Only the first port of the bridge
> gets enabled and the remaining port/devices remain unplugged. The hot
> plug/unplug operation is done by the hotplug driver
> (drivers/pci/hotplug/pnv_php.c).
>
> Root Cause Analysis: This behavior is due to missing code for the DPC
> switch/bridge.

I don't see anything touching DPC in this series?

> *snip*
>
> Command for reproducing the issue :
>
> For hot unplug/disable - echo 0 > /sys/bus/pci/slots/C5/power
> For hot plug/enable -echo 1 > /sys/bus/pci/slots/C5/power
>
> where C5 is slot associated with bridge.
>
> Scenario/Tests:
> Output of lspci -nn before test is given below. This snippet contains
> devices used for testing on Powernv machine.
>
> 0004:02:00.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:01.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:02.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:02:03.0 PCI bridge [0604]: PMC-Sierra Inc. Device [11f8:4052]
> 0004:08:00.0 Serial Attached SCSI controller [0107]:
> Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01)
> 0004:09:00.0 Serial Attached SCSI controller [0107]:
> Broadcom / LSI SAS3216 PCI-Express Fusion-MPT SAS-3 [1000:00c9] (rev 01)
>
> Output of lspci -tv before test is as follows:
>
> # lspci -tv
>  +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--+-00.0-[03-07]--
>  |   |   +-01.0-[08]00.0  Broadcom / 
> LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>  |   |   +-02.0-[09]00.0  Broadcom / 
> LSI SAS3216 PCI-Express Fusion-MPT SAS-3
>  |   |   \-03.0-[0a-0e]--
>  |   \-00.1  PMC-Sierra Inc. Device 4052
>
> C5(bridge) and C6(End Point) slot address are as below:
> # cat /sys/bus/pci/slots/C5/address
> 0004:02:00
> # cat /sys/bus/pci/slots/C6/address
> 0004:09:00

Uh, if I'm reading this right it looks like your "slot" C5 is actually
the PCIe switch's internal bus which is definitely not hot pluggable.
I find it helps to look at the PCI topology in terms of where the
physical PCIe links are. Here we've got:

- A link between the PHB (0004:00:00.0) and the switch upstream port
(0004:01:00.0)
- A link from switch downstream port 0 (0004:02:00.0) to nothing
- A link from switch downstream port 1 (0004:02:01.0) to a SAS card
- A link from switch downstream port 2 (0004:02:02.0) to a SAS card
- A link from switch downstream port 2 (0004:02:03.0) to nothing

Note that there's no PCIe link between the switch upstream port
(0004:01:00.0) and the downstream ports on bus 0004:02. The connection
between those is invisible to us because it's custom bus logic
internal to the PCIe switch ASIC. What I think has happened here is
that system firmware has supplied bad PCIe slot information to OPAL
which has resulted in pnv_php advertising a slot in the wrong place.
Assuming this following the usual IBM convention I'd expect the bridge
device for C5 to be the PHB's root port and the bus should be 0004:01.
It might be worth adding some logic to pnv_php to verify the PCI
bridge upstream of the slot actually has the PCIe slot capability to
guard against this problem.

> Hot-unplug operation on slot associated with bridge:
> # echo 0 > /sys/bus/pci/slots/C5/power
> # lspci -tv
>  +-[0004:00]---00.0-[01-0e]--+-00.0-[02-0e]--
>  |   \-00.1  PMC-Sierra Inc. Device 4052

Yep, "powering off" C5 doesn't remove the upstream port device. This
would create problems if you physically removed the card from C5 since
the kernel would assume the switch device is still present.

> *snip*


> diff --git a/arch/powerpc/kernel/pci_dn.c b/arch/powerpc/kernel/pci_dn.c
> index 38561d6a2079..bea612759832 100644
> --- a/arch/powerpc/kernel/pci_dn.c
> +++ b/arch/powerpc/kernel/pci_dn.c
> @@ -493,4 +493,36 @@ static void pci_dev_pdn_setup(struct pci_dev *pdev)
> pdn = pci_get_pdn(pdev);
> pdev->dev.archdata.pci_data = pdn;
>  }
> +
> +void pci_traverse_sibling_nodes_and_scan_slot(struct device_node *start, 
> struct pci_bus *bus)
> +{
> +   struct device_node *dn;
> +   int slotno;
> +
> +   u32 class = 0;
> +
> +   if (!of_property_read_u32(start->child, "class-code", )) {
> +   /* Call of pci_scan_slot for non-bridge/EP case */
> +   if (!((class >> 8) == PCI_CLASS_BRIDGE_PCI)) {
> +   slotno = PCI_SLOT(PCI_DN(start->child)->devfn);
> +   pci_scan_slot(bus, PCI_DEVFN(slotno, 0));
> +   return;
> +   }
> +   }
> +
> +   /* Iterate all siblings */
> +   

Re: [PATCH v8] arch/powerpc/kvm: Add support for reading VPA counters for pseries guests

2024-05-16 Thread kernel test robot
Hi Gautam,

kernel test robot noticed the following build errors:

[auto build test ERROR on powerpc/topic/ppc-kvm]
[also build test ERROR on powerpc/next powerpc/fixes kvm/queue 
mst-vhost/linux-next linus/master v6.9 next-20240516]
[cannot apply to kvm/linux-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Gautam-Menghani/arch-powerpc-kvm-Add-support-for-reading-VPA-counters-for-pseries-guests/20240510-185213
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
topic/ppc-kvm
patch link:
https://lore.kernel.org/r/20240510104941.78410-1-gautam%40linux.ibm.com
patch subject: [PATCH v8] arch/powerpc/kvm: Add support for reading VPA 
counters for pseries guests
config: powerpc-allmodconfig 
(https://download.01.org/0day-ci/archive/20240517/202405170932.tl7g99ij-...@intel.com/config)
compiler: powerpc64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240517/202405170932.tl7g99ij-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405170932.tl7g99ij-...@intel.com/

All errors (new ones prefixed by >>):

   powerpc64-linux-ld: warning: discarding dynamic section .glink
   powerpc64-linux-ld: warning: discarding dynamic section .plt
   powerpc64-linux-ld: linkage table error against 
`__traceiter_kvmppc_vcpu_stats'
   powerpc64-linux-ld: stubs don't match calculated size
   powerpc64-linux-ld: can not build stubs: bad value
   powerpc64-linux-ld: arch/powerpc/kvm/book3s_hv_nestedv2.o: in function 
`do_trace_nested_cs_time':
>> book3s_hv_nestedv2.c:(.text.do_trace_nested_cs_time+0x264): undefined 
>> reference to `__traceiter_kvmppc_vcpu_stats'
>> powerpc64-linux-ld: 
>> arch/powerpc/kvm/book3s_hv_nestedv2.o:(__jump_table+0x8): undefined 
>> reference to `__tracepoint_kvmppc_vcpu_stats'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH RESEND v8 16/16] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of

2024-05-16 Thread Klara Modin
3: 800081223dd0 x22: 3a198a40 x21: 
[   44.430908] x20: d4202000 x19: 8000814dd880 x18: 0006
[   44.439122] x17:  x16: 0020 x15: 0002
[   44.447338] x14: 8000811a6370 x13: 2000 x12: 
[   44.43] x11: 8000811a6370 x10: 0166 x9 : 8000811fe370
[   44.463771] x8 : 00017fe8 x7 : f000 x6 : 8000811fe370
[   44.471989] x5 :  x4 :  x3 : 
[   44.480208] x2 :  x1 :  x0 : 02203240
[   44.488420] Call trace:
[   44.491847] vfree (mm/vmalloc.c:3324 (discriminator 1)) 
[   44.495900] execmem_free (mm/execmem.c:70) 
[   44.500394] bpf_jit_free_exec+0x10/0x1c 
[   44.505329] bpf_prog_pack_free (kernel/bpf/core.c:1006) 
[   44.510507] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195) 
[   44.516017] bpf_jit_free (include/linux/filter.h:1083 
arch/arm64/net/bpf_jit_comp.c:2474) 
[   44.520424] bpf_prog_free_deferred (kernel/bpf/core.c:2785) 
[   44.525864] process_one_work (kernel/workqueue.c:3273) 
[   44.530754] worker_thread (kernel/workqueue.c:3342 (discriminator 2) 
kernel/workqueue.c:3429 (discriminator 2)) 
[   44.535364] kthread (kernel/kthread.c:388) 
[   44.539417] ret_from_fork (arch/arm64/kernel/entry.S:861) 
[   44.543791] ---[ end trace  ]---
# bad: [dbd9e2e056d8577375ae4b31ada94f8aa3769e8a] Add linux-next specific files 
for 20240516
git bisect start 'next/master'
# status: waiting for good commit(s), bad commit known
# good: [8c06da67d0bd3139a97f301b4aa9c482b9d4f29e] Merge tag 
'livepatching-for-6.10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
git bisect good 8c06da67d0bd3139a97f301b4aa9c482b9d4f29e
# good: [147d3734724040bb0aff1252299e48947a6c8858] Merge branch 'master' of 
git://linuxtv.org/mchehab/media-next.git
git bisect good 147d3734724040bb0aff1252299e48947a6c8858
# bad: [729cf96da8de5e7ae70fef40a1b864bc00c2dca1] Merge branch 'next' of 
git://git.kernel.org/pub/scm/virt/kvm/kvm.git
git bisect bad 729cf96da8de5e7ae70fef40a1b864bc00c2dca1
# good: [4364438497c638785b1394aab764a15b6baefaf3] Merge branch 'drm-xe-next' 
of https://gitlab.freedesktop.org/drm/xe/kernel
git bisect good 4364438497c638785b1394aab764a15b6baefaf3
# bad: [b3ead6c10eccbfa446ce30927f94472c278cd3d7] Merge branch 'for-next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git
git bisect bad b3ead6c10eccbfa446ce30927f94472c278cd3d7
# bad: [d83384f475a4cfa0e9bda1cab538d99360fa2c48] Merge branch 'for-mfd-next' 
of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git
git bisect bad d83384f475a4cfa0e9bda1cab538d99360fa2c48
# bad: [9564f97e8e3ec6bdbf0c105b45fa2516d64c4685] Merge branch 'for-next' of 
git://git.kernel.dk/linux-block.git
git bisect bad 9564f97e8e3ec6bdbf0c105b45fa2516d64c4685
# bad: [0e6c77dedcb11f510c0dbdaf6455b918b28f1b62] Merge branch 'next' of 
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git
git bisect bad 0e6c77dedcb11f510c0dbdaf6455b918b28f1b62
# good: [5852f2afcdd9b7c9dedec4fdf14b8b079349828f] Input: drop explicit 
initialization of struct i2c_device_id::driver_data to 0
git bisect good 5852f2afcdd9b7c9dedec4fdf14b8b079349828f
# good: [223b5e57d0d50b0c07b933350dbcde92018d3080] mm/execmem, arch: convert 
remaining overrides of module_alloc to execmem
git bisect good 223b5e57d0d50b0c07b933350dbcde92018d3080
# good: [14e56fb2ed1dbc3c3171d12ab435b0f691f6f215] x86/ftrace: enable dynamic 
ftrace without CONFIG_MODULES
git bisect good 14e56fb2ed1dbc3c3171d12ab435b0f691f6f215
# good: [7582b7be16d0ba90e3dbd9575a730cabd9eb852a] kprobes: remove dependency 
on CONFIG_MODULES
git bisect good 7582b7be16d0ba90e3dbd9575a730cabd9eb852a
# bad: [86d899efdd58c98a0d196e31945009fc47a56264] Merge branch 'modules-next' 
of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git
git bisect bad 86d899efdd58c98a0d196e31945009fc47a56264
# bad: [2c9e5d4a008293407836d29d35dfd4353615bd2f] bpf: remove CONFIG_BPF_JIT 
dependency on CONFIG_MODULES of
git bisect bad 2c9e5d4a008293407836d29d35dfd4353615bd2f
# first bad commit: [2c9e5d4a008293407836d29d35dfd4353615bd2f] bpf: remove 
CONFIG_BPF_JIT dependency on CONFIG_MODULES of


config.gz
Description: application/gzip


Re: [PATCH v2 1/1] arch/fault: don't print logs for pte marker poison errors

2024-05-16 Thread Axel Rasmussen
On Wed, May 15, 2024 at 1:19 PM Borislav Petkov  wrote:
>
> On Wed, May 15, 2024 at 12:19:16PM -0700, Axel Rasmussen wrote:
> > An unprivileged process can allocate a VMA, use the userfaultfd API to
> > install one of these PTE markers, and then register a no-op SIGBUS
> > handler. Now it can access that address in a tight loop,
>
> Maybe the userfaultfd should not allow this, I dunno. You made me look
> at this thing and to me it all sounds weird. One thread does page fault
> handling for the other and that helps with live migration somehow. OMG,
> whaaat?
>
> Maybe I don't understand it and probably never will...
>
> But, for example, membarrier used do to a stupid thing of allowing one
> thread to hammer another with an IPI storm. Bad bad idea. So it got
> fixed.
>
> All I'm saying is, if unprivileged processes can do crap, they should be
> prevented from doing crap. Like ratelimiting the pagefaults or whatnot.
>
> One of the recovery action strategies from memory poison is, well, you
> kill the process. If you can detect the hammering process which
> installed that page marker, you kill it. Problem solved.
>
> But again, this userfaultfd thing sounds really weird so I could very
> well be way wrong.
>
> > Even in a non-contrived / non-malicious case, use of this API could
> > have similar effects. If nothing else, the log message can be
> > confusing to administrators: they state that an MCE occurred, whereas
> > with the simulated poison API, this is not the case; it isn't a "real"
> > MCE / hardware error.
>
> Yeah, I read that part in
>
> Documentation/admin-guide/mm/userfaultfd.rst
>
> Simulated poison huh? Another WTF.
>
> > In the KVM use case, the host can't just allocate a new page, because
> > it doesn't know what the guest might have had stored there. Best we
>
> Ok, let's think of real hw poison.
>
> When doing the recovery, you don't care what's stored there because as
> far as the hardware is concerned, if you consume that poison the *whole*
> machine might go down.
>
> So you lose the page. Plain and simple. And the guest can go visit the
> bureau of complaints and grievances.
>
> Still better than killing the guest or even the whole host with other
> guests running on it.
>
> > can do is propagate the poison into the guest, and let the guest OS
> > deal with it as it sees fit, and mark the page poisoned on the host.
>
> You mark the page as poison on the host and you yank it from under the
> guest. That physical frame is gone and the faster all the actors
> involved understand that, the better.
>
> > I don't disagree the guest *shouldn't* reaccess it in this case. :)
> > But if it did, it should get another poison event just as you say.
>
> Yes, it shouldn't. Look at memory_failure(). This will kill whole
> processes if it has to, depending on what the page is used for.
>
> > And, live migration between physical hosts should be transparent to
> > the guest. So if the guest gets a poison, and then we live migrate it,
>
> So if I were to design this, I'd do it this way:
>
> 0. guest gets hw poison injected
>
> 1. it runs memory_failure() and it kills the processes using the page.
>
> 2. page is marked poisoned on the host so no other guest gets it.
>
> That's it. No second accesses whatsoever. At least this is how it works
> on baremetal.

I agree with almost all of the above. But one point is, I don't think
we can trust the guest to be reasonable. :)

Public cloud provider customers might run some OS other than Linux, or
an old / buggy kernel, or one with out-of-tree patches which make it
do who knows what. There can also be users who are actively malicious.

Some customers may try to do fancy "poison recovery" where they can
avoid killing the in-guest process when a poison event occurs. These
implementations can be buggy :) and unintentionally reaccess.


>
> This hw poisoning emulation is just silly and unnecessary.
>
> But again, I probably am missing some aspects. It all just sounded
> really weird to me that's why I thought I should ask what's behind all
> that.
>
> Thx.
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Segher Boessenkool
Hi!

On Thu, May 16, 2024 at 10:06:58PM +1000, Michael Ellerman wrote:
> Andy Polyakov  writes:
> >>> +.abiversion  2
> >>
> >> I'd prefer that was left to the compiler flags.
> >
> > Problem is that it's the compiler that is responsible for providing this
> > directive in the intermediate .s prior invoking the assembler. And there
> > is no assembler flag to pass through -Wa.
> 
> Hmm, right. But none of our existing .S files include .abiversion
> directives.
> 
> We build .S files with gcc, passing -mabi=elfv2, but it seems to have no
> effect.

Yup.  You coulds include some header file, maybe?  Since you run the
assembler code through the C preprocessor anyway, for some weird reason :-)

> But the actual code follows ELFv2, because we wrote it that way, and I
> guess the linker doesn't look at the actual ABI version of the .o ?

It isn't a version.  It is an actual different ABI.

GNU LD allows linking together whatever, yes.

> Is .abiversion documented anywhere? I can't see it in the manual.

Yeah me neither.  https://sourceware.org/bugzilla/enter_bug.cgi ?
A commandline flag (to GAS) would seem best?


Segher


Re: [PATCH 2/3] crypto: X25519 core functions for ppc64le

2024-05-16 Thread Segher Boessenkool
On Wed, May 15, 2024 at 10:29:56AM +0200, Andy Polyakov wrote:
> >+static void cswap(fe51 p, fe51 q, unsigned int bit)
> 
> The "c" in cswap stands for "constant-time," and the problem is that 
> contemporary compilers have exhibited the ability to produce 
> non-constant-time machine code as result of compilation of the above 
> kind of technique.

This can happen with *any* comnpiler, on *any* platform.  In general,
you have to write machine code if you want to be sure what machine code
will eventually be executed.

>  The outcome is platform-specific and ironically some 
> of PPC code generators were observed to generate "most" 
> non-constant-time code. "Most" in sense that execution time variations 
> would be most easy to catch. One way to work around the problem, at 
> least for the time being, is to add 'asm volatile("" : "+r"(c))' after 
> you calculate 'c'. But there is no guarantee that the next compiler 
> version won't see through it, hence the permanent solution is to do it 
> in assembly. I can put together something...

Such tricks can help ameliorate the problem, sure.  But it is not a
solution ever.


Segher


[PATCH v2 3/3] crypto: Update Kconfig and Makefile for ppc64le x25519.

2024-05-16 Thread Danny Tsen
Defined CRYPTO_CURVE25519_PPC64 to support X25519 for ppc64le.

Added new module curve25519-ppc64le for X25519.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/Kconfig  | 11 +++
 arch/powerpc/crypto/Makefile |  2 ++
 2 files changed, 13 insertions(+)

diff --git a/arch/powerpc/crypto/Kconfig b/arch/powerpc/crypto/Kconfig
index 1e201b7ae2fc..09ebcbdfb34f 100644
--- a/arch/powerpc/crypto/Kconfig
+++ b/arch/powerpc/crypto/Kconfig
@@ -2,6 +2,17 @@
 
 menu "Accelerated Cryptographic Algorithms for CPU (powerpc)"
 
+config CRYPTO_CURVE25519_PPC64
+   tristate "Public key crypto: Curve25519 (PowerPC64)"
+   depends on PPC64 && CPU_LITTLE_ENDIAN
+   select CRYPTO_LIB_CURVE25519_GENERIC
+   select CRYPTO_ARCH_HAVE_LIB_CURVE25519
+   help
+ Curve25519 algorithm
+
+ Architecture: PowerPC64
+ - Little-endian
+
 config CRYPTO_CRC32C_VPMSUM
tristate "CRC32c"
depends on PPC64 && ALTIVEC
diff --git a/arch/powerpc/crypto/Makefile b/arch/powerpc/crypto/Makefile
index fca0e9739869..59808592f0a1 100644
--- a/arch/powerpc/crypto/Makefile
+++ b/arch/powerpc/crypto/Makefile
@@ -17,6 +17,7 @@ obj-$(CONFIG_CRYPTO_AES_GCM_P10) += aes-gcm-p10-crypto.o
 obj-$(CONFIG_CRYPTO_CHACHA20_P10) += chacha-p10-crypto.o
 obj-$(CONFIG_CRYPTO_POLY1305_P10) += poly1305-p10-crypto.o
 obj-$(CONFIG_CRYPTO_DEV_VMX_ENCRYPT) += vmx-crypto.o
+obj-$(CONFIG_CRYPTO_CURVE25519_PPC64) += curve25519-ppc64le.o
 
 aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o 
aes-spe-glue.o
 md5-ppc-y := md5-asm.o md5-glue.o
@@ -29,6 +30,7 @@ aes-gcm-p10-crypto-y := aes-gcm-p10-glue.o aes-gcm-p10.o 
ghashp10-ppc.o aesp10-p
 chacha-p10-crypto-y := chacha-p10-glue.o chacha-p10le-8x.o
 poly1305-p10-crypto-y := poly1305-p10-glue.o poly1305-p10le_64.o
 vmx-crypto-objs := vmx.o aesp8-ppc.o ghashp8-ppc.o aes.o aes_cbc.o aes_ctr.o 
aes_xts.o ghash.o
+curve25519-ppc64le-y := curve25519-ppc64le-core.o curve25519-ppc64le_asm.o
 
 ifeq ($(CONFIG_CPU_LITTLE_ENDIAN),y)
 override flavour := linux-ppc64le
-- 
2.31.1



[PATCH v2 2/3] crypto: X25519 core functions for ppc64le

2024-05-16 Thread Danny Tsen
X25519 core functions to handle scalar multiplication for ppc64le.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/curve25519-ppc64le-core.c | 299 ++
 1 file changed, 299 insertions(+)
 create mode 100644 arch/powerpc/crypto/curve25519-ppc64le-core.c

diff --git a/arch/powerpc/crypto/curve25519-ppc64le-core.c 
b/arch/powerpc/crypto/curve25519-ppc64le-core.c
new file mode 100644
index ..4e3e44ea4484
--- /dev/null
+++ b/arch/powerpc/crypto/curve25519-ppc64le-core.c
@@ -0,0 +1,299 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2024- IBM Corp.
+ *
+ * X25519 scalar multiplication with 51 bits limbs for PPC64le.
+ *   Based on RFC7748 and AArch64 optimized implementation for X25519
+ * - Algorithm 1 Scalar multiplication of a variable point
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+typedef uint64_t fe51[5];
+
+asmlinkage void x25519_fe51_mul(fe51 h, const fe51 f, const fe51 g);
+asmlinkage void x25519_fe51_sqr(fe51 h, const fe51 f);
+asmlinkage void x25519_fe51_mul121666(fe51 h, fe51 f);
+asmlinkage void x25519_fe51_sqr_times(fe51 h, const fe51 f, int n);
+asmlinkage void x25519_fe51_frombytes(fe51 h, const uint8_t *s);
+asmlinkage void x25519_fe51_tobytes(uint8_t *s, const fe51 h);
+asmlinkage void x25519_cswap(fe51 p, fe51 q, unsigned int bit);
+
+#define fmul x25519_fe51_mul
+#define fsqr x25519_fe51_sqr
+#define fmul121666 x25519_fe51_mul121666
+#define fe51_tobytes x25519_fe51_tobytes
+
+static void fadd(fe51 h, const fe51 f, const fe51 g)
+{
+   h[0] = f[0] + g[0];
+   h[1] = f[1] + g[1];
+   h[2] = f[2] + g[2];
+   h[3] = f[3] + g[3];
+   h[4] = f[4] + g[4];
+}
+
+/*
+ * Prime = 2 ** 255 - 19, 255 bits
+ *(0x7fff       
ffed)
+ *
+ * Prime in 5 51-bit limbs
+ */
+static fe51 prime51 = { 0x7ffed, 0x7, 0x7, 
0x7, 0x7};
+
+static void fsub(fe51 h, const fe51 f, const fe51 g)
+{
+   h[0] = (f[0] + ((prime51[0] * 2))) - g[0];
+   h[1] = (f[1] + ((prime51[1] * 2))) - g[1];
+   h[2] = (f[2] + ((prime51[2] * 2))) - g[2];
+   h[3] = (f[3] + ((prime51[3] * 2))) - g[3];
+   h[4] = (f[4] + ((prime51[4] * 2))) - g[4];
+}
+
+static void fe51_frombytes(fe51 h, const uint8_t *s)
+{
+   /*
+* Make sure 64-bit aligned.
+*/
+   unsigned char sbuf[32+8];
+   unsigned char *sb = PTR_ALIGN((void *)sbuf, 8);
+
+   memcpy(sb, s, 32);
+   x25519_fe51_frombytes(h, sb);
+}
+
+static void finv(fe51 o, const fe51 i)
+{
+   fe51 a0, b, c, t00;
+
+   fsqr(a0, i);
+   x25519_fe51_sqr_times(t00, a0, 2);
+
+   fmul(b, t00, i);
+   fmul(a0, b, a0);
+
+   fsqr(t00, a0);
+
+   fmul(b, t00, b);
+   x25519_fe51_sqr_times(t00, b, 5);
+
+   fmul(b, t00, b);
+   x25519_fe51_sqr_times(t00, b, 10);
+
+   fmul(c, t00, b);
+   x25519_fe51_sqr_times(t00, c, 20);
+
+   fmul(t00, t00, c);
+   x25519_fe51_sqr_times(t00, t00, 10);
+
+   fmul(b, t00, b);
+   x25519_fe51_sqr_times(t00, b, 50);
+
+   fmul(c, t00, b);
+   x25519_fe51_sqr_times(t00, c, 100);
+
+   fmul(t00, t00, c);
+   x25519_fe51_sqr_times(t00, t00, 50);
+
+   fmul(t00, t00, b);
+   x25519_fe51_sqr_times(t00, t00, 5);
+
+   fmul(o, t00, a0);
+}
+
+static void curve25519_fe51(uint8_t out[32], const uint8_t scalar[32],
+   const uint8_t point[32])
+{
+   fe51 x1, x2, z2, x3, z3;
+   uint8_t s[32];
+   unsigned int swap = 0;
+   int i;
+
+   memcpy(s, scalar, 32);
+   s[0]  &= 0xf8;
+   s[31] &= 0x7f;
+   s[31] |= 0x40;
+   fe51_frombytes(x1, point);
+
+   z2[0] = z2[1] = z2[2] = z2[3] = z2[4] = 0;
+   x3[0] = x1[0];
+   x3[1] = x1[1];
+   x3[2] = x1[2];
+   x3[3] = x1[3];
+   x3[4] = x1[4];
+
+   x2[0] = z3[0] = 1;
+   x2[1] = z3[1] = 0;
+   x2[2] = z3[2] = 0;
+   x2[3] = z3[3] = 0;
+   x2[4] = z3[4] = 0;
+
+   for (i = 254; i >= 0; --i) {
+   unsigned int k_t = 1 & (s[i / 8] >> (i & 7));
+   fe51 a, b, c, d, e;
+   fe51 da, cb, aa, bb;
+   fe51 dacb_p, dacb_m;
+
+   swap ^= k_t;
+   x25519_cswap(x2, x3, swap);
+   x25519_cswap(z2, z3, swap);
+   swap = k_t;
+
+   fsub(b, x2, z2);// B = x_2 - z_2
+   fadd(a, x2, z2);// A = x_2 + z_2
+   fsub(d, x3, z3);// D = x_3 - z_3
+   fadd(c, x3, z3);// C = x_3 + z_3
+
+   fsqr(bb, b);// BB = B^2
+   fsqr(aa, a);// AA = A^2
+   fmul(da, d, a); // DA = D * A
+   fmul(cb, c, b); // CB = C * B
+
+   fsub(e, aa, 

[PATCH v2 0/3] crypto: X25519 supports for ppc64le

2024-05-16 Thread Danny Tsen
This patch series provide X25519 support for ppc64le with a new module
curve25519-ppc64le.

The implementation is based on CRYPTOGAMs perl output from x25519-ppc64.pl.
(see https://github.com/dot-asm/cryptogams/)
Modified and added 4 supporting functions.

This patch has passed the selftest by running modprobe
curve25519-ppc64le.

Danny Tsen (3):
  X25519 low-level primitives for ppc64le.
  X25519 core functions for ppc64le
  Update Kconfig and Makefile for ppc64le x25519.

 arch/powerpc/crypto/Kconfig   |  11 +
 arch/powerpc/crypto/Makefile  |   2 +
 arch/powerpc/crypto/curve25519-ppc64le-core.c | 299 
 arch/powerpc/crypto/curve25519-ppc64le_asm.S  | 671 ++
 4 files changed, 983 insertions(+)
 create mode 100644 arch/powerpc/crypto/curve25519-ppc64le-core.c
 create mode 100644 arch/powerpc/crypto/curve25519-ppc64le_asm.S

-- 
2.31.1



[PATCH v2 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Danny Tsen
Use the perl output of x25519-ppc64.pl from CRYPTOGAMs
(see https://github.com/dot-asm/cryptogams/) and added four
supporting functions, x25519_fe51_sqr_times, x25519_fe51_frombytes,
x25519_fe51_tobytes and x25519_cswap.

Signed-off-by: Danny Tsen 
---
 arch/powerpc/crypto/curve25519-ppc64le_asm.S | 671 +++
 1 file changed, 671 insertions(+)
 create mode 100644 arch/powerpc/crypto/curve25519-ppc64le_asm.S

diff --git a/arch/powerpc/crypto/curve25519-ppc64le_asm.S 
b/arch/powerpc/crypto/curve25519-ppc64le_asm.S
new file mode 100644
index ..06c1febe24b9
--- /dev/null
+++ b/arch/powerpc/crypto/curve25519-ppc64le_asm.S
@@ -0,0 +1,671 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#
+# This code is taken from CRYPTOGAMs[1] and is included here using the option
+# in the license to distribute the code under the GPL. Therefore this program
+# is free software; you can redistribute it and/or modify it under the terms of
+# the GNU General Public License version 2 as published by the Free Software
+# Foundation.
+#
+# [1] https://github.com/dot-asm/cryptogams/
+
+# Copyright (c) 2006-2017, CRYPTOGAMS by 
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+#   * Redistributions of source code must retain copyright notices,
+# this list of conditions and the following disclaimer.
+#
+#   * Redistributions in binary form must reproduce the above
+# copyright notice, this list of conditions and the following
+# disclaimer in the documentation and/or other materials
+# provided with the distribution.
+#
+#   * Neither the name of the CRYPTOGAMS nor the names of its
+# copyright holder and contributors may be used to endorse or
+# promote products derived from this software without specific
+# prior written permission.
+#
+# ALTERNATIVELY, provided that this notice is retained in full, this
+# product may be distributed under the terms of the GNU General Public
+# License (GPL), in which case the provisions of the GPL apply INSTEAD OF
+# those given above.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# 
+# Written by Andy Polyakov  for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see https://www.openssl.org/~appro/cryptogams/.
+# 
+
+#
+# 
+# Written and Modified by Danny Tsen 
+# - Added x25519_fe51_sqr_times, x25519_fe51_frombytes, x25519_fe51_tobytes
+#   and x25519_cswap
+#
+# Copyright 2024- IBM Corp.
+#
+# X25519 lower-level primitives for PPC64.
+#
+
+#include 
+
+.text
+
+.align 5
+SYM_FUNC_START(x25519_fe51_mul)
+
+   stdu1,-144(1)
+   std 21,56(1)
+   std 22,64(1)
+   std 23,72(1)
+   std 24,80(1)
+   std 25,88(1)
+   std 26,96(1)
+   std 27,104(1)
+   std 28,112(1)
+   std 29,120(1)
+   std 30,128(1)
+   std 31,136(1)
+
+   ld  6,0(5)
+   ld  7,0(4)
+   ld  8,8(4)
+   ld  9,16(4)
+   ld  10,24(4)
+   ld  11,32(4)
+
+   mulld   22,7,6
+   mulhdu  23,7,6
+
+   mulld   24,8,6
+   mulhdu  25,8,6
+
+   mulld   30,11,6
+   mulhdu  31,11,6
+   ld  4,8(5)
+   mulli   11,11,19
+
+   mulld   26,9,6
+   mulhdu  27,9,6
+
+   mulld   28,10,6
+   mulhdu  29,10,6
+   mulld   12,11,4
+   mulhdu  21,11,4
+   addc22,22,12
+   adde23,23,21
+
+   mulld   12,7,4
+   mulhdu  21,7,4
+   addc24,24,12
+   adde25,25,21
+
+   mulld   12,10,4
+   mulhdu  21,10,4
+   ld  6,16(5)
+   mulli   10,10,19
+   addc30,30,12
+   adde31,31,21
+
+   mulld   12,8,4
+   mulhdu  21,8,4
+   addc26,26,12
+   adde27,27,21
+
+   mulld   12,9,4
+   mulhdu  21,9,4
+   addc 

Re: [PATCH v15 00/16] Add audio support in v4l2 framework

2024-05-16 Thread Jaroslav Kysela

On 15. 05. 24 15:34, Shengjiu Wang wrote:

On Wed, May 15, 2024 at 6:46 PM Jaroslav Kysela  wrote:


On 15. 05. 24 12:19, Takashi Iwai wrote:

On Wed, 15 May 2024 11:50:52 +0200,
Jaroslav Kysela wrote:


On 15. 05. 24 11:17, Hans Verkuil wrote:

Hi Jaroslav,

On 5/13/24 13:56, Jaroslav Kysela wrote:

On 09. 05. 24 13:13, Jaroslav Kysela wrote:

On 09. 05. 24 12:44, Shengjiu Wang wrote:

mem2mem is just like the decoder in the compress pipeline. which is
one of the components in the pipeline.


I was thinking of loopback with endpoints using compress streams,
without physical endpoint, something like:

compress playback (to feed data from userspace) -> DSP (processing) ->
compress capture (send data back to userspace)

Unless I'm missing something, you should be able to process data as fast
as you can feed it and consume it in such case.



Actually in the beginning I tried this,  but it did not work well.
ALSA needs time control for playback and capture, playback and capture
needs to synchronize.  Usually the playback and capture pipeline is
independent in ALSA design,  but in this case, the playback and capture
should synchronize, they are not independent.


The core compress API core no strict timing constraints. You can eventually0
have two half-duplex compress devices, if you like to have really independent
mechanism. If something is missing in API, you can extend this API (like to
inform the user space that it's a producer/consumer processing without any
relation to the real time). I like this idea.


I was thinking more about this. If I am right, the mentioned use in gstreamer
is supposed to run the conversion (DSP) job in "one shot" (can be handled
using one system call like blocking ioctl).  The goal is just to offload the
CPU work to the DSP (co-processor). If there are no requirements for the
queuing, we can implement this ioctl in the compress ALSA API easily using the
data management through the dma-buf API. We can eventually define a new
direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
handle this new data scheme. The API may be extended later on real demand, of
course.

Otherwise all pieces are already in the current ALSA compress API
(capabilities, params, enumeration). The realtime controls may be created
using ALSA control API.


So does this mean that Shengjiu should attempt to use this ALSA approach first?


I've not seen any argument to use v4l2 mem2mem buffer scheme for this
data conversion forcefully. It looks like a simple job and ALSA APIs
may be extended for this simple purpose.

Shengjiu, what are your requirements for gstreamer support? Would be a
new blocking ioctl enough for the initial support in the compress ALSA
API?


If it works with compress API, it'd be great, yeah.
So, your idea is to open compress-offload devices for read and write,
then and let them convert a la batch jobs without timing control?

For full-duplex usages, we might need some more extensions, so that
both read and write parameters can be synchronized.  (So far the
compress stream is a unidirectional, and the runtime buffer for a
single stream.)

And the buffer management is based on the fixed size fragments.  I
hope this doesn't matter much for the intended operation?


It's a question, if the standard I/O is really required for this case. My
quick idea was to just implement a new "direction" for this job supporting
only one ioctl for the data processing which will execute the job in "one
shot" at the moment. The I/O may be handled through dma-buf API (which seems
to be standard nowadays for this purpose and allows future chaining).

So something like:

struct dsp_job {
 int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
 int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
 ... maybe some extra data size members here ...
 ... maybe some special parameters here ...
};

#define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)

This ioctl will be blocking (thus synced). My question is, if it's feasible
for gstreamer or not. For this particular case, if the rate conversion is
implemented in software, it will block the gstreamer data processing, too.



Thanks.

I have several questions:
1.  Compress API alway binds to a sound card.  Can we avoid that?
  For ASRC, it is just one component,


Is this a real issue? Usually, I would expect a sound hardware (card) presence 
when ASRC is available, or not? Eventually, a separate sound card with one 
compress device may be created, too. For enumeration - the user space may just 
iterate through all sound cards / compress devices to find ASRC in the system.


The devices/interfaces in the sound card are independent. Also, USB MIDI 
converters offer only one serial MIDI interface for example, too.



2.  Compress API doesn't seem to support mmap().  Is this a problem
  for sending and getting data to/from the driver?


I proposed to use dma-buf for I/O (separate 

Re: [PATCH v15 00/16] Add audio support in v4l2 framework

2024-05-16 Thread Jaroslav Kysela

On 15. 05. 24 22:33, Nicolas Dufresne wrote:

Hi,

GStreamer hat on ...

Le mercredi 15 mai 2024 à 12:46 +0200, Jaroslav Kysela a écrit :

On 15. 05. 24 12:19, Takashi Iwai wrote:

On Wed, 15 May 2024 11:50:52 +0200,
Jaroslav Kysela wrote:


On 15. 05. 24 11:17, Hans Verkuil wrote:

Hi Jaroslav,

On 5/13/24 13:56, Jaroslav Kysela wrote:

On 09. 05. 24 13:13, Jaroslav Kysela wrote:

On 09. 05. 24 12:44, Shengjiu Wang wrote:

mem2mem is just like the decoder in the compress pipeline. which is
one of the components in the pipeline.


I was thinking of loopback with endpoints using compress streams,
without physical endpoint, something like:

compress playback (to feed data from userspace) -> DSP (processing) ->
compress capture (send data back to userspace)

Unless I'm missing something, you should be able to process data as fast
as you can feed it and consume it in such case.



Actually in the beginning I tried this,  but it did not work well.
ALSA needs time control for playback and capture, playback and capture
needs to synchronize.  Usually the playback and capture pipeline is
independent in ALSA design,  but in this case, the playback and capture
should synchronize, they are not independent.


The core compress API core no strict timing constraints. You can eventually0
have two half-duplex compress devices, if you like to have really independent
mechanism. If something is missing in API, you can extend this API (like to
inform the user space that it's a producer/consumer processing without any
relation to the real time). I like this idea.


I was thinking more about this. If I am right, the mentioned use in gstreamer
is supposed to run the conversion (DSP) job in "one shot" (can be handled
using one system call like blocking ioctl).  The goal is just to offload the
CPU work to the DSP (co-processor). If there are no requirements for the
queuing, we can implement this ioctl in the compress ALSA API easily using the
data management through the dma-buf API. We can eventually define a new
direction (enum snd_compr_direction) like SND_COMPRESS_CONVERT or so to allow
handle this new data scheme. The API may be extended later on real demand, of
course.

Otherwise all pieces are already in the current ALSA compress API
(capabilities, params, enumeration). The realtime controls may be created
using ALSA control API.


So does this mean that Shengjiu should attempt to use this ALSA approach first?


I've not seen any argument to use v4l2 mem2mem buffer scheme for this
data conversion forcefully. It looks like a simple job and ALSA APIs
may be extended for this simple purpose.

Shengjiu, what are your requirements for gstreamer support? Would be a
new blocking ioctl enough for the initial support in the compress ALSA
API?


If it works with compress API, it'd be great, yeah.
So, your idea is to open compress-offload devices for read and write,
then and let them convert a la batch jobs without timing control?

For full-duplex usages, we might need some more extensions, so that
both read and write parameters can be synchronized.  (So far the
compress stream is a unidirectional, and the runtime buffer for a
single stream.)

And the buffer management is based on the fixed size fragments.  I
hope this doesn't matter much for the intended operation?


It's a question, if the standard I/O is really required for this case. My
quick idea was to just implement a new "direction" for this job supporting
only one ioctl for the data processing which will execute the job in "one
shot" at the moment. The I/O may be handled through dma-buf API (which seems
to be standard nowadays for this purpose and allows future chaining).

So something like:

struct dsp_job {
 int source_fd; /* dma-buf FD with source data - for dma_buf_get() */
 int target_fd; /* dma-buf FD for target data - for dma_buf_get() */
 ... maybe some extra data size members here ...
 ... maybe some special parameters here ...
};

#define SNDRV_COMPRESS_DSPJOB _IOWR('C', 0x60, struct dsp_job)

This ioctl will be blocking (thus synced). My question is, if it's feasible
for gstreamer or not. For this particular case, if the rate conversion is
implemented in software, it will block the gstreamer data processing, too.


Yes, GStreamer threading is using a push-back model, so blocking for the time of
the processing is fine. Note that the extra simplicity will suffer from ioctl()
latency.

In GFX, they solve this issue with fences. That allow setting up the next
operation in the chain before the data has been produced.


The fences look really nicely and seem more modern. It should be possible with 
dma-buf/sync_file.c interface to handle multiple jobs simultaneously and share 
the state between user space and kernel driver.


In this case, I think that two non-blocking ioctls should be enough - add a 
new job with source/target dma buffers guarded by one fence and abort (flush) 
all active jobs.


I'll try to propose an API extension for the 

[PATCH] Perf: Calling available function for stats printing

2024-05-16 Thread Abhishek Dubey
For printing dump_trace, use existing stats_print()
function.

Signed-off-by: Abhishek Dubey 
---
 tools/perf/builtin-report.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index dcd93ee5fc24..3cabd5b0bfec 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1088,10 +1088,7 @@ static int __cmd_report(struct report *rep)
perf_session__fprintf_dsos(session, stdout);
 
if (dump_trace) {
-   perf_session__fprintf_nr_events(session, stdout,
-   rep->skip_empty);
-   evlist__fprintf_nr_events(session->evlist, stdout,
- rep->skip_empty);
+   stats_print(rep);
return 0;
}
}
-- 
2.44.0



[PATCH] PowerPC: Replace kretprobe with rethook

2024-05-16 Thread Abhishek Dubey
This is an adaptation of commit f3a112c0c40d ("x86,rethook,kprobes:
Replace kretprobe with rethook on x86") to Power.

Replaces the kretprobe code with rethook on Power. With this patch,
kretprobe on Power uses the rethook instead of kretprobe specific
trampoline code.

Reference to other archs:
commit b57c2f124098 ("riscv: add riscv rethook implementation")
commit 7b0a096436c2 ("LoongArch: Replace kretprobe with rethook")

Signed-off-by: Abhishek Dubey 
---
 arch/powerpc/Kconfig |  1 +
 arch/powerpc/kernel/Makefile |  1 +
 arch/powerpc/kernel/kprobes.c| 65 +
 arch/powerpc/kernel/optprobes.c  |  2 +-
 arch/powerpc/kernel/rethook.c| 71 
 arch/powerpc/kernel/stacktrace.c |  6 +--
 include/linux/rethook.h  |  1 -
 7 files changed, 78 insertions(+), 69 deletions(-)
 create mode 100644 arch/powerpc/kernel/rethook.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1c4be3373686..108de491965a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -268,6 +268,7 @@ config PPC
select HAVE_PERF_EVENTS_NMI if PPC64
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+   select HAVE_RETHOOK
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE
select HAVE_RSEQ
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index d3282fbea4f2..181d764be3a6 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -142,6 +142,7 @@ obj-$(CONFIG_KPROBES)   += kprobes.o
 obj-$(CONFIG_OPTPROBES)+= optprobes.o optprobes_head.o
 obj-$(CONFIG_KPROBES_ON_FTRACE)+= kprobes-ftrace.o
 obj-$(CONFIG_UPROBES)  += uprobes.o
+obj-$(CONFIG_RETHOOK)   += rethook.o
 obj-$(CONFIG_PPC_UDBG_16550)   += legacy_serial.o udbg_16550.o
 obj-$(CONFIG_SWIOTLB)  += dma-swiotlb.o
 obj-$(CONFIG_ARCH_HAS_DMA_SET_MASK) += dma-mask.o
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index bbca90a5e2ec..614bb68ad0e6 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -248,16 +248,6 @@ static nokprobe_inline void set_current_kprobe(struct 
kprobe *p, struct pt_regs
kcb->kprobe_saved_msr = regs->msr;
 }
 
-void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs 
*regs)
-{
-   ri->ret_addr = (kprobe_opcode_t *)regs->link;
-   ri->fp = NULL;
-
-   /* Replace the return addr with trampoline addr */
-   regs->link = (unsigned long)__kretprobe_trampoline;
-}
-NOKPROBE_SYMBOL(arch_prepare_kretprobe);
-
 static int try_to_emulate(struct kprobe *p, struct pt_regs *regs)
 {
int ret;
@@ -414,49 +404,6 @@ int kprobe_handler(struct pt_regs *regs)
 }
 NOKPROBE_SYMBOL(kprobe_handler);
 
-/*
- * Function return probe trampoline:
- * - init_kprobes() establishes a probepoint here
- * - When the probed function returns, this probe
- * causes the handlers to fire
- */
-asm(".global __kretprobe_trampoline\n"
-   ".type __kretprobe_trampoline, @function\n"
-   "__kretprobe_trampoline:\n"
-   "nop\n"
-   "blr\n"
-   ".size __kretprobe_trampoline, .-__kretprobe_trampoline\n");
-
-/*
- * Called when the probe at kretprobe trampoline is hit
- */
-static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
-{
-   unsigned long orig_ret_address;
-
-   orig_ret_address = __kretprobe_trampoline_handler(regs, NULL);
-   /*
-* We get here through one of two paths:
-* 1. by taking a trap -> kprobe_handler() -> here
-* 2. by optprobe branch -> optimized_callback() -> opt_pre_handler() 
-> here
-*
-* When going back through (1), we need regs->nip to be setup properly
-* as it is used to determine the return address from the trap.
-* For (2), since nip is not honoured with optprobes, we instead setup
-* the link register properly so that the subsequent 'blr' in
-* __kretprobe_trampoline jumps back to the right instruction.
-*
-* For nip, we should set the address to the previous instruction since
-* we end up emulating it in kprobe_handler(), which increments the nip
-* again.
-*/
-   regs_set_return_ip(regs, orig_ret_address - 4);
-   regs->link = orig_ret_address;
-
-   return 0;
-}
-NOKPROBE_SYMBOL(trampoline_probe_handler);
-
 /*
  * Called after single-stepping.  p->addr is the address of the
  * instruction whose first byte has been replaced by the "breakpoint"
@@ -559,19 +506,9 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 }
 NOKPROBE_SYMBOL(kprobe_fault_handler);
 
-static struct kprobe trampoline_p = {
-   .addr = (kprobe_opcode_t *) &__kretprobe_trampoline,
-   .pre_handler = trampoline_probe_handler
-};
-
-int __init arch_init_kprobes(void)
-{
-   return register_kprobe(_p);

Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Andy Polyakov

Hi,


+.abiversion2


I'd prefer that was left to the compiler flags.


Problem is that it's the compiler that is responsible for providing this
directive in the intermediate .s prior invoking the assembler. And there
is no assembler flag to pass through -Wa.


Hmm, right. But none of our existing .S files include .abiversion
directives.

We build .S files with gcc, passing -mabi=elfv2, but it seems to have no
effect.

So all the intermediate .o's generated from .S files are not ELFv2:

   $ find .build/ -name '*.o' | xargs file | grep Unspecified
   .build/arch/powerpc/kernel/vdso/note-64.o:ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped


I would guess that contemporary linker is more forgiving than it was 
back then when the .abiversion directive was added. If it works now, 
then it of course can be omitted. I suppose my original remark should be 
viewed rather as "you can't replace it with a command line option" than 
"you can't make it work without it." :-)



But the actual code follows ELFv2, because we wrote it that way, and I
guess the linker doesn't look at the actual ABI version of the .o ?

So it currently works. But it's kind of gross that those .o files are
not ELFv2 for an ELFv2 build.


Well, as far as passing base types and pointers to/from assembly goes, 
there are no differences between the versions. Then it's a question of 
meaning assigned to r2 and r13, but as long as you don't touch them, you 
can freely reuse the code with either ABI. With this in mind the 
.abiversion directive is effectively reduced to just a marker in the .o 
file. In other words the instruction sequences by themselves are 
customarily ABI-neutral, at least in "general calculation" modules such 
as the suggested one, so that if it works 100% without the .abiversion 
directive, then it can be safely omitted.


Cheers.



[PATCH] powerpc/fadump: Fix section mismatch warning

2024-05-16 Thread Michael Ellerman
With some compilers/configs fadump_setup_param_area() isn't inlined into
its caller (which is __init), leading to a section mismatch warning:

  WARNING: modpost: vmlinux: section mismatch in reference:
  fadump_setup_param_area+0x200 (section: .text.fadump_setup_param_area)
  -> memblock_phys_alloc_range (section: .init.text)

Fix it by adding an __init annotation.

Fixes: 683eab94da75 ("powerpc/fadump: setup additional parameters for dump 
capture kernel")
Reported-by: Stephen Rothwell 
Closes: https://lore.kernel.org/all/20240515163708.3380c...@canb.auug.org.au/
Reported-by: kernel test robot 
Closes: https://lore.kernel.org/all/202405140922.ouclox4y-...@intel.com/
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/kernel/fadump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 2276bacc4170..60f974775fc8 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1740,7 +1740,7 @@ static void __init fadump_process(void)
  * Reserve memory to store additional parameters to be passed
  * for fadump/capture kernel.
  */
-static void fadump_setup_param_area(void)
+static void __init fadump_setup_param_area(void)
 {
phys_addr_t range_start, range_end;
 
-- 
2.45.0



Re: [PATCHv4 8/9] ASoC: fsl-asoc-card: add DT property "cpu-system-clock-direction-out"

2024-05-16 Thread Mark Brown
On Wed, May 15, 2024 at 03:54:10PM +0200, Elinor Montmasson wrote:
> Add new optional DT property "cpu-system-clock-direction-out" to set
> sysclk direction as "out" for the CPU DAI when using the generic codec.
> It is set for both Tx and Rx.
> If not set, the direction is "in".
> The way the direction value is used is up to the CPU DAI driver
> implementation.

This feels like we should be using the clock bindings to specify the
clock input of whatever is using the output from the SoC, though that's
a lot more work.


signature.asc
Description: PGP signature


[PATCH 1/1] powerpc/numa: Online a node if PHB is attached.

2024-05-16 Thread Nilay Shroff
In the current design, a numa-node is made online only if
that node is attached to cpu/memory. With this design, if
any PCI/IO device is found to be attached to a numa-node
which is not online then the numa-node id of the corresponding
PCI/IO device is set to NUMA_NO_NODE(-1). This design may
negatively impact the performance of PCIe device if the
numa-node assigned to PCIe device is -1 because in such case
we may not be able to accurately calculate the distance
between two nodes.
The multi-controller NVMe PCIe disk has an issue with
calculating the node distance if the PCIe NVMe controller
is attached to a PCI host bridge which has numa-node id
value set to NUMA_NO_NODE. This patch helps fix this ensuring
that a cpu/memory less numa node is made online if it's
attached to PCI host bridge.

Signed-off-by: Nilay Shroff 
---
 arch/powerpc/mm/numa.c | 14 +-
 arch/powerpc/platforms/pseries/pci_dlpar.c | 14 ++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index a490724e84ad..9e5e366cee43 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -896,7 +896,7 @@ static int __init numa_setup_drmem_lmb(struct drmem_lmb 
*lmb,
 
 static int __init parse_numa_properties(void)
 {
-   struct device_node *memory;
+   struct device_node *memory, *pci;
int default_nid = 0;
unsigned long i;
const __be32 *associativity;
@@ -1010,6 +1010,18 @@ static int __init parse_numa_properties(void)
goto new_range;
}
 
+   for_each_node_by_name(pci, "pci") {
+   int nid;
+
+   associativity = of_get_associativity(pci);
+   if (associativity) {
+   nid = associativity_to_nid(associativity);
+   initialize_form1_numa_distance(associativity);
+   }
+   if (likely(nid >= 0) && !node_online(nid))
+   node_set_online(nid);
+   }
+
/*
 * Now do the same thing for each MEMBLOCK listed in the
 * ibm,dynamic-memory property in the
diff --git a/arch/powerpc/platforms/pseries/pci_dlpar.c 
b/arch/powerpc/platforms/pseries/pci_dlpar.c
index 4448386268d9..52e2623a741d 100644
--- a/arch/powerpc/platforms/pseries/pci_dlpar.c
+++ b/arch/powerpc/platforms/pseries/pci_dlpar.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -21,9 +22,22 @@
 struct pci_controller *init_phb_dynamic(struct device_node *dn)
 {
struct pci_controller *phb;
+   int nid;
 
pr_debug("PCI: Initializing new hotplug PHB %pOF\n", dn);
 
+   nid = of_node_to_nid(dn);
+   if (likely((nid) >= 0)) {
+   if (!node_online(nid)) {
+   if (__register_one_node(nid)) {
+   pr_err("PCI: Failed to register node %d\n", 
nid);
+   } else {
+   update_numa_distance(dn);
+   node_set_online(nid);
+   }
+   }
+   }
+
phb = pcibios_alloc_controller(dn);
if (!phb)
return NULL;
-- 
2.44.0



[PATCH 0/1] powerpc/numa: Make cpu/memory less numa-node online

2024-05-16 Thread Nilay Shroff
Hi,

On NUMA aware system, we make a numa-node online only if that node is 
attached to cpu/memory. However it's possible that we have some PCI/IO 
device affinitized to a numa-node which is not currently online. In such 
case we set the numa-node id of the corresponding PCI device to -1 
(NUMA_NO_NODE). Not assigning the correct numa-node id to PCI device may 
impact the performance of such device. For instance, we have a multi 
controller NVMe disk where each controller of the disk is attached to 
different PHB (PCI host bridge). Each of these PHBs has numa-node id 
assigned during PCI enumeration. During PCI enumeration if we find that
the numa-node is not online then we set the numa-node id of the PHB to -1. 
If we create shared namespace and attach to multi controller NVMe disk 
then that namespace could be accessed through each controller and as each 
controller is connected to different PHBs, it's possible to access the 
same namespace using multiple PCI channel. While sending IO to a shared 
namespace, NVMe driver would calculate the optimal IO path using numa-node 
distance. However if the numa-node id is not correctly assigned to NVMe 
PCIe controller then it's possible that driver would calculate incorrect 
NUMA distance and hence select the non-optimal path for sending IO. If 
this happens then we could potentially observe the degraded IO performance.

Please find below the performance of a multi-controller NVMe disk w/ and 
w/o the proposed patch applied:

# lspci 
0524:28:00.0 Non-Volatile memory controller: KIOXIA Corporation NVMe SSD 
Controller CM7 2.5" (rev 01)
0584:28:00.0 Non-Volatile memory controller: KIOXIA Corporation NVMe SSD 
Controller CM7 2.5" (rev 01)

# nvme list -v 
SubsystemSubsystem-NQN  
  Controllers
 

 
nvme-subsys1 nqn.2019-10.com.kioxia:KCM7DRUG1T92:3D60A04906N1   
  nvme0, nvme1

Device   SN   MN   FR   
TxPort AsdressSlot   SubsystemNamespaces  
    
-- -- --  
nvme03D60A04906N1 1.6TB NVMe Gen4 U.2 SSD IV   REV.CAS2 
pcie   0524:28:00.0  nvme-subsys1 nvme1n3
nvme13D60A04906N1 1.6TB NVMe Gen4 U.2 SSD IV   REV.CAS2 
pcie   0584:28:00.0  nvme-subsys1 nvme1n3

Device   Generic  NSID   Usage  Format  
 Controllers 
  -- -- 
 
/dev/nvme1n3 /dev/ng1n3   0x3  5.75  GB /   5.75  GB  4 KiB +  0 B  
 nvme0, nvme1

We can see above the nvme disk has two controllers nvme0 and nvme1.Both 
these controllers can be accessed from two different PCI channels (0524:28 
and 0584:28). 
I have also created a shared namespace (/dev/nvme1n3) which is connected 
behind controllers nvme0 and nvme1.

Test-1: Measure IO performance w/o proposed patch:
--
# numactl -H 
available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31
node 0 size: 31565 MB
node 0 free: 28452 MB
node distances:
node   0 
  0:  10 

On this machine we only have node 0 online. 

# cat /sys/class/nvme/nvme1/numa_node 
-1
# cat /sys/class/nvme/nvme0/numa_node 
0 
# cat /sys/class/nvme-subsystem/nvme-subsys1/iopolicy 
numa

We can find above the numa node id assigned to nvme1 is -1, however, the 
numa node id assigned to nvme0 is 0. Also the iopolicy is set to numa.

Now we would run IO perf test and measure the performance:

# fio --filename=/dev/nvme1n3 --direct=1 --rw=randwrite  --bs=4k 
--ioengine=io_uring --iodepth=512 --runtime=60 --numjobs=4 --time_based 
--group_reporting --name=iops-test-job --eta-newline=1 --cpus_allowed=0-3 
iops-test-job: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 
4096B-4096B, ioengine=io_uring, iodepth=512
...
fio-3.35
Starting 4 processes
[...]
[...]
iops-test-job: (groupid=0, jobs=4): err= 0: pid=5665: Tue Apr 30 04:07:31 2024
  write: IOPS=632k, BW=2469MiB/s (2589MB/s)(145GiB/60003msec); 0 zone resets
slat (usec): min=2, max=10031, avg= 4.62, stdev= 5.40
clat (usec): min=12, max=15687, avg=3233.58, stdev=877.78
 lat (usec): min=16, max=15693, avg=3238.19, stdev=879.06
clat percentiles (usec):
 |  1.00th=[ 2868],  5.00th=[ 2900], 10.00th=[ 2900], 20.00th=[ 2900],
 | 30.00th=[ 2933], 40.00th=[ 2933], 50.00th=[ 2933], 60.00th=[ 2933],
 | 70.00th=[ 2933], 80.00th=[ 2966], 90.00th=[ 5604], 95.00th=[ 5669],
 | 99.00th=[ 5735], 99.50th=[ 5735], 99.90th=[ 5866], 99.95th=[ 6456],
 

Re: [PATCHv4 7/9] ASoC: fsl-asoc-card: add DT clock "cpu_sysclk" with generic codec

2024-05-16 Thread Mark Brown
On Wed, May 15, 2024 at 03:54:09PM +0200, Elinor Montmasson wrote:

> Add an optional DT clock "cpu_sysclk" to get the CPU DAI system-clock
> frequency when using the generic codec.
> It is set for both Tx and Rx.
> The way the frequency value is used is up to the CPU DAI driver
> implementation.

> + struct clk *cpu_sysclk = clk_get(>dev, "cpu_sysclk");
> + if (!IS_ERR(cpu_sysclk)) {
> + priv->cpu_priv.sysclk_freq[TX] = 
> clk_get_rate(cpu_sysclk);
> + priv->cpu_priv.sysclk_freq[RX] = 
> priv->cpu_priv.sysclk_freq[TX];
> + clk_put(cpu_sysclk);
> + }

I don't really understand the goal here - this is just reading whatever
frequency happens to be set in the hardware when the driver starts up
which if nothing else seems rather fragile?


signature.asc
Description: PGP signature


Re: [PATCHv4 9/9] ASoC: dt-bindings: fsl-asoc-card: add compatible for generic codec

2024-05-16 Thread Mark Brown
On Wed, May 15, 2024 at 03:54:11PM +0200, Elinor Montmasson wrote:

> Add documentation about new dts bindings following new support
> for compatible "fsl,imx-audio-generic".

>audio-codec:
> -$ref: /schemas/types.yaml#/definitions/phandle
> -description: The phandle of an audio codec
> +$ref: /schemas/types.yaml#/definitions/phandle-array
> +description: |
> +  The phandle of an audio codec.
> +  If using the "fsl,imx-audio-generic" compatible, give instead a pair of
> +  phandles with the spdif_transmitter first (driver SPDIF DIT) and the
> +  spdif_receiver second (driver SPDIF DIR).
> +items:
> +  maxItems: 1

This description (and the code) don't feel like they're actually generic
- they're clearly specific to the bidrectional S/PDIF case.  I'd expect
something called -generic to cope with single CODECs as well as double,
and not to have any constraints on what those are.


signature.asc
Description: PGP signature


Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Michael Ellerman
Andy Polyakov  writes:
> Hi,
>
>>> +.abiversion2
>>
>> I'd prefer that was left to the compiler flags.
>
> Problem is that it's the compiler that is responsible for providing this
> directive in the intermediate .s prior invoking the assembler. And there
> is no assembler flag to pass through -Wa.

Hmm, right. But none of our existing .S files include .abiversion
directives.

We build .S files with gcc, passing -mabi=elfv2, but it seems to have no
effect.

So all the intermediate .o's generated from .S files are not ELFv2:

  $ find .build/ -name '*.o' | xargs file | grep Unspecified
  .build/arch/powerpc/kernel/vdso/note-64.o:ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped
  .build/arch/powerpc/kernel/vdso/sigtramp64-64.o:  ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped
  .build/arch/powerpc/kernel/vdso/getcpu-64.o:  ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped
  .build/arch/powerpc/kernel/vdso/gettimeofday-64.o:ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped
  .build/arch/powerpc/kernel/vdso/datapage-64.o:ELF 64-bit 
LSB relocatable, 64-bit PowerPC or cisco 7500, Unspecified or Power ELF V1 ABI, 
version 1 (SYSV), not stripped
  ...

But the actual code follows ELFv2, because we wrote it that way, and I
guess the linker doesn't look at the actual ABI version of the .o ?

So it currently works. But it's kind of gross that those .o files are
not ELFv2 for an ELFv2 build.

> If concern is ABI neutrality,
> then solution would rather be #if (_CALL_ELF-0) == 2/#endif. One can
> also make a case for
>
> #ifdef _CALL_ELF
> .abiversion _CALL_ELF
> #endif

Is .abiversion documented anywhere? I can't see it in the manual.

We used to use _CALL_ELF, but the kernel config is supposed to be the
source of truth, so we'd use:

  #ifdef CONFIG_PPC64_ELF_ABI_V2
  .abiversion 2
  #endif

And probably put it in a macro like:

  #ifdef CONFIG_PPC64_ELF_ABI_V2
  #define ASM_ABI_VERSION .abiversion 2
  #else
  #define ASM_ABI_VERSION
  #endif

Or something like that. But it's annoying that we need to go and
sprinkle that in every .S file.

Anyway, my comment can be ignored as far as this series is concerned,
seems we have to clean this up everywhere.

cheers


Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Danny Tsen

Hi Andy,

I learned something here.  Will fix this.  Thanks.

-Danny

On 5/16/24 3:38 AM, Andy Polyakov wrote:

Hi,


+.abiversion    2


I'd prefer that was left to the compiler flags.


Problem is that it's the compiler that is responsible for providing 
this directive in the intermediate .s prior invoking the assembler. 
And there is no assembler flag to pass through -Wa. If concern is ABI 
neutrality, then solution would rather be #if (_CALL_ELF-0) == 
2/#endif. One can also make a case for


#ifdef _CALL_ELF
.abiversion _CALL_ELF
#endif

Cheers.



Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Danny Tsen



On 5/15/24 11:53 PM, Michael Ellerman wrote:

Hi Danny,

Danny Tsen  writes:

Use the perl output of x25519-ppc64.pl from CRYPTOGAMs and added three
supporting functions, x25519_fe51_sqr_times, x25519_fe51_frombytes
and x25519_fe51_tobytes.

For other algorithms we have checked-in the perl script and generated
the code at runtime. Is there a reason you've done it differently this time?


Hi Michael,

It's easier for me to read and use just assembly not mixed with perl and 
it's easier for me to debug and testing also I copied some code and made 
some modification.



Signed-off-by: Danny Tsen 
---
  arch/powerpc/crypto/curve25519-ppc64le_asm.S | 648 +++
  1 file changed, 648 insertions(+)
  create mode 100644 arch/powerpc/crypto/curve25519-ppc64le_asm.S

diff --git a/arch/powerpc/crypto/curve25519-ppc64le_asm.S 
b/arch/powerpc/crypto/curve25519-ppc64le_asm.S
new file mode 100644
index ..8a018104838a
--- /dev/null
+++ b/arch/powerpc/crypto/curve25519-ppc64le_asm.S
@@ -0,0 +1,648 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#
+# Copyright 2024- IBM Corp.  All Rights Reserved.
  
I'm not a lawyer, but AFAIK "All Rights Reserved" is not required and

can be confusing - because we are not reserving all rights, we are
granting some rights under the GPL.

I also think the IBM copyright should be down below where your
modifications are described.

Will change that.

+# This code is taken from CRYPTOGAMs[1] and is included here using the option
+# in the license to distribute the code under the GPL. Therefore this program
+# is free software; you can redistribute it and/or modify it under the terms of
+# the GNU General Public License version 2 as published by the Free Software
+# Foundation.
+#
+# [1] https://www.openssl.org/~appro/cryptogams/
+
+# Copyright (c) 2006-2017, CRYPTOGAMS by 
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+#   * Redistributions of source code must retain copyright notices,
+# this list of conditions and the following disclaimer.
+#
+#   * Redistributions in binary form must reproduce the above
+# copyright notice, this list of conditions and the following
+# disclaimer in the documentation and/or other materials
+# provided with the distribution.
+#
+#   * Neither the name of the CRYPTOGAMS nor the names of its
+# copyright holder and contributors may be used to endorse or
+# promote products derived from this software without specific
+# prior written permission.
+#
+# ALTERNATIVELY, provided that this notice is retained in full, this
+# product may be distributed under the terms of the GNU General Public
+# License (GPL), in which case the provisions of the GPL apply INSTEAD OF
+# those given above.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# 
+# Written by Andy Polyakov  for the OpenSSL
+# project. The module is, however, dual licensed under OpenSSL and
+# CRYPTOGAMS licenses depending on where you obtain it. For further
+# details see https://www.openssl.org/~appro/cryptogams/.
+# 
+
+#
+# 
+# Written and Modified by Danny Tsen 
+# - Added x25519_fe51_sqr_times, x25519_fe51_frombytes, x25519_fe51_tobytes

ie. here.


+# X25519 lower-level primitives for PPC64.
+#
+
+#include 
+
+.machine "any"
  
Please don't add new .machine directives unless they are required.



+.abiversion2

I'd prefer that was left to the compiler flags.


Ok.

Thanks.

-Danny



cheers



Re: [PATCH 1/3] crypto: X25519 low-level primitives for ppc64le.

2024-05-16 Thread Andy Polyakov

Hi,


+.abiversion2


I'd prefer that was left to the compiler flags.


Problem is that it's the compiler that is responsible for providing this 
directive in the intermediate .s prior invoking the assembler. And there 
is no assembler flag to pass through -Wa. If concern is ABI neutrality, 
then solution would rather be #if (_CALL_ELF-0) == 2/#endif. One can 
also make a case for


#ifdef _CALL_ELF
.abiversion _CALL_ELF
#endif

Cheers.