Re: [PATCH v3] vp_vdpa: fix the method of calculating vectors
better subject: vp_vdpa: don't allocate unused msix vectors to make it clear it's not a bugfix. more comments below, but most importantly this looks like it adds a bug. On Tue, Apr 09, 2024 at 09:49:35AM +0800, lyx634449800 wrote: > When there is a ctlq and it doesn't require interrupt > callbacks,the original method of calculating vectors > wastes hardware msi or msix resources as well as system > IRQ resources. > > When conducting performance testing using testpmd in the > guest os, it was found that the performance was lower compared > to directly using vfio-pci to passthrough the device > > In scenarios where the virtio device in the guest os does > not utilize interrupts, the vdpa driver still configures > the hardware's msix vector. Therefore, the hardware still > sends interrupts to the host os. Because of this unnecessary > action by the hardware, hardware performance decreases, and > it also affects the performance of the host os. > > Before modification:(interrupt mode) > 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 > 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 > 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config > > After modification:(interrupt mode) > 32: 0 0 1 7 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-0 > 33: 36 0 3 0 PCI-MSI 32769-edge vp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[:00:02.0]-config > > Before modification:(virtio pmd mode for guest os) > 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 > 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 > 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config > > After modification:(virtio pmd mode for guest os) > 32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-config > > To verify the use of the virtio PMD mode in the guest operating > system, the following patch needs to be applied to QEMU: > https://lore.kernel.org/all/20240408073311.2049-1-yuxue@jaguarmicro.com > > Signed-off-by: lyx634449800 Bad S.O.B format. Should be Signed-off-by: Real Name > --- > > V3: delete unused variables and add validation records > V2: fix when allocating IRQs, scan all queues > > drivers/vdpa/virtio_pci/vp_vdpa.c | 35 +++ > 1 file changed, 22 insertions(+), 13 deletions(-) > > diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c > b/drivers/vdpa/virtio_pci/vp_vdpa.c > index df5f4a3bccb5..cd3aeb3b8f21 100644 > --- a/drivers/vdpa/virtio_pci/vp_vdpa.c > +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c > @@ -160,22 +160,31 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) > struct pci_dev *pdev = mdev->pci_dev; > int i, ret, irq; > int queues = vp_vdpa->queues; > - int vectors = queues + 1; > + int msix_vec, allocated_vectors = 0; I would actually call allocated_vectors -> vectors, make the patch smaller. > > - ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX); > - if (ret != vectors) { > + for (i = 0; i < queues; i++) { > + if (vp_vdpa->vring[i].cb.callback) > + allocated_vectors++; > + } > + allocated_vectors = allocated_vectors + 1; better: allocated_vectors++; /* extra one for config */ > + > + ret = pci_alloc_irq_vectors(pdev, allocated_vectors, allocated_vectors, > + PCI_IRQ_MSIX); > + if (ret != allocated_vectors) { > dev_err(>dev, > "vp_vdpa: fail to allocate irq vectors want %d but > %d\n", > - vectors, ret); > + allocated_vectors, ret); > return ret; > } > - > - vp_vdpa->vectors = vectors; > + vp_vdpa->vectors = allocated_vectors; > > for (i = 0; i < queues; i++) { > + if (!vp_vdpa->vring[i].cb.callback) > + continue; > + > snprintf(vp_vdpa->vring[i].msix_name, VP_VDPA_NAME_SIZE, > "vp-vdpa[%s]-%d\n", pci_name(pdev), i); > - irq = pci_irq_vector(pdev, i); > + irq = pci_irq_vector(pdev, msix_vec); using uninitialized msix_vec here? I would expect compiler to warn about it. pay attention to compiler warnings pls. > ret = devm_request_irq(>dev, irq, > vp_vdpa_vq_handler, > 0, vp_vdpa->vring[i].msix_name, > @@ -185,23 +194,23 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) > "vp_vdpa: fail to request irq for vq %d\n", i); > goto err; > } > - vp_modern_queue_vector(mdev, i, i); > + vp_modern_queue_vector(mdev, i, msix_vec); >
Re: [PATCH v3 11/25] media: i2c: imx258: Add get_selection for pixel array information
On 4/3/24 12:46, Pavel Machek wrote: > Hi! > >> Libcamera requires the cropping information for each mode, so >> add this information to the driver. > >> @@ -116,6 +124,9 @@ struct imx258_mode { >> u32 link_freq_index; >> /* Default register values */ >> struct imx258_reg_list reg_list; >> + >> +/* Analog crop rectangle. */ > > No need for "." at the end, as it is not above. > >> +struct v4l2_rect crop; >> }; > > If the crop is same in all modes, should we have it in common place? > > Best regards, > Pavel I gave this a try similar to what was done on imx219 but its having issues and makes a bunch of changes to the following patches so we might need to hold off on that change for now.
[PATCH v2 fs/proc/bootconfig 1/2] fs/proc: remove redundant comments from /proc/bootconfig
From: Zhenhua Huang commit 717c7c894d4b ("fs/proc: Add boot loader arguments as comment to /proc/bootconfig") adds bootloader argument comments into /proc/bootconfig. /proc/bootconfig shows boot_command_line[] multiple times following every xbc key value pair, that's duplicated and not necessary. Remove redundant ones. Output before and after the fix is like: key1 = value1 *bootloader argument comments* key2 = value2 *bootloader argument comments* key3 = value3 *bootloader argument comments* ... key1 = value1 key2 = value2 key3 = value3 *bootloader argument comments* ... Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as comment to /proc/bootconfig") Signed-off-by: Zhenhua Huang Signed-off-by: Paul E. McKenney Cc: Cc: Acked-by: Masami Hiramatsu (Google) --- fs/proc/bootconfig.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c index 902b326e1e560..e5635a6b127b0 100644 --- a/fs/proc/bootconfig.c +++ b/fs/proc/bootconfig.c @@ -62,12 +62,12 @@ static int __init copy_xbc_key_value_list(char *dst, size_t size) break; dst += ret; } - if (ret >= 0 && boot_command_line[0]) { - ret = snprintf(dst, rest(dst, end), "# Parameters from bootloader:\n# %s\n", - boot_command_line); - if (ret > 0) - dst += ret; - } + } + if (ret >= 0 && boot_command_line[0]) { + ret = snprintf(dst, rest(dst, end), "# Parameters from bootloader:\n# %s\n", + boot_command_line); + if (ret > 0) + dst += ret; } out: kfree(key); -- 2.40.1
[PATCH v2 fs/proc/bootconfig 2/2] fs/proc: Skip bootloader comment if no embedded kernel parameters
From: Masami Hiramatsu If the "bootconfig" kernel command-line argument was specified or if the kernel was built with CONFIG_BOOT_CONFIG_FORCE, but if there are no embedded kernel parameter, omit the "# Parameters from bootloader:" comment from the /proc/bootconfig file. This will cause automation to fall back to the /proc/cmdline file, which will be identical to the comment in this no-embedded-kernel-parameters case. Signed-off-by: Masami Hiramatsu Signed-off-by: Paul E. McKenney --- fs/proc/bootconfig.c | 2 +- include/linux/bootconfig.h | 1 + init/main.c| 5 + 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c index e5635a6b127b0..87dcaae32ff87 100644 --- a/fs/proc/bootconfig.c +++ b/fs/proc/bootconfig.c @@ -63,7 +63,7 @@ static int __init copy_xbc_key_value_list(char *dst, size_t size) dst += ret; } } - if (ret >= 0 && boot_command_line[0]) { + if (cmdline_has_extra_options() && ret >= 0 && boot_command_line[0]) { ret = snprintf(dst, rest(dst, end), "# Parameters from bootloader:\n# %s\n", boot_command_line); if (ret > 0) diff --git a/include/linux/bootconfig.h b/include/linux/bootconfig.h index ca73940e26df8..e5ee2c694401e 100644 --- a/include/linux/bootconfig.h +++ b/include/linux/bootconfig.h @@ -10,6 +10,7 @@ #ifdef __KERNEL__ #include #include +bool __init cmdline_has_extra_options(void); #else /* !__KERNEL__ */ /* * NOTE: This is only for tools/bootconfig, because tools/bootconfig will diff --git a/init/main.c b/init/main.c index 2ca52474d0c30..881f6230ee59e 100644 --- a/init/main.c +++ b/init/main.c @@ -487,6 +487,11 @@ static int __init warn_bootconfig(char *str) early_param("bootconfig", warn_bootconfig); +bool __init cmdline_has_extra_options(void) +{ + return extra_command_line || extra_init_args; +} + /* Change NUL term back to "=", to make "param" the whole string. */ static void __init repair_env_string(char *param, char *val) { -- 2.40.1
[PATCH v2 fs/proc/bootconfig 0/2] remove redundant comments from /proc/bootconfig
Hello! This series removes redundant comments from /proc/bootconfig: 1. fs/proc: remove redundant comments from /proc/bootconfig, courtesy of Zhenhua Huang. 2. fs/proc: Skip bootloader comment if no embedded kernel parameters, courtesy of Masami Hiramatsu. Thanx, Paul b/fs/proc/bootconfig.c | 12 ++-- b/include/linux/bootconfig.h |1 + b/init/main.c|5 + fs/proc/bootconfig.c |2 +- 4 files changed, 13 insertions(+), 7 deletions(-)
Re: [PATCH v10 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
On Mon, 08 Apr 2024 17:37:10 -0500, Huang, Kai wrote: On 9/04/2024 6:03 am, Haitao Huang wrote: The misc root cgroup is a static similar to sgx_cg_root. So misc_cg_root() won't be NULL However, based on how css_misc() was check NULL, I suppose sgx_get_current_cg() may be NULL when cgroup is disabled (again not 100% sure but we handle it anyway) Could you help to check? Sorry I am busy on something else thus won't be able to do any actual check. It's always non-NULL based on testing. It's hard for me to say definitely by reading the code. But IIUC cgroup_disable command-line only blocks operations in /sys/fs/cgroup so user space can't set up controllers and config limits, etc., for the diasabled ones. Each task->cgroups would still have a non-NULL pointer to the static root object for each cgroup that is enabled by KConfig, so get_current_misc_cg() thus sgx_get_current_cg() should not return NULL regardless 'cgroup_disable=misc'. Maybe @Michal or @tj can confirm? Thanks Haitao
Re: [PATCH] drivers/virtio: delayed configuration descriptor flags
On Tue, Apr 9, 2024 at 1:27 AM ni.liqiang wrote: > > In our testing of the virtio hardware accelerator, we found that > configuring the flags of the descriptor after addr and len, > as implemented in DPDK, seems to be more friendly to the hardware. > > In our Virtio hardware implementation tests, using the default > open-source code, the hardware's bulk reads ensure performance > but correctness is compromised. If we refer to the implementation code > of DPDK, placing the flags configuration of the descriptor > after addr and len, virtio backend can function properly based on > our hardware accelerator. > > I am somewhat puzzled by this. From a software process perspective, > it seems that there should be no difference whether > the flags configuration of the descriptor is before or after addr and len. > However, this is not the case according to experimental test results. > We would like to know if such a change in the configuration order > is reasonable and acceptable? Harmless but a hint that there's a bug in your hardware? More below > > Thanks. > > Signed-off-by: ni.liqiang > Reviewed-by: jin.qi > Tested-by: jin.qi > Cc: ni.liqiang > --- > drivers/virtio/virtio_ring.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 6f7e5010a673..bea2c2fb084e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1472,15 +1472,16 @@ static inline int virtqueue_add_packed(struct > virtqueue *_vq, > flags = cpu_to_le16(vq->packed.avail_used_flags | > (++c == total_sg ? 0 : VRING_DESC_F_NEXT) > | > (n < out_sgs ? 0 : VRING_DESC_F_WRITE)); > - if (i == head) > - head_flags = flags; > - else > - desc[i].flags = flags; > > desc[i].addr = cpu_to_le64(addr); > desc[i].len = cpu_to_le32(sg->length); > desc[i].id = cpu_to_le16(id); > > + if (i == head) > + head_flags = flags; > + else > + desc[i].flags = flags; > + The head_flags are not updated at this time, so descriptors are not available, the device should not start to read the chain: /* * A driver MUST NOT make the first descriptor in the list * available before all subsequent descriptors comprising * the list are made available. */ virtio_wmb(vq->weak_barriers); vq->packed.vring.desc[head].flags = head_flags; vq->num_added += descs_used; It looks like your device does speculation reading on the descriptors that are not available? Thanks > if (unlikely(vq->use_dma_api)) { > vq->packed.desc_extra[curr].addr = addr; > vq->packed.desc_extra[curr].len = sg->length; > -- > 2.34.1 > >
Re: [PATCH v3] vp_vdpa: fix the method of calculating vectors
On Tue, Apr 9, 2024 at 9:49 AM lyx634449800 wrote: > > When there is a ctlq and it doesn't require interrupt > callbacks,the original method of calculating vectors > wastes hardware msi or msix resources as well as system > IRQ resources. > > When conducting performance testing using testpmd in the > guest os, it was found that the performance was lower compared > to directly using vfio-pci to passthrough the device > > In scenarios where the virtio device in the guest os does > not utilize interrupts, the vdpa driver still configures > the hardware's msix vector. Therefore, the hardware still > sends interrupts to the host os. Because of this unnecessary > action by the hardware, hardware performance decreases, and > it also affects the performance of the host os. > > Before modification:(interrupt mode) > 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 > 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 > 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config > > After modification:(interrupt mode) > 32: 0 0 1 7 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-0 > 33: 36 0 3 0 PCI-MSI 32769-edge vp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[:00:02.0]-config > > Before modification:(virtio pmd mode for guest os) > 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 > 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 > 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 > 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config > > After modification:(virtio pmd mode for guest os) > 32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-config > > To verify the use of the virtio PMD mode in the guest operating > system, the following patch needs to be applied to QEMU: > https://lore.kernel.org/all/20240408073311.2049-1-yuxue@jaguarmicro.com > > Signed-off-by: lyx634449800 > --- > > V3: delete unused variables and add validation records > V2: fix when allocating IRQs, scan all queues > > drivers/vdpa/virtio_pci/vp_vdpa.c | 35 +++ > 1 file changed, 22 insertions(+), 13 deletions(-) > > diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c > b/drivers/vdpa/virtio_pci/vp_vdpa.c > index df5f4a3bccb5..cd3aeb3b8f21 100644 > --- a/drivers/vdpa/virtio_pci/vp_vdpa.c > +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c > @@ -160,22 +160,31 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) > struct pci_dev *pdev = mdev->pci_dev; > int i, ret, irq; > int queues = vp_vdpa->queues; > - int vectors = queues + 1; > + int msix_vec, allocated_vectors = 0; > > - ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX); > - if (ret != vectors) { > + for (i = 0; i < queues; i++) { > + if (vp_vdpa->vring[i].cb.callback) > + allocated_vectors++; > + } > + allocated_vectors = allocated_vectors + 1; > + > + ret = pci_alloc_irq_vectors(pdev, allocated_vectors, > allocated_vectors, > + PCI_IRQ_MSIX); > + if (ret != allocated_vectors) { > dev_err(>dev, > "vp_vdpa: fail to allocate irq vectors want %d but > %d\n", > - vectors, ret); > + allocated_vectors, ret); > return ret; > } > - > - vp_vdpa->vectors = vectors; > + vp_vdpa->vectors = allocated_vectors; > > for (i = 0; i < queues; i++) { > + if (!vp_vdpa->vring[i].cb.callback) > + continue; > + > snprintf(vp_vdpa->vring[i].msix_name, VP_VDPA_NAME_SIZE, > "vp-vdpa[%s]-%d\n", pci_name(pdev), i); > - irq = pci_irq_vector(pdev, i); > + irq = pci_irq_vector(pdev, msix_vec); > ret = devm_request_irq(>dev, irq, >vp_vdpa_vq_handler, >0, vp_vdpa->vring[i].msix_name, > @@ -185,23 +194,23 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) > "vp_vdpa: fail to request irq for vq %d\n", > i); > goto err; > } > - vp_modern_queue_vector(mdev, i, i); > + vp_modern_queue_vector(mdev, i, msix_vec); > vp_vdpa->vring[i].irq = irq; > + msix_vec++; > } > > snprintf(vp_vdpa->msix_name, VP_VDPA_NAME_SIZE, > "vp-vdpa[%s]-config\n", > -pci_name(pdev)); > - irq = pci_irq_vector(pdev, queues); > + pci_name(pdev)); > + irq = pci_irq_vector(pdev, msix_vec); > ret = devm_request_irq(>dev, irq, vp_vdpa_config_handler, 0, >
[PATCH v3] vp_vdpa: fix the method of calculating vectors
When there is a ctlq and it doesn't require interrupt callbacks,the original method of calculating vectors wastes hardware msi or msix resources as well as system IRQ resources. When conducting performance testing using testpmd in the guest os, it was found that the performance was lower compared to directly using vfio-pci to passthrough the device In scenarios where the virtio device in the guest os does not utilize interrupts, the vdpa driver still configures the hardware's msix vector. Therefore, the hardware still sends interrupts to the host os. Because of this unnecessary action by the hardware, hardware performance decreases, and it also affects the performance of the host os. Before modification:(interrupt mode) 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config After modification:(interrupt mode) 32: 0 0 1 7 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-0 33: 36 0 3 0 PCI-MSI 32769-edge vp-vdpa[:00:02.0]-1 34: 0 0 0 0 PCI-MSI 32770-edge vp-vdpa[:00:02.0]-config Before modification:(virtio pmd mode for guest os) 32: 0 0 0 0 PCI-MSI 32768-edgevp-vdpa[:00:02.0]-0 33: 0 0 0 0 PCI-MSI 32769-edgevp-vdpa[:00:02.0]-1 34: 0 0 0 0 PCI-MSI 32770-edgevp-vdpa[:00:02.0]-2 35: 0 0 0 0 PCI-MSI 32771-edgevp-vdpa[:00:02.0]-config After modification:(virtio pmd mode for guest os) 32: 0 0 0 0 PCI-MSI 32768-edge vp-vdpa[:00:02.0]-config To verify the use of the virtio PMD mode in the guest operating system, the following patch needs to be applied to QEMU: https://lore.kernel.org/all/20240408073311.2049-1-yuxue@jaguarmicro.com Signed-off-by: lyx634449800 --- V3: delete unused variables and add validation records V2: fix when allocating IRQs, scan all queues drivers/vdpa/virtio_pci/vp_vdpa.c | 35 +++ 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/drivers/vdpa/virtio_pci/vp_vdpa.c b/drivers/vdpa/virtio_pci/vp_vdpa.c index df5f4a3bccb5..cd3aeb3b8f21 100644 --- a/drivers/vdpa/virtio_pci/vp_vdpa.c +++ b/drivers/vdpa/virtio_pci/vp_vdpa.c @@ -160,22 +160,31 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) struct pci_dev *pdev = mdev->pci_dev; int i, ret, irq; int queues = vp_vdpa->queues; - int vectors = queues + 1; + int msix_vec, allocated_vectors = 0; - ret = pci_alloc_irq_vectors(pdev, vectors, vectors, PCI_IRQ_MSIX); - if (ret != vectors) { + for (i = 0; i < queues; i++) { + if (vp_vdpa->vring[i].cb.callback) + allocated_vectors++; + } + allocated_vectors = allocated_vectors + 1; + + ret = pci_alloc_irq_vectors(pdev, allocated_vectors, allocated_vectors, + PCI_IRQ_MSIX); + if (ret != allocated_vectors) { dev_err(>dev, "vp_vdpa: fail to allocate irq vectors want %d but %d\n", - vectors, ret); + allocated_vectors, ret); return ret; } - - vp_vdpa->vectors = vectors; + vp_vdpa->vectors = allocated_vectors; for (i = 0; i < queues; i++) { + if (!vp_vdpa->vring[i].cb.callback) + continue; + snprintf(vp_vdpa->vring[i].msix_name, VP_VDPA_NAME_SIZE, "vp-vdpa[%s]-%d\n", pci_name(pdev), i); - irq = pci_irq_vector(pdev, i); + irq = pci_irq_vector(pdev, msix_vec); ret = devm_request_irq(>dev, irq, vp_vdpa_vq_handler, 0, vp_vdpa->vring[i].msix_name, @@ -185,23 +194,23 @@ static int vp_vdpa_request_irq(struct vp_vdpa *vp_vdpa) "vp_vdpa: fail to request irq for vq %d\n", i); goto err; } - vp_modern_queue_vector(mdev, i, i); + vp_modern_queue_vector(mdev, i, msix_vec); vp_vdpa->vring[i].irq = irq; + msix_vec++; } snprintf(vp_vdpa->msix_name, VP_VDPA_NAME_SIZE, "vp-vdpa[%s]-config\n", -pci_name(pdev)); - irq = pci_irq_vector(pdev, queues); + pci_name(pdev)); + irq = pci_irq_vector(pdev, msix_vec); ret = devm_request_irq(>dev, irq, vp_vdpa_config_handler, 0, vp_vdpa->msix_name, vp_vdpa); if (ret) { dev_err(>dev, - "vp_vdpa: fail to request irq for vq %d\n", i); + "vp_vdpa: fail to request irq for config\n"); goto err; } -
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
Hi, On 4/8/2024 3:45 PM, Michael S. Tsirkin wrote: > On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: >> From: Hou Tao >> >> Hi, >> >> The patch set aims to fix the warning related to an abnormal size >> parameter of kmalloc() in virtiofs. The warning occurred when attempting >> to insert a 10MB sized kernel module kept in a virtiofs with cache >> disabled. As analyzed in patch #1, the root cause is that the length of >> the read buffer is no limited, and the read buffer is passed directly to >> virtiofs through out_args[0].value. Therefore patch #1 limits the >> length of the read buffer passed to virtiofs by using max_pages. However >> it is not enough, because now the maximal value of max_pages is 256. >> Consequently, when reading a 10MB-sized kernel module, the length of the >> bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will >> try to allocate 2MB from memory subsystem. The request for 2MB of >> physically contiguous memory significantly stress the memory subsystem >> and may fail indefinitely on hosts with fragmented memory. To address >> this, patch #2~#5 use scattered pages in a bio_vec to replace the >> kmalloc-allocated bounce buffer when the length of the bounce buffer for >> KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the >> allocation of the bounce buffer and sg array in virtiofs is that >> GFP_ATOMIC is used even when the allocation occurs in a kworker context. >> Therefore the last patch uses GFP_NOFS for the allocation of both sg >> array and bounce buffer when initiated by the kworker. For more details, >> please check the individual patches. >> >> As usual, comments are always welcome. >> >> Change Log: > Bernd should I just merge the patchset as is? > It seems to fix a real problem and no one has the > time to work on a better fix WDYT? Sorry for the long delay. I am just start to prepare for v3. In v3, I plan to avoid the unnecessary memory copy between fuse args and bio_vec. Will post it before next week. > > >> v2: >> * limit the length of ITER_KVEC dio by max_pages instead of the >> newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC >> dio being consistent with other rw operations. >> * replace kmalloc-allocated bounce buffer by using a bounce buffer >> backed by scattered pages when the length of the bounce buffer for >> KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with >> fragmented memory, the KVEC_ITER dio can be handled normally by >> virtiofs. (Bernd Schubert) >> * merge the GFP_NOFS patch [1] into this patch-set and use >> memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS >> (Benjamin Coddington) >> >> v1: >> https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ >> >> [1]: >> https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ >> >> Hou Tao (6): >> fuse: limit the length of ITER_KVEC dio by max_pages >> virtiofs: move alloc/free of argbuf into separated helpers >> virtiofs: factor out more common methods for argbuf >> virtiofs: support bounce buffer backed by scattered pages >> virtiofs: use scattered bounce buffer for ITER_KVEC dio >> virtiofs: use GFP_NOFS when enqueuing request through kworker >> >> fs/fuse/file.c | 12 +- >> fs/fuse/virtio_fs.c | 336 +--- >> 2 files changed, 296 insertions(+), 52 deletions(-) >> >> -- >> 2.29.2
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
Hi Michal, kernel test robot noticed the following build errors: [auto build test ERROR on fec50db7033ea478773b159e0e2efb135270e3b7] url: https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031 base: fec50db7033ea478773b159e0e2efb135270e3b7 patch link: https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value config: arm-allnoconfig (https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/config) compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project 8b3b4a92adee40483c27f26c478a384cd69c6f05) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240409/202404090903.3jz667sn-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202404090903.3jz667sn-...@intel.com/ All errors (new ones prefixed by >>): In file included from kernel/sysctl.c:23: In file included from include/linux/mm.h:2208: include/linux/vmstat.h:522:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion] 522 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_" | ~~~ ^ ~~~ >> kernel/sysctl.c:1819:14: error: initializing 'void *' with an expression of >> type 'const int *' discards qualifiers >> [-Werror,-Wincompatible-pointer-types-discards-qualifiers] 1819 | .extra2 = _max_max, | ^~~~ 1 warning and 1 error generated. vim +1819 kernel/sysctl.c f461d2dcd511c0 Christoph Hellwig 2020-04-24 1617 f461d2dcd511c0 Christoph Hellwig 2020-04-24 1618 static struct ctl_table kern_table[] = { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1619 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1620 .procname = "panic", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1621 .data = _timeout, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1622 .maxlen = sizeof(int), 49f0ce5f92321c Jerome Marchand 2014-01-21 1623 .mode = 0644, 6d4561110a3e9f Eric W. Biederman 2009-11-16 1624 .proc_handler = proc_dointvec, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1625 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1626 #ifdef CONFIG_PROC_SYSCTL ^1da177e4c3f41 Linus Torvalds 2005-04-16 1627 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1628 .procname = "tainted", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1629 .maxlen = sizeof(long), ^1da177e4c3f41 Linus Torvalds 2005-04-16 1630 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1631 .proc_handler = proc_taint, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1632 }, 2da02997e08d3e David Rientjes 2009-01-06 1633 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1634 .procname = "sysctl_writes_strict", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1635 .data = _writes_strict, 9e3961a0979817 Prarit Bhargava 2014-12-10 1636 .maxlen = sizeof(int), 2da02997e08d3e David Rientjes 2009-01-06 1637 .mode = 0644, 9e3961a0979817 Prarit Bhargava 2014-12-10 1638 .proc_handler = proc_dointvec_minmax, 78e36f3b0dae58 Xiaoming Ni 2022-01-21 1639 .extra1 = SYSCTL_NEG_ONE, eec4844fae7c03 Matteo Croce2019-07-18 1640 .extra2 = SYSCTL_ONE, 2da02997e08d3e David Rientjes 2009-01-06 1641 }, 964c9dff009189 Alexander Popov 2018-08-17 1642 #endif 1efff914afac8a Theodore Ts'o 2015-03-17 1643 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1644 .procname = "print-fatal-signals", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1645 .data = _fatal_signals, 964c9dff009189 Alexander Popov 2018-08-17 1646 .maxlen = sizeof(int), 1efff914afac8a Theodore Ts'o 2015-03-17 1647 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1648 .proc_handler = proc_dointvec, 1efff914afac8a Theodore Ts'o 2015-03-17 1649 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1650 #ifdef CONFIG_SPARC ^1da177e4c3f41 Linus Torvalds 2005-04-16 1651 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1652 .procname = "reb
Re: [PATCH fs/proc/bootconfig] remove redundant comments from /proc/bootconfig
On Tue, Apr 09, 2024 at 09:25:40AM +0900, Masami Hiramatsu wrote: > On Mon, 8 Apr 2024 12:18:19 -0700 > "Paul E. McKenney" wrote: > > > On Sat, Apr 06, 2024 at 11:11:08AM +0900, Masami Hiramatsu wrote: > > > On Thu, 4 Apr 2024 21:25:41 -0700 > > > "Paul E. McKenney" wrote: > > > > > > > On Fri, Apr 05, 2024 at 11:57:45AM +0900, Masami Hiramatsu wrote: > > > > > On Fri, 5 Apr 2024 10:23:24 +0900 > > > > > Masami Hiramatsu (Google) wrote: > > > > > > > > > > > On Thu, 4 Apr 2024 10:43:14 -0700 > > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > > > On Thu, Apr 04, 2024 at 08:55:22AM +0900, Masami Hiramatsu wrote: > > > > > > > > On Wed, 3 Apr 2024 12:16:28 -0700 > > > > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > > > > > > > commit 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > > > comment to > > > > > > > > > /proc/bootconfig") adds bootloader argument comments into > > > > > > > > > /proc/bootconfig. > > > > > > > > > > > > > > > > > > /proc/bootconfig shows boot_command_line[] multiple times > > > > > > > > > following > > > > > > > > > every xbc key value pair, that's duplicated and not necessary. > > > > > > > > > Remove redundant ones. > > > > > > > > > > > > > > > > > > Output before and after the fix is like: > > > > > > > > > key1 = value1 > > > > > > > > > *bootloader argument comments* > > > > > > > > > key2 = value2 > > > > > > > > > *bootloader argument comments* > > > > > > > > > key3 = value3 > > > > > > > > > *bootloader argument comments* > > > > > > > > > ... > > > > > > > > > > > > > > > > > > key1 = value1 > > > > > > > > > key2 = value2 > > > > > > > > > key3 = value3 > > > > > > > > > *bootloader argument comments* > > > > > > > > > ... > > > > > > > > > > > > > > > > > > Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > > > comment to /proc/bootconfig") > > > > > > > > > Signed-off-by: Zhenhua Huang > > > > > > > > > Signed-off-by: Paul E. McKenney > > > > > > > > > Cc: Masami Hiramatsu > > > > > > > > > Cc: > > > > > > > > > Cc: > > > > > > > > > > > > > > > > OOps, good catch! Let me pick it. > > > > > > > > > > > > > > > > Acked-by: Masami Hiramatsu (Google) > > > > > > > > > > > > > > Thank you, and I have applied your ack and pulled this into its > > > > > > > own > > > > > > > bootconfig.2024.04.04a. > > > > > > > > > > > > > > My guess is that you will push this via your own tree, and so I > > > > > > > will > > > > > > > drop my copy as soon as yours hits -next. > > > > > > > > > > > > Thanks! I would like to make PR this soon as bootconfig fixes for > > > > > > v6.9-rc2. > > > > > > > > > > Hmm I found that this always shows the command line comment in > > > > > /proc/bootconfig even without "bootconfig" option. > > > > > I think that is easier for user-tools but changes the behavior and > > > > > a bit redundant. > > > > > > > > > > We should skip showing this original argument comment if bootconfig is > > > > > not initialized (no "bootconfig" in cmdline) as it is now. > > > > > > > > So something like this folded into that patch? > > > > > > Hm, I expected just checking it in the loop as below. > > > > > > > > > diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c > > > index e5635a6b127b..98e0780f7e07 100644 > > > --- a/fs/proc/bootconfig.c > > > +++ b/fs/proc/bootconfig.c > > > @@ -27,6 +27,7 @@ static int __init copy_xbc_key_value_list(char *dst, > > > size_t size) > > > { > > > struct xbc_node *leaf, *vnode; > > > char *key, *end = dst + size; > > > + bool empty = true; > > > const char *val; > > > char q; > > > int ret = 0; > > > @@ -62,8 +63,9 @@ static int __init copy_xbc_key_value_list(char *dst, > > > size_t size) > > > break; > > > dst += ret; > > > } > > > + empty = false; > > > } > > > - if (ret >= 0 && boot_command_line[0]) { > > > + if (!empty && ret >= 0 && boot_command_line[0]) { > > > ret = snprintf(dst, rest(dst, end), "# Parameters from > > > bootloader:\n# %s\n", > > > boot_command_line); > > > if (ret > 0) > > > > > > > > > > > > The difference is checking "bootconfig" cmdline option or checking > > > the "bootconfig" is actually empty. So the behaviors are different > > > when the "bootconfig" is specified but there is no bootconfig data. > > > > Ah, understood, the point is to avoid the comment in cases where its > > content would be identical to /proc/cmdline. > > > > > Another idea is to check whether the cmdline is actually updated by > > > bootconfig and show original one only if it is updated. > > > (I think this fits the purpose of the original patch better.) > > > > > > > > > diff --git
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
Hi Michal, kernel test robot noticed the following build warnings: [auto build test WARNING on fec50db7033ea478773b159e0e2efb135270e3b7] url: https://github.com/intel-lab-lkp/linux/commits/Michal-Koutn/tracing-Remove-dependency-of-saved_cmdlines_buffer-on-PID_MAX_DEFAULT/20240408-230031 base: fec50db7033ea478773b159e0e2efb135270e3b7 patch link: https://lore.kernel.org/r/20240408145819.8787-3-mkoutny%40suse.com patch subject: [PATCH 2/3] kernel/pid: Remove default pid_max value config: alpha-allnoconfig (https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/config) compiler: alpha-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240409/202404090849.mgj3z0xi-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202404090849.mgj3z0xi-...@intel.com/ All warnings (new ones prefixed by >>): >> kernel/sysctl.c:1819:35: warning: initialization discards 'const' qualifier >> from pointer target type [-Wdiscarded-qualifiers] 1819 | .extra2 = _max_max, | ^ vim +/const +1819 kernel/sysctl.c f461d2dcd511c0 Christoph Hellwig 2020-04-24 1617 f461d2dcd511c0 Christoph Hellwig 2020-04-24 1618 static struct ctl_table kern_table[] = { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1619 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1620 .procname = "panic", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1621 .data = _timeout, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1622 .maxlen = sizeof(int), 49f0ce5f92321c Jerome Marchand 2014-01-21 1623 .mode = 0644, 6d4561110a3e9f Eric W. Biederman 2009-11-16 1624 .proc_handler = proc_dointvec, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1625 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1626 #ifdef CONFIG_PROC_SYSCTL ^1da177e4c3f41 Linus Torvalds 2005-04-16 1627 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1628 .procname = "tainted", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1629 .maxlen = sizeof(long), ^1da177e4c3f41 Linus Torvalds 2005-04-16 1630 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1631 .proc_handler = proc_taint, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1632 }, 2da02997e08d3e David Rientjes 2009-01-06 1633 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1634 .procname = "sysctl_writes_strict", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1635 .data = _writes_strict, 9e3961a0979817 Prarit Bhargava 2014-12-10 1636 .maxlen = sizeof(int), 2da02997e08d3e David Rientjes 2009-01-06 1637 .mode = 0644, 9e3961a0979817 Prarit Bhargava 2014-12-10 1638 .proc_handler = proc_dointvec_minmax, 78e36f3b0dae58 Xiaoming Ni 2022-01-21 1639 .extra1 = SYSCTL_NEG_ONE, eec4844fae7c03 Matteo Croce2019-07-18 1640 .extra2 = SYSCTL_ONE, 2da02997e08d3e David Rientjes 2009-01-06 1641 }, 964c9dff009189 Alexander Popov 2018-08-17 1642 #endif 1efff914afac8a Theodore Ts'o 2015-03-17 1643 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1644 .procname = "print-fatal-signals", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1645 .data = _fatal_signals, 964c9dff009189 Alexander Popov 2018-08-17 1646 .maxlen = sizeof(int), 1efff914afac8a Theodore Ts'o 2015-03-17 1647 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1648 .proc_handler = proc_dointvec, 1efff914afac8a Theodore Ts'o 2015-03-17 1649 }, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1650 #ifdef CONFIG_SPARC ^1da177e4c3f41 Linus Torvalds 2005-04-16 1651 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1652 .procname = "reboot-cmd", f461d2dcd511c0 Christoph Hellwig 2020-04-24 1653 .data = reboot_command, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1654 .maxlen = 256, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1655 .mode = 0644, f461d2dcd511c0 Christoph Hellwig 2020-04-24 1656 .proc_handler = proc_dostring, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1657 }, ^1da177e4c3f41 Linus Torvalds 2005-04-16 1658 { f461d2dcd511c0 Christoph Hellwig 2020-04-24 1659 .procname
Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe
On Mon, 8 Apr 2024 18:02:13 +0200 Jiri Olsa wrote: > On Fri, Apr 05, 2024 at 01:02:30PM +0200, Oleg Nesterov wrote: > > On 04/05, Jiri Olsa wrote: > > > > > > On Fri, Apr 05, 2024 at 10:22:03AM +0900, Masami Hiramatsu wrote: > > > > > > > > I think this expects setjmp/longjmp as below > > > > > > > > foo() { <- retprobe1 > > > > setjmp() > > > > bar() { <- retprobe2 > > > > longjmp() > > > > } > > > > } <- return to trampoline > > > > > > > > In this case, we need to skip retprobe2's instance. > > > > Yes, > > > > > > My concern is, if we can not find appropriate return instance, what > > > > happen? > > > > e.g. > > > > > > > > foo() { <-- retprobe1 > > > >bar() { # sp is decremented > > > >sys_uretprobe() <-- ?? > > > > } > > > > } > > > > > > > > It seems sys_uretprobe() will handle retprobe1 at that point instead of > > > > SIGILL. > > > > > > yes, and I think it's fine, you get the consumer called in wrong place, > > > but it's your fault and kernel won't crash > > > > Agreed. > > > > With or without this patch userpace can also do > > > > foo() { <-- retprobe1 > > bar() { > > jump to xol_area > > } > > } > > > > handle_trampoline() will handle retprobe1. > > > > > this can be fixed by checking the syscall is called from the trampoline > > > and prevent handle_trampoline call if it's not > > > > Yes, but I still do not think this makes a lot of sense. But I won't argue. > > > > And what should sys_uretprobe() do if it is not called from the trampoline? > > I'd prefer force_sig(SIGILL) to punish the abuser ;) OK, OK, EINVAL. > > so the similar behaviour with int3 ends up with immediate SIGTRAP > and not invoking pending uretprobe consumers, like: > > - setup uretprobe for foo > - foo() { > executes int 3 -> sends SIGTRAP > } > > because the int3 handler checks if it got executed from the uretprobe's > trampoline.. if not it treats that int3 as regular trap Yeah, that is consistent behavior. Sounds good to me. > > while for uretprobe syscall we have at the moment following behaviour: > > - setup uretprobe for foo > - foo() { > uretprobe_syscall -> executes foo's uretprobe consumers > } > - at some point we get to the 'ret' instruction that jump into uretprobe > trampoline and the uretprobe_syscall won't find pending uretprobe and > will send SIGILL > > > so I think we should mimic int3 behaviour and: > > - setup uretprobe for foo > - foo() { > uretprobe_syscall -> check if we got executed from uretprobe's > trampoline and send SIGILL if that's not the case OK, this looks good to me. > > I think it's better to have the offending process killed right away, > rather than having more undefined behaviour, waiting for final 'ret' > instruction that jumps to uretprobe trampoline and causes SIGILL > > > > > I agree very much with Andrii, > > > >sigreturn() exists only to allow the implementation of signal > > handlers. It should never be > >called directly. Details of the arguments (if any) passed to > > sigreturn() vary depending on > >the architecture. > > > > this is how sys_uretprobe() should be treated/documented. > > yes, will include man page patch in new version And please follow Documentation/process/adding-syscalls.rst in new version, then we can avoid repeating the same discussion :-) Thank you! > > jirka > > > > > sigreturn() can be "improved" too. Say, it could validate sigcontext->ip > > and return -EINVAL if this addr is not valid. But why? > > > > Oleg. > > -- Masami Hiramatsu (Google)
Re: [PATCH fs/proc/bootconfig] remove redundant comments from /proc/bootconfig
On Mon, 8 Apr 2024 12:18:19 -0700 "Paul E. McKenney" wrote: > On Sat, Apr 06, 2024 at 11:11:08AM +0900, Masami Hiramatsu wrote: > > On Thu, 4 Apr 2024 21:25:41 -0700 > > "Paul E. McKenney" wrote: > > > > > On Fri, Apr 05, 2024 at 11:57:45AM +0900, Masami Hiramatsu wrote: > > > > On Fri, 5 Apr 2024 10:23:24 +0900 > > > > Masami Hiramatsu (Google) wrote: > > > > > > > > > On Thu, 4 Apr 2024 10:43:14 -0700 > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > On Thu, Apr 04, 2024 at 08:55:22AM +0900, Masami Hiramatsu wrote: > > > > > > > On Wed, 3 Apr 2024 12:16:28 -0700 > > > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > > > > > commit 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > > comment to > > > > > > > > /proc/bootconfig") adds bootloader argument comments into > > > > > > > > /proc/bootconfig. > > > > > > > > > > > > > > > > /proc/bootconfig shows boot_command_line[] multiple times > > > > > > > > following > > > > > > > > every xbc key value pair, that's duplicated and not necessary. > > > > > > > > Remove redundant ones. > > > > > > > > > > > > > > > > Output before and after the fix is like: > > > > > > > > key1 = value1 > > > > > > > > *bootloader argument comments* > > > > > > > > key2 = value2 > > > > > > > > *bootloader argument comments* > > > > > > > > key3 = value3 > > > > > > > > *bootloader argument comments* > > > > > > > > ... > > > > > > > > > > > > > > > > key1 = value1 > > > > > > > > key2 = value2 > > > > > > > > key3 = value3 > > > > > > > > *bootloader argument comments* > > > > > > > > ... > > > > > > > > > > > > > > > > Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > > comment to /proc/bootconfig") > > > > > > > > Signed-off-by: Zhenhua Huang > > > > > > > > Signed-off-by: Paul E. McKenney > > > > > > > > Cc: Masami Hiramatsu > > > > > > > > Cc: > > > > > > > > Cc: > > > > > > > > > > > > > > OOps, good catch! Let me pick it. > > > > > > > > > > > > > > Acked-by: Masami Hiramatsu (Google) > > > > > > > > > > > > Thank you, and I have applied your ack and pulled this into its own > > > > > > bootconfig.2024.04.04a. > > > > > > > > > > > > My guess is that you will push this via your own tree, and so I will > > > > > > drop my copy as soon as yours hits -next. > > > > > > > > > > Thanks! I would like to make PR this soon as bootconfig fixes for > > > > > v6.9-rc2. > > > > > > > > Hmm I found that this always shows the command line comment in > > > > /proc/bootconfig even without "bootconfig" option. > > > > I think that is easier for user-tools but changes the behavior and > > > > a bit redundant. > > > > > > > > We should skip showing this original argument comment if bootconfig is > > > > not initialized (no "bootconfig" in cmdline) as it is now. > > > > > > So something like this folded into that patch? > > > > Hm, I expected just checking it in the loop as below. > > > > > > diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c > > index e5635a6b127b..98e0780f7e07 100644 > > --- a/fs/proc/bootconfig.c > > +++ b/fs/proc/bootconfig.c > > @@ -27,6 +27,7 @@ static int __init copy_xbc_key_value_list(char *dst, > > size_t size) > > { > > struct xbc_node *leaf, *vnode; > > char *key, *end = dst + size; > > + bool empty = true; > > const char *val; > > char q; > > int ret = 0; > > @@ -62,8 +63,9 @@ static int __init copy_xbc_key_value_list(char *dst, > > size_t size) > > break; > > dst += ret; > > } > > + empty = false; > > } > > - if (ret >= 0 && boot_command_line[0]) { > > + if (!empty && ret >= 0 && boot_command_line[0]) { > > ret = snprintf(dst, rest(dst, end), "# Parameters from > > bootloader:\n# %s\n", > >boot_command_line); > > if (ret > 0) > > > > > > > > The difference is checking "bootconfig" cmdline option or checking > > the "bootconfig" is actually empty. So the behaviors are different > > when the "bootconfig" is specified but there is no bootconfig data. > > Ah, understood, the point is to avoid the comment in cases where its > content would be identical to /proc/cmdline. > > > Another idea is to check whether the cmdline is actually updated by > > bootconfig and show original one only if it is updated. > > (I think this fits the purpose of the original patch better.) > > > > > > diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c > > index e5635a6b127b..95d6a231210c 100644 > > --- a/fs/proc/bootconfig.c > > +++ b/fs/proc/bootconfig.c > > @@ -10,6 +10,9 @@ > > #include > > #include > > > > +/* defined in main/init.c */ > > +bool __init cmdline_has_extra_options(void); > > + > > static
Re: [PATCH v2] rethook: Remove warning messages printed for finding return address of a frame.
On Mon, 8 Apr 2024 10:51:40 -0700 Kui-Feng Lee wrote: > The function rethook_find_ret_addr() prints a warning message and returns 0 > when the target task is running and is not the "current" task in order to > prevent the incorrect return address, although it still may return an > incorrect address. > > However, the warning message turns into noise when BPF profiling programs > call bpf_get_task_stack() on running tasks in a firm with a large number of > hosts. > > The callers should be aware and willing to take the risk of receiving an > incorrect return address from a task that is currently running other than > the "current" one. A warning is not needed here as the callers are intent > on it. > OK, looks good to me. Let me pick it to probes/for-next. Thanks! > Acked-by: Andrii Nakryiko > Acked-by: John Fastabend > Signed-off-by: Kui-Feng Lee > > --- > Changes from v1: > > - Rephrased the commit log. > >- Removed the confusing last part of the first paragraph. > >- Removed "frequently" from the 2nd paragraph, replaced by "a firm with > a large number of hosts". > > v1: https://lore.kernel.org/all/20240401191621.758056-1-thinker...@gmail.com/ > --- > kernel/trace/rethook.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c > index fa03094e9e69..4297a132a7ae 100644 > --- a/kernel/trace/rethook.c > +++ b/kernel/trace/rethook.c > @@ -248,7 +248,7 @@ unsigned long rethook_find_ret_addr(struct task_struct > *tsk, unsigned long frame > if (WARN_ON_ONCE(!cur)) > return 0; > > - if (WARN_ON_ONCE(tsk != current && task_is_running(tsk))) > + if (tsk != current && task_is_running(tsk)) > return 0; > > do { > -- > 2.34.1 > -- Masami Hiramatsu (Google)
Re: [PATCH v10 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
On 9/04/2024 6:03 am, Haitao Huang wrote: The misc root cgroup is a static similar to sgx_cg_root. So misc_cg_root() won't be NULL However, based on how css_misc() was check NULL, I suppose sgx_get_current_cg() may be NULL when cgroup is disabled (again not 100% sure but we handle it anyway) Could you help to check? Sorry I am busy on something else thus won't be able to do any actual check.
Re: UBSAN array-index-out-of-bounds in read_blocklist
On 08/04/2024 01:56, Ubisectech Sirius wrote: Hello. We are Ubisectech Sirius Team, the vulnerability lab of China ValiantSec. Recently, our team has discovered a issue in Linux kernel 6.8. Attached to the email were a PoC file of the issue. Hello, I have found the cause of the problem and fixed it. A patch will be sent to the LKML shortly. Thanks Phillip Squashfs maintainer. Stack dump: [ cut here ] UBSAN: array-index-out-of-bounds in fs/squashfs/file.c:256:34 index 4294967295 is out of range for type 'meta_entry [127]' CPU: 1 PID: 16927 Comm: syz-executor.0 Not tainted 6.8.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:347 fill_meta_index fs/squashfs/file.c:256 [inline] read_blocklist+0x175e/0x1790 fs/squashfs/file.c:333 squashfs_readahead+0x14fa/0x22d0 fs/squashfs/file.c:591 read_pages+0x1a2/0xd70 mm/readahead.c:160 page_cache_ra_unbounded+0x477/0x5f0 mm/readahead.c:269 do_page_cache_ra mm/readahead.c:299 [inline] page_cache_ra_order+0x772/0xa00 mm/readahead.c:544 do_sync_mmap_readahead mm/filemap.c:3153 [inline] filemap_fault+0x1691/0x3390 mm/filemap.c:3245 __do_fault+0x108/0x490 mm/memory.c:4396 do_read_fault mm/memory.c:4758 [inline] do_fault mm/memory.c:4888 [inline] do_pte_missing mm/memory.c:3745 [inline] handle_pte_fault mm/memory.c:5164 [inline] __handle_mm_fault+0x340a/0x48b0 mm/memory.c:5305 handle_mm_fault+0x3c2/0xa40 mm/memory.c:5470 do_user_addr_fault+0x2ed/0x1010 arch/x86/mm/fault.c:1355 handle_page_fault arch/x86/mm/fault.c:1498 [inline] exc_page_fault+0x99/0x180 arch/x86/mm/fault.c:1554 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:570 RIP: 0033:0x7f8446a28202 Code: 48 63 c3 5b c3 0f 1f 00 f3 0f 1e fa 48 89 7c 24 f0 48 89 74 24 e8 48 89 54 24 e0 48 8b 4c 24 f0 48 8b 54 24 e8 48 8b 74 24 e0 <8b> 41 40 23 81 00 01 00 00 f3 0f 6f 06 c1 e0 06 48 01 d0 0f 11 00 RSP: 002b:7f8447755018 EFLAGS: 00010212 RAX: 7f8446a281e0 RBX: 7f8446bcbf80 RCX: 20ff RDX: RSI: 2000 RDI: 20ff RBP: 7f8446af14a6 R08: R09: R10: 20ff R11: 2000 R12: R13: 000b R14: 7f8446bcbf80 R15: 7f8447735000 ---[ end trace ]--- Thank you for taking the time to read this email and we look forward to working with you further.
[PATCH v14 4/4] remoteproc: zynqmp: parse TCM from device tree
ZynqMP TCM information was fixed in driver. Now ZynqMP TCM information is available in device-tree. Parse TCM information in driver as per new bindings. Signed-off-by: Tanmay Shah --- Changes in v14: - Add Versal platform support - Add Versal-NET platform support - Maintain backward compatibility for ZynqMP platform and use hardcode TCM addresses - Configure TCM based on xlnx,tcm-mode property for Versal - Avoid TCM configuration if that property isn't available in DT drivers/remoteproc/xlnx_r5_remoteproc.c | 173 ++-- 1 file changed, 132 insertions(+), 41 deletions(-) diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c index 0f942440b4e2..504492f930ac 100644 --- a/drivers/remoteproc/xlnx_r5_remoteproc.c +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c @@ -74,8 +74,8 @@ struct mbox_info { }; /* - * Hardcoded TCM bank values. This will be removed once TCM bindings are - * accepted for system-dt specifications and upstreamed in linux kernel + * Hardcoded TCM bank values. This will stay in driver to maintain backward + * compatibility with device-tree that does not have TCM information. */ static const struct mem_bank_data zynqmp_tcm_banks_split[] = { {0xffe0UL, 0x0, 0x1UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each */ @@ -300,36 +300,6 @@ static void zynqmp_r5_rproc_kick(struct rproc *rproc, int vqid) dev_warn(dev, "failed to send message\n"); } -/* - * zynqmp_r5_set_mode() - * - * set RPU cluster and TCM operation mode - * - * @r5_core: pointer to zynqmp_r5_core type object - * @fw_reg_val: value expected by firmware to configure RPU cluster mode - * @tcm_mode: value expected by fw to configure TCM mode (lockstep or split) - * - * Return: 0 for success and < 0 for failure - */ -static int zynqmp_r5_set_mode(struct zynqmp_r5_core *r5_core, - enum rpu_oper_mode fw_reg_val, - enum rpu_tcm_comb tcm_mode) -{ - int ret; - - ret = zynqmp_pm_set_rpu_mode(r5_core->pm_domain_id, fw_reg_val); - if (ret < 0) { - dev_err(r5_core->dev, "failed to set RPU mode\n"); - return ret; - } - - ret = zynqmp_pm_set_tcm_config(r5_core->pm_domain_id, tcm_mode); - if (ret < 0) - dev_err(r5_core->dev, "failed to configure TCM\n"); - - return ret; -} - /* * zynqmp_r5_rproc_start() * @rproc: single R5 core's corresponding rproc instance @@ -761,6 +731,103 @@ static struct zynqmp_r5_core *zynqmp_r5_add_rproc_core(struct device *cdev) return ERR_PTR(ret); } +static int zynqmp_r5_get_tcm_node_from_dt(struct zynqmp_r5_cluster *cluster) +{ + int i, j, tcm_bank_count, ret, tcm_pd_idx, pd_count; + struct of_phandle_args out_args; + struct zynqmp_r5_core *r5_core; + struct platform_device *cpdev; + struct mem_bank_data *tcm; + struct device_node *np; + struct resource *res; + u64 abs_addr, size; + struct device *dev; + + for (i = 0; i < cluster->core_count; i++) { + r5_core = cluster->r5_cores[i]; + dev = r5_core->dev; + np = r5_core->np; + + pd_count = of_count_phandle_with_args(np, "power-domains", + "#power-domain-cells"); + + if (pd_count <= 0) { + dev_err(dev, "invalid power-domains property, %d\n", pd_count); + return -EINVAL; + } + + /* First entry in power-domains list is for r5 core, rest for TCM. */ + tcm_bank_count = pd_count - 1; + + if (tcm_bank_count <= 0) { + dev_err(dev, "invalid TCM count %d\n", tcm_bank_count); + return -EINVAL; + } + + r5_core->tcm_banks = devm_kcalloc(dev, tcm_bank_count, + sizeof(struct mem_bank_data *), + GFP_KERNEL); + if (!r5_core->tcm_banks) + return -ENOMEM; + + r5_core->tcm_bank_count = tcm_bank_count; + for (j = 0, tcm_pd_idx = 1; j < tcm_bank_count; j++, tcm_pd_idx++) { + tcm = devm_kzalloc(dev, sizeof(struct mem_bank_data), + GFP_KERNEL); + if (!tcm) + return -ENOMEM; + + r5_core->tcm_banks[j] = tcm; + + /* Get power-domains id of TCM. */ + ret = of_parse_phandle_with_args(np, "power-domains", +"#power-domain-cells", +tcm_pd_idx, _args); + if (ret) { + dev_err(r5_core->dev, +
[PATCH v14 3/4] dts: zynqmp: add properties for TCM in remoteproc
Add properties as per new bindings in zynqmp remoteproc node to represent TCM address and size. This patch also adds alternative remoteproc node to represent remoteproc cluster in split mode. By default lockstep mode is enabled and users should disable it before using split mode dts. Both device-tree nodes can't be used simultaneously one of them must be disabled. For zcu102-1.0 and zcu102-1.1 board remoteproc split mode dts node is enabled and lockstep mode dts is disabled. Signed-off-by: Tanmay Shah --- Changes in v14: - Add xlnx,tcm-mode property in remoteproc node .../boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts | 8 +++ arch/arm64/boot/dts/xilinx/zynqmp.dtsi| 67 +-- 2 files changed, 70 insertions(+), 5 deletions(-) diff --git a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts index c8f71a1aec89..495ca94b45db 100644 --- a/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts +++ b/arch/arm64/boot/dts/xilinx/zynqmp-zcu102-rev1.0.dts @@ -14,6 +14,14 @@ / { compatible = "xlnx,zynqmp-zcu102-rev1.0", "xlnx,zynqmp-zcu102", "xlnx,zynqmp"; }; +_split { + status = "okay"; +}; + +_lockstep { + status = "disabled"; +}; + { #address-cells = <1>; #size-cells = <1>; diff --git a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi index 25d20d803230..ef31b0fc73d1 100644 --- a/arch/arm64/boot/dts/xilinx/zynqmp.dtsi +++ b/arch/arm64/boot/dts/xilinx/zynqmp.dtsi @@ -260,19 +260,76 @@ fpga_full: fpga-full { ranges; }; - remoteproc { + rproc_lockstep: remoteproc@ffe0 { compatible = "xlnx,zynqmp-r5fss"; xlnx,cluster-mode = <1>; + xlnx,tcm-mode = <1>; - r5f-0 { + #address-cells = <2>; + #size-cells = <2>; + + ranges = <0x0 0x0 0x0 0xffe0 0x0 0x1>, +<0x0 0x2 0x0 0xffe2 0x0 0x1>, +<0x0 0x1 0x0 0xffe1 0x0 0x1>, +<0x0 0x3 0x0 0xffe3 0x0 0x1>; + + r5f@0 { + compatible = "xlnx,zynqmp-r5f"; + reg = <0x0 0x0 0x0 0x1>, + <0x0 0x2 0x0 0x1>, + <0x0 0x1 0x0 0x1>, + <0x0 0x3 0x0 0x1>; + reg-names = "atcm0", "btcm0", "atcm1", "btcm1"; + power-domains = <_firmware PD_RPU_0>, + <_firmware PD_R5_0_ATCM>, + <_firmware PD_R5_0_BTCM>, + <_firmware PD_R5_1_ATCM>, + <_firmware PD_R5_1_BTCM>; + memory-region = <_0_fw_image>; + }; + + r5f@1 { + compatible = "xlnx,zynqmp-r5f"; + reg = <0x1 0x0 0x0 0x1>, <0x1 0x2 0x0 0x1>; + reg-names = "atcm0", "btcm0"; + power-domains = <_firmware PD_RPU_1>, + <_firmware PD_R5_1_ATCM>, + <_firmware PD_R5_1_BTCM>; + memory-region = <_1_fw_image>; + }; + }; + + rproc_split: remoteproc-split@ffe0 { + status = "disabled"; + compatible = "xlnx,zynqmp-r5fss"; + xlnx,cluster-mode = <0>; + xlnx,tcm-mode = <0>; + + #address-cells = <2>; + #size-cells = <2>; + + ranges = <0x0 0x0 0x0 0xffe0 0x0 0x1>, +<0x0 0x2 0x0 0xffe2 0x0 0x1>, +<0x1 0x0 0x0 0xffe9 0x0 0x1>, +<0x1 0x2 0x0 0xffeb 0x0 0x1>; + + r5f@0 { compatible = "xlnx,zynqmp-r5f"; - power-domains = <_firmware PD_RPU_0>; + reg = <0x0 0x0 0x0 0x1>, <0x0 0x2 0x0 0x1>; + reg-names = "atcm0", "btcm0"; + power-domains = <_firmware PD_RPU_0>, + <_firmware PD_R5_0_ATCM>, + <_firmware PD_R5_0_BTCM>; memory-region = <_0_fw_image>; }; - r5f-1 { + r5f@1 { compatible = "xlnx,zynqmp-r5f"; - power-domains = <_firmware PD_RPU_1>; + reg = <0x1 0x0 0x0 0x1>, <0x1 0x2 0x0 0x1>; + reg-names = "atcm0", "btcm0"; + power-domains = <_firmware PD_RPU_1>, + <_firmware PD_R5_1_ATCM>, +
[PATCH v14 2/4] dt-bindings: remoteproc: add Tightly Coupled Memory (TCM) bindings
From: Radhey Shyam Pandey Introduce bindings for TCM memory address space on AMD-xilinx Zynq UltraScale+ platform. It will help in defining TCM in device-tree and make it's access platform agnostic and data-driven. Tightly-coupled memories(TCMs) are low-latency memory that provides predictable instruction execution and predictable data load/store timing. Each Cortex-R5F processor contains two 64-bit wide 64 KB memory banks on the ATCM and BTCM ports, for a total of 128 KB of memory. The TCM resources(reg, reg-names and power-domain) are documented for each TCM in the R5 node. The reg and reg-names are made as required properties as we don't want to hardcode TCM addresses for future platforms and for zu+ legacy implementation will ensure that the old dts w/o reg/reg-names works and stable ABI is maintained. It also extends the examples for TCM split and lockstep modes. Signed-off-by: Radhey Shyam Pandey Signed-off-by: Tanmay Shah --- Changes in v14: - Remove previous RB tag - Add xlnx,tcm-mode property - Add Versal platform support - Add Versal-NET platform support .../remoteproc/xlnx,zynqmp-r5fss.yaml | 279 -- 1 file changed, 257 insertions(+), 22 deletions(-) diff --git a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml index 78aac69f1060..6f13da11f593 100644 --- a/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml +++ b/Documentation/devicetree/bindings/remoteproc/xlnx,zynqmp-r5fss.yaml @@ -18,11 +18,26 @@ description: | properties: compatible: -const: xlnx,zynqmp-r5fss +enum: + - xlnx,zynqmp-r5fss + - xlnx,versal-r5fss + - xlnx,versal-net-r52fss + + "#address-cells": +const: 2 + + "#size-cells": +const: 2 + + ranges: +description: | + Standard ranges definition providing address translations for + local R5F TCM address spaces to bus addresses. xlnx,cluster-mode: $ref: /schemas/types.yaml#/definitions/uint32 enum: [0, 1, 2] +default: 1 description: | The RPU MPCore can operate in split mode (Dual-processor performance), Safety lock-step mode(Both RPU cores execute the same code in lock-step, @@ -36,8 +51,16 @@ properties: 1: lockstep mode (default) 2: single cpu mode + xlnx,tcm-mode: +$ref: /schemas/types.yaml#/definitions/uint32 +enum: [0, 1] +description: | + Configure RPU TCM + 0: split mode + 1: lockstep mode + patternProperties: - "^r5f-[a-f0-9]+$": + "^r(.*)@[0-9a-f]+$": type: object description: | The RPU is located in the Low Power Domain of the Processor Subsystem. @@ -52,10 +75,22 @@ patternProperties: properties: compatible: -const: xlnx,zynqmp-r5f +enum: + - xlnx,zynqmp-r5f + - xlnx,versal-r5f + - xlnx,versal-net-r52f + + reg: +minItems: 1 +maxItems: 4 + + reg-names: +minItems: 1 +maxItems: 4 power-domains: -maxItems: 1 +minItems: 2 +maxItems: 5 mboxes: minItems: 1 @@ -101,35 +136,235 @@ patternProperties: required: - compatible + - reg + - reg-names - power-domains -unevaluatedProperties: false - required: - compatible + - "#address-cells" + - "#size-cells" + - ranges + +allOf: + - if: + properties: +compatible: + contains: +enum: + - xlnx,versal-net-r52fss +then: + properties: +xlnx,tcm-mode: false + + patternProperties: +"^r52f@[0-9a-f]+$": + type: object + + properties: +reg: + minItems: 1 + items: +- description: ATCM internal memory +- description: BTCM internal memory +- description: CTCM internal memory + +reg-names: + minItems: 1 + items: +- const: atcm0 +- const: btcm0 +- const: ctcm0 + +power-domains: + minItems: 2 + items: +- description: RPU core power domain +- description: ATCM power domain +- description: BTCM power domain +- description: CTCM power domain + + - if: + properties: +compatible: + contains: +enum: + - xlnx,zynqmp-r5fss + - xlnx,versal-r5fss +then: + if: +properties: + xlnx,cluster-mode: +enum: [1, 2] + then: +properties: + xlnx,tcm-mode: +enum: [1] + +patternProperties: + "^r5f@[0-9a-f]+$": +type: object + +properties: + reg: +minItems: 1 +items: + - description: ATCM
[PATCH v14 0/4] add zynqmp TCM bindings
Tightly-Coupled Memories(TCMs) are low-latency memory that provides predictable instruction execution and predictable data load/store timing. Each Cortex-R5F processor contains exclusive two 64 KB memory banks on the ATCM and BTCM ports, for a total of 128 KB of memory. In lockstep mode, both 128KB memory is accessible to the cluster. As per ZynqMP Ultrascale+ Technical Reference Manual UG1085, following is address space of TCM memory. The bindings in this patch series introduces properties to accommodate following address space with address translation between Linux and Cortex-R5 views. | | | | | --- | --- | --- | | *Mode*| *R5 View* | *Linux view* | Notes | | *Split Mode* | *start addr*| *start addr* | | | R5_0 ATCM (64 KB) | 0x_ | 0xFFE0_ | | | R5_0 BTCM (64 KB) | 0x0002_ | 0xFFE2_ | | | R5_1 ATCM (64 KB) | 0x_ | 0xFFE9_ | alias of 0xFFE1_ | | R5_1 BTCM (64 KB) | 0x0002_ | 0xFFEB_ | alias of 0xFFE3_ | | ___ | ___ |___ | | | *Lockstep Mode*| | | | | R5_0 ATCM (128 KB) | 0x_ | 0xFFE0_ | | | R5_0 BTCM (128 KB) | 0x0002_ | 0xFFE2_ | | References: UG1085 TCM address space: https://docs.xilinx.com/r/en-US/ug1085-zynq-ultrascale-trm/Tightly-Coupled-Memory-Address-Map --- prerequisite-patch-link: https://lore.kernel.org/all/d4556268-8274-4089-949f-3b97d6779...@gmail.com/ Base Branch: 6.9.rc2 Changes in v14: - Add xlnx,tcm-mode property and use it for TCM configuration - Add Versal and Versal-NET platform support - Maintain backward compatibility for ZynqMP platform and use hardcode TCM addresses Changes in v13: - Have power-domains property for lockstep case instead of keeping it flexible. - Add "items:" list in power-domains property Changes in v12: - add "reg", "reg-names" and "power-domains" in pattern properties - add "reg" and "reg-names" in required list - keep "power-domains" in required list as it was before the change Changes in v11: - Fix yamllint warning and reduce indentation as needed - Remove redundant initialization of the variable - Return correct error code if memory allocation failed Changs in v10: - Add new patch (1/4) to series that changes hardcode TCM addresses in lockstep mode and removes separate handling of TCM in lockstep and split mode - modify number of "reg", "reg-names" and "power-domains" entries based on cluster mode - Add extra optional atcm and btcm in "reg" property for lockstep mode - Add "reg-names" for extra optional atcm and btcm for lockstep mode - Drop previous Ack as bindings has new change - Add individual tcm regions via "reg" and "reg-names" for lockstep mode - Add each tcm's power-domains in lockstep mode - Drop previous Ack as new change in dts patchset - Remove redundant changes in driver to handle TCM in lockstep mode Changes in v9: - Fix rproc lockstep dts - Introduce new API to request and release core1 TCM power-domains in lockstep mode. This will be used during prepare -> add_tcm_banks callback to enable TCM in lockstep mode. - Parse TCM from device-tree in lockstep mode and split mode in uniform way. - Fix TCM representation in device-tree in lockstep mode. - Fix comments as suggested Changes in v8: - Remove use of pm_domains framework - Remove checking of pm_domain_id validation to power on/off tcm - Remove spurious change - parse power-domains property from device-tree and use EEMI calls to power on/off TCM instead of using pm domains framework Changes in v7: - %s/pm_dev1/pm_dev_core0/r - %s/pm_dev_link1/pm_dev_core0_link/r - %s/pm_dev2/pm_dev_core1/r - %s/pm_dev_link2/pm_dev_core1_link/r - remove pm_domain_id check to move next patch - add comment about how 1st entry in pm domain list is used - fix loop when jump to fail_add_pm_domains loop - move checking of pm_domain_id from previous patch - fix mem_bank_data memory allocation Changes in v6: - Introduce new node entry for r5f cluster split mode dts and keep it disabled by default. - Keep remoteproc lockstep mode enabled by default to maintian back compatibility. - Enable split mode only for zcu102 board to demo split mode use - Remove spurious change - Handle errors in add_pm_domains function - Remove redundant code to handle errors from remove_pm_domains - Missing . at the end of the commit message - remove redundant initialization of variables - remove fail_tcm label and relevant code to free memory acquired using devm_* API. As this will be freed when device free it - add extra check to see if "reg" property is supported or not Changes in v5: - maintain Rob's Ack on bindings patch as no changes in bindings -
[PATCH v14 1/4] remoteproc: zynqmp: fix lockstep mode memory region
In lockstep mode, r5 core0 uses TCM of R5 core1. Following is lockstep mode memory region as per hardware reference manual. | *TCM* | *R5 View* | *Linux view* | | R5_0 ATCM (128 KB) | 0x_ | 0xFFE0_ | | R5_0 BTCM (128 KB) | 0x0002_ | 0xFFE2_ | However, driver shouldn't model it as above because R5 core0 TCM and core1 TCM has different power-domains mapped to it. Hence, TCM address space in lockstep mode should be modeled as 64KB regions only where each region has its own power-domain as following: | *TCM* | *R5 View* | *Linux view* | | R5_0 ATCM0 (64 KB) | 0x_ | 0xFFE0_ | | R5_0 BTCM0 (64 KB) | 0x0002_ | 0xFFE2_ | | R5_0 ATCM1 (64 KB) | 0x0001_ | 0xFFE1_ | | R5_0 BTCM1 (64 KB) | 0x0003_ | 0xFFE3_ | This makes driver maintanance easy and makes design robust for future platorms as well. Signed-off-by: Tanmay Shah --- drivers/remoteproc/xlnx_r5_remoteproc.c | 146 ++-- 1 file changed, 12 insertions(+), 134 deletions(-) diff --git a/drivers/remoteproc/xlnx_r5_remoteproc.c b/drivers/remoteproc/xlnx_r5_remoteproc.c index cfbd97b89c26..0f942440b4e2 100644 --- a/drivers/remoteproc/xlnx_r5_remoteproc.c +++ b/drivers/remoteproc/xlnx_r5_remoteproc.c @@ -84,12 +84,12 @@ static const struct mem_bank_data zynqmp_tcm_banks_split[] = { {0xffebUL, 0x2, 0x1UL, PD_R5_1_BTCM, "btcm1"}, }; -/* In lockstep mode cluster combines each 64KB TCM and makes 128KB TCM */ +/* In lockstep mode cluster uses each 64KB TCM from second core as well */ static const struct mem_bank_data zynqmp_tcm_banks_lockstep[] = { - {0xffe0UL, 0x0, 0x2UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 128KB each */ - {0xffe2UL, 0x2, 0x2UL, PD_R5_0_BTCM, "btcm0"}, - {0, 0, 0, PD_R5_1_ATCM, ""}, - {0, 0, 0, PD_R5_1_BTCM, ""}, + {0xffe0UL, 0x0, 0x1UL, PD_R5_0_ATCM, "atcm0"}, /* TCM 64KB each */ + {0xffe2UL, 0x2, 0x1UL, PD_R5_0_BTCM, "btcm0"}, + {0xffe1UL, 0x1, 0x1UL, PD_R5_1_ATCM, "atcm1"}, + {0xffe3UL, 0x3, 0x1UL, PD_R5_1_BTCM, "btcm1"}, }; /** @@ -541,14 +541,14 @@ static int tcm_mem_map(struct rproc *rproc, } /* - * add_tcm_carveout_split_mode() + * add_tcm_banks() * @rproc: single R5 core's corresponding rproc instance * - * allocate and add remoteproc carveout for TCM memory in split mode + * allocate and add remoteproc carveout for TCM memory * * return 0 on success, otherwise non-zero value on failure */ -static int add_tcm_carveout_split_mode(struct rproc *rproc) +static int add_tcm_banks(struct rproc *rproc) { struct rproc_mem_entry *rproc_mem; struct zynqmp_r5_core *r5_core; @@ -581,10 +581,10 @@ static int add_tcm_carveout_split_mode(struct rproc *rproc) ZYNQMP_PM_REQUEST_ACK_BLOCKING); if (ret < 0) { dev_err(dev, "failed to turn on TCM 0x%x", pm_domain_id); - goto release_tcm_split; + goto release_tcm; } - dev_dbg(dev, "TCM carveout split mode %s addr=%llx, da=0x%x, size=0x%lx", + dev_dbg(dev, "TCM carveout %s addr=%llx, da=0x%x, size=0x%lx", bank_name, bank_addr, da, bank_size); rproc_mem = rproc_mem_entry_init(dev, NULL, bank_addr, @@ -594,99 +594,16 @@ static int add_tcm_carveout_split_mode(struct rproc *rproc) if (!rproc_mem) { ret = -ENOMEM; zynqmp_pm_release_node(pm_domain_id); - goto release_tcm_split; - } - - rproc_add_carveout(rproc, rproc_mem); - rproc_coredump_add_segment(rproc, da, bank_size); - } - - return 0; - -release_tcm_split: - /* If failed, Turn off all TCM banks turned on before */ - for (i--; i >= 0; i--) { - pm_domain_id = r5_core->tcm_banks[i]->pm_domain_id; - zynqmp_pm_release_node(pm_domain_id); - } - return ret; -} - -/* - * add_tcm_carveout_lockstep_mode() - * @rproc: single R5 core's corresponding rproc instance - * - * allocate and add remoteproc carveout for TCM memory in lockstep mode - * - * return 0 on success, otherwise non-zero value on failure - */ -static int add_tcm_carveout_lockstep_mode(struct rproc *rproc) -{ - struct rproc_mem_entry *rproc_mem; - struct zynqmp_r5_core *r5_core; - int i, num_banks, ret; - phys_addr_t bank_addr; - size_t bank_size = 0; - struct device *dev; - u32 pm_domain_id; - char *bank_name; - u32 da; - - r5_core = rproc->priv; - dev = r5_core->dev; - - /* Go through zynqmp banks for r5 node */ - num_banks = r5_core->tcm_bank_count; - - /* -* In lockstep mode, TCM is contiguous memory
Re: [PATCH 2/3] kernel/pid: Remove default pid_max value
On Mon, 8 Apr 2024 16:58:18 +0200 Michal Koutný wrote: > The kernel provides mechanisms, while it should not imply policies -- > default pid_max seems to be an example of the policy that does not fit > all. At the same time pid_max must have some value assigned, so use the > end of the allowed range -- pid_max_max. > > This change thus increases initial pid_max from 32k to 4M (x86_64 > defconfig). That seems like a large change. It isn't clear why we'd want to merge this patchset. Does it improve anyone's life and if so, how?
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 17:17, Ondřej Jirman wrote: > On Mon, Apr 08, 2024 at 03:27:00PM GMT, Krzysztof Kozlowski wrote: >> On 08/04/2024 14:48, Ondřej Jirman wrote: >>> Yeah, I understand where the confusion is. The driver is not for anx7688 >>> chip >>> really. The driver is named anx7688, but that's mostly a historical >>> accident at >>> this point. >>> >>> I guess there can be a driver for anx7688 chip that can directly use the >>> chip's >>> resources from the host by directly manipulating its registers and >>> implementing >>> type-c functionality via eg. Linux's TCPM or TCPCI stack, etc. (eg. like >>> fusb302 driver, or various tcpci subdrivers). >>> >>> But in this case the chip is driven by an optional on-chip microcontroller's >>> firmware and *this driver* is specifically for *the Type-C port on >>> Pinephone* >> >> We do not talk here about the driver, but bindings, so hardware. > > Got it. Bindings should be the same regardless of what driver would be used, > whether this OCM based one, or some future one based on the above mentioned > TCPCI in-kernel implementation. Hardware is the same in both cases. > > Just trying to imagine how to actually solve the issues... > > Basic thing with the I2C regulator thing is that needs to be enabled as long > as anx7688 needs to communicate over I2C. Other user of this power rail is > touchscreen controller for its normal power supply, and it needs to be able > to disable it during system suspend. This does not look like anything specific to this particular device... but even without this, please think how you want it to be solved. Same supply which has to be on always, because your anx7688 can talk over I2C, and in the same time sometimes off, so touchscreen can be shutdown. If this is regulator for the I2C bus, then I think we already had this discussion some time ago. I think it is not a property of the I2C device, but the controller. > > Now for things to not fail during suspend/resume based on PM callbacks > invocation order, anx7688 driver needs to enable this regulator too, as long > as it needs it. No, the I2C bus driver needs to manage it. Not one individual I2C device. Again, why anx7688 is specific? If you next phone has anx8867, using different driver, you also add there i2c-supply? And if it is nxp,ptn5100 as well? > > I can put bus-supply to I2C controller node, and read it from the ANX7688 > driver > I guess, by going up a DT node. Whether that's going to be acceptable, I don't > know. > > > VCONN regulator I don't know where else to put either. It doesn't seem to > belong > anywhere. It's not something directly connected to Type-C connector, so > not part of connector bindings, and there's nothing else I can see, other > than anx7688 device which needs it for core functionality. That sounds like a GPIO, not regulator. anx7688 has GPIOs, right? On Pinephone they go to regulator, but on FooPhone also using anx7688 they go somewhere else, so why this anx7688 assumes this is a regulator? You need to properly represent the hardware, not bend it to your one use-case for one hardware. > > ANX7688 chip desing doesn't have integrated VCONN mosfet switches so it always > needs external supply + switches that are controlled by the chip itself. > There's > no sensible design where someone would not want this and the driver needs > to get this regulator reference from somewhere. The switches are sort of an > extension of the chip. Best regards, Krzysztof
Re: [PATCH] dt-bindings: iio: imu: mpu6050: Improve i2c-gate disallow list
Hello Luca, good catch, thanks for the patch! Acked-by: Jean-Baptiste Maneyrol From: Luca Weiss Sent: Monday, April 8, 2024 18:34 To: ~postmarketos/upstream...@lists.sr.ht <~postmarketos/upstream...@lists.sr.ht>; phone-de...@vger.kernel.org ; Jonathan Cameron ; Lars-Peter Clausen ; Rob Herring ; Krzysztof Kozlowski ; Conor Dooley ; Jean-Baptiste Maneyrol Cc: linux-...@vger.kernel.org ; devicet...@vger.kernel.org ; linux-kernel@vger.kernel.org ; Luca Weiss Subject: [PATCH] dt-bindings: iio: imu: mpu6050: Improve i2c-gate disallow list This Message Is From an Untrusted Sender You have not previously corresponded with this sender. Before all supported sensors except for MPU{9150,9250,9255} were not allowed to use i2c-gate in the bindings which excluded quite a few supported sensors where this functionality is supported. Switch the list of sensors to ones where the Linux driver explicitly disallows support for the auxiliary bus ("inv_mpu_i2c_aux_bus"). Since the driver is also based on "default: return true" this should scale better into the future. Signed-off-by: Luca Weiss --- This fixes dt validation error on qcom-msm8974-lge-nexus5-hammerhead which uses mpu6515 arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dtb: mpu6515@68: i2c-gate: False schema does not allow {'#address-cells': [[1]], '#size-cells': [[0]], 'ak8963@f': {'compatible': ['asahi-kasei,ak8963'], 'reg': [[15]], 'gpios': [[40, 67, 0]], 'vid-supply': [[50]], 'vdd-supply': [[49]]}, 'bmp280@76': {'compatible': ['bosch,bmp280'], 'reg': [[118]], 'vdda-supply': [[50]], 'vddd-supply': [[49]]}} from schema $id: https://urldefense.com/v3/__http://devicetree.org/schemas/iio/imu/invensense,mpu6050.yaml*__;Iw!!FtrhtPsWDhZ6tw!Athn1pwCL_LPpZ97exHEFSkirApIqFF2ISY01IuyHtFBxpbPkcPWh_FmzB_TiCzb8uv1HO0AHY4IeIlv1-o$[devicetree[.]org] --- .../devicetree/bindings/iio/imu/invensense,mpu6050.yaml | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml b/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml index 297b8a1a7ffb..587ff2bced2d 100644 --- a/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml +++ b/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml @@ -62,14 +62,15 @@ properties: allOf: - $ref: /schemas/spi/spi-peripheral-props.yaml# - if: - not: -properties: - compatible: -contains: - enum: -- invensense,mpu9150 -- invensense,mpu9250 -- invensense,mpu9255 + properties: +compatible: + contains: +enum: + - invensense,iam20680 + - invensense,icm20602 + - invensense,icm20608 + - invensense,icm20609 + - invensense,icm20689 then: properties: i2c-gate: false --- base-commit: 8568bb2ccc278f344e6ac44af6ed010a90aa88dc change-id: 20240408-mpu6050-i2c-gate-4ea473e492f4 Best regards, -- Luca Weiss
[PATCH 2/2] ARM: dts: qcom: msm8974: Use proper compatible for APCS syscon
Use the apcs-kpss-global compatible for the APCS global mailbox block found on this SoC. This also resolves a dt-binding checker warning: arch/arm/boot/dts/qcom/qcom-msm8974pro-fairphone-fp2.dtb: syscon@f9011000: compatible: 'anyOf' conditional failed, one must be fixed: ['syscon'] is too short 'syscon' is not one of ['allwinner,sun8i-a83t-system-controller', 'allwinner,sun8i-h3-system-controller', 'allwinner,sun8i-v3s-system-controller', 'allwinner,sun50i-a64-system-controller', 'amd,pensando-elba-syscon', 'brcm,cru-clkset', 'freecom,fsg-cs2-system-controller', 'fsl,imx93-aonmix-ns-syscfg', 'fsl,imx93-wakeupmix-syscfg', 'hisilicon,dsa-subctrl', 'hisilicon,hi6220-sramctrl', 'hisilicon,pcie-sas-subctrl', 'hisilicon,peri-subctrl', 'hpe,gxp-sysreg', 'intel,lgm-syscon', 'loongson,ls1b-syscon', 'loongson,ls1c-syscon', 'marvell,armada-3700-usb2-host-misc', 'mediatek,mt8135-pctl-a-syscfg', 'mediatek,mt8135-pctl-b-syscfg', 'mediatek,mt8365-syscfg', 'microchip,lan966x-cpu-syscon', 'microchip,sparx5-cpu-syscon', 'mstar,msc313-pmsleep', 'nuvoton,ma35d1-sys', 'nuvoton,wpcm450-shm', 'rockchip,px30-qos', 'rockchip,rk3036-qos', 'rockchip,rk3066-qos', 'rockchip,rk3128-qos', 'rockchip,rk3228-qos', 'rockchip,rk3288-qos', 'rockchip,rk3368-qos', 'rockchip,rk3399-qos', 'rockchip,rk356 8-qos', 'rockchip,rk3588-qos', 'rockchip,rv1126-qos', 'starfive,jh7100-sysmain', 'ti,am62-usb-phy-ctrl', 'ti,am654-dss-oldi-io-ctrl', 'ti,am654-serdes-ctrl', 'ti,j784s4-pcie-ctrl'] from schema $id: http://devicetree.org/schemas/mfd/syscon.yaml# Signed-off-by: Luca Weiss --- arch/arm/boot/dts/qcom/qcom-msm8974.dtsi | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/arm/boot/dts/qcom/qcom-msm8974.dtsi b/arch/arm/boot/dts/qcom/qcom-msm8974.dtsi index 233d9bf42298..7e0224006b1f 100644 --- a/arch/arm/boot/dts/qcom/qcom-msm8974.dtsi +++ b/arch/arm/boot/dts/qcom/qcom-msm8974.dtsi @@ -341,9 +341,11 @@ intc: interrupt-controller@f900 { <0xf9002000 0x1000>; }; - apcs: syscon@f9011000 { - compatible = "syscon"; + apcs: mailbox@f9011000 { + compatible = "qcom,msm8974-apcs-kpss-global", +"qcom,msm8994-apcs-kpss-global", "syscon"; reg = <0xf9011000 0x1000>; + #mbox-cells = <1>; }; saw_l2: power-manager@f9012000 { -- 2.44.0
[PATCH 1/2] dt-bindings: mailbox: qcom: Add MSM8974 APCS compatible
Add compatible for the Qualcomm MSM8974 APCS block. Signed-off-by: Luca Weiss --- Documentation/devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml b/Documentation/devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml index 79eb523b8436..982c741e6225 100644 --- a/Documentation/devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml +++ b/Documentation/devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml @@ -30,6 +30,7 @@ properties: - const: syscon - items: - enum: + - qcom,msm8974-apcs-kpss-global - qcom,msm8976-apcs-kpss-global - const: qcom,msm8994-apcs-kpss-global - const: syscon -- 2.44.0
[PATCH 0/2] Fix msm8974 apcs syscon compatible
Finally fix a warning about the apcs-global syscon used on msm8974 that has been around forever. Signed-off-by: Luca Weiss --- Luca Weiss (2): dt-bindings: mailbox: qcom: Add MSM8974 APCS compatible ARM: dts: qcom: msm8974: Use proper compatible for APCS syscon .../devicetree/bindings/mailbox/qcom,apcs-kpss-global.yaml | 1 + arch/arm/boot/dts/qcom/qcom-msm8974.dtsi| 6 -- 2 files changed, 5 insertions(+), 2 deletions(-) --- base-commit: 8568bb2ccc278f344e6ac44af6ed010a90aa88dc change-id: 20240408-msm8974-apcs-b7765f6bab99 Best regards, -- Luca Weiss
Re: [PATCH fs/proc/bootconfig] remove redundant comments from /proc/bootconfig
On Sat, Apr 06, 2024 at 11:11:08AM +0900, Masami Hiramatsu wrote: > On Thu, 4 Apr 2024 21:25:41 -0700 > "Paul E. McKenney" wrote: > > > On Fri, Apr 05, 2024 at 11:57:45AM +0900, Masami Hiramatsu wrote: > > > On Fri, 5 Apr 2024 10:23:24 +0900 > > > Masami Hiramatsu (Google) wrote: > > > > > > > On Thu, 4 Apr 2024 10:43:14 -0700 > > > > "Paul E. McKenney" wrote: > > > > > > > > > On Thu, Apr 04, 2024 at 08:55:22AM +0900, Masami Hiramatsu wrote: > > > > > > On Wed, 3 Apr 2024 12:16:28 -0700 > > > > > > "Paul E. McKenney" wrote: > > > > > > > > > > > > > commit 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > comment to > > > > > > > /proc/bootconfig") adds bootloader argument comments into > > > > > > > /proc/bootconfig. > > > > > > > > > > > > > > /proc/bootconfig shows boot_command_line[] multiple times > > > > > > > following > > > > > > > every xbc key value pair, that's duplicated and not necessary. > > > > > > > Remove redundant ones. > > > > > > > > > > > > > > Output before and after the fix is like: > > > > > > > key1 = value1 > > > > > > > *bootloader argument comments* > > > > > > > key2 = value2 > > > > > > > *bootloader argument comments* > > > > > > > key3 = value3 > > > > > > > *bootloader argument comments* > > > > > > > ... > > > > > > > > > > > > > > key1 = value1 > > > > > > > key2 = value2 > > > > > > > key3 = value3 > > > > > > > *bootloader argument comments* > > > > > > > ... > > > > > > > > > > > > > > Fixes: 717c7c894d4b ("fs/proc: Add boot loader arguments as > > > > > > > comment to /proc/bootconfig") > > > > > > > Signed-off-by: Zhenhua Huang > > > > > > > Signed-off-by: Paul E. McKenney > > > > > > > Cc: Masami Hiramatsu > > > > > > > Cc: > > > > > > > Cc: > > > > > > > > > > > > OOps, good catch! Let me pick it. > > > > > > > > > > > > Acked-by: Masami Hiramatsu (Google) > > > > > > > > > > Thank you, and I have applied your ack and pulled this into its own > > > > > bootconfig.2024.04.04a. > > > > > > > > > > My guess is that you will push this via your own tree, and so I will > > > > > drop my copy as soon as yours hits -next. > > > > > > > > Thanks! I would like to make PR this soon as bootconfig fixes for > > > > v6.9-rc2. > > > > > > Hmm I found that this always shows the command line comment in > > > /proc/bootconfig even without "bootconfig" option. > > > I think that is easier for user-tools but changes the behavior and > > > a bit redundant. > > > > > > We should skip showing this original argument comment if bootconfig is > > > not initialized (no "bootconfig" in cmdline) as it is now. > > > > So something like this folded into that patch? > > Hm, I expected just checking it in the loop as below. > > > diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c > index e5635a6b127b..98e0780f7e07 100644 > --- a/fs/proc/bootconfig.c > +++ b/fs/proc/bootconfig.c > @@ -27,6 +27,7 @@ static int __init copy_xbc_key_value_list(char *dst, size_t > size) > { > struct xbc_node *leaf, *vnode; > char *key, *end = dst + size; > + bool empty = true; > const char *val; > char q; > int ret = 0; > @@ -62,8 +63,9 @@ static int __init copy_xbc_key_value_list(char *dst, size_t > size) > break; > dst += ret; > } > + empty = false; > } > - if (ret >= 0 && boot_command_line[0]) { > + if (!empty && ret >= 0 && boot_command_line[0]) { > ret = snprintf(dst, rest(dst, end), "# Parameters from > bootloader:\n# %s\n", > boot_command_line); > if (ret > 0) > > > > The difference is checking "bootconfig" cmdline option or checking > the "bootconfig" is actually empty. So the behaviors are different > when the "bootconfig" is specified but there is no bootconfig data. Ah, understood, the point is to avoid the comment in cases where its content would be identical to /proc/cmdline. > Another idea is to check whether the cmdline is actually updated by > bootconfig and show original one only if it is updated. > (I think this fits the purpose of the original patch better.) > > > diff --git a/fs/proc/bootconfig.c b/fs/proc/bootconfig.c > index e5635a6b127b..95d6a231210c 100644 > --- a/fs/proc/bootconfig.c > +++ b/fs/proc/bootconfig.c > @@ -10,6 +10,9 @@ > #include > #include > > +/* defined in main/init.c */ > +bool __init cmdline_has_extra_options(void); > + > static char *saved_boot_config; > > static int boot_config_proc_show(struct seq_file *m, void *v) > @@ -63,7 +66,7 @@ static int __init copy_xbc_key_value_list(char *dst, size_t > size) > dst += ret; > } > } > - if (ret >= 0 &&
Re: [PATCH] drivers/virtio: delayed configuration descriptor flags
On 4/8/24 10:02, ni.liqiang wrote: > In our testing of the virtio hardware accelerator, we found that > configuring the flags of the descriptor after addr and len, > as implemented in DPDK, seems to be more friendly to the hardware. please describe in detail "friendly to the hardware" means .. > In our Virtio hardware implementation tests, using the default > open-source code, the hardware's bulk reads ensure performance > but correctness is compromised. If we refer to the implementation code > of DPDK, placing the flags configuration of the descriptor > after addr and len, virtio backend can function properly based on > our hardware accelerator. you are not specifying how the correctness is compromised .. again what the "properly" mean here ? what is the exact failure that you are seeing ? please document .. > I am somewhat puzzled by this. From a software process perspective, > it seems that there should be no difference whether > the flags configuration of the descriptor is before or after addr and len. > However, this is not the case according to experimental test results. > We would like to know if such a change in the configuration order > is reasonable and acceptable? where are the experimental results ? any particular reason those results are not documented here ? > > Thanks. > > Signed-off-by: ni.liqiang > Reviewed-by: jin.qi > Tested-by: jin.qi > Cc: ni.liqiang > --- > drivers/virtio/virtio_ring.c | 9 + > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > index 6f7e5010a673..bea2c2fb084e 100644 > --- a/drivers/virtio/virtio_ring.c > +++ b/drivers/virtio/virtio_ring.c > @@ -1472,15 +1472,16 @@ static inline int virtqueue_add_packed(struct > virtqueue *_vq, > flags = cpu_to_le16(vq->packed.avail_used_flags | > (++c == total_sg ? 0 : VRING_DESC_F_NEXT) | > (n < out_sgs ? 0 : VRING_DESC_F_WRITE)); > - if (i == head) > - head_flags = flags; > - else > - desc[i].flags = flags; > > desc[i].addr = cpu_to_le64(addr); > desc[i].len = cpu_to_le32(sg->length); > desc[i].id = cpu_to_le16(id); > > + if (i == head) > + head_flags = flags; > + else > + desc[i].flags = flags; > + > if (unlikely(vq->use_dma_api)) { > vq->packed.desc_extra[curr].addr = addr; > vq->packed.desc_extra[curr].len = sg->length; -ck
Re: [PATCH 1/2] dt-bindings: pinctrl: qcom,pmic-gpio: Allow gpio-hog nodes
On Montag, 8. April 2024 19:26:49 CEST Konrad Dybcio wrote: > > On 4/8/24 18:39, Luca Weiss wrote: > > Allow specifying a GPIO hog, as already used on > > qcom-msm8974-lge-nexus5-hammerhead.dts. > > > > Signed-off-by: Luca Weiss > > --- > > .../devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml | 12 > > > > 1 file changed, 12 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml > > b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml > > index a786357ed1af..510a05369dbb 100644 > > --- a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml > > +++ b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml > > @@ -424,6 +424,10 @@ patternProperties: > > $ref: "#/$defs/qcom-pmic-gpio-state" > > additionalProperties: false > > > > + "^(hog-[0-9]+|.+-hog(-[0-9]+)?)$": > > I see a couple bindings do this, but I'm not sure if we want two > allow two styles for no reason.. Rob? This regex is actually from the gpio-hog.yaml base https://github.com/devicetree-org/dt-schema/blob/main/dtschema/schemas/gpio/gpio-hog.yaml#L23 Why it's made this way I cannot tell you, but I didn't want to 'artifically' restrict the pattern for qcom,pmic-gpio. > > Konrad >
Re: [PATCH v10 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
On Mon, 08 Apr 2024 07:20:23 -0500, Huang, Kai wrote: --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h @@ -28,6 +28,10 @@ static inline int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_recl static inline void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg) { } static inline void sgx_cgroup_init(void) { } + +static inline void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm) +{ +} #else struct sgx_cgroup { struct misc_cg *cg; @@ -65,6 +69,9 @@ static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg) int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim reclaim); void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg); +bool sgx_cgroup_lru_empty(struct misc_cg *root); +bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg); +void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct *charge_mm); Seems the decision to choose whether to implement a stub function for some function is quite random to me. My impression is people in general don't like #ifdef in the C file but prefer to implementing it in the header in some helper function. I guess you might want to just implement a stub function for each of the 3 functions exposed, so that we can eliminate some #ifdefs in the sgx/main.c (see below). void sgx_cgroup_init(void); #endif diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c index 7f92455d957d..68f28ff2d5ef 100644 --- a/arch/x86/kernel/cpu/sgx/main.c +++ b/arch/x86/kernel/cpu/sgx/main.c @@ -34,6 +34,16 @@ static struct sgx_epc_lru_list sgx_global_lru; static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_page) { +#ifdef CONFIG_CGROUP_SGX_EPC + if (epc_page->sgx_cg) + return _page->sgx_cg->lru; + + /* +* This should not happen when cgroup is enabled: Every page belongs +* to a cgroup, or the root by default. +*/ + WARN_ON_ONCE(1); In the case MISC cgroup is enabled in Kconfig but disabled by command line, I think this becomes legal now? I'm not sure actually. Never saw the warning even if I set "cgroup_disable=misc" in commandlibe. Tried both v1 and v2. Anyway, I think we can remove this warning and we handle the NULL sgx_cg now. +#endif return _global_lru; } @@ -42,7 +52,11 @@ static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page *epc_pag */ static inline bool sgx_can_reclaim(void) { +#ifdef CONFIG_CGROUP_SGX_EPC + return !sgx_cgroup_lru_empty(misc_cg_root()); +#else return !list_empty(_global_lru.reclaimable); +#endif } Here you are using #ifdef CONFIG_CGRUP_SGX_EPC, but ... static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); @@ -404,7 +418,10 @@ static bool sgx_should_reclaim(unsigned long watermark) static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) { - sgx_reclaim_pages(_global_lru, charge_mm); + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) + sgx_cgroup_reclaim_pages(misc_cg_root(), charge_mm); + else + sgx_reclaim_pages(_global_lru, charge_mm); } ... here you are using IS_ENABLED(CONFIG_CGROUP_SGX_EPC). Any reason they are not consistent? I was hesitant to expose sgx_global_lru needed for implementing sgx_cgroup_lru_empty() stub which would also be a random decision and overkill to just remove couple of #ifdefs in short functions. Also, in the case where MISC cgroup is disabled via commandline, I think it won't work, because misc_cg_root() should be NULL in this case while IS_ENABLED(CONFIG_CGROUP_SGX_EPC) is true. The misc root cgroup is a static similar to sgx_cg_root. So misc_cg_root() won't be NULL However, based on how css_misc() was check NULL, I suppose sgx_get_current_cg() may be NULL when cgroup is disabled (again not 100% sure but we handle it anyway) /* @@ -414,6 +431,16 @@ static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) */ void sgx_reclaim_direct(void) { +#ifdef CONFIG_CGROUP_SGX_EPC + struct sgx_cgroup *sgx_cg = sgx_get_current_cg(); + + /* Make sure there are some free pages at cgroup level */ + if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg)) { + sgx_cgroup_reclaim_pages(sgx_cg->cg, current->mm); + sgx_put_cg(sgx_cg); + } +#endif This #ifdef CONFIG_CGROUP_SGX_EPC can be removed if we implement a stub function for sgx_cgroup_should_reclaim(). Yes. + /* Make sure there are some free pages at global level */ if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) sgx_reclaim_pages_global(current->mm); } Thanks Haitao
[PATCH v2] rethook: Remove warning messages printed for finding return address of a frame.
The function rethook_find_ret_addr() prints a warning message and returns 0 when the target task is running and is not the "current" task in order to prevent the incorrect return address, although it still may return an incorrect address. However, the warning message turns into noise when BPF profiling programs call bpf_get_task_stack() on running tasks in a firm with a large number of hosts. The callers should be aware and willing to take the risk of receiving an incorrect return address from a task that is currently running other than the "current" one. A warning is not needed here as the callers are intent on it. Acked-by: Andrii Nakryiko Acked-by: John Fastabend Signed-off-by: Kui-Feng Lee --- Changes from v1: - Rephrased the commit log. - Removed the confusing last part of the first paragraph. - Removed "frequently" from the 2nd paragraph, replaced by "a firm with a large number of hosts". v1: https://lore.kernel.org/all/20240401191621.758056-1-thinker...@gmail.com/ --- kernel/trace/rethook.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c index fa03094e9e69..4297a132a7ae 100644 --- a/kernel/trace/rethook.c +++ b/kernel/trace/rethook.c @@ -248,7 +248,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame if (WARN_ON_ONCE(!cur)) return 0; - if (WARN_ON_ONCE(tsk != current && task_is_running(tsk))) + if (tsk != current && task_is_running(tsk)) return 0; do { -- 2.34.1
Re: [RFC PATCH v3 5/7] mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion
On Mon, 8 Apr 2024 21:06:44 +0900 Honggyu Kim wrote: > On Fri, 5 Apr 2024 12:24:30 -0700 SeongJae Park wrote: > > On Fri, 5 Apr 2024 15:08:54 +0900 Honggyu Kim wrote: [...] > > > Here is one of the example usage of this 'migrate_cold' action. > > > > > > $ cd /sys/kernel/mm/damon/admin/kdamonds/ > > > $ cat contexts//schemes//action > > > migrate_cold > > > $ echo 2 > contexts//schemes//target_nid > > > $ echo commit > state > > > $ numactl -p 0 ./hot_cold 500M 600M & > > > $ numastat -c -p hot_cold > > > > > > Per-node process memory usage (in MBs) > > > PID Node 0 Node 1 Node 2 Total > > > -- -- -- -- - > > > 701 (hot_cold) 501 0601 1101 > > > > > > Since there are some common routines with pageout, many functions have > > > similar logics between pageout and migrate cold. > > > > > > damon_pa_migrate_folio_list() is a minimized version of > > > shrink_folio_list(), but it's minified only for demotion. > > > > MIGRATE_COLD is not only for demotion, right? I think the last two words > > are > > better to be removed for reducing unnecessary confuses. > > You mean the last two sentences? I will remove them if you feel it's > confusing. Yes. My real intended suggestion was 's/only for demotion/only for migration/', but entirely removing the sentences is also ok for me. > > > > > > > Signed-off-by: Honggyu Kim > > > Signed-off-by: Hyeongtak Ji > > > --- > > > include/linux/damon.h| 2 + > > > mm/damon/paddr.c | 146 ++- > > > mm/damon/sysfs-schemes.c | 4 ++ > > > 3 files changed, 151 insertions(+), 1 deletion(-) [...] > > > --- a/mm/damon/paddr.c > > > +++ b/mm/damon/paddr.c [...] > > > +{ > > > + unsigned int nr_succeeded; > > > + nodemask_t allowed_mask = NODE_MASK_NONE; > > > + > > > > I personally prefer not having empty lines in the middle of variable > > declarations/definitions. Could we remove this empty line? > > I can remove it, but I would like to have more discussion about this > issue. The current implementation allows only a single migration > target with "target_nid", but users might want to provide fall back > migration target nids. > > For example, if more than two CXL nodes exist in the system, users might > want to migrate cold pages to any CXL nodes. In such cases, we might > have to make "target_nid" accept comma separated node IDs. nodemask can > be better but we should provide a way to change the scanning order. > > I would like to hear how you think about this. Good point. I think we could later extend the sysfs file to receive the comma-separated numbers, or even mask. For simplicity, adding sysfs files dedicated for the different format of inputs could also be an option (e.g., target_nids_list, target_nids_mask). But starting from this single node as is now looks ok to me. [...] > > > + /* 'folio_list' is always empty here */ > > > + > > > + /* Migrate folios selected for migration */ > > > + nr_migrated += migrate_folio_list(_folios, pgdat, target_nid); > > > + /* Folios that could not be migrated are still in @migrate_folios */ > > > + if (!list_empty(_folios)) { > > > + /* Folios which weren't migrated go back on @folio_list */ > > > + list_splice_init(_folios, folio_list); > > > + } > > > > Let's not use braces for single statement > > (https://docs.kernel.org/process/coding-style.html#placing-braces-and-spaces). > > Hmm.. I know the convention but left it as is because of the comment. > If I remove the braces, it would have a weird alignment for the two > lines for comment and statement lines. I don't really hate such alignment. But if you don't like it, how about moving the comment out of the if statement? Having one comment for one-line if statement looks not bad to me. > > > > + > > > + try_to_unmap_flush(); > > > + > > > + list_splice(_folios, folio_list); > > > > Can't we move remaining folios in migrate_folios to ret_folios at once? > > I will see if it's possible. Thank you. Not a strict request, though. [...] > > > + nid = folio_nid(lru_to_folio(folio_list)); > > > + do { > > > + struct folio *folio = lru_to_folio(folio_list); > > > + > > > + if (nid == folio_nid(folio)) { > > > + folio_clear_active(folio); > > > > I think this was necessary for demotion, but now this should be removed > > since > > this function is no more for demotion but for migrating random pages, right? > > Yeah, it can be removed because we do migration instead of demotion, > but I need to make sure if it doesn't change the performance evaluation > results. Yes, please ensure the test results are valid :) Thanks, SJ [...]
Re: [PATCH 1/2] dt-bindings: pinctrl: qcom,pmic-gpio: Allow gpio-hog nodes
On 4/8/24 18:39, Luca Weiss wrote: Allow specifying a GPIO hog, as already used on qcom-msm8974-lge-nexus5-hammerhead.dts. Signed-off-by: Luca Weiss --- .../devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml | 12 1 file changed, 12 insertions(+) diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml index a786357ed1af..510a05369dbb 100644 --- a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml +++ b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml @@ -424,6 +424,10 @@ patternProperties: $ref: "#/$defs/qcom-pmic-gpio-state" additionalProperties: false + "^(hog-[0-9]+|.+-hog(-[0-9]+)?)$": I see a couple bindings do this, but I'm not sure if we want two allow two styles for no reason.. Rob? Konrad
[PATCH] drivers/virtio: delayed configuration descriptor flags
In our testing of the virtio hardware accelerator, we found that configuring the flags of the descriptor after addr and len, as implemented in DPDK, seems to be more friendly to the hardware. In our Virtio hardware implementation tests, using the default open-source code, the hardware's bulk reads ensure performance but correctness is compromised. If we refer to the implementation code of DPDK, placing the flags configuration of the descriptor after addr and len, virtio backend can function properly based on our hardware accelerator. I am somewhat puzzled by this. From a software process perspective, it seems that there should be no difference whether the flags configuration of the descriptor is before or after addr and len. However, this is not the case according to experimental test results. We would like to know if such a change in the configuration order is reasonable and acceptable? Thanks. Signed-off-by: ni.liqiang Reviewed-by: jin.qi Tested-by: jin.qi Cc: ni.liqiang --- drivers/virtio/virtio_ring.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 6f7e5010a673..bea2c2fb084e 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -1472,15 +1472,16 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, flags = cpu_to_le16(vq->packed.avail_used_flags | (++c == total_sg ? 0 : VRING_DESC_F_NEXT) | (n < out_sgs ? 0 : VRING_DESC_F_WRITE)); - if (i == head) - head_flags = flags; - else - desc[i].flags = flags; desc[i].addr = cpu_to_le64(addr); desc[i].len = cpu_to_le32(sg->length); desc[i].id = cpu_to_le16(id); + if (i == head) + head_flags = flags; + else + desc[i].flags = flags; + if (unlikely(vq->use_dma_api)) { vq->packed.desc_extra[curr].addr = addr; vq->packed.desc_extra[curr].len = sg->length; -- 2.34.1
Re: [PATCH bpf-next] rethook: Remove warning messages printed for finding return address of a frame.
On 4/7/24 18:13, Masami Hiramatsu (Google) wrote: On Wed, 3 Apr 2024 16:36:25 +0200 Daniel Borkmann wrote: On 4/2/24 6:58 PM, Andrii Nakryiko wrote: On Mon, Apr 1, 2024 at 12:16 PM Kui-Feng Lee wrote: rethook_find_ret_addr() prints a warning message and returns 0 when the target task is running and not the "current" task to prevent returning an incorrect return address. However, this check is incomplete as the target task can still transition to the running state when finding the return address, although it is safe with RCU. Could you tell me more about this last part? This change just remove WARN_ON_ONCE() which warns that the user tries to unwind stack of a running task. This means the task can change the stack in parallel if the task is running on other CPU. Does the BPF stop the task? or do you have any RCU magic to copy the stack? No, the BPF doesn't stop the task or copy the stack. The last part tries to explain that this function can still return an incorrect address even with this check. And calling this function on a target task that is not "current" is safe. Since you think it is confusing. I will remove this part. The issue we encounter is that the kernel frequently prints warning messages when BPF profiling programs call to bpf_get_task_stack() on running tasks. Hmm, WARN_ON_ONCE should print it once, not frequently. You are right! I should rephrase it. In a firm with a large number of hosts, this warning message become a noise. The callers should be aware and willing to take the risk of receiving an incorrect return address from a task that is currently running other than the "current" one. A warning is not needed here as the callers are intent on it. Signed-off-by: Kui-Feng Lee --- kernel/trace/rethook.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c index fa03094e9e69..4297a132a7ae 100644 --- a/kernel/trace/rethook.c +++ b/kernel/trace/rethook.c @@ -248,7 +248,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame if (WARN_ON_ONCE(!cur)) return 0; - if (WARN_ON_ONCE(tsk != current && task_is_running(tsk))) + if (tsk != current && task_is_running(tsk)) return 0; This should probably go through Masami's tree, but the change makes sense to me, given this is an expected condition. Acked-by: Andrii Nakryiko Masami, I assume you'll pick this up? OK, anyway it will just return 0 if this situation happens, and caller will get the trampoline address instead of correct return address in this case. I think it does not do any unsafe things. So I agree removing it. But I think the explanation is a bit confusing. Thank you, Thanks, Daniel
[PATCH 2/2] ARM: dts: qcom: msm8974-hammerhead: Update gpio hog node name
Follow the gpio-hog bindings and use otg-hog as node name. Signed-off-by: Luca Weiss --- arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts b/arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts index 4aaae8537a3f..06549051be50 100644 --- a/arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts +++ b/arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts @@ -328,7 +328,7 @@ wlan_regulator_pin: wl-reg-active-state { power-source = ; }; - otg { + otg-hog { gpio-hog; gpios = <35 GPIO_ACTIVE_HIGH>; output-high; -- 2.44.0
[PATCH 0/2] Allow gpio-hog nodes in qcom,pmic-gpio bindings (& dt fixup)
Resolve the dt validation failure on Nexus 5. Signed-off-by: Luca Weiss --- Luca Weiss (2): dt-bindings: pinctrl: qcom,pmic-gpio: Allow gpio-hog nodes ARM: dts: qcom: msm8974-hammerhead: Update gpio hog node name .../devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml | 12 .../arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dts | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-) --- base-commit: 8568bb2ccc278f344e6ac44af6ed010a90aa88dc change-id: 20240408-qcom-pmic-gpio-hog-2b4c5f103126 Best regards, -- Luca Weiss
[PATCH 1/2] dt-bindings: pinctrl: qcom,pmic-gpio: Allow gpio-hog nodes
Allow specifying a GPIO hog, as already used on qcom-msm8974-lge-nexus5-hammerhead.dts. Signed-off-by: Luca Weiss --- .../devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml | 12 1 file changed, 12 insertions(+) diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml index a786357ed1af..510a05369dbb 100644 --- a/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml +++ b/Documentation/devicetree/bindings/pinctrl/qcom,pmic-gpio.yaml @@ -424,6 +424,10 @@ patternProperties: $ref: "#/$defs/qcom-pmic-gpio-state" additionalProperties: false + "^(hog-[0-9]+|.+-hog(-[0-9]+)?)$": +required: + - gpio-hog + $defs: qcom-pmic-gpio-state: type: object @@ -571,6 +575,7 @@ $defs: examples: - | +#include #include pm8921_gpio: gpio@150 { @@ -594,5 +599,12 @@ examples: power-source = ; }; }; + + otg-hog { +gpio-hog; +gpios = <35 GPIO_ACTIVE_HIGH>; +output-high; +line-name = "otg-gpio"; + }; }; ... -- 2.44.0
[PATCH] dt-bindings: iio: imu: mpu6050: Improve i2c-gate disallow list
Before all supported sensors except for MPU{9150,9250,9255} were not allowed to use i2c-gate in the bindings which excluded quite a few supported sensors where this functionality is supported. Switch the list of sensors to ones where the Linux driver explicitly disallows support for the auxiliary bus ("inv_mpu_i2c_aux_bus"). Since the driver is also based on "default: return true" this should scale better into the future. Signed-off-by: Luca Weiss --- This fixes dt validation error on qcom-msm8974-lge-nexus5-hammerhead which uses mpu6515 arch/arm/boot/dts/qcom/qcom-msm8974-lge-nexus5-hammerhead.dtb: mpu6515@68: i2c-gate: False schema does not allow {'#address-cells': [[1]], '#size-cells': [[0]], 'ak8963@f': {'compatible': ['asahi-kasei,ak8963'], 'reg': [[15]], 'gpios': [[40, 67, 0]], 'vid-supply': [[50]], 'vdd-supply': [[49]]}, 'bmp280@76': {'compatible': ['bosch,bmp280'], 'reg': [[118]], 'vdda-supply': [[50]], 'vddd-supply': [[49]]}} from schema $id: http://devicetree.org/schemas/iio/imu/invensense,mpu6050.yaml# --- .../devicetree/bindings/iio/imu/invensense,mpu6050.yaml | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml b/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml index 297b8a1a7ffb..587ff2bced2d 100644 --- a/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml +++ b/Documentation/devicetree/bindings/iio/imu/invensense,mpu6050.yaml @@ -62,14 +62,15 @@ properties: allOf: - $ref: /schemas/spi/spi-peripheral-props.yaml# - if: - not: -properties: - compatible: -contains: - enum: -- invensense,mpu9150 -- invensense,mpu9250 -- invensense,mpu9255 + properties: +compatible: + contains: +enum: + - invensense,iam20680 + - invensense,icm20602 + - invensense,icm20608 + - invensense,icm20609 + - invensense,icm20689 then: properties: i2c-gate: false --- base-commit: 8568bb2ccc278f344e6ac44af6ed010a90aa88dc change-id: 20240408-mpu6050-i2c-gate-4ea473e492f4 Best regards, -- Luca Weiss
Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe
On 04/08, Jiri Olsa wrote: > > On Fri, Apr 05, 2024 at 01:02:30PM +0200, Oleg Nesterov wrote: > > > > And what should sys_uretprobe() do if it is not called from the trampoline? > > I'd prefer force_sig(SIGILL) to punish the abuser ;) OK, OK, EINVAL. > > so the similar behaviour with int3 ends up with immediate SIGTRAP > and not invoking pending uretprobe consumers, like: > > - setup uretprobe for foo > - foo() { > executes int 3 -> sends SIGTRAP > } > > because the int3 handler checks if it got executed from the uretprobe's > trampoline. ... or the task has uprobe at this address > if not it treats that int3 as regular trap Yes this mimics the "default" behaviour without uprobes/uretprobes > so I think we should mimic int3 behaviour and: > > - setup uretprobe for foo > - foo() { > uretprobe_syscall -> check if we got executed from uretprobe's > trampoline and send SIGILL if that's not the case Agreed, > I think it's better to have the offending process killed right away, > rather than having more undefined behaviour, waiting for final 'ret' > instruction that jumps to uretprobe trampoline and causes SIGILL Agreed. In fact I think it should be also killed if copy_to/from_user() fails by the same reason. Oleg.
Re: [PATCHv2 1/3] uprobe: Add uretprobe syscall to speed up return probe
On Fri, Apr 05, 2024 at 01:02:30PM +0200, Oleg Nesterov wrote: > On 04/05, Jiri Olsa wrote: > > > > On Fri, Apr 05, 2024 at 10:22:03AM +0900, Masami Hiramatsu wrote: > > > > > > I think this expects setjmp/longjmp as below > > > > > > foo() { <- retprobe1 > > > setjmp() > > > bar() { <- retprobe2 > > > longjmp() > > > } > > > } <- return to trampoline > > > > > > In this case, we need to skip retprobe2's instance. > > Yes, > > > > My concern is, if we can not find appropriate return instance, what > > > happen? > > > e.g. > > > > > > foo() { <-- retprobe1 > > >bar() { # sp is decremented > > >sys_uretprobe() <-- ?? > > > } > > > } > > > > > > It seems sys_uretprobe() will handle retprobe1 at that point instead of > > > SIGILL. > > > > yes, and I think it's fine, you get the consumer called in wrong place, > > but it's your fault and kernel won't crash > > Agreed. > > With or without this patch userpace can also do > > foo() { <-- retprobe1 > bar() { > jump to xol_area > } > } > > handle_trampoline() will handle retprobe1. > > > this can be fixed by checking the syscall is called from the trampoline > > and prevent handle_trampoline call if it's not > > Yes, but I still do not think this makes a lot of sense. But I won't argue. > > And what should sys_uretprobe() do if it is not called from the trampoline? > I'd prefer force_sig(SIGILL) to punish the abuser ;) OK, OK, EINVAL. so the similar behaviour with int3 ends up with immediate SIGTRAP and not invoking pending uretprobe consumers, like: - setup uretprobe for foo - foo() { executes int 3 -> sends SIGTRAP } because the int3 handler checks if it got executed from the uretprobe's trampoline.. if not it treats that int3 as regular trap while for uretprobe syscall we have at the moment following behaviour: - setup uretprobe for foo - foo() { uretprobe_syscall -> executes foo's uretprobe consumers } - at some point we get to the 'ret' instruction that jump into uretprobe trampoline and the uretprobe_syscall won't find pending uretprobe and will send SIGILL so I think we should mimic int3 behaviour and: - setup uretprobe for foo - foo() { uretprobe_syscall -> check if we got executed from uretprobe's trampoline and send SIGILL if that's not the case I think it's better to have the offending process killed right away, rather than having more undefined behaviour, waiting for final 'ret' instruction that jumps to uretprobe trampoline and causes SIGILL > > I agree very much with Andrii, > >sigreturn() exists only to allow the implementation of signal > handlers. It should never be >called directly. Details of the arguments (if any) passed to > sigreturn() vary depending on >the architecture. > > this is how sys_uretprobe() should be treated/documented. yes, will include man page patch in new version jirka > > sigreturn() can be "improved" too. Say, it could validate sigcontext->ip > and return -EINVAL if this addr is not valid. But why? > > Oleg. >
Re: [PATCH] [v4] module: don't ignore sysfs_create_link() failures
On Mon, Apr 08, 2024 at 10:05:58AM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > The sysfs_create_link() return code is marked as __must_check, but the > module_add_driver() function tries hard to not care, by assigning the > return code to a variable. When building with 'make W=1', gcc still > warns because this variable is only assigned but not used: > > drivers/base/module.c: In function 'module_add_driver': > drivers/base/module.c:36:6: warning: variable 'no_warn' set but not used > [-Wunused-but-set-variable] > > Rework the code to properly unwind and return the error code to the > caller. My reading of the original code was that it tries to > not fail when the links already exist, so keep ignoring -EEXIST > errors. > > Fixes: e17e0f51aeea ("Driver core: show drivers in /sys/module/") > Reviewed-by: Greg Kroah-Hartman Reviewed-by: Luis Chamberlain Luis
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On Mon, Apr 08, 2024 at 03:27:00PM GMT, Krzysztof Kozlowski wrote: > On 08/04/2024 14:48, Ondřej Jirman wrote: > > Yeah, I understand where the confusion is. The driver is not for anx7688 > > chip > > really. The driver is named anx7688, but that's mostly a historical > > accident at > > this point. > > > > I guess there can be a driver for anx7688 chip that can directly use the > > chip's > > resources from the host by directly manipulating its registers and > > implementing > > type-c functionality via eg. Linux's TCPM or TCPCI stack, etc. (eg. like > > fusb302 driver, or various tcpci subdrivers). > > > > But in this case the chip is driven by an optional on-chip microcontroller's > > firmware and *this driver* is specifically for *the Type-C port on > > Pinephone* > > We do not talk here about the driver, but bindings, so hardware. Got it. Bindings should be the same regardless of what driver would be used, whether this OCM based one, or some future one based on the above mentioned TCPCI in-kernel implementation. Hardware is the same in both cases. Just trying to imagine how to actually solve the issues... Basic thing with the I2C regulator thing is that needs to be enabled as long as anx7688 needs to communicate over I2C. Other user of this power rail is touchscreen controller for its normal power supply, and it needs to be able to disable it during system suspend. Now for things to not fail during suspend/resume based on PM callbacks invocation order, anx7688 driver needs to enable this regulator too, as long as it needs it. I can put bus-supply to I2C controller node, and read it from the ANX7688 driver I guess, by going up a DT node. Whether that's going to be acceptable, I don't know. VCONN regulator I don't know where else to put either. It doesn't seem to belong anywhere. It's not something directly connected to Type-C connector, so not part of connector bindings, and there's nothing else I can see, other than anx7688 device which needs it for core functionality. ANX7688 chip desing doesn't have integrated VCONN mosfet switches so it always needs external supply + switches that are controlled by the chip itself. There's no sensible design where someone would not want this and the driver needs to get this regulator reference from somewhere. The switches are sort of an extension of the chip. kind regards, o. > > and serves as an integration driver for quite a bunch of things that need to > > work together on Pinephone for all of the Type-C port's features to operate > > reasonably well (and one of those is some communication with anx7688 > > firmware > > that we use, and enabling power to this chip and other things as > > appropriate, > > based on the communication from the firmware). > > That's still looking like putting board design into particular device > binding. > > > > > It handles the specific needs of the Pinephone's Type-C implementation, all > > of > > its quirks (of which there are many over several HW revisions) that can't be > > handled by the particular implementation of on-chip microcontroller firmware > > directly and need host side interaction. > > > > In an ideal world, many of the things this driver handles would be handled > > by > > embedded microcontroller on the board (like it is with some RK3399 based > > Google > > devices), but Pinephone has no such thing and this glue needs to be > > implemented > > somewhere in the kernel. > > You might need multiple schemas, because this is for anx7688, not for > Pinephone type-c implementation. > > However I still do not see yet a limitation of DTS requiring stuffing > some other properties into anx7688 or creating some other, virtual entity. > > > Best regards, > Krzysztof >
[PATCH 2/3] kernel/pid: Remove default pid_max value
pid_max is a per-pidns (thus global too) limit on a number of tasks the kernel admits. The knob can be configured by admin in the range between pid_max_min and pid_max_max (sic). The default value sits between those and it typically equals max(32k, 1k*nr_cpus). The nr_cpu scaling was introduced in commit 72680a191b93 ("pids: increase pid_max based on num_possible_cpus") to accommodate kernel's own helper tasks (before workqueues). Generally, 1024 tasks/cpu cap is too much if they were all running and it is also too little when they are idle (memory being bottleneck). The kernel also provides other mechanisms to restrict number of tasks -- threads-max sysctl and RLIMIT_NPROC with memory-scaled defaults and generic pids cgroup controller (the last one being the solution of fork-bombs, with qualified limits set up by admin). The kernel provides mechanisms, while it should not imply policies -- default pid_max seems to be an example of the policy that does not fit all. At the same time pid_max must have some value assigned, so use the end of the allowed range -- pid_max_max. This change thus increases initial pid_max from 32k to 4M (x86_64 defconfig). This has effect on size of structure that alloc_pid/idr_alloc_cyclic eventually uses and structure that kernel tracing uses with 'record-tgid' (~16 MiB). Signed-off-by: Michal Koutný --- include/linux/pid.h | 4 ++-- include/linux/threads.h | 15 --- kernel/pid.c| 8 +++- 3 files changed, 9 insertions(+), 18 deletions(-) diff --git a/include/linux/pid.h b/include/linux/pid.h index a3aad9b4074c..0d191ac02958 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -106,8 +106,8 @@ extern void exchange_tids(struct task_struct *task, struct task_struct *old); extern void transfer_pid(struct task_struct *old, struct task_struct *new, enum pid_type); -extern int pid_max; -extern int pid_max_min, pid_max_max; +extern int pid_max_min, pid_max; +extern const int pid_max_max; /* * look up a PID in the hash table. Must be called with the tasklist_lock diff --git a/include/linux/threads.h b/include/linux/threads.h index c34173e6c5f1..43f8f38a0c13 100644 --- a/include/linux/threads.h +++ b/include/linux/threads.h @@ -22,25 +22,18 @@ #define MIN_THREADS_LEFT_FOR_ROOT 4 -/* - * This controls the default maximum pid allocated to a process - */ -#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000) - /* * A maximum of 4 million PIDs should be enough for a while. * [NOTE: PID/TIDs are limited to 2^30 ~= 1 billion, see FUTEX_TID_MASK.] */ #define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \ - (sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT)) + (sizeof(long) > 4 ? 4 * 1024 * 1024 : 0x8000)) /* - * Define a minimum number of pids per cpu. Heuristically based - * on original pid max of 32k for 32 cpus. Also, increase the - * minimum settable value for pid_max on the running system based - * on similar defaults. See kernel/pid.c:pid_idr_init() for details. + * Define a minimum number of pids per cpu. Mainly to accommodate + * smpboot_register_percpu_thread() kernel threads. + * See kernel/pid.c:pid_idr_init() for details. */ -#define PIDS_PER_CPU_DEFAULT 1024 #define PIDS_PER_CPU_MIN 8 #endif diff --git a/kernel/pid.c b/kernel/pid.c index da76ed1873f7..24ae505ac3b0 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -60,10 +60,10 @@ struct pid init_struct_pid = { }, } }; -int pid_max = PID_MAX_DEFAULT; +int pid_max = PID_MAX_LIMIT; int pid_max_min = RESERVED_PIDS + 1; -int pid_max_max = PID_MAX_LIMIT; +const int pid_max_max = PID_MAX_LIMIT; /* * Pseudo filesystems start inode numbering after one. We use Reserved * PIDs as a natural offset. @@ -652,9 +652,7 @@ void __init pid_idr_init(void) /* Verify no one has done anything silly: */ BUILD_BUG_ON(PID_MAX_LIMIT >= PIDNS_ADDING); - /* bump default and minimum pid_max based on number of cpus */ - pid_max = min(pid_max_max, max_t(int, pid_max, - PIDS_PER_CPU_DEFAULT * num_possible_cpus())); + /* bump minimum pid_max based on number of cpus */ pid_max_min = max_t(int, pid_max_min, PIDS_PER_CPU_MIN * num_possible_cpus()); pr_info("pid_max: default: %u minimum: %u\n", pid_max, pid_max_min); -- 2.44.0
[PATCH 1/3] tracing: Remove dependency of saved_cmdlines_buffer on PID_MAX_DEFAULT
Calculations into map_pid_to_cmdline use PID_MAX_DEFAULT but they actually depend on the size of map_pid_to_cmdline. The size of the map may be arbitrary. First, refer to the map size where necessary, second, pick a good value for the size of the map. Since the buffer is allocated at boot (i.e. user cannot affect its size later), accounting for full PID_MAX_LIMIT would inflate map's size unnecessarily (4*4M) for all users. Stick to the original value of 4*32k, the commit 785e3c0a3a87 ("tracing: Map all PIDs to command lines") explains why it still works for higher pids. The point of this exercise is to remove dependency on PID_MAX_DEFAULT. Signed-off-by: Michal Koutný --- kernel/trace/trace_sched_switch.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/kernel/trace/trace_sched_switch.c b/kernel/trace/trace_sched_switch.c index 8a407adb0e1c..aca2dafdd97a 100644 --- a/kernel/trace/trace_sched_switch.c +++ b/kernel/trace/trace_sched_switch.c @@ -161,6 +161,7 @@ static size_t tgid_map_max; #define SAVED_CMDLINES_DEFAULT 128 #define NO_CMDLINE_MAP UINT_MAX +#define PID_MAP_SIZE (CONFIG_BASE_SMALL ? 0x1000 : 0x8000) /* * Preemption must be disabled before acquiring trace_cmdline_lock. * The various trace_arrays' max_lock must be acquired in a context @@ -168,7 +169,7 @@ static size_t tgid_map_max; */ static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED; struct saved_cmdlines_buffer { - unsigned map_pid_to_cmdline[PID_MAX_DEFAULT+1]; + unsigned map_pid_to_cmdline[PID_MAP_SIZE]; unsigned *map_cmdline_to_pid; unsigned cmdline_num; int cmdline_idx; @@ -248,7 +249,7 @@ int trace_save_cmdline(struct task_struct *tsk) if (!tsk->pid) return 1; - tpid = tsk->pid & (PID_MAX_DEFAULT - 1); + tpid = tsk->pid % PID_MAP_SIZE; /* * It's not the end of the world if we don't get @@ -294,7 +295,7 @@ static void __trace_find_cmdline(int pid, char comm[]) return; } - tpid = pid & (PID_MAX_DEFAULT - 1); + tpid = pid % PID_MAP_SIZE; map = savedcmd->map_pid_to_cmdline[tpid]; if (map != NO_CMDLINE_MAP) { tpid = savedcmd->map_cmdline_to_pid[map]; @@ -645,8 +646,8 @@ tracing_saved_cmdlines_size_write(struct file *filp, const char __user *ubuf, if (ret) return ret; - /* must have at least 1 entry or less than PID_MAX_DEFAULT */ - if (!val || val > PID_MAX_DEFAULT) + /* must have at least 1 entry or fit into map_pid_to_cmdline */ + if (!val || val >= PID_MAP_SIZE) return -EINVAL; ret = tracing_resize_saved_cmdlines((unsigned int)val); -- 2.44.0
[PATCH 3/3] tracing: Compare pid_max against pid_list capacity
trace_pid_list_alloc() checks pid_max against a magic number referencing an (obsolete) source file when it actually should check against the capacity of pid_list tree. Turn definition of MAX_PID around -- derive it from tree parameters and replace references to magic value and header files with so defined MAX_PID. Should PID_MAX_LIMIT change in future or pid_max escapes PID_MAX_LIMIT, appropriate checks remain in place. No functional change intended. Signed-off-by: Michal Koutný --- kernel/trace/pid_list.c | 6 +++--- kernel/trace/pid_list.h | 4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/trace/pid_list.c b/kernel/trace/pid_list.c index 95106d02b32d..b968f0b65dc1 100644 --- a/kernel/trace/pid_list.c +++ b/kernel/trace/pid_list.c @@ -93,7 +93,7 @@ static inline bool upper_empty(union upper_chunk *chunk) static inline int pid_split(unsigned int pid, unsigned int *upper1, unsigned int *upper2, unsigned int *lower) { - /* MAX_PID should cover all pids */ + /* MAX_PID must cover all possible pids */ BUILD_BUG_ON(MAX_PID < PID_MAX_LIMIT); /* In case a bad pid is passed in, then fail */ @@ -413,8 +413,8 @@ struct trace_pid_list *trace_pid_list_alloc(void) struct trace_pid_list *pid_list; int i; - /* According to linux/thread.h, pids can be no bigger that 30 bits */ - WARN_ON_ONCE(pid_max > (1 << 30)); + /* See pid_split(), equal to pid_max > PID_MAX_LIMIT */ + WARN_ON_ONCE(pid_max > MAX_PID); pid_list = kzalloc(sizeof(*pid_list), GFP_KERNEL); if (!pid_list) diff --git a/kernel/trace/pid_list.h b/kernel/trace/pid_list.h index 62e73f1ac85f..28562a9a3d01 100644 --- a/kernel/trace/pid_list.h +++ b/kernel/trace/pid_list.h @@ -56,8 +56,8 @@ #define UPPER_MASK (UPPER_MAX - 1) -/* According to linux/thread.h pids can not be bigger than or equal to 1 << 30 */ -#define MAX_PID(1 << 30) +/* Structure can hold only pids strictly below this limit */ +#define MAX_PID(1 << (UPPER_BITS + UPPER_BITS + LOWER_BITS)) /* Just keep 6 chunks of both upper and lower in the cache on alloc */ #define CHUNK_ALLOC 6 -- 2.44.0
[PATCH 0/3] kernel/pid: Remove default pid_max value
TL;DR excerpt from commit 02/03: The kernel provides mechanisms, while it should not imply policies -- default pid_max seems to be an example of the policy that does not fit all. At the same time pid_max must have some value assigned, so use the end of the allowed range -- pid_max_max. More details are in that commit's message. The other two commits are related preparation and less related refresh in code that somewhat references pid_max. Michal Koutný (3): tracing: Remove dependency of saved_cmdlines_buffer on PID_MAX_DEFAULT kernel/pid: Remove default pid_max value tracing: Compare pid_max against pid_list capacity include/linux/pid.h | 4 ++-- include/linux/threads.h | 15 --- kernel/pid.c | 8 +++- kernel/trace/pid_list.c | 6 +++--- kernel/trace/pid_list.h | 4 ++-- kernel/trace/trace_sched_switch.c | 11 ++- 6 files changed, 20 insertions(+), 28 deletions(-) base-commit: fec50db7033ea478773b159e0e2efb135270e3b7 -- 2.44.0
Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback
On Fri, Apr 05, 2024 at 07:58:12AM -0400, Paolo Bonzini wrote: > The .change_pte() MMU notifier callback was intended as an > optimization. The original point of it was that KSM could tell KVM to flip > its secondary PTE to a new location without having to first zap it. At > the time there was also an .invalidate_page() callback; both of them were > *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(), > and .invalidate_page() also doubled as a fallback implementation of > .change_pte(). > > Later on, however, both callbacks were changed to occur within an > invalidate_range_start/end() block. > > In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to > set_pte_at_notify with invalidate_range_start and invalidate_range_end", > 2012-10-09) did so to remove the fallback from .invalidate_page() to > .change_pte() and allow sleepable .invalidate_page() hooks. > > This however made KVM's usage of the .change_pte() callback completely > moot, because KVM unmaps the sPTEs during .invalidate_range_start() > and therefore .change_pte() has no hope of finding a sPTE to change. > Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as > well as all the architecture specific implementations. Paolo, I may miss a bunch of details here (as I still remember some change_pte patches previously on the list..), however not sure whether we considered enable it? Asked because I remember Andrea used to have a custom tree maintaining that part: https://github.com/aagit/aa/commit/c761078df7a77d13ddfaeebe56a0f4bc128b1968 Maybe it can't be enabled for some reason that I overlooked in the current tree, or we just decided to not to? Thanks, -- Peter Xu
Re: [RFC PATCH v3 0/7] DAMON based tiered memory management for CXL
Hi Gregory, On Fri, 5 Apr 2024 12:56:14 -0400 Gregory Price wrote: > On Fri, Apr 05, 2024 at 03:08:49PM +0900, Honggyu Kim wrote: > > There was an RFC IDEA "DAMOS-based Tiered-Memory Management" previously > > posted at [1]. > > > > 1. YCSB zipfian distribution read only workload > > memory pressure with cold memory on node0 with 512GB of local DRAM. > > =++= > >| cold memory occupied by mmap and memset | > >| 0G 440G 450G 460G 470G 480G 490G 500G | > > =++= > > Execution time normalized to DRAM-only values | GEOMEAN > > -++- > > DRAM-only| 1.00 - - - - - - - | 1.00 > > CXL-only | 1.22 - - - - - - - | 1.22 > > default |- 1.12 1.13 1.14 1.16 1.19 1.21 1.21 | 1.17 > > DAMON tiered |- 1.04 1.03 1.04 1.06 1.05 1.05 1.05 | 1.05 > > =++= > > CXL usage of redis-server in GB | AVERAGE > > -++- > > DRAM-only| 0.0 - - - - - - - | 0.0 > > CXL-only | 52.6 - - - - - - - | 52.6 > > default |- 20.4 27.0 33.1 39.5 45.6 50.5 50.3 | 38.1 > > DAMON tiered |- 0.1 0.3 0.8 0.6 0.7 1.3 0.9 | 0.7 > > =++= > > > > Each test result is based on the exeuction environment as follows. > > > > DRAM-only : redis-server uses only local DRAM memory. > > CXL-only: redis-server uses only CXL memory. > > default : default memory policy(MPOL_DEFAULT). > > numa balancing disabled. > > DAMON tiered: DAMON enabled with DAMOS_MIGRATE_COLD for DRAM nodes and > > DAMOS_MIGRATE_HOT for CXL nodes. > > > > The above result shows the "default" execution time goes up as the size > > of cold memory is increased from 440G to 500G because the more cold > > memory used, the more CXL memory is used for the target redis workload > > and this makes the execution time increase. > > > > However, "DAMON tiered" result shows less slowdown because the > > DAMOS_MIGRATE_COLD action at DRAM node proactively demotes pre-allocated > > cold memory to CXL node and this free space at DRAM increases more > > chance to allocate hot or warm pages of redis-server to fast DRAM node. > > Moreover, DAMOS_MIGRATE_HOT action at CXL node also promotes hot pages > > of redis-server to DRAM node actively. > > > > As a result, it makes more memory of redis-server stay in DRAM node > > compared to "default" memory policy and this makes the performance > > improvement. > > > > The following result of latest distribution workload shows similar data. > > > > 2. YCSB latest distribution read only workload > > memory pressure with cold memory on node0 with 512GB of local DRAM. > > =++= > >| cold memory occupied by mmap and memset | > >| 0G 440G 450G 460G 470G 480G 490G 500G | > > =++= > > Execution time normalized to DRAM-only values | GEOMEAN > > -++- > > DRAM-only| 1.00 - - - - - - - | 1.00 > > CXL-only | 1.18 - - - - - - - | 1.18 > > default |- 1.18 1.19 1.18 1.18 1.17 1.19 1.18 | 1.18 > > DAMON tiered |- 1.04 1.04 1.04 1.05 1.04 1.05 1.05 | 1.04 > > =++= > > CXL usage of redis-server in GB | AVERAGE > > -++- > > DRAM-only| 0.0 - - - - - - - | 0.0 > > CXL-only | 52.6 - - - - - - - | 52.6 > > default |- 20.5 27.1 33.2 39.5 45.5 50.4 50.5 | 38.1 > > DAMON tiered |- 0.2 0.4 0.7 1.6 1.2 1.1 3.4 | 1.2 > > =++= > > > > In summary of both results, our evaluation shows that "DAMON tiered" > > memory management reduces the performance slowdown compared to the > > "default" memory policy from 17~18% to 4~5% when the system runs with > > high memory pressure on its fast tier DRAM nodes. > > > > Having these DAMOS_MIGRATE_HOT and DAMOS_MIGRATE_COLD actions can make > > tiered memory systems run more
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 14:48, Ondřej Jirman wrote: > On Mon, Apr 08, 2024 at 01:59:12PM GMT, Krzysztof Kozlowski wrote: >> On 08/04/2024 13:52, Ondřej Jirman wrote: >>> On Mon, Apr 08, 2024 at 01:24:03PM GMT, Krzysztof Kozlowski wrote: On 08/04/2024 13:21, Pavel Machek wrote: > Hi! > >>> Add binding for anx7688 usb type-c bridge. I don't have a datasheet, >>> but I did best I could. >>> >>> Signed-off-by: Pavel Machek >> >> ... >> >>> + cabledet-gpios: >>> +maxItems: 1 >>> +description: GPIO controlling CABLE_DET (C3) pin. >>> + >>> + avdd10-supply: >>> +description: 1.0V power supply going to AVDD10 (A4, ...) pins >>> + >>> + dvdd10-supply: >>> +description: 1.0V power supply going to DVDD10 (D6, ...) pins >>> + >>> + avdd18-supply: >>> +description: 1.8V power supply going to AVDD18 (E3, ...) pins >>> + >>> + dvdd18-supply: >>> +description: 1.8V power supply going to DVDD18 (G4, ...) pins >>> + >>> + avdd33-supply: >>> +description: 3.3V power supply going to AVDD33 (C4, ...) pins >>> + >>> + i2c-supply: true >>> + vconn-supply: true >> >> There are no such supplies like i2c and vconn on the schematics. >> >> I think this represents some other part of component which was added >> here only for convenience. > > Can you give me pointer to documentation you are looking at? The schematics you linked in the document at the beginning. Page 13. Do you see these pins there? I saw only VCONN1_EN, but that's not a supply. >>> >>> The supply is U1308. >> >> That's not a supply to anx7688. > > Yeah, I understand where the confusion is. The driver is not for anx7688 chip > really. The driver is named anx7688, but that's mostly a historical accident > at > this point. > > I guess there can be a driver for anx7688 chip that can directly use the > chip's > resources from the host by directly manipulating its registers and > implementing > type-c functionality via eg. Linux's TCPM or TCPCI stack, etc. (eg. like > fusb302 driver, or various tcpci subdrivers). > > But in this case the chip is driven by an optional on-chip microcontroller's > firmware and *this driver* is specifically for *the Type-C port on Pinephone* We do not talk here about the driver, but bindings, so hardware. > and serves as an integration driver for quite a bunch of things that need to > work together on Pinephone for all of the Type-C port's features to operate > reasonably well (and one of those is some communication with anx7688 firmware > that we use, and enabling power to this chip and other things as appropriate, > based on the communication from the firmware). That's still looking like putting board design into particular device binding. > > It handles the specific needs of the Pinephone's Type-C implementation, all of > its quirks (of which there are many over several HW revisions) that can't be > handled by the particular implementation of on-chip microcontroller firmware > directly and need host side interaction. > > In an ideal world, many of the things this driver handles would be handled by > embedded microcontroller on the board (like it is with some RK3399 based > Google > devices), but Pinephone has no such thing and this glue needs to be > implemented > somewhere in the kernel. You might need multiple schemas, because this is for anx7688, not for Pinephone type-c implementation. However I still do not see yet a limitation of DTS requiring stuffing some other properties into anx7688 or creating some other, virtual entity. Best regards, Krzysztof
Re: [PATCH v3] Documentation: Add reconnect process for VDUSE
On Mon, Apr 08, 2024 at 08:39:21PM +0800, Cindy Lu wrote: > On Mon, Apr 8, 2024 at 3:40 PM Michael S. Tsirkin wrote: > > > > On Thu, Apr 04, 2024 at 01:56:31PM +0800, Cindy Lu wrote: > > > Add a document explaining the reconnect process, including what the > > > Userspace App needs to do and how it works with the kernel. > > > > > > Signed-off-by: Cindy Lu > > > --- > > > Documentation/userspace-api/vduse.rst | 41 +++ > > > 1 file changed, 41 insertions(+) > > > > > > diff --git a/Documentation/userspace-api/vduse.rst > > > b/Documentation/userspace-api/vduse.rst > > > index bdb880e01132..7faa83462e78 100644 > > > --- a/Documentation/userspace-api/vduse.rst > > > +++ b/Documentation/userspace-api/vduse.rst > > > @@ -231,3 +231,44 @@ able to start the dataplane processing as follows: > > > after the used ring is filled. > > > > > > For more details on the uAPI, please see include/uapi/linux/vduse.h. > > > + > > > +HOW VDUSE devices reconnection works > > > + > > > +1. What is reconnection? > > > + > > > + When the userspace application loads, it should establish a connection > > > + to the vduse kernel device. Sometimes,the userspace application > > > exists, > > > + and we want to support its restart and connect to the kernel device > > > again > > > + > > > +2. How can I support reconnection in a userspace application? > > > + > > > +2.1 During initialization, the userspace application should first verify > > > the > > > +existence of the device "/dev/vduse/vduse_name". > > > +If it doesn't exist, it means this is the first-time for connection. > > > goto step 2.2 > > > +If it exists, it means this is a reconnection, and we should goto > > > step 2.3 > > > + > > > +2.2 Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on > > > +/dev/vduse/control. > > > +When ioctl(VDUSE_CREATE_DEV) is called, kernel allocates memory for > > > +the reconnect information. The total memory size is > > > PAGE_SIZE*vq_mumber. > > > > Confused. Where is that allocation, in code? > > > > Thanks! > > > this should allocated in function vduse_create_dev(), I mean, it's not allocated there ATM right? This is just doc patch to become part of a larger patchset? > I will rewrite > this part to make it more clearer > will send a new version soon > Thanks > cindy > > > > +2.3 Check if the information is suitable for reconnect > > > +If this is reconnection : > > > +Before attempting to reconnect, The userspace application needs to > > > use the > > > +ioctl(VDUSE_DEV_GET_CONFIG, VDUSE_DEV_GET_STATUS, > > > VDUSE_DEV_GET_FEATURES...) > > > +to get the information from kernel. > > > +Please review the information and confirm if it is suitable to > > > reconnect. > > > + > > > +2.4 Userspace application needs to mmap the memory to userspace > > > +The userspace application requires mapping one page for every vq. > > > These pages > > > +should be used to save vq-related information during system running. > > > Additionally, > > > +the application must define its own structure to store information > > > for reconnection. > > > + > > > +2.5 Completed the initialization and running the application. > > > +While the application is running, it is important to store relevant > > > information > > > +about reconnections in mapped pages. When calling the ioctl > > > VDUSE_VQ_GET_INFO to > > > +get vq information, it's necessary to check whether it's a > > > reconnection. If it is > > > +a reconnection, the vq-related information must be get from the > > > mapped pages. > > > + > > > +2.6 When the Userspace application exits, it is necessary to unmap all > > > the > > > +pages for reconnection > > > -- > > > 2.43.0 > >
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On Mon, Apr 08, 2024 at 01:59:12PM GMT, Krzysztof Kozlowski wrote: > On 08/04/2024 13:52, Ondřej Jirman wrote: > > On Mon, Apr 08, 2024 at 01:24:03PM GMT, Krzysztof Kozlowski wrote: > >> On 08/04/2024 13:21, Pavel Machek wrote: > >>> Hi! > >>> > > Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > > but I did best I could. > > > > Signed-off-by: Pavel Machek > > ... > > > + cabledet-gpios: > > +maxItems: 1 > > +description: GPIO controlling CABLE_DET (C3) pin. > > + > > + avdd10-supply: > > +description: 1.0V power supply going to AVDD10 (A4, ...) pins > > + > > + dvdd10-supply: > > +description: 1.0V power supply going to DVDD10 (D6, ...) pins > > + > > + avdd18-supply: > > +description: 1.8V power supply going to AVDD18 (E3, ...) pins > > + > > + dvdd18-supply: > > +description: 1.8V power supply going to DVDD18 (G4, ...) pins > > + > > + avdd33-supply: > > +description: 3.3V power supply going to AVDD33 (C4, ...) pins > > + > > + i2c-supply: true > > + vconn-supply: true > > There are no such supplies like i2c and vconn on the schematics. > > I think this represents some other part of component which was added > here only for convenience. > >>> > >>> Can you give me pointer to documentation you are looking at? > >> > >> The schematics you linked in the document at the beginning. Page 13. Do > >> you see these pins there? I saw only VCONN1_EN, but that's not a supply. > > > > The supply is U1308. > > That's not a supply to anx7688. Yeah, I understand where the confusion is. The driver is not for anx7688 chip really. The driver is named anx7688, but that's mostly a historical accident at this point. I guess there can be a driver for anx7688 chip that can directly use the chip's resources from the host by directly manipulating its registers and implementing type-c functionality via eg. Linux's TCPM or TCPCI stack, etc. (eg. like fusb302 driver, or various tcpci subdrivers). But in this case the chip is driven by an optional on-chip microcontroller's firmware and *this driver* is specifically for *the Type-C port on Pinephone* and serves as an integration driver for quite a bunch of things that need to work together on Pinephone for all of the Type-C port's features to operate reasonably well (and one of those is some communication with anx7688 firmware that we use, and enabling power to this chip and other things as appropriate, based on the communication from the firmware). It handles the specific needs of the Pinephone's Type-C implementation, all of its quirks (of which there are many over several HW revisions) that can't be handled by the particular implementation of on-chip microcontroller firmware directly and need host side interaction. In an ideal world, many of the things this driver handles would be handled by embedded microcontroller on the board (like it is with some RK3399 based Google devices), but Pinephone has no such thing and this glue needs to be implemented somewhere in the kernel. Kind regards, o. > Best regards, > Krzysztof >
Re: [PATCH v2] kprobes: Avoid possible warn in __arm_kprobe_ftrace()
Hi Zheng, On Mon, 8 Apr 2024 16:34:03 +0800 Zheng Yejian wrote: > There is once warn in __arm_kprobe_ftrace() on: > > ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0); > if (WARN_ONCE(..., "Failed to arm kprobe-ftrace at %pS (error %d)\n", ...) >return ret; > > This warning is generated because 'p->addr' is detected to be not a valid > ftrace location in ftrace_set_filter_ip(). The ftrace address check is done > by check_ftrace_location() at the beginning of check_kprobe_address_safe(). > At that point, ftrace_location(addr) == addr should return true if the > module is loaded. Then the module is searched twice: > 1. in is_module_text_address(), we find that 'p->addr' is in a module; > 2. in __module_text_address(), we find the module; > > If the module has just been unloaded before the second search, then > '*probed_mod' is NULL and we would not go to get the module refcount, > then the return value of check_kprobe_address_safe() would be 0, but > actually we need to return -EINVAL. OK, so you found a race window in check_kprobe_address_safe(). It does something like below. check_kprobe_address_safe() { ... /* Timing [A] */ if (!(core_kernel_text(p->addr) || is_module_text_address(p->addr)) || ...(other reserved address check)) { return -EINVAL; } /* Timing [B] */ *probed_mod = __module_text_address(p->addr): if (*probe_mod) { if (!try_module_get(*probed_mod)) { return -ENOENT; } ... } } So, if p->addr is in a module which is alive at the timing [A], but unloaded at timing [B], 'p->addr' is passed the 'is_module_text_address(p->addr)' check, but *probed_mod becomes NULL. Thus the corresponding module is not referenced and kprobe_arm(p) will access a wrong address (use after free). This happens either kprobe on ftrace is enabled or not. To fix this problem, we should move the mutex_lock(kprobe_mutex) before check_kprobe_address_safe() because kprobe_module_callback() also lock it so it can stop module unloading. Can you ensure this will fix your problem? I think your patch is just optimizing but not fixing the fundamental problem, which is we don't have an atomic search symbol and get module API. In that case, we should stop a whole module unloading system until registering a new kprobe on a module. (After registering the kprobe, the callback can mark it gone and disarm_kprobe does not work anymore.) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 9d9095e81792..94eaefd1bc51 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1633,11 +1633,11 @@ int register_kprobe(struct kprobe *p) p->nmissed = 0; INIT_LIST_HEAD(>list); + mutex_lock(_mutex); + ret = check_kprobe_address_safe(p, _mod); if (ret) - return ret; - - mutex_lock(_mutex); + goto out; if (on_func_entry) p->flags |= KPROBE_FLAG_ON_FUNC_ENTRY; Thank you, > > To fix it, originally we can simply check 'p->addr' is out of text again, > like below. But that would check twice respectively in kernel text and > module text, so finally I reduce them to be once. > > if (!(core_kernel_text((unsigned long) p->addr) || > is_module_text_address((unsigned long) p->addr)) || ...) { > ret = -EINVAL; > goto out; > } > ... > *probed_mod = __module_text_address((unsigned long) p->addr); > if (*probed_mod) { > ... > } else if (!core_kernel_text((unsigned long) p->addr)) { // check again! > ret = -EINVAL; > goto out; > } > > Signed-off-by: Zheng Yejian > --- > kernel/kprobes.c | 18 -- > 1 file changed, 12 insertions(+), 6 deletions(-) > > v2: > - Update commit messages and comments as suggested by Masami. >Link: > https://lore.kernel.org/all/20240408115038.b0c85767bf1f249eccc32...@kernel.org/ > > v1: > - Link: > https://lore.kernel.org/all/20240407035904.2556645-1-zhengyeji...@huawei.com/ > > diff --git a/kernel/kprobes.c b/kernel/kprobes.c > index 9d9095e81792..65adc815fc6e 100644 > --- a/kernel/kprobes.c > +++ b/kernel/kprobes.c > @@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p, > jump_label_lock(); > preempt_disable(); > > - /* Ensure it is not in reserved area nor out of text */ > - if (!(core_kernel_text((unsigned long) p->addr) || > - is_module_text_address((unsigned long) p->addr)) || > - in_gate_area_no_mm((unsigned long) p->addr) || > + /* Ensure the address is in a text area, and find a module if exists. */ > + *probed_mod = NULL; > + if (!core_kernel_text((unsigned long) p->addr)) { > + *probed_mod = __module_text_address((unsigned long) p->addr); > + if (!(*probed_mod)) { > + ret = -EINVAL; > +
Re: [PATCH v3] Documentation: Add reconnect process for VDUSE
On Mon, Apr 8, 2024 at 3:40 PM Michael S. Tsirkin wrote: > > On Thu, Apr 04, 2024 at 01:56:31PM +0800, Cindy Lu wrote: > > Add a document explaining the reconnect process, including what the > > Userspace App needs to do and how it works with the kernel. > > > > Signed-off-by: Cindy Lu > > --- > > Documentation/userspace-api/vduse.rst | 41 +++ > > 1 file changed, 41 insertions(+) > > > > diff --git a/Documentation/userspace-api/vduse.rst > > b/Documentation/userspace-api/vduse.rst > > index bdb880e01132..7faa83462e78 100644 > > --- a/Documentation/userspace-api/vduse.rst > > +++ b/Documentation/userspace-api/vduse.rst > > @@ -231,3 +231,44 @@ able to start the dataplane processing as follows: > > after the used ring is filled. > > > > For more details on the uAPI, please see include/uapi/linux/vduse.h. > > + > > +HOW VDUSE devices reconnection works > > + > > +1. What is reconnection? > > + > > + When the userspace application loads, it should establish a connection > > + to the vduse kernel device. Sometimes,the userspace application exists, > > + and we want to support its restart and connect to the kernel device > > again > > + > > +2. How can I support reconnection in a userspace application? > > + > > +2.1 During initialization, the userspace application should first verify > > the > > +existence of the device "/dev/vduse/vduse_name". > > +If it doesn't exist, it means this is the first-time for connection. > > goto step 2.2 > > +If it exists, it means this is a reconnection, and we should goto step > > 2.3 > > + > > +2.2 Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on > > +/dev/vduse/control. > > +When ioctl(VDUSE_CREATE_DEV) is called, kernel allocates memory for > > +the reconnect information. The total memory size is > > PAGE_SIZE*vq_mumber. > > Confused. Where is that allocation, in code? > > Thanks! > this should allocated in function vduse_create_dev(), I will rewrite this part to make it more clearer will send a new version soon Thanks cindy > > +2.3 Check if the information is suitable for reconnect > > +If this is reconnection : > > +Before attempting to reconnect, The userspace application needs to use > > the > > +ioctl(VDUSE_DEV_GET_CONFIG, VDUSE_DEV_GET_STATUS, > > VDUSE_DEV_GET_FEATURES...) > > +to get the information from kernel. > > +Please review the information and confirm if it is suitable to > > reconnect. > > + > > +2.4 Userspace application needs to mmap the memory to userspace > > +The userspace application requires mapping one page for every vq. > > These pages > > +should be used to save vq-related information during system running. > > Additionally, > > +the application must define its own structure to store information for > > reconnection. > > + > > +2.5 Completed the initialization and running the application. > > +While the application is running, it is important to store relevant > > information > > +about reconnections in mapped pages. When calling the ioctl > > VDUSE_VQ_GET_INFO to > > +get vq information, it's necessary to check whether it's a > > reconnection. If it is > > +a reconnection, the vq-related information must be get from the mapped > > pages. > > + > > +2.6 When the Userspace application exits, it is necessary to unmap all the > > +pages for reconnection > > -- > > 2.43.0 >
Re: [PATCH v10 12/14] x86/sgx: Turn on per-cgroup EPC reclamation
> --- a/arch/x86/kernel/cpu/sgx/epc_cgroup.h > +++ b/arch/x86/kernel/cpu/sgx/epc_cgroup.h > @@ -28,6 +28,10 @@ static inline int sgx_cgroup_try_charge(struct sgx_cgroup > *sgx_cg, enum sgx_recl > static inline void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg) { } > > static inline void sgx_cgroup_init(void) { } > + > +static inline void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct > mm_struct *charge_mm) > +{ > +} > #else > struct sgx_cgroup { > struct misc_cg *cg; > @@ -65,6 +69,9 @@ static inline void sgx_put_cg(struct sgx_cgroup *sgx_cg) > > int sgx_cgroup_try_charge(struct sgx_cgroup *sgx_cg, enum sgx_reclaim > reclaim); > void sgx_cgroup_uncharge(struct sgx_cgroup *sgx_cg); > +bool sgx_cgroup_lru_empty(struct misc_cg *root); > +bool sgx_cgroup_should_reclaim(struct sgx_cgroup *sgx_cg); > +void sgx_cgroup_reclaim_pages(struct misc_cg *root, struct mm_struct > *charge_mm); Seems the decision to choose whether to implement a stub function for some function is quite random to me. My impression is people in general don't like #ifdef in the C file but prefer to implementing it in the header in some helper function. I guess you might want to just implement a stub function for each of the 3 functions exposed, so that we can eliminate some #ifdefs in the sgx/main.c (see below). > void sgx_cgroup_init(void); > > #endif > diff --git a/arch/x86/kernel/cpu/sgx/main.c b/arch/x86/kernel/cpu/sgx/main.c > index 7f92455d957d..68f28ff2d5ef 100644 > --- a/arch/x86/kernel/cpu/sgx/main.c > +++ b/arch/x86/kernel/cpu/sgx/main.c > @@ -34,6 +34,16 @@ static struct sgx_epc_lru_list sgx_global_lru; > > static inline struct sgx_epc_lru_list *sgx_lru_list(struct sgx_epc_page > *epc_page) > { > +#ifdef CONFIG_CGROUP_SGX_EPC > + if (epc_page->sgx_cg) > + return _page->sgx_cg->lru; > + > + /* > + * This should not happen when cgroup is enabled: Every page belongs > + * to a cgroup, or the root by default. > + */ > + WARN_ON_ONCE(1); In the case MISC cgroup is enabled in Kconfig but disabled by command line, I think this becomes legal now? > +#endif > return _global_lru; > } > > @@ -42,7 +52,11 @@ static inline struct sgx_epc_lru_list *sgx_lru_list(struct > sgx_epc_page *epc_pag > */ > static inline bool sgx_can_reclaim(void) > { > +#ifdef CONFIG_CGROUP_SGX_EPC > + return !sgx_cgroup_lru_empty(misc_cg_root()); > +#else > return !list_empty(_global_lru.reclaimable); > +#endif > } > Here you are using #ifdef CONFIG_CGRUP_SGX_EPC, but ... > static atomic_long_t sgx_nr_free_pages = ATOMIC_LONG_INIT(0); > @@ -404,7 +418,10 @@ static bool sgx_should_reclaim(unsigned long watermark) > > static void sgx_reclaim_pages_global(struct mm_struct *charge_mm) > { > - sgx_reclaim_pages(_global_lru, charge_mm); > + if (IS_ENABLED(CONFIG_CGROUP_SGX_EPC)) > + sgx_cgroup_reclaim_pages(misc_cg_root(), charge_mm); > + else > + sgx_reclaim_pages(_global_lru, charge_mm); > } ... here you are using IS_ENABLED(CONFIG_CGROUP_SGX_EPC). Any reason they are not consistent? Also, in the case where MISC cgroup is disabled via commandline, I think it won't work, because misc_cg_root() should be NULL in this case while IS_ENABLED(CONFIG_CGROUP_SGX_EPC) is true. > > /* > @@ -414,6 +431,16 @@ static void sgx_reclaim_pages_global(struct mm_struct > *charge_mm) > */ > void sgx_reclaim_direct(void) > { > +#ifdef CONFIG_CGROUP_SGX_EPC > + struct sgx_cgroup *sgx_cg = sgx_get_current_cg(); > + > + /* Make sure there are some free pages at cgroup level */ > + if (sgx_cg && sgx_cgroup_should_reclaim(sgx_cg)) { > + sgx_cgroup_reclaim_pages(sgx_cg->cg, current->mm); > + sgx_put_cg(sgx_cg); > + } > +#endif This #ifdef CONFIG_CGROUP_SGX_EPC can be removed if we implement a stub function for sgx_cgroup_should_reclaim(). > + /* Make sure there are some free pages at global level */ > if (sgx_should_reclaim(SGX_NR_LOW_PAGES)) > sgx_reclaim_pages_global(current->mm); > }
Re: [RFC PATCH v3 5/7] mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion
On Fri, 5 Apr 2024 12:24:30 -0700 SeongJae Park wrote: > On Fri, 5 Apr 2024 15:08:54 +0900 Honggyu Kim wrote: > > > This patch introduces DAMOS_MIGRATE_COLD action, which is similar to > > DAMOS_PAGEOUT, but migrate folios to the given 'target_nid' in the sysfs > > instead of swapping them out. > > > > The 'target_nid' sysfs knob is created by this patch to inform the > > migration target node ID. > > Isn't it created by the previous patch? Right. I didn't fix the commit message after split this patch. I will fix it. > > > > Here is one of the example usage of this 'migrate_cold' action. > > > > $ cd /sys/kernel/mm/damon/admin/kdamonds/ > > $ cat contexts//schemes//action > > migrate_cold > > $ echo 2 > contexts//schemes//target_nid > > $ echo commit > state > > $ numactl -p 0 ./hot_cold 500M 600M & > > $ numastat -c -p hot_cold > > > > Per-node process memory usage (in MBs) > > PID Node 0 Node 1 Node 2 Total > > -- -- -- -- - > > 701 (hot_cold) 501 0601 1101 > > > > Since there are some common routines with pageout, many functions have > > similar logics between pageout and migrate cold. > > > > damon_pa_migrate_folio_list() is a minimized version of > > shrink_folio_list(), but it's minified only for demotion. > > MIGRATE_COLD is not only for demotion, right? I think the last two words are > better to be removed for reducing unnecessary confuses. You mean the last two sentences? I will remove them if you feel it's confusing. > > > > Signed-off-by: Honggyu Kim > > Signed-off-by: Hyeongtak Ji > > --- > > include/linux/damon.h| 2 + > > mm/damon/paddr.c | 146 ++- > > mm/damon/sysfs-schemes.c | 4 ++ > > 3 files changed, 151 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/damon.h b/include/linux/damon.h > > index 24ea33a03d5d..df8671e69a70 100644 > > --- a/include/linux/damon.h > > +++ b/include/linux/damon.h > > @@ -105,6 +105,7 @@ struct damon_target { > > * @DAMOS_NOHUGEPAGE: Call ``madvise()`` for the region with > > MADV_NOHUGEPAGE. > > * @DAMOS_LRU_PRIO:Prioritize the region on its LRU lists. > > * @DAMOS_LRU_DEPRIO: Deprioritize the region on its LRU lists. > > + * @DAMOS_MIGRATE_COLD: Migrate for the given cold region. > > Whether it will be for cold region or not is depending on the target access > pattern. What about 'Migrate the regions in coldest regions first manner.'? > Or, simply 'Migrate the regions (prioritize cold)' here, and explain about the > prioritization under quota on the detailed comments part? "Migrate the regions in coldest regions first manner under quota" sounds better. I will change it. > Also, let's use tab consistently. Yeah, it's a mistake. will fix it. > > * @DAMOS_STAT:Do nothing but count the stat. > > * @NR_DAMOS_ACTIONS: Total number of DAMOS actions > > * > > @@ -122,6 +123,7 @@ enum damos_action { > > DAMOS_NOHUGEPAGE, > > DAMOS_LRU_PRIO, > > DAMOS_LRU_DEPRIO, > > + DAMOS_MIGRATE_COLD, > > DAMOS_STAT, /* Do nothing but only record the stat */ > > NR_DAMOS_ACTIONS, > > }; > > diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c > > index 277a1c4d833c..fe217a26f788 100644 > > --- a/mm/damon/paddr.c > > +++ b/mm/damon/paddr.c > > @@ -12,6 +12,9 @@ > > #include > > #include > > #include > > +#include > > +#include > > +#include > > > > #include "../internal.h" > > #include "ops-common.h" > > @@ -226,8 +229,137 @@ static bool damos_pa_filter_out(struct damos *scheme, > > struct folio *folio) > > > > enum migration_mode { > > MIG_PAGEOUT, > > + MIG_MIGRATE_COLD, > > }; > > > > +static unsigned int migrate_folio_list(struct list_head *migrate_folios, > > + struct pglist_data *pgdat, > > + int target_nid) > > To avoid name collisions, I'd prefer having damon_pa_prefix. I show this > patch > is defining damon_pa_migrate_folio_list() below, though. What about > __damon_pa_migrate_folio_list()? Ack. I will change it to __damon_pa_migrate_folio_list(). > > +{ > > + unsigned int nr_succeeded; > > + nodemask_t allowed_mask = NODE_MASK_NONE; > > + > > I personally prefer not having empty lines in the middle of variable > declarations/definitions. Could we remove this empty line? I can remove it, but I would like to have more discussion about this issue. The current implementation allows only a single migration target with "target_nid", but users might want to provide fall back migration target nids. For example, if more than two CXL nodes exist in the system, users might want to migrate cold pages to any CXL nodes. In such cases, we might have to make "target_nid" accept comma separated node IDs. nodemask can be better but we should provide a way to change the scanning order. I would like to hear how you think about
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 13:51, Ondřej Jirman wrote: > Hi Krzysztof, > > On Mon, Apr 08, 2024 at 01:17:32PM GMT, Krzysztof Kozlowski wrote: >> On 08/04/2024 12:51, Pavel Machek wrote: >>> Add binding for anx7688 usb type-c bridge. I don't have a datasheet, >>> but I did best I could. >>> >>> Signed-off-by: Pavel Machek >> >> ... >> >>> + cabledet-gpios: >>> +maxItems: 1 >>> +description: GPIO controlling CABLE_DET (C3) pin. >>> + >>> + avdd10-supply: >>> +description: 1.0V power supply going to AVDD10 (A4, ...) pins >>> + >>> + dvdd10-supply: >>> +description: 1.0V power supply going to DVDD10 (D6, ...) pins >>> + >>> + avdd18-supply: >>> +description: 1.8V power supply going to AVDD18 (E3, ...) pins >>> + >>> + dvdd18-supply: >>> +description: 1.8V power supply going to DVDD18 (G4, ...) pins >>> + >>> + avdd33-supply: >>> +description: 3.3V power supply going to AVDD33 (C4, ...) pins >>> + >>> + i2c-supply: true >>> + vconn-supply: true >> >> There are no such supplies like i2c and vconn on the schematics. > > Which schematics? > > ANX7688 has VCONN1/2_EN GPIOs that control a switching of VCONN power supply > to resective CCx pins. That's just a switch signal. Power for VCONN needs > to come from somewhere and the driver needs to enable the regulator at > the appropriate time only. > > On Pinephone it can't be an always on power supply and needs to be enabled > only when used due to HW design of the circuit. (default without ANX driver > initialized would be to shove 5V to both CC pins, which breaks Type-C spec) > > I2C supply is needed for I2C bus to work, apparently. There's nothing > that says that I2C pull-ups supply has to come from supplies powering the > chip. I2C I/O is open drain and the device needs to enable a bus supply > in order to communicate. No, that's misunderstanding of DT. These are not supplies to anx7688. Bus supply is not related to anx7688. > > You can say that bus master should be managing the bus supply, but you'd still > have a problem that each device may be behind a voltage translator, and > logically, bus master driver should care only about its side of the bus then. > Both side of level shifter need the pull-up power enabled. Again, that's nothing related to anx7688. If you want to add it here, why not to every device everywhere? > > You can also make an argument that bus supply can be always on, but that > causes several other issues on Pinephone due to shared nature of most > resources like these. There are other devices on shared power rails, that > need to be turned off during sleep, etc. > No, do not add non-existing properties on this device as workaround for other issues. Please drop these two supplies *and all other which do not exist* on anx7688. Best regards, Krzysztof
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 13:52, Ondřej Jirman wrote: > On Mon, Apr 08, 2024 at 01:24:03PM GMT, Krzysztof Kozlowski wrote: >> On 08/04/2024 13:21, Pavel Machek wrote: >>> Hi! >>> > Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > but I did best I could. > > Signed-off-by: Pavel Machek ... > + cabledet-gpios: > +maxItems: 1 > +description: GPIO controlling CABLE_DET (C3) pin. > + > + avdd10-supply: > +description: 1.0V power supply going to AVDD10 (A4, ...) pins > + > + dvdd10-supply: > +description: 1.0V power supply going to DVDD10 (D6, ...) pins > + > + avdd18-supply: > +description: 1.8V power supply going to AVDD18 (E3, ...) pins > + > + dvdd18-supply: > +description: 1.8V power supply going to DVDD18 (G4, ...) pins > + > + avdd33-supply: > +description: 3.3V power supply going to AVDD33 (C4, ...) pins > + > + i2c-supply: true > + vconn-supply: true There are no such supplies like i2c and vconn on the schematics. I think this represents some other part of component which was added here only for convenience. >>> >>> Can you give me pointer to documentation you are looking at? >> >> The schematics you linked in the document at the beginning. Page 13. Do >> you see these pins there? I saw only VCONN1_EN, but that's not a supply. > > The supply is U1308. That's not a supply to anx7688. Best regards, Krzysztof
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On Mon, Apr 08, 2024 at 01:24:03PM GMT, Krzysztof Kozlowski wrote: > On 08/04/2024 13:21, Pavel Machek wrote: > > Hi! > > > >>> Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > >>> but I did best I could. > >>> > >>> Signed-off-by: Pavel Machek > >> > >> ... > >> > >>> + cabledet-gpios: > >>> +maxItems: 1 > >>> +description: GPIO controlling CABLE_DET (C3) pin. > >>> + > >>> + avdd10-supply: > >>> +description: 1.0V power supply going to AVDD10 (A4, ...) pins > >>> + > >>> + dvdd10-supply: > >>> +description: 1.0V power supply going to DVDD10 (D6, ...) pins > >>> + > >>> + avdd18-supply: > >>> +description: 1.8V power supply going to AVDD18 (E3, ...) pins > >>> + > >>> + dvdd18-supply: > >>> +description: 1.8V power supply going to DVDD18 (G4, ...) pins > >>> + > >>> + avdd33-supply: > >>> +description: 3.3V power supply going to AVDD33 (C4, ...) pins > >>> + > >>> + i2c-supply: true > >>> + vconn-supply: true > >> > >> There are no such supplies like i2c and vconn on the schematics. > >> > >> I think this represents some other part of component which was added > >> here only for convenience. > > > > Can you give me pointer to documentation you are looking at? > > The schematics you linked in the document at the beginning. Page 13. Do > you see these pins there? I saw only VCONN1_EN, but that's not a supply. The supply is U1308. regards, o. > Best regards, > Krzysztof >
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
Hi Krzysztof, On Mon, Apr 08, 2024 at 01:17:32PM GMT, Krzysztof Kozlowski wrote: > On 08/04/2024 12:51, Pavel Machek wrote: > > Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > > but I did best I could. > > > > Signed-off-by: Pavel Machek > > ... > > > + cabledet-gpios: > > +maxItems: 1 > > +description: GPIO controlling CABLE_DET (C3) pin. > > + > > + avdd10-supply: > > +description: 1.0V power supply going to AVDD10 (A4, ...) pins > > + > > + dvdd10-supply: > > +description: 1.0V power supply going to DVDD10 (D6, ...) pins > > + > > + avdd18-supply: > > +description: 1.8V power supply going to AVDD18 (E3, ...) pins > > + > > + dvdd18-supply: > > +description: 1.8V power supply going to DVDD18 (G4, ...) pins > > + > > + avdd33-supply: > > +description: 3.3V power supply going to AVDD33 (C4, ...) pins > > + > > + i2c-supply: true > > + vconn-supply: true > > There are no such supplies like i2c and vconn on the schematics. Which schematics? ANX7688 has VCONN1/2_EN GPIOs that control a switching of VCONN power supply to resective CCx pins. That's just a switch signal. Power for VCONN needs to come from somewhere and the driver needs to enable the regulator at the appropriate time only. On Pinephone it can't be an always on power supply and needs to be enabled only when used due to HW design of the circuit. (default without ANX driver initialized would be to shove 5V to both CC pins, which breaks Type-C spec) I2C supply is needed for I2C bus to work, apparently. There's nothing that says that I2C pull-ups supply has to come from supplies powering the chip. I2C I/O is open drain and the device needs to enable a bus supply in order to communicate. You can say that bus master should be managing the bus supply, but you'd still have a problem that each device may be behind a voltage translator, and logically, bus master driver should care only about its side of the bus then. Both side of level shifter need the pull-up power enabled. You can also make an argument that bus supply can be always on, but that causes several other issues on Pinephone due to shared nature of most resources like these. There are other devices on shared power rails, that need to be turned off during sleep, etc. Kind regards, o. > I think this represents some other part of component which was added > here only for convenience. > > > > Best regards, > Krzysztof >
Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback
Paolo Bonzini writes: > The .change_pte() MMU notifier callback was intended as an > optimization. The original point of it was that KSM could tell KVM to flip > its secondary PTE to a new location without having to first zap it. At > the time there was also an .invalidate_page() callback; both of them were > *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(), > and .invalidate_page() also doubled as a fallback implementation of > .change_pte(). > > Later on, however, both callbacks were changed to occur within an > invalidate_range_start/end() block. > > In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to > set_pte_at_notify with invalidate_range_start and invalidate_range_end", > 2012-10-09) did so to remove the fallback from .invalidate_page() to > .change_pte() and allow sleepable .invalidate_page() hooks. > > This however made KVM's usage of the .change_pte() callback completely > moot, because KVM unmaps the sPTEs during .invalidate_range_start() > and therefore .change_pte() has no hope of finding a sPTE to change. > Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as > well as all the architecture specific implementations. > > Signed-off-by: Paolo Bonzini > --- > arch/arm64/kvm/mmu.c | 34 - > arch/loongarch/include/asm/kvm_host.h | 1 - > arch/loongarch/kvm/mmu.c | 32 > arch/mips/kvm/mmu.c | 30 --- > arch/powerpc/include/asm/kvm_ppc.h| 1 - > arch/powerpc/kvm/book3s.c | 5 --- > arch/powerpc/kvm/book3s.h | 1 - > arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 -- > arch/powerpc/kvm/book3s_hv.c | 1 - > arch/powerpc/kvm/book3s_pr.c | 7 > arch/powerpc/kvm/e500_mmu_host.c | 6 --- LGTM. Acked-by: Michael Ellerman (powerpc) cheers
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 13:21, Pavel Machek wrote: > Hi! > >>> Add binding for anx7688 usb type-c bridge. I don't have a datasheet, >>> but I did best I could. >>> >>> Signed-off-by: Pavel Machek >> >> ... >> >>> + cabledet-gpios: >>> +maxItems: 1 >>> +description: GPIO controlling CABLE_DET (C3) pin. >>> + >>> + avdd10-supply: >>> +description: 1.0V power supply going to AVDD10 (A4, ...) pins >>> + >>> + dvdd10-supply: >>> +description: 1.0V power supply going to DVDD10 (D6, ...) pins >>> + >>> + avdd18-supply: >>> +description: 1.8V power supply going to AVDD18 (E3, ...) pins >>> + >>> + dvdd18-supply: >>> +description: 1.8V power supply going to DVDD18 (G4, ...) pins >>> + >>> + avdd33-supply: >>> +description: 3.3V power supply going to AVDD33 (C4, ...) pins >>> + >>> + i2c-supply: true >>> + vconn-supply: true >> >> There are no such supplies like i2c and vconn on the schematics. >> >> I think this represents some other part of component which was added >> here only for convenience. > > Can you give me pointer to documentation you are looking at? The schematics you linked in the document at the beginning. Page 13. Do you see these pins there? I saw only VCONN1_EN, but that's not a supply. Best regards, Krzysztof
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
Hi! > > Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > > but I did best I could. > > > > Signed-off-by: Pavel Machek > > ... > > > + cabledet-gpios: > > +maxItems: 1 > > +description: GPIO controlling CABLE_DET (C3) pin. > > + > > + avdd10-supply: > > +description: 1.0V power supply going to AVDD10 (A4, ...) pins > > + > > + dvdd10-supply: > > +description: 1.0V power supply going to DVDD10 (D6, ...) pins > > + > > + avdd18-supply: > > +description: 1.8V power supply going to AVDD18 (E3, ...) pins > > + > > + dvdd18-supply: > > +description: 1.8V power supply going to DVDD18 (G4, ...) pins > > + > > + avdd33-supply: > > +description: 3.3V power supply going to AVDD33 (C4, ...) pins > > + > > + i2c-supply: true > > + vconn-supply: true > > There are no such supplies like i2c and vconn on the schematics. > > I think this represents some other part of component which was added > here only for convenience. Can you give me pointer to documentation you are looking at? Best regards, Pavel -- People of Russia, stop Putin before his war on Ukraine escalates. signature.asc Description: PGP signature
Re: [PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
On 08/04/2024 12:51, Pavel Machek wrote: > Add binding for anx7688 usb type-c bridge. I don't have a datasheet, > but I did best I could. > > Signed-off-by: Pavel Machek ... > + cabledet-gpios: > +maxItems: 1 > +description: GPIO controlling CABLE_DET (C3) pin. > + > + avdd10-supply: > +description: 1.0V power supply going to AVDD10 (A4, ...) pins > + > + dvdd10-supply: > +description: 1.0V power supply going to DVDD10 (D6, ...) pins > + > + avdd18-supply: > +description: 1.8V power supply going to AVDD18 (E3, ...) pins > + > + dvdd18-supply: > +description: 1.8V power supply going to DVDD18 (G4, ...) pins > + > + avdd33-supply: > +description: 3.3V power supply going to AVDD33 (C4, ...) pins > + > + i2c-supply: true > + vconn-supply: true There are no such supplies like i2c and vconn on the schematics. I think this represents some other part of component which was added here only for convenience. Best regards, Krzysztof
Re: [RFC PATCH v3 0/7] DAMON based tiered memory management for CXL memory
Hi SeongJae, On Fri, 5 Apr 2024 12:28:00 -0700 SeongJae Park wrote: > Hello Honggyu, > > On Fri, 5 Apr 2024 15:08:49 +0900 Honggyu Kim wrote: > > > There was an RFC IDEA "DAMOS-based Tiered-Memory Management" previously > > posted at [1]. > > > > It says there is no implementation of the demote/promote DAMOS action > > are made. This RFC is about its implementation for physical address > > space. > > > > > > Changes from RFC v2: > > 1. Rename DAMOS_{PROMOTE,DEMOTE} actions to DAMOS_MIGRATE_{HOT,COLD}. > > 2. Create 'target_nid' to set the migration target node instead of > > depending on node distance based information. > > 3. Instead of having page level access check in this patch series, > > delegate the job to a new DAMOS filter type YOUNG[2]. > > 4. Introduce vmstat counters "damon_migrate_{hot,cold}". > > 5. Rebase from v6.7 to v6.8. > > Thank you for patiently keeping discussion and making this great version! I > left comments on each patch, but found no special concerns. Per-page access > recheck for MIGRATE_HOT and vmstat change are taking my eyes, though. I doubt > if those really needed. It would be nice if you could answer to the comments. I will answer them where you made the comments. > Once my comments on this version are addressed, I would have no reason to > object at dropping the RFC tag from this patchset. Thanks. I will drop the RFC after addressing your comments. > Nonetheless, I show some warnings and errors from checkpatch.pl. I don't > really care about those for RFC patches, so no problem at all. But if you > agree to my opinion about RFC tag dropping, and therefore if you will send > next > version without RFC tag, please make sure you also run checkpatch.pl before > posting. Sure. I will run checkpatch.pl from the next revision. Thanks, Honggyu > > Thanks, > SJ > > [...]
[PATCHv3 2/2] usb: typec: anx7688: Add driver for ANX7688 USB-C HDMI bridge
From: Ondrej Jirman This is driver for ANX7688 USB-C HDMI, with flashing and debugging features removed. ANX7688 is rather criticial piece on PinePhone, there's no display and no battery charging without it. There's likely more work to be done here, but having basic support in mainline is needed to be able to work on the other stuff (networking, cameras, power management). Signed-off-by: Ondrej Jirman Co-developed-by: Martijn Braam Co-developed-by: Samuel Holland Signed-off-by: Pavel Machek --- v2: Fix checkpatch stuff. Some cleanups, adapt to dts format in 1/2. v3: Turn down debugging, as requested during review. diff --git a/drivers/usb/typec/Kconfig b/drivers/usb/typec/Kconfig index 2f80c2792dbd..c9043ae61546 100644 --- a/drivers/usb/typec/Kconfig +++ b/drivers/usb/typec/Kconfig @@ -64,6 +64,17 @@ config TYPEC_ANX7411 If you choose to build this driver as a dynamically linked module, the module will be called anx7411.ko. +config TYPEC_ANX7688 + tristate "Analogix ANX7688 Type-C DRP Port controller and mux driver" + depends on I2C + depends on USB_ROLE_SWITCH + help + Say Y or M here if your system has Analogix ANX7688 Type-C Bridge + controller driver. + + If you choose to build this driver as a dynamically linked module, the + module will be called anx7688.ko. + config TYPEC_RT1719 tristate "Richtek RT1719 Sink Only Type-C controller driver" depends on USB_ROLE_SWITCH || !USB_ROLE_SWITCH diff --git a/drivers/usb/typec/Makefile b/drivers/usb/typec/Makefile index 7a368fea61bc..3f8ff94ad294 100644 --- a/drivers/usb/typec/Makefile +++ b/drivers/usb/typec/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_TYPEC_TCPM)+= tcpm/ obj-$(CONFIG_TYPEC_UCSI) += ucsi/ obj-$(CONFIG_TYPEC_TPS6598X) += tipd/ obj-$(CONFIG_TYPEC_ANX7411)+= anx7411.o +obj-$(CONFIG_TYPEC_ANX7688)+= anx7688.o obj-$(CONFIG_TYPEC_HD3SS3220) += hd3ss3220.o obj-$(CONFIG_TYPEC_STUSB160X) += stusb160x.o obj-$(CONFIG_TYPEC_RT1719) += rt1719.o diff --git a/drivers/usb/typec/anx7688.c b/drivers/usb/typec/anx7688.c new file mode 100644 index ..407cbd5fedba --- /dev/null +++ b/drivers/usb/typec/anx7688.c @@ -0,0 +1,1830 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * ANX7688 USB-C HDMI bridge/PD driver + * + * Warning, this driver is somewhat PinePhone specific. + * + * How this works: + * - this driver allows to program firmware into ANX7688 EEPROM, and + * initialize it + * - it then communicates with the firmware running on the OCM (on-chip + * microcontroller) + * - it detects whether there is cable plugged in or not and powers + * up or down the ANX7688 based on that + * - when the cable is connected the firmware on the OCM will handle + * the detection of the nature of the device on the other end + * of the USB-C cable + * - this driver then communicates with the USB phy to let it swap + * data roles accordingly + * - it also enables VBUS and VCONN regulators as appropriate + * - USB phy driver (Allwinner) needs to know whether to switch to + * device or host mode, or whether to turn off + * - when the firmware detects SRC.1.5A or SRC.3.0A via CC pins + * or something else via PD, it notifies this driver via software + * interrupt and this driver will determine how to update the TypeC + * port status and what input current limit is appropriate + * - input current limit determination happens 500ms after cable + * insertion or hard reset (delay is necessary to determine whether + * the remote end is PD capable or not) + * - this driver tells to the PMIC driver that the input current limit + * needs to be changed + * - this driver also monitors PMIC status and re-sets the input current + * limit if it changes for some reason (due to PMIC internal decision + * making) (this is disabled for now) + * + * ANX7688 FW behavior as observed: + * + * - DO NOT SET MORE THAN 1 SINK CAPABILITY! Firmware will ignore what + * you set and send hardcoded PDO_BATT 5-21V 30W message! + * + * Product brief: + * https://www.analogix.com/en/system/files/AA-002281-PB-6-ANX7688_Product_Brief_0.pdf + * Development notes: + * https://xnux.eu/devices/feature/anx7688.html + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* firmware regs */ + +#define ANX7688_REG_VBUS_OFF_DELAY_TIME 0x22 +#define ANX7688_REG_FEATURE_CTRL0x27 +#define ANX7688_REG_EEPROM_LOAD_STATUS1 0x11 +#define ANX7688_REG_EEPROM_LOAD_STATUS0 0x12 +#define ANX7688_REG_FW_VERSION1 0x15 +#define ANX7688_REG_FW_VERSION0 0x16 + +#define ANX7688_EEPROM_FW_LOADED 0x01 + +#define ANX7688_REG_STATUS_INT_MASK 0x17 +#define ANX7688_REG_STATUS_INT 0x28 +#define ANX7688_IRQS_RECEIVED_MSG BIT(0) +#define ANX7688_IRQS_RECEIVED_ACK BIT(1) +#define
[PATCHv3 1/2] dt-bindings: usb: typec: anx7688: start a binding document
Add binding for anx7688 usb type-c bridge. I don't have a datasheet, but I did best I could. Signed-off-by: Pavel Machek --- v2: implement review feedback v3: fix single character pointed by robot diff --git a/Documentation/devicetree/bindings/usb/analogix,anx7688.yaml b/Documentation/devicetree/bindings/usb/analogix,anx7688.yaml new file mode 100644 index ..48b9ae936cb5 --- /dev/null +++ b/Documentation/devicetree/bindings/usb/analogix,anx7688.yaml @@ -0,0 +1,127 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/usb/analogix,anx7688.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +# Pin names can be deduced from +# https://files.pine64.org/doc/PinePhone/PinePhone%20v1.2b%20Released%20Schematic.pdf + +title: Analogix ANX7688 Type-C controller + +maintainers: + - Pavel Machek + +properties: + compatible: +enum: + - analogix,anx7688 + + reg: +maxItems: 1 + + interrupts: +maxItems: 1 + + reset-gpios: +maxItems: 1 +description: GPIO controlling RESET_N (B7) pin. + + enable-gpios: +maxItems: 1 +description: GPIO controlling POWER_EN (D2) pin. + + cabledet-gpios: +maxItems: 1 +description: GPIO controlling CABLE_DET (C3) pin. + + avdd10-supply: +description: 1.0V power supply going to AVDD10 (A4, ...) pins + + dvdd10-supply: +description: 1.0V power supply going to DVDD10 (D6, ...) pins + + avdd18-supply: +description: 1.8V power supply going to AVDD18 (E3, ...) pins + + dvdd18-supply: +description: 1.8V power supply going to DVDD18 (G4, ...) pins + + avdd33-supply: +description: 3.3V power supply going to AVDD33 (C4, ...) pins + + i2c-supply: true + vconn-supply: true + hdmi-vt-supply: true + vbus-supply: true + vbus-in-supply: true + + connector: +type: object +$ref: /schemas/connector/usb-connector.yaml + +description: + Properties for usb c connector. + +properties: + compatible: +const: usb-c-connector + +required: + - compatible + - reg + - connector + +additionalProperties: false + +examples: + - | +#include +#include + +i2c { +#address-cells = <1>; +#size-cells = <0>; + +typec@2c { +compatible = "analogix,anx7688"; +reg = <0x2c>; +interrupts = <8 IRQ_TYPE_EDGE_FALLING>; +interrupt-parent = <>; + +enable-gpios = < 3 10 GPIO_ACTIVE_LOW>; /* PD10 */ +reset-gpios = < 3 6 GPIO_ACTIVE_HIGH>; /* PD6 */ +cabledet-gpios = <_pio 0 8 GPIO_ACTIVE_HIGH>; /* PL8 */ + +avdd10-supply = <_anx1v0>; +dvdd10-supply = <_anx1v0>; +avdd18-supply = <_ldo_io1>; +dvdd18-supply = <_ldo_io1>; +avdd33-supply = <_dcdc1>; +i2c-supply = <_ldo_io0>; +vconn-supply = <_vconn5v0>; +hdmi-vt-supply = <_dldo1>; + +vbus-supply = <_usb_5v>; +vbus-in-supply = <_power_supply>; + +typec_con: connector { +compatible = "usb-c-connector"; +power-role = "dual"; +data-role = "dual"; +try-power-role = "source"; + +ports { +#address-cells = <1>; +#size-cells = <0>; +port@0 { +reg = <0>; +typec_con_ep: endpoint { +remote-endpoint = <_hs_ep>; +}; +}; +}; +}; +}; +}; +... -- People of Russia, stop Putin before his war on Ukraine escalates. signature.asc Description: PGP signature
[PATCH] tracing: Add new_exec tracepoint
Add "new_exec" tracepoint, which is run right after the point of no return but before the current task assumes its new exec identity. Unlike the tracepoint "sched_process_exec", the "new_exec" tracepoint runs before flushing the old exec, i.e. while the task still has the original state (such as original MM), but when the new exec either succeeds or crashes (but never returns to the original exec). Being able to trace this event can be helpful in a number of use cases: * allowing tracing eBPF programs access to the original MM on exec, before current->mm is replaced; * counting exec in the original task (via perf event); * profiling flush time ("new_exec" to "sched_process_exec"). Example of tracing output ("new_exec" and "sched_process_exec"): $ cat /sys/kernel/debug/tracing/trace_pipe <...>-379 [003] . 179.626921: new_exec: filename=/usr/bin/sshd pid=379 comm=sshd <...>-379 [003] . 179.629131: sched_process_exec: filename=/usr/bin/sshd pid=379 old_pid=379 <...>-381 [002] . 180.048580: new_exec: filename=/bin/bash pid=381 comm=sshd <...>-381 [002] . 180.053122: sched_process_exec: filename=/bin/bash pid=381 old_pid=381 <...>-385 [001] . 180.068277: new_exec: filename=/usr/bin/tty pid=385 comm=bash <...>-385 [001] . 180.069485: sched_process_exec: filename=/usr/bin/tty pid=385 old_pid=385 <...>-389 [006] . 192.020147: new_exec: filename=/usr/bin/dmesg pid=389 comm=bash bash-389 [006] . 192.021377: sched_process_exec: filename=/usr/bin/dmesg pid=389 old_pid=389 Signed-off-by: Marco Elver --- fs/exec.c | 2 ++ include/trace/events/task.h | 30 ++ 2 files changed, 32 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index 38bf71cbdf5e..ab778ae1fc06 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1268,6 +1268,8 @@ int begin_new_exec(struct linux_binprm * bprm) if (retval) return retval; + trace_new_exec(current, bprm); + /* * Ensure all future errors are fatal. */ diff --git a/include/trace/events/task.h b/include/trace/events/task.h index 47b527464d1a..8853dc44783d 100644 --- a/include/trace/events/task.h +++ b/include/trace/events/task.h @@ -56,6 +56,36 @@ TRACE_EVENT(task_rename, __entry->newcomm, __entry->oom_score_adj) ); +/** + * new_exec - called before setting up new exec + * @task: pointer to the current task + * @bprm: pointer to linux_binprm used for new exec + * + * Called before flushing the old exec, but at the point of no return during + * switching to the new exec. + */ +TRACE_EVENT(new_exec, + + TP_PROTO(struct task_struct *task, struct linux_binprm *bprm), + + TP_ARGS(task, bprm), + + TP_STRUCT__entry( + __string( filename, bprm->filename ) + __field(pid_t, pid ) + __string( comm, task->comm ) + ), + + TP_fast_assign( + __assign_str(filename, bprm->filename); + __entry->pid = task->pid; + __assign_str(comm, task->comm); + ), + + TP_printk("filename=%s pid=%d comm=%s", + __get_str(filename), __entry->pid, __get_str(comm)) +); + #endif /* This part must be outside protection */ -- 2.44.0.478.gd926399ef9-goog
[PATCH v2] kprobes: Avoid possible warn in __arm_kprobe_ftrace()
There is once warn in __arm_kprobe_ftrace() on: ret = ftrace_set_filter_ip(ops, (unsigned long)p->addr, 0, 0); if (WARN_ONCE(..., "Failed to arm kprobe-ftrace at %pS (error %d)\n", ...) return ret; This warning is generated because 'p->addr' is detected to be not a valid ftrace location in ftrace_set_filter_ip(). The ftrace address check is done by check_ftrace_location() at the beginning of check_kprobe_address_safe(). At that point, ftrace_location(addr) == addr should return true if the module is loaded. Then the module is searched twice: 1. in is_module_text_address(), we find that 'p->addr' is in a module; 2. in __module_text_address(), we find the module; If the module has just been unloaded before the second search, then '*probed_mod' is NULL and we would not go to get the module refcount, then the return value of check_kprobe_address_safe() would be 0, but actually we need to return -EINVAL. To fix it, originally we can simply check 'p->addr' is out of text again, like below. But that would check twice respectively in kernel text and module text, so finally I reduce them to be once. if (!(core_kernel_text((unsigned long) p->addr) || is_module_text_address((unsigned long) p->addr)) || ...) { ret = -EINVAL; goto out; } ... *probed_mod = __module_text_address((unsigned long) p->addr); if (*probed_mod) { ... } else if (!core_kernel_text((unsigned long) p->addr)) { // check again! ret = -EINVAL; goto out; } Signed-off-by: Zheng Yejian --- kernel/kprobes.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) v2: - Update commit messages and comments as suggested by Masami. Link: https://lore.kernel.org/all/20240408115038.b0c85767bf1f249eccc32...@kernel.org/ v1: - Link: https://lore.kernel.org/all/20240407035904.2556645-1-zhengyeji...@huawei.com/ diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 9d9095e81792..65adc815fc6e 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1567,10 +1567,17 @@ static int check_kprobe_address_safe(struct kprobe *p, jump_label_lock(); preempt_disable(); - /* Ensure it is not in reserved area nor out of text */ - if (!(core_kernel_text((unsigned long) p->addr) || - is_module_text_address((unsigned long) p->addr)) || - in_gate_area_no_mm((unsigned long) p->addr) || + /* Ensure the address is in a text area, and find a module if exists. */ + *probed_mod = NULL; + if (!core_kernel_text((unsigned long) p->addr)) { + *probed_mod = __module_text_address((unsigned long) p->addr); + if (!(*probed_mod)) { + ret = -EINVAL; + goto out; + } + } + /* Ensure it is not in reserved area. */ + if (in_gate_area_no_mm((unsigned long) p->addr) || within_kprobe_blacklist((unsigned long) p->addr) || jump_label_text_reserved(p->addr, p->addr) || static_call_text_reserved(p->addr, p->addr) || @@ -1580,8 +1587,7 @@ static int check_kprobe_address_safe(struct kprobe *p, goto out; } - /* Check if 'p' is probing a module. */ - *probed_mod = __module_text_address((unsigned long) p->addr); + /* Get module refcount and reject __init functions for loaded modules. */ if (*probed_mod) { /* * We must hold a refcount of the probed module while updating -- 2.25.1
[PATCH] [v4] module: don't ignore sysfs_create_link() failures
From: Arnd Bergmann The sysfs_create_link() return code is marked as __must_check, but the module_add_driver() function tries hard to not care, by assigning the return code to a variable. When building with 'make W=1', gcc still warns because this variable is only assigned but not used: drivers/base/module.c: In function 'module_add_driver': drivers/base/module.c:36:6: warning: variable 'no_warn' set but not used [-Wunused-but-set-variable] Rework the code to properly unwind and return the error code to the caller. My reading of the original code was that it tries to not fail when the links already exist, so keep ignoring -EEXIST errors. Fixes: e17e0f51aeea ("Driver core: show drivers in /sys/module/") Reviewed-by: Greg Kroah-Hartman See-also: 4a7fb6363f2d ("add __must_check to device management code") Signed-off-by: Arnd Bergmann --- Luis, can you merge this through the modules-next tree? Cc: Luis Chamberlain Cc: linux-modu...@vger.kernel.org Cc: "Rafael J. Wysocki" --- v4: minor style changes, based on feedback from Andy Shevchenko v3: make error handling stricter, add unwinding, fix build fail with CONFIG_MODULES=n v2: rework to actually handle the error. I have not tested the error handling beyond build testing, so please review carefully. --- drivers/base/base.h | 9 ++--- drivers/base/bus.c| 9 - drivers/base/module.c | 42 +++--- 3 files changed, 45 insertions(+), 15 deletions(-) diff --git a/drivers/base/base.h b/drivers/base/base.h index 0738ccad08b2..db4f910e8e36 100644 --- a/drivers/base/base.h +++ b/drivers/base/base.h @@ -192,11 +192,14 @@ extern struct kset *devices_kset; void devices_kset_move_last(struct device *dev); #if defined(CONFIG_MODULES) && defined(CONFIG_SYSFS) -void module_add_driver(struct module *mod, struct device_driver *drv); +int module_add_driver(struct module *mod, struct device_driver *drv); void module_remove_driver(struct device_driver *drv); #else -static inline void module_add_driver(struct module *mod, -struct device_driver *drv) { } +static inline int module_add_driver(struct module *mod, + struct device_driver *drv) +{ + return 0; +} static inline void module_remove_driver(struct device_driver *drv) { } #endif diff --git a/drivers/base/bus.c b/drivers/base/bus.c index daee55c9b2d9..ffea0728b8b2 100644 --- a/drivers/base/bus.c +++ b/drivers/base/bus.c @@ -674,7 +674,12 @@ int bus_add_driver(struct device_driver *drv) if (error) goto out_del_list; } - module_add_driver(drv->owner, drv); + error = module_add_driver(drv->owner, drv); + if (error) { + printk(KERN_ERR "%s: failed to create module links for %s\n", + __func__, drv->name); + goto out_detach; + } error = driver_create_file(drv, _attr_uevent); if (error) { @@ -699,6 +704,8 @@ int bus_add_driver(struct device_driver *drv) return 0; +out_detach: + driver_detach(drv); out_del_list: klist_del(>knode_bus); out_unregister: diff --git a/drivers/base/module.c b/drivers/base/module.c index 46ad4d636731..a1b55da07127 100644 --- a/drivers/base/module.c +++ b/drivers/base/module.c @@ -30,14 +30,14 @@ static void module_create_drivers_dir(struct module_kobject *mk) mutex_unlock(_dir_mutex); } -void module_add_driver(struct module *mod, struct device_driver *drv) +int module_add_driver(struct module *mod, struct device_driver *drv) { char *driver_name; - int no_warn; struct module_kobject *mk = NULL; + int ret; if (!drv) - return; + return 0; if (mod) mk = >mkobj; @@ -56,17 +56,37 @@ void module_add_driver(struct module *mod, struct device_driver *drv) } if (!mk) - return; + return 0; + + ret = sysfs_create_link(>p->kobj, >kobj, "module"); + if (ret) + return ret; - /* Don't check return codes; these calls are idempotent */ - no_warn = sysfs_create_link(>p->kobj, >kobj, "module"); driver_name = make_driver_name(drv); - if (driver_name) { - module_create_drivers_dir(mk); - no_warn = sysfs_create_link(mk->drivers_dir, >p->kobj, - driver_name); - kfree(driver_name); + if (!driver_name) { + ret = -ENOMEM; + goto out; + } + + module_create_drivers_dir(mk); + if (!mk->drivers_dir) { + ret = -EINVAL; + goto out; } + + ret = sysfs_create_link(mk->drivers_dir, >p->kobj, driver_name); + if (ret) + goto out; + + kfree(driver_name); + + return 0; +out: + sysfs_remove_link(>p->kobj, "module"); +
Re: [PATCH] [v3] module: don't ignore sysfs_create_link() failures
On Tue, Mar 26, 2024, at 16:29, Andy Shevchenko wrote: > On Tue, Mar 26, 2024 at 03:57:18PM +0100, Arnd Bergmann wrote: >> From: Arnd Bergmann >> >> The sysfs_create_link() return code is marked as __must_check, but the >> module_add_driver() function tries hard to not care, by assigning the >> return code to a variable. When building with 'make W=1', gcc still >> warns because this variable is only assigned but not used: >> >> drivers/base/module.c: In function 'module_add_driver': >> drivers/base/module.c:36:6: warning: variable 'no_warn' set but not used >> [-Wunused-but-set-variable] >> >> Rework the code to properly unwind and return the error code to the >> caller. My reading of the original code was that it tries to >> not fail when the links already exist, so keep ignoring -EEXIST >> errors. > >> Cc: Luis Chamberlain >> Cc: linux-modu...@vger.kernel.org >> Cc: Greg Kroah-Hartman >> Cc: "Rafael J. Wysocki" > > Wondering if you can move these to be after --- to avoid polluting commit > message. This will have the same effect and be archived on lore. But on > pros side it will unload the commit message(s) from unneeded noise. Done > >> +error = module_add_driver(drv->owner, drv); >> +if (error) { >> +printk(KERN_ERR "%s: failed to create module links for %s\n", >> +__func__, drv->name); > > What's wrong with pr_err()? Even if it's not a style used, in a new pieces of > code this can be improved beforehand. So, we will reduce a technical debt, and > not adding to it. I think that would be more confusing, and would rather keep the style consistent. There is no practical difference here. >> +int module_add_driver(struct module *mod, struct device_driver *drv) >> { >> char *driver_name; >> -int no_warn; >> +int ret; > > I would move it... > >> struct module_kobject *mk = NULL; > > ...to be here. Done Arnd
Re: [PATCH v2 0/6] virtiofs: fix the warning for ITER_KVEC dio
On Wed, Feb 28, 2024 at 10:41:20PM +0800, Hou Tao wrote: > From: Hou Tao > > Hi, > > The patch set aims to fix the warning related to an abnormal size > parameter of kmalloc() in virtiofs. The warning occurred when attempting > to insert a 10MB sized kernel module kept in a virtiofs with cache > disabled. As analyzed in patch #1, the root cause is that the length of > the read buffer is no limited, and the read buffer is passed directly to > virtiofs through out_args[0].value. Therefore patch #1 limits the > length of the read buffer passed to virtiofs by using max_pages. However > it is not enough, because now the maximal value of max_pages is 256. > Consequently, when reading a 10MB-sized kernel module, the length of the > bounce buffer in virtiofs will be 40 + (256 * 4096), and kmalloc will > try to allocate 2MB from memory subsystem. The request for 2MB of > physically contiguous memory significantly stress the memory subsystem > and may fail indefinitely on hosts with fragmented memory. To address > this, patch #2~#5 use scattered pages in a bio_vec to replace the > kmalloc-allocated bounce buffer when the length of the bounce buffer for > KVEC_ITER dio is larger than PAGE_SIZE. The final issue with the > allocation of the bounce buffer and sg array in virtiofs is that > GFP_ATOMIC is used even when the allocation occurs in a kworker context. > Therefore the last patch uses GFP_NOFS for the allocation of both sg > array and bounce buffer when initiated by the kworker. For more details, > please check the individual patches. > > As usual, comments are always welcome. > > Change Log: Bernd should I just merge the patchset as is? It seems to fix a real problem and no one has the time to work on a better fix WDYT? > v2: > * limit the length of ITER_KVEC dio by max_pages instead of the > newly-introduced max_nopage_rw. Using max_pages make the ITER_KVEC > dio being consistent with other rw operations. > * replace kmalloc-allocated bounce buffer by using a bounce buffer > backed by scattered pages when the length of the bounce buffer for > KVEC_ITER dio is larger than PAG_SIZE, so even on hosts with > fragmented memory, the KVEC_ITER dio can be handled normally by > virtiofs. (Bernd Schubert) > * merge the GFP_NOFS patch [1] into this patch-set and use > memalloc_nofs_{save|restore}+GFP_KERNEL instead of GFP_NOFS > (Benjamin Coddington) > > v1: > https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-hou...@huaweicloud.com/ > > [1]: > https://lore.kernel.org/linux-fsdevel/20240105105305.4052672-1-hou...@huaweicloud.com/ > > Hou Tao (6): > fuse: limit the length of ITER_KVEC dio by max_pages > virtiofs: move alloc/free of argbuf into separated helpers > virtiofs: factor out more common methods for argbuf > virtiofs: support bounce buffer backed by scattered pages > virtiofs: use scattered bounce buffer for ITER_KVEC dio > virtiofs: use GFP_NOFS when enqueuing request through kworker > > fs/fuse/file.c | 12 +- > fs/fuse/virtio_fs.c | 336 +--- > 2 files changed, 296 insertions(+), 52 deletions(-) > > -- > 2.29.2
Re: [PATCH v3] Documentation: Add reconnect process for VDUSE
On Thu, Apr 04, 2024 at 01:56:31PM +0800, Cindy Lu wrote: > Add a document explaining the reconnect process, including what the > Userspace App needs to do and how it works with the kernel. > > Signed-off-by: Cindy Lu > --- > Documentation/userspace-api/vduse.rst | 41 +++ > 1 file changed, 41 insertions(+) > > diff --git a/Documentation/userspace-api/vduse.rst > b/Documentation/userspace-api/vduse.rst > index bdb880e01132..7faa83462e78 100644 > --- a/Documentation/userspace-api/vduse.rst > +++ b/Documentation/userspace-api/vduse.rst > @@ -231,3 +231,44 @@ able to start the dataplane processing as follows: > after the used ring is filled. > > For more details on the uAPI, please see include/uapi/linux/vduse.h. > + > +HOW VDUSE devices reconnection works > + > +1. What is reconnection? > + > + When the userspace application loads, it should establish a connection > + to the vduse kernel device. Sometimes,the userspace application exists, > + and we want to support its restart and connect to the kernel device again > + > +2. How can I support reconnection in a userspace application? > + > +2.1 During initialization, the userspace application should first verify the > +existence of the device "/dev/vduse/vduse_name". > +If it doesn't exist, it means this is the first-time for connection. > goto step 2.2 > +If it exists, it means this is a reconnection, and we should goto step > 2.3 > + > +2.2 Create a new VDUSE instance with ioctl(VDUSE_CREATE_DEV) on > +/dev/vduse/control. > +When ioctl(VDUSE_CREATE_DEV) is called, kernel allocates memory for > +the reconnect information. The total memory size is PAGE_SIZE*vq_mumber. Confused. Where is that allocation, in code? Thanks! > +2.3 Check if the information is suitable for reconnect > +If this is reconnection : > +Before attempting to reconnect, The userspace application needs to use > the > +ioctl(VDUSE_DEV_GET_CONFIG, VDUSE_DEV_GET_STATUS, > VDUSE_DEV_GET_FEATURES...) > +to get the information from kernel. > +Please review the information and confirm if it is suitable to reconnect. > + > +2.4 Userspace application needs to mmap the memory to userspace > +The userspace application requires mapping one page for every vq. These > pages > +should be used to save vq-related information during system running. > Additionally, > +the application must define its own structure to store information for > reconnection. > + > +2.5 Completed the initialization and running the application. > +While the application is running, it is important to store relevant > information > +about reconnections in mapped pages. When calling the ioctl > VDUSE_VQ_GET_INFO to > +get vq information, it's necessary to check whether it's a reconnection. > If it is > +a reconnection, the vq-related information must be get from the mapped > pages. > + > +2.6 When the Userspace application exits, it is necessary to unmap all the > +pages for reconnection > -- > 2.43.0
Re: [PATCH 4/4] mm: replace set_pte_at_notify() with just set_pte_at()
On 05.04.24 13:58, Paolo Bonzini wrote: With the demise of the .change_pte() MMU notifier callback, there is no notification happening in set_pte_at_notify(). It is a synonym of set_pte_at() and can be replaced with it. Signed-off-by: Paolo Bonzini --- A real joy seeing that gone Reviewed-by: David Hildenbrand -- Cheers, David / dhildenb
Re: [PATCH 3/4] mmu_notifier: remove the .change_pte() callback
On 05.04.24 13:58, Paolo Bonzini wrote: The scope of set_pte_at_notify() has reduced more and more through the years. Initially, it was meant for when the change to the PTE was not bracketed by mmu_notifier_invalidate_range_{start,end}(). However, that has not been so for over ten years. During all this period the only implementation of .change_pte() was KVM and it had no actual functionality, because it was called after mmu_notifier_invalidate_range_start() zapped the secondary PTE. Now that this (nonfunctional) user of the .change_pte() callback is gone, the whole callback can be removed. For now, leave in place set_pte_at_notify() even though it is just a synonym for set_pte_at(). Signed-off-by: Paolo Bonzini --- Reviewed-by: David Hildenbrand -- Cheers, David / dhildenb
Re: [PATCH v3 3/3] vhost: Improve vhost_get_avail_idx() with smp_rmb()
On Mon, Apr 08, 2024 at 02:15:24PM +1000, Gavin Shan wrote: > Hi Michael, > > On 3/30/24 19:02, Gavin Shan wrote: > > On 3/28/24 19:31, Michael S. Tsirkin wrote: > > > On Thu, Mar 28, 2024 at 10:21:49AM +1000, Gavin Shan wrote: > > > > All the callers of vhost_get_avail_idx() are concerned to the memory > > > > barrier, imposed by smp_rmb() to ensure the order of the available > > > > ring entry read and avail_idx read. > > > > > > > > Improve vhost_get_avail_idx() so that smp_rmb() is executed when > > > > the avail_idx is advanced. With it, the callers needn't to worry > > > > about the memory barrier. > > > > > > > > Suggested-by: Michael S. Tsirkin > > > > Signed-off-by: Gavin Shan > > > > > > Previous patches are ok. This one I feel needs more work - > > > first more code such as sanity checking should go into > > > this function, second there's actually a difference > > > between comparing to last_avail_idx and just comparing > > > to the previous value of avail_idx. > > > I will pick patches 1-2 and post a cleanup on top so you can > > > take a look, ok? > > > > > > > Thanks, Michael. It's fine to me. > > > > A kindly ping. > > If it's ok to you, could you please merge PATCH[1-2]? Our downstream > 9.4 need the fixes, especially for NVidia's grace-hopper and grace-grace > platforms. > > For PATCH[3], I also can help with the improvement if you don't have time > for it. Please let me know. > > Thanks, > Gavin The thing to do is basically diff with the patch I wrote :) We can also do a bit more cleanups on top of *that*, like unifying error handling. -- MST
Re: [PATCH 1/4] KVM: delete .change_pte MMU notifier callback
On 2024/4/5 下午7:58, Paolo Bonzini wrote: The .change_pte() MMU notifier callback was intended as an optimization. The original point of it was that KSM could tell KVM to flip its secondary PTE to a new location without having to first zap it. At the time there was also an .invalidate_page() callback; both of them were *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(), and .invalidate_page() also doubled as a fallback implementation of change_pte(). Later on, however, both callbacks were changed to occur within an invalidate_range_start/end() block. In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end", 2012-10-09) did so to remove the fallback from .invalidate_page() to change_pte() and allow sleepable .invalidate_page() hooks. This however made KVM's usage of the .change_pte() callback completely moot, because KVM unmaps the sPTEs during .invalidate_range_start() and therefore .change_pte() has no hope of finding a sPTE to change. Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as well as all the architecture specific implementations. Signed-off-by: Paolo Bonzini --- arch/arm64/kvm/mmu.c | 34 - arch/loongarch/include/asm/kvm_host.h | 1 - arch/loongarch/kvm/mmu.c | 32 arch/mips/kvm/mmu.c | 30 --- arch/powerpc/include/asm/kvm_ppc.h| 1 - arch/powerpc/kvm/book3s.c | 5 --- arch/powerpc/kvm/book3s.h | 1 - arch/powerpc/kvm/book3s_64_mmu_hv.c | 12 -- arch/powerpc/kvm/book3s_hv.c | 1 - arch/powerpc/kvm/book3s_pr.c | 7 arch/powerpc/kvm/e500_mmu_host.c | 6 --- arch/riscv/kvm/mmu.c | 20 -- arch/x86/kvm/mmu/mmu.c| 54 +-- arch/x86/kvm/mmu/spte.c | 16 arch/x86/kvm/mmu/spte.h | 2 - arch/x86/kvm/mmu/tdp_mmu.c| 46 --- arch/x86/kvm/mmu/tdp_mmu.h| 1 - include/linux/kvm_host.h | 2 - include/trace/events/kvm.h| 15 virt/kvm/kvm_main.c | 43 - 20 files changed, 2 insertions(+), 327 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index dc04bc767865..ff17849be9f4 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1768,40 +1768,6 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) return false; } -bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) -{ - kvm_pfn_t pfn = pte_pfn(range->arg.pte); - - if (!kvm->arch.mmu.pgt) - return false; - - WARN_ON(range->end - range->start != 1); - - /* -* If the page isn't tagged, defer to user_mem_abort() for sanitising -* the MTE tags. The S2 pte should have been unmapped by -* mmu_notifier_invalidate_range_end(). -*/ - if (kvm_has_mte(kvm) && !page_mte_tagged(pfn_to_page(pfn))) - return false; - - /* -* We've moved a page around, probably through CoW, so let's treat -* it just like a translation fault and the map handler will clean -* the cache to the PoC. -* -* The MMU notifiers will have unmapped a huge PMD before calling -* ->change_pte() (which in turn calls kvm_set_spte_gfn()) and -* therefore we never need to clear out a huge PMD through this -* calling path and a memcache is not required. -*/ - kvm_pgtable_stage2_map(kvm->arch.mmu.pgt, range->start << PAGE_SHIFT, - PAGE_SIZE, __pfn_to_phys(pfn), - KVM_PGTABLE_PROT_R, NULL, 0); - - return false; -} - bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { u64 size = (range->end - range->start) << PAGE_SHIFT; diff --git a/arch/loongarch/include/asm/kvm_host.h b/arch/loongarch/include/asm/kvm_host.h index 2d62f7b0d377..69305441f40d 100644 --- a/arch/loongarch/include/asm/kvm_host.h +++ b/arch/loongarch/include/asm/kvm_host.h @@ -203,7 +203,6 @@ void kvm_flush_tlb_all(void); void kvm_flush_tlb_gpa(struct kvm_vcpu *vcpu, unsigned long gpa); int kvm_handle_mm_fault(struct kvm_vcpu *vcpu, unsigned long badv, bool write); -void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte); int kvm_unmap_hva_range(struct kvm *kvm, unsigned long start, unsigned long end, bool blockable); int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end); int kvm_test_age_hva(struct kvm *kvm, unsigned long hva); diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c index a556cff35740..98883aa23ab8 100644 --- a/arch/loongarch/kvm/mmu.c +++ b/arch/loongarch/kvm/mmu.c @@ -494,38 +494,6 @@ bool kvm_unmap_gfn_range(struct
Re: [PATCH v3 3/3] vhost: Improve vhost_get_avail_idx() with smp_rmb()
On Mon, Apr 08, 2024 at 02:15:24PM +1000, Gavin Shan wrote: > Hi Michael, > > On 3/30/24 19:02, Gavin Shan wrote: > > On 3/28/24 19:31, Michael S. Tsirkin wrote: > > > On Thu, Mar 28, 2024 at 10:21:49AM +1000, Gavin Shan wrote: > > > > All the callers of vhost_get_avail_idx() are concerned to the memory > > > > barrier, imposed by smp_rmb() to ensure the order of the available > > > > ring entry read and avail_idx read. > > > > > > > > Improve vhost_get_avail_idx() so that smp_rmb() is executed when > > > > the avail_idx is advanced. With it, the callers needn't to worry > > > > about the memory barrier. > > > > > > > > Suggested-by: Michael S. Tsirkin > > > > Signed-off-by: Gavin Shan > > > > > > Previous patches are ok. This one I feel needs more work - > > > first more code such as sanity checking should go into > > > this function, second there's actually a difference > > > between comparing to last_avail_idx and just comparing > > > to the previous value of avail_idx. > > > I will pick patches 1-2 and post a cleanup on top so you can > > > take a look, ok? > > > > > > > Thanks, Michael. It's fine to me. > > > > A kindly ping. > > If it's ok to you, could you please merge PATCH[1-2]? Our downstream > 9.4 need the fixes, especially for NVidia's grace-hopper and grace-grace > platforms. Yes - in the next rc hopefully. > For PATCH[3], I also can help with the improvement if you don't have time > for it. Please let me know. > > Thanks, > Gavin That would be great. -- MST
Re: [PATCH net-next v5] virtio_net: Support RX hash XDP hint
On Mon, Apr 1, 2024 at 11:38 AM Liang Chen wrote: > > On Thu, Feb 29, 2024 at 4:37 PM Liang Chen wrote: > > > > On Tue, Feb 27, 2024 at 4:42 AM John Fastabend > > wrote: > > > > > > Jason Wang wrote: > > > > On Fri, Feb 23, 2024 at 9:42 AM Xuan Zhuo > > > > wrote: > > > > > > > > > > On Fri, 09 Feb 2024 13:57:25 +0100, Paolo Abeni > > > > > wrote: > > > > > > On Fri, 2024-02-09 at 18:39 +0800, Liang Chen wrote: > > > > > > > On Wed, Feb 7, 2024 at 10:27 PM Paolo Abeni > > > > > > > wrote: > > > > > > > > > > > > > > > > On Wed, 2024-02-07 at 10:54 +0800, Liang Chen wrote: > > > > > > > > > On Tue, Feb 6, 2024 at 6:44 PM Paolo Abeni > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > On Sat, 2024-02-03 at 10:56 +0800, Liang Chen wrote: > > > > > > > > > > > On Sat, Feb 3, 2024 at 12:20 AM Jesper Dangaard Brouer > > > > > > > > > > > wrote: > > > > > > > > > > > > On 02/02/2024 13.11, Liang Chen wrote: > > > > > > > > > > [...] > > > > > > > > > > > > > @@ -1033,6 +1039,16 @@ static void > > > > > > > > > > > > > put_xdp_frags(struct xdp_buff *xdp) > > > > > > > > > > > > > } > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > +static void virtnet_xdp_save_rx_hash(struct > > > > > > > > > > > > > virtnet_xdp_buff *virtnet_xdp, > > > > > > > > > > > > > + struct net_device > > > > > > > > > > > > > *dev, > > > > > > > > > > > > > + struct > > > > > > > > > > > > > virtio_net_hdr_v1_hash *hdr_hash) > > > > > > > > > > > > > +{ > > > > > > > > > > > > > + if (dev->features & NETIF_F_RXHASH) { > > > > > > > > > > > > > + virtnet_xdp->hash_value = > > > > > > > > > > > > > hdr_hash->hash_value; > > > > > > > > > > > > > + virtnet_xdp->hash_report = > > > > > > > > > > > > > hdr_hash->hash_report; > > > > > > > > > > > > > + } > > > > > > > > > > > > > +} > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > > > > > > > Would it be possible to store a pointer to hdr_hash in > > > > > > > > > > > > virtnet_xdp_buff, > > > > > > > > > > > > with the purpose of delaying extracting this, until and > > > > > > > > > > > > only if XDP > > > > > > > > > > > > bpf_prog calls the kfunc? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That seems to be the way v1 works, > > > > > > > > > > > https://lore.kernel.org/all/20240122102256.261374-1-liangchen.li...@gmail.com/ > > > > > > > > > > > . But it was pointed out that the inline header may be > > > > > > > > > > > overwritten by > > > > > > > > > > > the xdp prog, so the hash is copied out to maintain its > > > > > > > > > > > integrity. > > > > > > > > > > > > > > > > > > > > Why? isn't XDP supposed to get write access only to the pkt > > > > > > > > > > contents/buffer? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Normally, an XDP program accesses only the packet data. > > > > > > > > > However, > > > > > > > > > there's also an XDP RX Metadata area, referenced by the > > > > > > > > > data_meta > > > > > > > > > pointer. This pointer can be adjusted with > > > > > > > > > bpf_xdp_adjust_meta to > > > > > > > > > point somewhere ahead of the data buffer, thereby granting > > > > > > > > > the XDP > > > > > > > > > program access to the virtio header located immediately > > > > > > > > > before the > > > > > > > > > > > > > > > > AFAICS bpf_xdp_adjust_meta() does not allow moving the > > > > > > > > meta_data before > > > > > > > > xdp->data_hard_start: > > > > > > > > > > > > > > > > https://elixir.bootlin.com/linux/latest/source/net/core/filter.c#L4210 > > > > > > > > > > > > > > > > and virtio net set such field after the virtio_net_hdr: > > > > > > > > > > > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L1218 > > > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/net/virtio_net.c#L1420 > > > > > > > > > > > > > > > > I don't see how the virtio hdr could be touched? Possibly even > > > > > > > > more > > > > > > > > important: if such thing is possible, I think is should be > > > > > > > > somewhat > > > > > > > > denied (for the same reason an H/W nic should prevent XDP from > > > > > > > > modifying its own buffer descriptor). > > > > > > > > > > > > > > Thank you for highlighting this concern. The header layout differs > > > > > > > slightly between small and mergeable mode. Taking 'mergeable > > > > > > > mode' as > > > > > > > an example, after calling xdp_prepare_buff the layout of xdp_buff > > > > > > > would be as depicted in the diagram below, > > > > > > > > > > > > > > buf > > > > > > >| > > > > > > >v > > > > > > > +--+--+-+ > > > > > > > | xdp headroom | virtio header| packet | > > > > > > > | (256 bytes) | (20 bytes) | content | > > > > > > >
Re: Re: [PATCH v3] vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI
On Wed, Apr 3, 2024 at 10:47 AM tab wrote: > > > > > > > On Fri, Mar 29, 2024 at 11:55:50AM +0800, Jason Wang wrote: > > > > On Wed, Mar 27, 2024 at 5:08 PM Jason Wang wrote: > > > > > > > > > > On Thu, Mar 21, 2024 at 3:00 PM Michael S. Tsirkin > > > > > wrote: > > > > > > > > > > > > On Wed, Mar 20, 2024 at 06:19:12PM +0800, Wang Rong wrote: > > > > > > > From: Rong Wang > > > > > > > > > > > > > > Once enable iommu domain for one device, the MSI > > > > > > > translation tables have to be there for software-managed MSI. > > > > > > > Otherwise, platform with software-managed MSI without an > > > > > > > irq bypass function, can not get a correct memory write event > > > > > > > from pcie, will not get irqs. > > > > > > > The solution is to obtain the MSI phy base address from > > > > > > > iommu reserved region, and set it to iommu MSI cookie, > > > > > > > then translation tables will be created while request irq. > > > > > > > > > > > > > > Change log > > > > > > > -- > > > > > > > > > > > > > > v1->v2: > > > > > > > - add resv iotlb to avoid overlap mapping. > > > > > > > v2->v3: > > > > > > > - there is no need to export the iommu symbol anymore. > > > > > > > > > > > > > > Signed-off-by: Rong Wang > > > > > > > > > > > > There's in interest to keep extending vhost iotlb - > > > > > > we should just switch over to iommufd which supports > > > > > > this already. > > > > > > > > > > IOMMUFD is good but VFIO supports this before IOMMUFD. This patch > > > > > makes vDPA run without a backporting of full IOMMUFD in the production > > > > > environment. I think it's worth. > > > > > > > > > > If you worry about the extension, we can just use the vhost iotlb > > > > > existing facility to do this. > > > > > > > > > > Thanks > > > > > > > > Btw, Wang Rong, > > > > > > > > It looks that Cindy does have the bandwidth in working for IOMMUFD > > > > support. > > > > > > I think you mean she does not. > > > > Yes, you are right. > > > > Thanks > > I need to discuss internally, and there may be someone else will do that. > > Thanks. Ok, please let us know if you have a conclusion. Thanks > > > > > > > > > > Do you have the will to do that? > > > > > > > > Thanks > > > > > > > > -- > 发自我的网易邮箱平板适配版 > > > > - Original Message - > From: "Jason Wang" > To: "Michael S. Tsirkin" > Cc: "Wang Rong" , k...@vger.kernel.org, > virtualizat...@lists.linux.dev, net...@vger.kernel.org, > linux-kernel@vger.kernel.org, "Cindy Lu" > Sent: Fri, 29 Mar 2024 18:39:54 +0800 > Subject: Re: [PATCH v3] vhost/vdpa: Add MSI translation tables to iommu for > software-managed MSI > > On Fri, Mar 29, 2024 at 5:13 PM Michael S. Tsirkin wrote: > > > > On Fri, Mar 29, 2024 at 11:55:50AM +0800, Jason Wang wrote: > > > On Wed, Mar 27, 2024 at 5:08 PM Jason Wang wrote: > > > > > > > > On Thu, Mar 21, 2024 at 3:00 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Wed, Mar 20, 2024 at 06:19:12PM +0800, Wang Rong wrote: > > > > > > From: Rong Wang > > > > > > > > > > > > Once enable iommu domain for one device, the MSI > > > > > > translation tables have to be there for software-managed MSI. > > > > > > Otherwise, platform with software-managed MSI without an > > > > > > irq bypass function, can not get a correct memory write event > > > > > > from pcie, will not get irqs. > > > > > > The solution is to obtain the MSI phy base address from > > > > > > iommu reserved region, and set it to iommu MSI cookie, > > > > > > then translation tables will be created while request irq. > > > > > > > > > > > > Change log > > > > > > -- > > > > > > > > > > > > v1->v2: > > > > > > - add resv iotlb to avoid overlap mapping. > > > > > > v2->v3: > > > > > > - there is no need to export the iommu symbol anymore. > > > > > > > > > > > > Signed-off-by: Rong Wang > > > > > > > > > > There's in interest to keep extending vhost iotlb - > > > > > we should just switch over to iommufd which supports > > > > > this already. > > > > > > > > IOMMUFD is good but VFIO supports this before IOMMUFD. This patch > > > > makes vDPA run without a backporting of full IOMMUFD in the production > > > > environment. I think it's worth. > > > > > > > > If you worry about the extension, we can just use the vhost iotlb > > > > existing facility to do this. > > > > > > > > Thanks > > > > > > Btw, Wang Rong, > > > > > > It looks that Cindy does have the bandwidth in working for IOMMUFD > > > support. > > > > I think you mean she does not. > > Yes, you are right. > > Thanks > > > > > > Do you have the will to do that? > > > > > > Thanks > >
Re: [PATCH 2/4] KVM: remove unused argument of kvm_handle_hva_range()
On 5/4/24 13:58, Paolo Bonzini wrote: The only user was kvm_mmu_notifier_change_pte(), which is now gone. Signed-off-by: Paolo Bonzini --- virt/kvm/kvm_main.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 4/4] mm: replace set_pte_at_notify() with just set_pte_at()
On 5/4/24 13:58, Paolo Bonzini wrote: With the demise of the .change_pte() MMU notifier callback, there is no notification happening in set_pte_at_notify(). It is a synonym of set_pte_at() and can be replaced with it. Signed-off-by: Paolo Bonzini --- include/linux/mmu_notifier.h | 2 -- kernel/events/uprobes.c | 5 ++--- mm/ksm.c | 4 ++-- mm/memory.c | 7 +-- mm/migrate_device.c | 8 ++-- 5 files changed, 7 insertions(+), 19 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 3/3] dt-bindings: remoteproc: qcom,sdm845-adsp-pil: Fix qcom,halt-regs definition
On 07/04/2024 11:58, Luca Weiss wrote: > Set the 'items' correctly for the qcom,halt-regs property and update the > description to match what it should be. > > Signed-off-by: Luca Weiss > --- > .../devicetree/bindings/remoteproc/qcom,sdm845-adsp-pil.yaml| 6 > +- > 1 file changed, 5 insertions(+), 1 deletion(-) Reviewed-by: Krzysztof Kozlowski Best regards, Krzysztof