date:20210303

[PATCH AUTOSEL 5.10 24/47] sparc64: Use arch_validate_flags() to validate ADI flag

2021-03-03 Thread Sasha Levin

From: Khalid Aziz 

[ Upstream commit 147d8622f2a26ef34beacc60e1ed8b66c2fa457f ]

When userspace calls mprotect() to enable ADI on an address range,
do_mprotect_pkey() calls arch_validate_prot() to validate new
protection flags. arch_validate_prot() for sparc looks at the first
VMA associated with address range to verify if ADI can indeed be
enabled on this address range. This has two issues - (1) Address
range might cover multiple VMAs while arch_validate_prot() looks at
only the first VMA, (2) arch_validate_prot() peeks at VMA without
holding mmap lock which can result in race condition.

arch_validate_flags() from commit c462ac288f2c ("mm: Introduce
arch_validate_flags()") allows for VMA flags to be validated for all
VMAs that cover the address range given by user while holding mmap
lock. This patch updates sparc code to move the VMA check from
arch_validate_prot() to arch_validate_flags() to fix above two
issues.

Suggested-by: Jann Horn 
Suggested-by: Christoph Hellwig 
Suggested-by: Catalin Marinas 
Signed-off-by: Khalid Aziz 
Reviewed-by: Catalin Marinas 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 arch/sparc/include/asm/mman.h | 54 +++
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/arch/sparc/include/asm/mman.h b/arch/sparc/include/asm/mman.h
index f94532f25db1..274217e7ed70 100644
--- a/arch/sparc/include/asm/mman.h
+++ b/arch/sparc/include/asm/mman.h
@@ -57,35 +57,39 @@ static inline int sparc_validate_prot(unsigned long prot, 
unsigned long addr)
 {
if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_ADI))
return 0;
-   if (prot & PROT_ADI) {
-   if (!adi_capable())
-   return 0;
+   return 1;
+}
 
-   if (addr) {
-   struct vm_area_struct *vma;
+#define arch_validate_flags(vm_flags) arch_validate_flags(vm_flags)
+/* arch_validate_flags() - Ensure combination of flags is valid for a
+ * VMA.
+ */
+static inline bool arch_validate_flags(unsigned long vm_flags)
+{
+   /* If ADI is being enabled on this VMA, check for ADI
+* capability on the platform and ensure VMA is suitable
+* for ADI
+*/
+   if (vm_flags & VM_SPARC_ADI) {
+   if (!adi_capable())
+   return false;
 
-   vma = find_vma(current->mm, addr);
-   if (vma) {
-   /* ADI can not be enabled on PFN
-* mapped pages
-*/
-   if (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
-   return 0;
+   /* ADI can not be enabled on PFN mapped pages */
+   if (vm_flags & (VM_PFNMAP | VM_MIXEDMAP))
+   return false;
 
-   /* Mergeable pages can become unmergeable
-* if ADI is enabled on them even if they
-* have identical data on them. This can be
-* because ADI enabled pages with identical
-* data may still not have identical ADI
-* tags on them. Disallow ADI on mergeable
-* pages.
-*/
-   if (vma->vm_flags & VM_MERGEABLE)
-   return 0;
-   }
-   }
+   /* Mergeable pages can become unmergeable
+* if ADI is enabled on them even if they
+* have identical data on them. This can be
+* because ADI enabled pages with identical
+* data may still not have identical ADI
+* tags on them. Disallow ADI on mergeable
+* pages.
+*/
+   if (vm_flags & VM_MERGEABLE)
+   return false;
}
-   return 1;
+   return true;
 }
 #endif /* CONFIG_SPARC64 */
 
-- 
2.30.1

[PATCH AUTOSEL 4.14 02/13] mmc: mxs-mmc: Fix a resource leak in an error handling path in 'mxs_mmc_probe()'

2021-03-03 Thread Sasha Levin

From: Christophe JAILLET 

[ Upstream commit 0bb7e560f821c7770973a94e346654c4bdccd42c ]

If 'mmc_of_parse()' fails, we must undo the previous 'dma_request_chan()'
call.

Signed-off-by: Christophe JAILLET 
Link: 
https://lore.kernel.org/r/20201208203527.49262-1-christophe.jail...@wanadoo.fr
Signed-off-by: Ulf Hansson 
Signed-off-by: Sasha Levin 
---
 drivers/mmc/host/mxs-mmc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mmc/host/mxs-mmc.c b/drivers/mmc/host/mxs-mmc.c
index add1e70195ea..7125687faf76 100644
--- a/drivers/mmc/host/mxs-mmc.c
+++ b/drivers/mmc/host/mxs-mmc.c
@@ -659,7 +659,7 @@ static int mxs_mmc_probe(struct platform_device *pdev)
 
ret = mmc_of_parse(mmc);
if (ret)
-   goto out_clk_disable;
+   goto out_free_dma;
 
mmc->ocr_avail = MMC_VDD_32_33 | MMC_VDD_33_34;
 
-- 
2.30.1

Re: [PATCH v1] microblaze: tag highmem_setup() with __meminit

2021-03-03 Thread Michal Simek




On 3/2/21 10:04 AM, David Hildenbrand wrote:
> On 01.03.21 23:18, Oscar Salvador wrote:
>> On Mon, Mar 01, 2021 at 12:47:49PM +0100, David Hildenbrand wrote:
>>> With commit a0cd7a7c4bc0 ("mm: simplify free_highmem_page() and
>>> free_reserved_page()") the kernel test robot complains about a warning:
>>>
>>> WARNING: modpost: vmlinux.o(.text.unlikely+0x23ac): Section mismatch in
>>>    reference from the function highmem_setup() to the function
>>>    .meminit.text:memblock_is_reserved()
>>>
>>> This has been broken ever since microblaze added highmem support,
>>> because memblock_is_reserved() was already tagged with "__init" back
>>> then -
>>> most probably the function always got inlined, so we never stumbled over
>>> it.
>>
>> It might be good to point out that we need __meminit instead of __init
>> because microblaze platform does not define CONFIG_ARCH_KEEP_MEMBLOCK,
>> and __init_memblock fallsback to that.
>>
>> (I had to go and look as I was puzzled :-) )
>>
>> Reviewed-by: Oscar Salvador 
> 
> Thanks!
> 
> Whoever feels like picking this up (@Andrew?) can you add
> 
> "We need __meminit because __init_memblock defaults to that without
> CONFIG_ARCH_KEEP_MEMBLOCK" and __init_memblock is not used outside
> memblock code.
> 

Applied with this line added.

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs

[PATCH v3 04/32] KVM: arm64: Initialize kvm_nvhe_init_params early

2021-03-03 Thread Quentin Perret

Move the initialization of kvm_nvhe_init_params in a dedicated function
that is run early, and only once during KVM init, rather than every time
the KVM vectors are set and reset.

This also opens the opportunity for the hypervisor to change the init
structs during boot, hence simplifying the replacement of host-provided
page-table by the one the hypervisor will create for itself.

Signed-off-by: Quentin Perret 
---
 arch/arm64/kvm/arm.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fc4c95dd2d26..2d1e7ef69c04 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1383,22 +1383,18 @@ static int kvm_init_vector_slots(void)
return 0;
 }
 
-static void cpu_init_hyp_mode(void)
+static void cpu_prepare_hyp_mode(int cpu)
 {
-   struct kvm_nvhe_init_params *params = 
this_cpu_ptr_nvhe_sym(kvm_init_params);
-   struct arm_smccc_res res;
+   struct kvm_nvhe_init_params *params = 
per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
unsigned long tcr;
 
-   /* Switch from the HYP stub to our own HYP init vector */
-   __hyp_set_vectors(kvm_get_idmap_vector());
-
/*
 * Calculate the raw per-cpu offset without a translation from the
 * kernel's mapping to the linear mapping, and store it in tpidr_el2
 * so that we can use adr_l to access per-cpu variables in EL2.
 * Also drop the KASAN tag which gets in the way...
 */
-   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(this_cpu_ptr_nvhe_sym(__per_cpu_start)) -
+   params->tpidr_el2 = (unsigned 
long)kasan_reset_tag(per_cpu_ptr_nvhe_sym(__per_cpu_start, cpu)) -
(unsigned 
long)kvm_ksym_ref(CHOOSE_NVHE_SYM(__per_cpu_start));
 
params->mair_el2 = read_sysreg(mair_el1);
@@ -1422,7 +1418,7 @@ static void cpu_init_hyp_mode(void)
tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
params->tcr_el2 = tcr;
 
-   params->stack_hyp_va = 
kern_hyp_va(__this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE);
+   params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) 
+ PAGE_SIZE);
params->pgd_pa = kvm_mmu_get_httbr();
 
/*
@@ -1430,6 +1426,15 @@ static void cpu_init_hyp_mode(void)
 * be read while the MMU is off.
 */
kvm_flush_dcache_to_poc(params, sizeof(*params));
+}
+
+static void cpu_init_hyp_mode(void)
+{
+   struct kvm_nvhe_init_params *params;
+   struct arm_smccc_res res;
+
+   /* Switch from the HYP stub to our own HYP init vector */
+   __hyp_set_vectors(kvm_get_idmap_vector());
 
/*
 * Call initialization code, and switch to the full blown HYP code.
@@ -1438,6 +1443,7 @@ static void cpu_init_hyp_mode(void)
 * cpus_have_const_cap() wrapper.
 */
BUG_ON(!system_capabilities_finalized());
+   params = this_cpu_ptr_nvhe_sym(kvm_init_params);
arm_smccc_1_1_hvc(KVM_HOST_SMCCC_FUNC(__kvm_hyp_init), 
virt_to_phys(params), &res);
WARN_ON(res.a0 != SMCCC_RET_SUCCESS);
 
@@ -1785,19 +1791,19 @@ static int init_hyp_mode(void)
}
}
 
-   /*
-* Map Hyp percpu pages
-*/
for_each_possible_cpu(cpu) {
char *percpu_begin = (char *)kvm_arm_hyp_percpu_base[cpu];
char *percpu_end = percpu_begin + nvhe_percpu_size();
 
+   /* Map Hyp percpu pages */
err = create_hyp_mappings(percpu_begin, percpu_end, PAGE_HYP);
-
if (err) {
kvm_err("Cannot map hyp percpu region\n");
goto out_err;
}
+
+   /* Prepare the CPU initialization parameters */
+   cpu_prepare_hyp_mode(cpu);
}
 
if (is_protected_kvm_enabled()) {
-- 
2.30.1.766.gb4fecdf3b7-goog

Re: [PATCH 01/12] Documentation: add BCM6328 pincontroller binding documentation

2021-03-03 Thread Linus Walleij

On Thu, Feb 25, 2021 at 5:42 PM Álvaro Fernández Rojas
 wrote:

> Add binding documentation for the pincontrol core found in BCM6328 SoCs.
>
> Signed-off-by: Álvaro Fernández Rojas 
> Signed-off-by: Jonas Gorski 
(...)
> +  interrupts-extended:
> +description:
> +  One interrupt per each of the 4 GPIO ports supported by the controller,
> +  sorted by port number ascending order.
> +minItems: 4
> +maxItems: 4

I don't know if this is advisable, there are different ways
of specifying interrupts so this may become ambiguous,
I think Rob will know how/if to do this though.

Yours,
Linus Walleij

Re: [PATCH V3 XRT Alveo 11/18] fpga: xrt: UCS platform driver

2021-03-03 Thread Tom Rix



On 2/17/21 10:40 PM, Lizhi Hou wrote:
> Add UCS driver. UCS is a hardware function discovered by walking xclbin
What does UCS stand for ? add to commit log
> metadata. A platform device node will be created for it.
> UCS enables/disables the dynamic region clocks.
>
> Signed-off-by: Sonal Santan 
> Signed-off-by: Max Zhen 
> Signed-off-by: Lizhi Hou 
> ---
>  drivers/fpga/xrt/include/xleaf/ucs.h |  24 +++
>  drivers/fpga/xrt/lib/xleaf/ucs.c | 235 +++
>  2 files changed, 259 insertions(+)
>  create mode 100644 drivers/fpga/xrt/include/xleaf/ucs.h
>  create mode 100644 drivers/fpga/xrt/lib/xleaf/ucs.c
>
> diff --git a/drivers/fpga/xrt/include/xleaf/ucs.h 
> b/drivers/fpga/xrt/include/xleaf/ucs.h
> new file mode 100644
> index ..a5ef0e100e12
> --- /dev/null
> +++ b/drivers/fpga/xrt/include/xleaf/ucs.h

This header is only used by ucs.c, so is it needed ?

could the enum be defined in ucs.c ?

> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Header file for XRT UCS Leaf Driver
> + *
> + * Copyright (C) 2020-2021 Xilinx, Inc.
> + *
> + * Authors:
> + *   Lizhi Hou 
> + */
> +
> +#ifndef _XRT_UCS_H_
> +#define _XRT_UCS_H_
> +
> +#include "xleaf.h"
> +
> +/*
> + * UCS driver IOCTL calls.
> + */
> +enum xrt_ucs_ioctl_cmd {
> + XRT_UCS_CHECK = XRT_XLEAF_CUSTOM_BASE, /* See comments in xleaf.h */
> + XRT_UCS_ENABLE,
no disable ?
> +};
> +
> +#endif   /* _XRT_UCS_H_ */
> diff --git a/drivers/fpga/xrt/lib/xleaf/ucs.c 
> b/drivers/fpga/xrt/lib/xleaf/ucs.c
> new file mode 100644
> index ..ae762c8fddbb
> --- /dev/null
> +++ b/drivers/fpga/xrt/lib/xleaf/ucs.c
> @@ -0,0 +1,235 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Xilinx Alveo FPGA UCS Driver
> + *
> + * Copyright (C) 2020-2021 Xilinx, Inc.
> + *
> + * Authors:
> + *  Lizhi Hou
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "metadata.h"
> +#include "xleaf.h"
> +#include "xleaf/ucs.h"
> +#include "xleaf/clock.h"
> +
> +#define UCS_ERR(ucs, fmt, arg...)   \
> + xrt_err((ucs)->pdev, fmt "\n", ##arg)
> +#define UCS_WARN(ucs, fmt, arg...)  \
> + xrt_warn((ucs)->pdev, fmt "\n", ##arg)
> +#define UCS_INFO(ucs, fmt, arg...)  \
> + xrt_info((ucs)->pdev, fmt "\n", ##arg)
> +#define UCS_DBG(ucs, fmt, arg...)   \
> + xrt_dbg((ucs)->pdev, fmt "\n", ##arg)
> +
> +#define XRT_UCS  "xrt_ucs"
> +
> +#define CHANNEL1_OFFSET  0
> +#define CHANNEL2_OFFSET  8
> +
> +#define CLK_MAX_VALUE6400
> +
> +struct ucs_control_status_ch1 {
> + unsigned int shutdown_clocks_latched:1;
> + unsigned int reserved1:15;
> + unsigned int clock_throttling_average:14;
> + unsigned int reserved2:2;
> +};
Likely needs to be packed and/or the unsigned int changed to u32
> +
> +struct xrt_ucs {
> + struct platform_device  *pdev;
> + void __iomem*ucs_base;
> + struct mutexucs_lock; /* ucs dev lock */
> +};
> +
> +static inline u32 reg_rd(struct xrt_ucs *ucs, u32 offset)
> +{
> + return ioread32(ucs->ucs_base + offset);
> +}
> +
> +static inline void reg_wr(struct xrt_ucs *ucs, u32 val, u32 offset)
> +{
> + iowrite32(val, ucs->ucs_base + offset);
> +}
> +
> +static void xrt_ucs_event_cb(struct platform_device *pdev, void *arg)
> +{
> + struct platform_device  *leaf;
> + struct xrt_event *evt = (struct xrt_event *)arg;
> + enum xrt_events e = evt->xe_evt;
> + enum xrt_subdev_id id = evt->xe_subdev.xevt_subdev_id;
> + int instance = evt->xe_subdev.xevt_subdev_instance;
> +
> + switch (e) {
> + case XRT_EVENT_POST_CREATION:
> + break;
> + default:
> + xrt_dbg(pdev, "ignored event %d", e);
> + return;
> + }
this switch is a noop, remove
> +
> + if (id != XRT_SUBDEV_CLOCK)
> + return;
> +
> + leaf = xleaf_get_leaf_by_id(pdev, XRT_SUBDEV_CLOCK, instance);
> + if (!leaf) {
> + xrt_err(pdev, "does not get clock subdev");
> + return;
> + }
> +
> + xleaf_ioctl(leaf, XRT_CLOCK_VERIFY, NULL);
> + xleaf_put_leaf(pdev, leaf);
> +}
> +
> +static void ucs_check(struct xrt_ucs *ucs, bool *latched)
> +{

checking but not returning status, change to returning int.

this function is called but xrt_ucs_leaf_ioctl which does return status.

> + struct ucs_control_status_ch1 *ucs_status_ch1;
> + u32 status;
> +
> + mutex_lock(&ucs->ucs_lock);
> + status = reg_rd(ucs, CHANNEL1_OFFSET);
> + ucs_status_ch1 = (struct ucs_control_status_ch1 *)&status;
> + if (ucs_status_ch1->shutdown_clocks_latched) {
> + UCS_ERR(ucs,
> + "Critical temperature or power event, kernel clocks 
> have been stopped.");
> + UCS_ERR(ucs,
> + "run 'xbutil valiate -q' to continue. See AR 73398 for 
> more details.");
This error message does not seem like it w

Re: [PATCH v2 1/3] mtd: partitions: ofpart: skip subnodes parse with compatible

2021-03-03 Thread Ansuel Smith

On Tue, Mar 02, 2021 at 05:53:54PM +0100, Rafał Miłecki wrote:
> On 16.02.2021 22:26, Ansuel Smith wrote:
> > If a partitions structure is not used, parse direct subnodes as
> > fixed-partitions only if a compatible is not found or is of type
> > fixed-partition. A parser can be used directly on the subnode and
> > subnodes should not be parsed as fixed-partitions by default.
> > 
> > Signed-off-by: Ansuel Smith 
> > ---
> >   drivers/mtd/parsers/ofpart.c | 5 +
> >   1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/mtd/parsers/ofpart.c b/drivers/mtd/parsers/ofpart.c
> > index daf507c123e6..4b363dd0311c 100644
> > --- a/drivers/mtd/parsers/ofpart.c
> > +++ b/drivers/mtd/parsers/ofpart.c
> > @@ -50,6 +50,11 @@ static int parse_fixed_partitions(struct mtd_info 
> > *master,
> >  master->name, mtd_node);
> > ofpart_node = mtd_node;
> > dedicated = false;
> > +
> > +   /* Skip parsing direct subnodes if a compatible is found and is 
> > not fixed-partitions */
> > +   if (node_has_compatible(ofpart_node) &&
> > +   !of_device_is_compatible(ofpart_node, "fixed-partitions"))
> > +   return 0;
> > } else if (!of_device_is_compatible(ofpart_node, "fixed-partitions")) {
> > /* The 'partitions' subnode might be used by another parser */
> > return 0;
> 
> I admit I'm not familiar with the old binding, so let me know if my
> understanding is incorrect.
> 
> It seems to me however that your change will break parsing in cases
> like:
> 
> spi-flash@0 {
>   compatible = "jedec,spi-nor";
>   reg = <0x0>;
> 
>   partition@0 {
>   label = "bootloader";
>   reg = <0x0 0x10>;
>   };
> };
> 
> nandcs@0 {
>   compatible = "brcm,nandcs";
>   reg = <0>;
> 
>   partition@0 {
>   label = "bootloader";
>   reg = <0x000 0x1>;
>   };
> };
> 
> Did we ever use "fixed-partitions" without partitions { } subnode?

Hi, very good point. You are right and I didin't think about this case.
I don't want to assume false statement, but since the ofpart parser and
the partitions structure should have been pushed at the same time, there
shouldn't be any use of "fixed-partitions" without partitions { }
subnodes. With this assumtion, the current implementation looks to be the 
cleanest way to skip parsing. (if the parsing is dubious, don't parse at
all... The idea was that)
The hacky and IMHO dirty way to solve this is add a bool to directly
skip the subnode parsing and check for that. Something like
"no-fixed-partition" that would disable the ofnode parser with no
partitions { } subnode would accomplish the same result of this patch
and keep compatibility with nodes scheme you pointed out.

Re: [PATCH 2/2] staging: rtl8192e: Change state information from u16 to u8

2021-03-03 Thread Atul Gopinathan

On Tue, Mar 02, 2021 at 03:38:52PM +0100, Greg KH wrote:
> On Mon, Feb 22, 2021 at 10:53:30PM +0530, Atul Gopinathan wrote:
> > On Mon, Feb 22, 2021 at 04:26:33PM +0100, Greg KH wrote:
> > > On Sun, Feb 21, 2021 at 10:27:21PM +0530, Atul Gopinathan wrote:
> > > > On Sun, Feb 21, 2021 at 02:08:26PM +0100, Greg KH wrote:
> > > > > On Sat, Feb 20, 2021 at 11:51:55PM +0530, Atul Gopinathan wrote:
> > > > > > The "CcxRmState" field in struct "rtllib_network" is defined
> > > > > > as a u16 array of size 2 (so, 4 bytes in total).
> > > > > > 
> > > > > > But the operations performed on this array throughout the code
> > > > > > base (in rtl8192e/) are all in byte size 2 indicating that this
> > > > > > array's type was defined wrongly.
> > > > > > 
> > > > > > There are two situation were u16 type of this field could yield
> > > > > > incorrect behaviour:
> > > > > > 
> > > > > > 1. In rtllib_rx.c:1970:
> > > > > > memcpy(network->CcxRmState, &info_element->data[4], 2);
> > > > > > 
> > > > > > Here last 2 bytes (index 4 and 5) from the info_element->data[]
> > > > > > array are meant to be copied into CcxRmState[].
> > > > > > Note that "data" array here is an array of type u8.
> > > > > > 
> > > > > > 2. In function "update_network()" in staging/rtl8192e/rtllib_rx.c:
> > > > > > memcpy(dst->CcxRmState, src->CcxRmState, 2);
> > > > > > 
> > > > > > Here again, only 2 bytes are copied from the source state to
> > > > > > destination state.
> > > > > > 
> > > > > > There are no instances of "CcxRmState" requiring u16 data type.
> > > > > > Here is the output of "grep -IRn 'CcxRmState'" on the rtl8192e/
> > > > > > directory for reviewing:
> > > > > > 
> > > > > > rtllib_rx.c:1970:   memcpy(network->CcxRmState, 
> > > > > > &info_element->data[4], 2);
> > > > > > rtllib_rx.c:1971:   if (network->CcxRmState[0] != 0)
> > > > > > rtllib_rx.c:1975:   network->MBssidMask = 
> > > > > > network->CcxRmState[1] & 0x07;
> > > > > > rtllib_rx.c:2520:   memcpy(dst->CcxRmState, src->CcxRmState, 2);
> > > > > > rtllib.h:1108:  u8  CcxRmState[2];
> > > > > 
> > > > > You just changed the logic in line 1975 in that file, right?  Are you
> > > > > _SURE_ that is ok?  Do you have a device to test this on?
> > > > 
> > > > I'm sorry, I didn't quite get you. By line 1975 in rtllib_rx.c, did you 
> > > > mean
> > > > the following line?:
> > > > 
> > > > network->MBssidMask = network->CcxRmState[1] & 0x07;
> > > 
> > > Yes.
> > > 
> > > > network->CcxRmState is being fed with 2 bytes of u8 data, in line 1970 
> > > > (as
> > > > seen above). I believe my patch doesn't change the logic of an "&" 
> > > > operation
> > > > being performed on it with 0x07, right?
> > > 
> > > It changes the location of the [1] operation to point to a different
> > > place in memory from what I can tell, as you changed the type of that
> > > array.
> > 
> > Oh yes, earlier, the network->CcxRmState[] array had memory locations as:
> > [x, x+16]. With this patch, it's locations are [x, x+8].
> > 
> > And I strongly believe this is how it should be based on how the original
> > author is using the CcxRmState[] array throughout the codebase:
> 
> Ok, but this has changed the way memory is addressed, which is what I
> was trying to point out when you said that nothing had changed.

Ah sorry about that, It just wasn't clear to me what you meant and my
mind was too fixated on the "&" operation.

> 
> > Allow me to explain (Based on the output of "grep -IRn 'CcxRmState'" that
> > I sent previously):
> > 1. At line 1970:
> > 
> > memcpy(network->CcxRmState, &info_element->data[4], 2);
> > 
> > this is where the array CcxRmState[] is being fed with
> > data. And one can see the source is an array named "data" which itself
> > has type u8. The third argument is "2", meaning 2 bytes of data should
> > be written from "data" array to "CcxRmState".
> > 
> > Also note that, the array CcxRmState has a size 2, as defined in
> > rtllib.h, in struct "rtllib_network":
> > 
> > u16 CcxRmState[2];
> > 
> > Say if CcxRmState[] _was_ supposed to be u16 and not u8, then both elements
> > of the source "data" array will only be written into the first element of
> > "CcxRmState", i.e, "CcxRmState[0]". The 2nd element, "CcxRmState[1]" will
> > never be fed with any data. The resultant CcxRmState[] array would look
> > something like this:
> > 
> > [(u8-data and u8-data squashed), 0].
> > 
> > The 2 u8-data here refers to info_element->data[4] and
> > info_element->data[5].
> > 
> > Instead, if "CcxRmState" was of type u8, then both elements of source
> > "data" array will be written into the 2 elements of "CcxRmState"
> > respectively:
> > 
> > [u8 data, u8 data]
> > 
> > This makes a lot more sense.
> > 
> > 2. Line 1975:
> > network->MBssidMask = network->CcxRmState[1] & 0x07;
> > 
> > With point 1 clear, it should now be easy to understand that
> > the the "&" operation in line 1975, will _always_ yield

[PATCH 12/23] KVM: SVM: merge update_cr0_intercept into svm_set_cr0

2021-03-03 Thread Paolo Bonzini

The logic of update_cr0_intercept is pointlessly complicated.
All svm_set_cr0 is compute the effective cr0 and compare it with
the guest value.

Inlining the function and simplifying the condition
clarifies what it is doing.

Signed-off-by: Paolo Bonzini 
---
 arch/x86/kvm/svm/svm.c | 54 +-
 1 file changed, 22 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e7fcd92551e5..968d1a1f2927 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1718,37 +1718,10 @@ static void svm_set_gdt(struct kvm_vcpu *vcpu, struct 
desc_ptr *dt)
vmcb_mark_dirty(svm->vmcb, VMCB_DT);
 }
 
-static void update_cr0_intercept(struct vcpu_svm *svm)
-{
-   ulong gcr0;
-   u64 *hcr0;
-
-   /*
-* SEV-ES guests must always keep the CR intercepts cleared. CR
-* tracking is done using the CR write traps.
-*/
-   if (sev_es_guest(svm->vcpu.kvm))
-   return;
-
-   gcr0 = svm->vcpu.arch.cr0;
-   hcr0 = &svm->vmcb->save.cr0;
-   *hcr0 = (*hcr0 & ~SVM_CR0_SELECTIVE_MASK)
-   | (gcr0 & SVM_CR0_SELECTIVE_MASK);
-
-   vmcb_mark_dirty(svm->vmcb, VMCB_CR);
-
-   if (gcr0 == *hcr0) {
-   svm_clr_intercept(svm, INTERCEPT_CR0_READ);
-   svm_clr_intercept(svm, INTERCEPT_CR0_WRITE);
-   } else {
-   svm_set_intercept(svm, INTERCEPT_CR0_READ);
-   svm_set_intercept(svm, INTERCEPT_CR0_WRITE);
-   }
-}
-
 void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
 {
struct vcpu_svm *svm = to_svm(vcpu);
+   u64 hcr0 = cr0;
 
 #ifdef CONFIG_X86_64
if (vcpu->arch.efer & EFER_LME && !vcpu->arch.guest_state_protected) {
@@ -1766,7 +1739,7 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
vcpu->arch.cr0 = cr0;
 
if (!npt_enabled)
-   cr0 |= X86_CR0_PG | X86_CR0_WP;
+   hcr0 |= X86_CR0_PG | X86_CR0_WP;
 
/*
 * re-enable caching here because the QEMU bios
@@ -1774,10 +1747,27 @@ void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long 
cr0)
 * reboot
 */
if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED))
-   cr0 &= ~(X86_CR0_CD | X86_CR0_NW);
-   svm->vmcb->save.cr0 = cr0;
+   hcr0 &= ~(X86_CR0_CD | X86_CR0_NW);
+
+   svm->vmcb->save.cr0 = hcr0;
vmcb_mark_dirty(svm->vmcb, VMCB_CR);
-   update_cr0_intercept(svm);
+
+   /*
+* SEV-ES guests must always keep the CR intercepts cleared. CR
+* tracking is done using the CR write traps.
+*/
+   if (sev_es_guest(svm->vcpu.kvm))
+   return;
+
+   if (hcr0 == cr0) {
+   /* Selective CR0 write remains on.  */
+   svm_clr_intercept(svm, INTERCEPT_CR0_READ);
+   svm_clr_intercept(svm, INTERCEPT_CR0_WRITE);
+   } else {
+   svm_set_intercept(svm, INTERCEPT_CR0_READ);
+   svm_set_intercept(svm, INTERCEPT_CR0_WRITE);
+   }
+
 }
 
 static bool svm_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
-- 
2.26.2

[PATCH v10 1/2] scsi: ufs: Enable power management for wlun

2021-03-03 Thread Asutosh Das

During runtime-suspend of ufs host, the scsi devices are
already suspended and so are the queues associated with them.
But the ufs host sends SSU to wlun during its runtime-suspend.
During the process blk_queue_enter checks if the queue is not in
suspended state. If so, it waits for the queue to resume, and never
comes out of it.
The commit
(d55d15a33: scsi: block: Do not accept any requests while suspended)
adds the check if the queue is in suspended state in blk_queue_enter().

Call trace:
 __switch_to+0x174/0x2c4
 __schedule+0x478/0x764
 schedule+0x9c/0xe0
 blk_queue_enter+0x158/0x228
 blk_mq_alloc_request+0x40/0xa4
 blk_get_request+0x2c/0x70
 __scsi_execute+0x60/0x1c4
 ufshcd_set_dev_pwr_mode+0x124/0x1e4
 ufshcd_suspend+0x208/0x83c
 ufshcd_runtime_suspend+0x40/0x154
 ufshcd_pltfrm_runtime_suspend+0x14/0x20
 pm_generic_runtime_suspend+0x28/0x3c
 __rpm_callback+0x80/0x2a4
 rpm_suspend+0x308/0x614
 rpm_idle+0x158/0x228
 pm_runtime_work+0x84/0xac
 process_one_work+0x1f0/0x470
 worker_thread+0x26c/0x4c8
 kthread+0x13c/0x320
 ret_from_fork+0x10/0x18

Fix this by registering ufs device wlun as a scsi driver and
registering it for block runtime-pm. Also make this as a
supplier for all other luns. That way, this device wlun
suspends after all the consumers and resumes after
hba resumes.

Co-developed-by: Can Guo 
Signed-off-by: Can Guo 
Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/cdns-pltfrm.c |   2 +
 drivers/scsi/ufs/tc-dwc-g210-pci.c |   2 +
 drivers/scsi/ufs/ufs-exynos.c  |   2 +
 drivers/scsi/ufs/ufs-hisi.c|   2 +
 drivers/scsi/ufs/ufs-mediatek.c|   2 +
 drivers/scsi/ufs/ufs-qcom.c|   2 +
 drivers/scsi/ufs/ufs_bsg.c |   6 +-
 drivers/scsi/ufs/ufshcd-pci.c  |  32 +-
 drivers/scsi/ufs/ufshcd.c  | 583 +++--
 drivers/scsi/ufs/ufshcd.h  |   7 +
 include/trace/events/ufs.h |  20 ++
 11 files changed, 478 insertions(+), 182 deletions(-)

diff --git a/drivers/scsi/ufs/cdns-pltfrm.c b/drivers/scsi/ufs/cdns-pltfrm.c
index 149391f..3e70c23 100644
--- a/drivers/scsi/ufs/cdns-pltfrm.c
+++ b/drivers/scsi/ufs/cdns-pltfrm.c
@@ -319,6 +319,8 @@ static const struct dev_pm_ops cdns_ufs_dev_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver cdns_ufs_pltfrm_driver = {
diff --git a/drivers/scsi/ufs/tc-dwc-g210-pci.c 
b/drivers/scsi/ufs/tc-dwc-g210-pci.c
index 67a6a61..b01db12 100644
--- a/drivers/scsi/ufs/tc-dwc-g210-pci.c
+++ b/drivers/scsi/ufs/tc-dwc-g210-pci.c
@@ -148,6 +148,8 @@ static const struct dev_pm_ops tc_dwc_g210_pci_pm_ops = {
.runtime_suspend = tc_dwc_g210_pci_runtime_suspend,
.runtime_resume  = tc_dwc_g210_pci_runtime_resume,
.runtime_idle= tc_dwc_g210_pci_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static const struct pci_device_id tc_dwc_g210_pci_tbl[] = {
diff --git a/drivers/scsi/ufs/ufs-exynos.c b/drivers/scsi/ufs/ufs-exynos.c
index 267943a1..45c0b02 100644
--- a/drivers/scsi/ufs/ufs-exynos.c
+++ b/drivers/scsi/ufs/ufs-exynos.c
@@ -1268,6 +1268,8 @@ static const struct dev_pm_ops exynos_ufs_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver exynos_ufs_pltform = {
diff --git a/drivers/scsi/ufs/ufs-hisi.c b/drivers/scsi/ufs/ufs-hisi.c
index 0aa5813..d463b44 100644
--- a/drivers/scsi/ufs/ufs-hisi.c
+++ b/drivers/scsi/ufs/ufs-hisi.c
@@ -574,6 +574,8 @@ static const struct dev_pm_ops ufs_hisi_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver ufs_hisi_pltform = {
diff --git a/drivers/scsi/ufs/ufs-mediatek.c b/drivers/scsi/ufs/ufs-mediatek.c
index c55202b..df1eabb 100644
--- a/drivers/scsi/ufs/ufs-mediatek.c
+++ b/drivers/scsi/ufs/ufs-mediatek.c
@@ -1097,6 +1097,8 @@ static const struct dev_pm_ops ufs_mtk_pm_ops = {
.runtime_suspend = ufshcd_pltfrm_runtime_suspend,
.runtime_resume  = ufshcd_pltfrm_runtime_resume,
.runtime_idle= ufshcd_pltfrm_runtime_idle,
+   .prepare = ufshcd_suspend_prepare,
+   .complete   = ufshcd_resume_complete,
 };
 
 static struct platform_driver ufs_mtk_pltform = {
diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index f97d7b0.

Re: [PATCH] gcc-plugins: latent_entropy: remove unneeded semicolon

2021-03-03 Thread Kees Cook

On Sat, 18 Apr 2020 15:05:21 +0800, Jason Yan wrote:
> Fix the following coccicheck warning:
> 
> scripts/gcc-plugins/latent_entropy_plugin.c:539:2-3: Unneeded semicolon

Applied to for-next/gcc-plugins, thanks!

[1/1] gcc-plugins: latent_entropy: remove unneeded semicolon
  https://git.kernel.org/kees/c/5477edcacaac

-- 
Kees Cook

Re: [PATCH] gcc-plugins: structleak: remove unneeded variable 'ret'

2021-03-03 Thread Kees Cook

On Sat, 18 Apr 2020 15:05:05 +0800, Jason Yan wrote:
> Fix the following coccicheck warning:
> 
> scripts/gcc-plugins/structleak_plugin.c:177:14-17: Unneeded variable:
> "ret". Return "0" on line 207

Applied to for-next/gcc-plugins, thanks!

[1/1] gcc-plugins: structleak: remove unneeded variable 'ret'
  https://git.kernel.org/kees/c/b924a8197ac7

-- 
Kees Cook

Re: [PATCH AUTOSEL 5.11 26/52] clk: qcom: gdsc: Implement NO_RET_PERIPH flag

2021-03-03 Thread Stephen Boyd

Quoting Sasha Levin (2021-03-02 03:55:07)
> From: AngeloGioacchino Del Regno 
> 
> [ Upstream commit 785c02eb35009a4be6dbc68f4f7d916e90b7177d ]
> 
> In some rare occasions, we want to only set the RETAIN_MEM bit, but
> not the RETAIN_PERIPH one: this is seen on at least SDM630/636/660's
> GPU-GX GDSC, where unsetting and setting back the RETAIN_PERIPH bit
> will generate chaos and panics during GPU suspend time (mainly, the
> chaos is unaligned access).
> 
> For this reason, introduce a new NO_RET_PERIPH flag to the GDSC
> driver to address this corner case.

Is there a patch that's going to use this in stable trees? On its own
this patch doesn't make sense to backport.

> 
> Signed-off-by: AngeloGioacchino Del Regno 
> 
> Link: 
> https://lore.kernel.org/r/20210113183817.447866-8-angelogioacchino.delre...@somainline.org
> Signed-off-by: Stephen Boyd 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/clk/qcom/gdsc.c | 10 --
>  drivers/clk/qcom/gdsc.h |  3 ++-
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
> index af26e0695b86..51ed640e527b 100644
> --- a/drivers/clk/qcom/gdsc.c
> +++ b/drivers/clk/qcom/gdsc.c
> @@ -183,7 +183,10 @@ static inline int gdsc_assert_reset(struct gdsc *sc)
>  static inline void gdsc_force_mem_on(struct gdsc *sc)
>  {
> int i;
> -   u32 mask = RETAIN_MEM | RETAIN_PERIPH;
> +   u32 mask = RETAIN_MEM;
> +
> +   if (!(sc->flags & NO_RET_PERIPH))
> +   mask |= RETAIN_PERIPH;
>  
> for (i = 0; i < sc->cxc_count; i++)
> regmap_update_bits(sc->regmap, sc->cxcs[i], mask, mask);
> @@ -192,7 +195,10 @@ static inline void gdsc_force_mem_on(struct gdsc *sc)
>  static inline void gdsc_clear_mem_on(struct gdsc *sc)
>  {
> int i;
> -   u32 mask = RETAIN_MEM | RETAIN_PERIPH;
> +   u32 mask = RETAIN_MEM;
> +
> +   if (!(sc->flags & NO_RET_PERIPH))
> +   mask |= RETAIN_PERIPH;
>  
> for (i = 0; i < sc->cxc_count; i++)
> regmap_update_bits(sc->regmap, sc->cxcs[i], mask, 0);
> diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
> index bd537438c793..5bb396b344d1 100644
> --- a/drivers/clk/qcom/gdsc.h
> +++ b/drivers/clk/qcom/gdsc.h
> @@ -42,7 +42,7 @@ struct gdsc {
>  #define PWRSTS_ON  BIT(2)
>  #define PWRSTS_OFF_ON  (PWRSTS_OFF | PWRSTS_ON)
>  #define PWRSTS_RET_ON  (PWRSTS_RET | PWRSTS_ON)
> -   const u8flags;
> +   const u16   flags;
>  #define VOTABLEBIT(0)
>  #define CLAMP_IO   BIT(1)
>  #define HW_CTRLBIT(2)
> @@ -51,6 +51,7 @@ struct gdsc {
>  #define POLL_CFG_GDSCR BIT(5)
>  #define ALWAYS_ON  BIT(6)
>  #define RETAIN_FF_ENABLE   BIT(7)
> +#define NO_RET_PERIPH  BIT(8)
> struct reset_controller_dev *rcdev;
> unsigned int*resets;
> unsigned intreset_count;

[PATCH] ASoC: cs42l42: fix semicolon.cocci warnings

2021-03-03 Thread kernel test robot

From: kernel test robot 

sound/soc/codecs/cs42l42.c:811:2-3: Unneeded semicolon


 Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

CC: Richard Fitzgerald 
Reported-by: kernel test robot 
Signed-off-by: kernel test robot 
---

url:
https://github.com/0day-ci/linux/commits/Lucas-Tanure/Report-jack-and-button-detection-Capture-Support/20210303-012348
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 
for-next

 cs42l42.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/sound/soc/codecs/cs42l42.c
+++ b/sound/soc/codecs/cs42l42.c
@@ -808,7 +808,7 @@ static int cs42l42_pcm_hw_params(struct
break;
default:
break;
-   };
+   }
 
return cs42l42_pll_config(component);
 }

Re: [PATCH AUTOSEL 5.10 22/47] clk: qcom: gdsc: Implement NO_RET_PERIPH flag

2021-03-03 Thread Stephen Boyd

Quoting Sasha Levin (2021-03-02 03:56:21)
> From: AngeloGioacchino Del Regno 
> 
> [ Upstream commit 785c02eb35009a4be6dbc68f4f7d916e90b7177d ]
> 
> In some rare occasions, we want to only set the RETAIN_MEM bit, but
> not the RETAIN_PERIPH one: this is seen on at least SDM630/636/660's
> GPU-GX GDSC, where unsetting and setting back the RETAIN_PERIPH bit
> will generate chaos and panics during GPU suspend time (mainly, the
> chaos is unaligned access).
> 
> For this reason, introduce a new NO_RET_PERIPH flag to the GDSC
> driver to address this corner case.
> 

Same comment as on 5.11

> Signed-off-by: AngeloGioacchino Del Regno 
> 
> Link: 
> https://lore.kernel.org/r/20210113183817.447866-8-angelogioacchino.delre...@somainline.org
> Signed-off-by: Stephen Boyd 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/clk/qcom/gdsc.c | 10 --
>  drivers/clk/qcom/gdsc.h |  3 ++-
>  2 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clk/qcom/gdsc.c b/drivers/clk/qcom/gdsc.c
> index af26e0695b86..51ed640e527b 100644
> --- a/drivers/clk/qcom/gdsc.c
> +++ b/drivers/clk/qcom/gdsc.c
> @@ -183,7 +183,10 @@ static inline int gdsc_assert_reset(struct gdsc *sc)
>  static inline void gdsc_force_mem_on(struct gdsc *sc)
>  {
> int i;
> -   u32 mask = RETAIN_MEM | RETAIN_PERIPH;
> +   u32 mask = RETAIN_MEM;
> +
> +   if (!(sc->flags & NO_RET_PERIPH))
> +   mask |= RETAIN_PERIPH;
>  
> for (i = 0; i < sc->cxc_count; i++)
> regmap_update_bits(sc->regmap, sc->cxcs[i], mask, mask);
> @@ -192,7 +195,10 @@ static inline void gdsc_force_mem_on(struct gdsc *sc)
>  static inline void gdsc_clear_mem_on(struct gdsc *sc)
>  {
> int i;
> -   u32 mask = RETAIN_MEM | RETAIN_PERIPH;
> +   u32 mask = RETAIN_MEM;
> +
> +   if (!(sc->flags & NO_RET_PERIPH))
> +   mask |= RETAIN_PERIPH;
>  
> for (i = 0; i < sc->cxc_count; i++)
> regmap_update_bits(sc->regmap, sc->cxcs[i], mask, 0);
> diff --git a/drivers/clk/qcom/gdsc.h b/drivers/clk/qcom/gdsc.h
> index bd537438c793..5bb396b344d1 100644
> --- a/drivers/clk/qcom/gdsc.h
> +++ b/drivers/clk/qcom/gdsc.h
> @@ -42,7 +42,7 @@ struct gdsc {
>  #define PWRSTS_ON  BIT(2)
>  #define PWRSTS_OFF_ON  (PWRSTS_OFF | PWRSTS_ON)
>  #define PWRSTS_RET_ON  (PWRSTS_RET | PWRSTS_ON)
> -   const u8flags;
> +   const u16   flags;
>  #define VOTABLEBIT(0)
>  #define CLAMP_IO   BIT(1)
>  #define HW_CTRLBIT(2)
> @@ -51,6 +51,7 @@ struct gdsc {
>  #define POLL_CFG_GDSCR BIT(5)
>  #define ALWAYS_ON  BIT(6)
>  #define RETAIN_FF_ENABLE   BIT(7)
> +#define NO_RET_PERIPH  BIT(8)
> struct reset_controller_dev *rcdev;
> unsigned int*resets;
> unsigned intreset_count;
> -- 
> 2.30.1
>

Re: [PATCH 12/15] ASoC: cs42l42: Wait at least 150us after writing SCLK_PRESENT

2021-03-03 Thread kernel test robot

Hi Lucas,

I love your patch! Perhaps something to improve:

[auto build test WARNING on asoc/for-next]
[also build test WARNING on v5.12-rc1 next-20210302]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Lucas-Tanure/Report-jack-and-button-detection-Capture-Support/20210303-012348
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 
for-next
config: microblaze-randconfig-c003-20210303 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"coccinelle warnings: (new ones prefixed by >>)"
>> sound/soc/codecs/cs42l42.c:811:2-3: Unneeded semicolon

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

Re: [PATCH] mm/memcg: set memcg when split pages

2021-03-03 Thread Johannes Weiner

On Tue, Mar 02, 2021 at 12:24:41PM -0800, Hugh Dickins wrote:
> On Tue, 2 Mar 2021, Michal Hocko wrote:
> > [Cc Johannes for awareness and fixup Nick's email]
> > 
> > On Tue 02-03-21 01:34:51, Zhou Guanghui wrote:
> > > When split page, the memory cgroup info recorded in first page is
> > > not copied to tail pages. In this case, when the tail pages are
> > > freed, the uncharge operation is not performed. As a result, the
> > > usage of this memcg keeps increasing, and the OOM may occur.
> > > 
> > > So, the copying of first page's memory cgroup info to tail pages
> > > is needed when split page.
> > 
> > I was not aware that alloc_pages_exact is used for accounted allocations
> > but git grep told me otherwise so this is not a theoretical one. Both
> > users (arm64 and s390 kvm) are quite recent AFAICS. split_page is also
> > used in dma allocator but I got lost in indirection so I have no idea
> > whether there are any users there.
> 
> Yes, it's a bit worrying that such a low-level thing as split_page()
> can now get caught up in memcg accounting, but I suppose that's okay.
> 
> I feel rather strongly that whichever way it is done, THP splitting
> and split_page() should use the same interface to memcg.
> 
> And a look at mem_cgroup_split_huge_fixup() suggests that nowadays
> there need to be css_get()s too - or better, a css_get_many().
> 
> Its #ifdef CONFIG_TRANSPARENT_HUGEPAGE should be removed, rename
> it mem_cgroup_split_page_fixup(), and take order from caller.

+1

There is already a split_page_owner() in both these places as well
which does a similar thing. Mabye we can match that by calling it
split_page_memcg() and having it take a nr of pages?

> Though I've never much liked that separate pass: would it be
> better page by page, like this copy_page_memcg() does?  Though
> mem_cgroup_disabled() and css_getting make that less appealing.

Agreed on both counts. mem_cgroup_disabled() is a jump label and would
be okay, IMO, but the refcounting - though it is (usually) per-cpu -
adds at least two branches and rcu read locking.

[PATCH] Leds: made enum led_brightness into typedef

2021-03-03 Thread Antoni Przybylik

In TODO it was written:
* Get rid of enum led_brightness

It is really an integer, as maximum is configurable. Get rid of it, or
make it into typedef or something.

So I made it into typedef.

Signed-off-by: Antoni Przybylik 
---
 drivers/leds/TODO |  5 
 drivers/leds/blink/leds-lgm-sso.c |  6 ++---
 drivers/leds/flash/leds-rt8515.c  |  2 +-
 drivers/leds/led-class-multicolor.c   |  2 +-
 drivers/leds/led-triggers.c   |  2 +-
 drivers/leds/leds-88pm860x.c  |  2 +-
 drivers/leds/leds-aat1290.c   |  8 +++
 drivers/leds/leds-acer-a500.c |  2 +-
 drivers/leds/leds-adp5520.c   |  2 +-
 drivers/leds/leds-an30259a.c  |  2 +-
 drivers/leds/leds-apu.c   |  4 ++--
 drivers/leds/leds-ariel.c |  4 ++--
 drivers/leds/leds-as3645a.c   |  4 ++--
 drivers/leds/leds-asic3.c |  2 +-
 drivers/leds/leds-aw2013.c|  2 +-
 drivers/leds/leds-bcm6328.c   |  2 +-
 drivers/leds/leds-bcm6358.c   |  2 +-
 drivers/leds/leds-bd2802.c|  2 +-
 drivers/leds/leds-blinkm.c|  8 +++
 drivers/leds/leds-clevo-mail.c|  2 +-
 drivers/leds/leds-cobalt-qube.c   |  2 +-
 drivers/leds/leds-cobalt-raq.c|  4 ++--
 drivers/leds/leds-cpcap.c |  2 +-
 drivers/leds/leds-cr0014114.c |  2 +-
 drivers/leds/leds-da903x.c|  2 +-
 drivers/leds/leds-da9052.c|  4 ++--
 drivers/leds/leds-dac124s085.c|  2 +-
 drivers/leds/leds-el15203000.c|  2 +-
 drivers/leds/leds-fsg.c   | 12 +-
 drivers/leds/leds-gpio.c  |  4 ++--
 drivers/leds/leds-hp6xx.c |  4 ++--
 drivers/leds/leds-ip30.c  |  2 +-
 drivers/leds/leds-ipaq-micro.c|  2 +-
 drivers/leds/leds-is31fl319x.c|  2 +-
 drivers/leds/leds-is31fl32xx.c|  2 +-
 drivers/leds/leds-ktd2692.c   |  6 ++---
 drivers/leds/leds-lm3530.c|  4 ++--
 drivers/leds/leds-lm3532.c|  2 +-
 drivers/leds/leds-lm3533.c|  4 ++--
 drivers/leds/leds-lm355x.c|  6 ++---
 drivers/leds/leds-lm3601x.c   |  2 +-
 drivers/leds/leds-lm36274.c   |  2 +-
 drivers/leds/leds-lm3642.c|  6 ++---
 drivers/leds/leds-lm3692x.c   |  4 ++--
 drivers/leds/leds-lm3697.c|  2 +-
 drivers/leds/leds-locomo.c|  6 ++---
 drivers/leds/leds-lp3944.c|  4 ++--
 drivers/leds/leds-lp3952.c|  2 +-
 drivers/leds/leds-lp50xx.c|  2 +-
 drivers/leds/leds-lp55xx-common.c |  4 ++--
 drivers/leds/leds-lp8788.c|  2 +-
 drivers/leds/leds-lp8860.c|  2 +-
 drivers/leds/leds-lt3593.c|  2 +-
 drivers/leds/leds-max77650.c  |  2 +-
 drivers/leds/leds-max77693.c  |  2 +-
 drivers/leds/leds-max8997.c   |  4 ++--
 drivers/leds/leds-mc13783.c   |  2 +-
 drivers/leds/leds-menf21bmc.c |  2 +-
 drivers/leds/leds-mlxcpld.c   |  4 ++--
 drivers/leds/leds-mlxreg.c|  8 +++
 drivers/leds/leds-mt6323.c| 10 
 drivers/leds/leds-net48xx.c   |  2 +-
 drivers/leds/leds-netxbig.c   |  2 +-
 drivers/leds/leds-nic78bx.c   |  4 ++--
 drivers/leds/leds-ns2.c   |  4 ++--
 drivers/leds/leds-ot200.c |  2 +-
 drivers/leds/leds-pca9532.c   |  4 ++--
 drivers/leds/leds-pca955x.c   |  2 +-
 drivers/leds/leds-pca963x.c   |  4 ++--
 drivers/leds/leds-pm8058.c|  6 ++---
 drivers/leds/leds-powernv.c   |  8 +++
 drivers/leds/leds-pwm.c   |  2 +-
 drivers/leds/leds-rb532.c |  4 ++--
 drivers/leds/leds-regulator.c |  4 ++--
 drivers/leds/leds-s3c24xx.c   |  2 +-
 drivers/leds/leds-sc27xx-bltc.c   |  4 ++--
 drivers/leds/leds-sgm3140.c   |  2 +-
 drivers/leds/leds-spi-byte.c  |  2 +-
 drivers/leds/leds-ss4200.c|  2 +-
 drivers/leds/leds-sunfire.c   | 18 +++
 drivers/leds/leds-syscon.c|  2 +-
 drivers/leds/leds-tca6507.c   |  2 +-
 drivers/leds/leds-tlc591xx.c  |  2 +-
 drivers/leds/leds-tps6105x.c  |  2 +-
 drivers/leds/leds-turris-omnia.c  |  2 +-
 drivers/leds/leds-wm831x-status.c |  4 ++--
 drivers/leds/leds-wm8350.c|  2 +-
 drivers/leds/leds-wrap.c  |  6 ++---
 drivers/leds/trigger/ledtrig-audio.c  |  6 ++---
 drivers/leds/trigger/ledtrig-camera.c |  4 ++--
 drivers/leds/uleds.c  |  2 +-
 include/linux/leds.h  | 33 +--
 92 files changed, 174 insertions(+), 180 deletions(-)

diff --git a/drivers/leds/TODO b/drivers/leds/TODO
index bfa60fa1d812..7ca785d0ff77 100644
--- a/drivers/leds/TODO
+++ b/drivers/leds/TODO
@@ -1,11 +1,6 @@
 -*- org -*-
 
 * On/off LEDs should have max_brightness of 1
-* Get rid of enum led_brig

[PATCH v5 3/5] ibmvfc: treat H_CLOSED as success during sub-CRQ registration

2021-03-03 Thread Tyrel Datwyler

A non-zero return code for H_REG_SUB_CRQ is currently treated as a
failure resulting in failing sub-CRQ setup. The case of H_CLOSED should
not be treated as a failure. This return code translates to a successful
sub-CRQ registration by the hypervisor, and is meant to communicate back
that there is currently no partner VIOS CRQ connection established as of
yet. This is a common occurrence during a disconnect where the client
adapter can possibly come back up prior to the partner adapter.

For non-zero return code from H_REG_SUB_CRQ treat a H_CLOSED as success
so that sub-CRQs are successfully setup.

Fixes: 3034ebe26389 ("ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ 
Channels")
Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index d34e1a4f74d9..1d9f961715ca 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5636,7 +5636,8 @@ static int ibmvfc_register_scsi_channel(struct 
ibmvfc_host *vhost,
rc = h_reg_sub_crq(vdev->unit_address, scrq->msg_token, PAGE_SIZE,
   &scrq->cookie, &scrq->hw_irq);
 
-   if (rc) {
+   /* H_CLOSED indicates successful register, but no CRQ partner */
+   if (rc && rc != H_CLOSED) {
dev_warn(dev, "Error registering sub-crq: %d\n", rc);
if (rc == H_PARAMETER)
dev_warn_once(dev, "Firmware may not support MQ\n");
-- 
2.27.0

[PATCH v5 2/5] ibmvfc: fix invalid sub-CRQ handles after hard reset

2021-03-03 Thread Tyrel Datwyler

A hard reset results in a complete transport disconnect such that the
CRQ connection with the partner VIOS is broken. This has the side effect
of also invalidating the associated sub-CRQs. The current code assumes
that the sub-CRQs are perserved resulting in a protocol violation after
trying to reconnect them with the VIOS. This introduces an infinite loop
such that the VIOS forces a disconnect after each subsequent attempt to
re-register with invalid handles.

Avoid the aforementioned issue by releasing the sub-CRQs prior to CRQ
disconnect, and driving a reinitialization of the sub-CRQs once a new
CRQ is registered with the hypervisor.

fixes: 3034ebe26389 ("ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ 
Channels")
Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 384960036f8b..d34e1a4f74d9 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -158,6 +158,9 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *);
 static void ibmvfc_tgt_implicit_logout_and_del(struct ibmvfc_target *);
 static void ibmvfc_tgt_move_login(struct ibmvfc_target *);
 
+static void ibmvfc_release_sub_crqs(struct ibmvfc_host *);
+static void ibmvfc_init_sub_crqs(struct ibmvfc_host *);
+
 static const char *unknown_error = "unknown error";
 
 static long h_reg_sub_crq(unsigned long unit_address, unsigned long ioba,
@@ -926,8 +929,8 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
unsigned long flags;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
struct ibmvfc_queue *crq = &vhost->crq;
-   struct ibmvfc_queue *scrq;
-   int i;
+
+   ibmvfc_release_sub_crqs(vhost);
 
/* Close the CRQ */
do {
@@ -947,16 +950,6 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
memset(crq->msgs.crq, 0, PAGE_SIZE);
crq->cur = 0;
 
-   if (vhost->scsi_scrqs.scrqs) {
-   for (i = 0; i < nr_scsi_hw_queues; i++) {
-   scrq = &vhost->scsi_scrqs.scrqs[i];
-   spin_lock(scrq->q_lock);
-   memset(scrq->msgs.scrq, 0, PAGE_SIZE);
-   scrq->cur = 0;
-   spin_unlock(scrq->q_lock);
-   }
-   }
-
/* And re-open it again */
rc = plpar_hcall_norets(H_REG_CRQ, vdev->unit_address,
crq->msg_token, PAGE_SIZE);
@@ -966,9 +959,12 @@ static int ibmvfc_reset_crq(struct ibmvfc_host *vhost)
dev_warn(vhost->dev, "Partner adapter not ready\n");
else if (rc != 0)
dev_warn(vhost->dev, "Couldn't register crq (rc=%d)\n", rc);
+
spin_unlock(vhost->crq.q_lock);
spin_unlock_irqrestore(vhost->host->host_lock, flags);
 
+   ibmvfc_init_sub_crqs(vhost);
+
return rc;
 }
 
@@ -5692,6 +5688,7 @@ static void ibmvfc_deregister_scsi_channel(struct 
ibmvfc_host *vhost, int index)
 
free_irq(scrq->irq, scrq);
irq_dispose_mapping(scrq->irq);
+   scrq->irq = 0;
 
do {
rc = plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address,
-- 
2.27.0

[PATCH v5 1/5] ibmvfc: simplify handling of sub-CRQ initialization

2021-03-03 Thread Tyrel Datwyler

If ibmvfc_init_sub_crqs() fails ibmvfc_probe() simply parrots
registration failure reported elsewhere, and futher
vhost->scsi_scrq.scrq == NULL is indication enough to the driver that it
has no sub-CRQs available. The mq_enabled check can also be moved into
ibmvfc_init_sub_crqs() such that each caller doesn't have to gate the
call with a mq_enabled check. Finally, in the case of sub-CRQ setup
failure setting do_enquiry can be turned off to putting the driver into
single queue fallback mode.

The aforementioned changes also simplify the next patch in the series
that fixes a hard reset issue, by tying a sub-CRQ setup failure and
do_enquiry logic into ibmvfc_init_sub_crqs().

Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 7097028d4cb6..384960036f8b 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -5705,17 +5705,21 @@ static void ibmvfc_deregister_scsi_channel(struct 
ibmvfc_host *vhost, int index)
LEAVE;
 }
 
-static int ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
+static void ibmvfc_init_sub_crqs(struct ibmvfc_host *vhost)
 {
int i, j;
 
ENTER;
+   if (!vhost->mq_enabled)
+   return;
 
vhost->scsi_scrqs.scrqs = kcalloc(nr_scsi_hw_queues,
  sizeof(*vhost->scsi_scrqs.scrqs),
  GFP_KERNEL);
-   if (!vhost->scsi_scrqs.scrqs)
-   return -1;
+   if (!vhost->scsi_scrqs.scrqs) {
+   vhost->do_enquiry = 0;
+   return;
+   }
 
for (i = 0; i < nr_scsi_hw_queues; i++) {
if (ibmvfc_register_scsi_channel(vhost, i)) {
@@ -5724,13 +5728,12 @@ static int ibmvfc_init_sub_crqs(struct ibmvfc_host 
*vhost)
kfree(vhost->scsi_scrqs.scrqs);
vhost->scsi_scrqs.scrqs = NULL;
vhost->scsi_scrqs.active_queues = 0;
-   LEAVE;
-   return -1;
+   vhost->do_enquiry = 0;
+   break;
}
}
 
LEAVE;
-   return 0;
 }
 
 static void ibmvfc_release_sub_crqs(struct ibmvfc_host *vhost)
@@ -5997,11 +6000,7 @@ static int ibmvfc_probe(struct vio_dev *vdev, const 
struct vio_device_id *id)
goto remove_shost;
}
 
-   if (vhost->mq_enabled) {
-   rc = ibmvfc_init_sub_crqs(vhost);
-   if (rc)
-   dev_warn(dev, "Failed to allocate Sub-CRQs. rc=%d\n", 
rc);
-   }
+   ibmvfc_init_sub_crqs(vhost);
 
if (shost_to_fc_host(shost)->rqst_q)
blk_queue_max_segments(shost_to_fc_host(shost)->rqst_q, 1);
-- 
2.27.0

[PATCH v5 0/5] ibmvfc: hard reset fixes

2021-03-03 Thread Tyrel Datwyler

This series contains a minor simplification of ibmvfc_init_sub_crqs() followed
by a couple fixes for sub-CRQ handling which effect hard reset of the
client/host adapter CRQ pair.

changes in v5:
Patches 2-5: Corrected upstream commit ids for Fixes: tags

changes in v4:
Patch 2: dropped Reviewed-by tag and moved sub-crq init to after locked region
Patch 5: moved sub-crq init to after locked region

changes in v3:
* Patch 1 & 5: moved ibmvfc_init_sub_crqs out of locked patch

changes in v2:
* added Reviewed-by tags for patches 1-3
* Patch 4: use rtas_busy_delay to test rc and delay correct amount of time
* Patch 5: (new) similar fix for LPM case where CRQ pair needs re-enablement

Tyrel Datwyler (5):
  powerpc/pseries: extract host bridge from pci_bus prior to bus removal
  ibmvfc: simplify handling of sub-CRQ initialization
  ibmvfc: fix invalid sub-CRQ handles after hard reset
  ibmvfc: treat H_CLOSED as success during sub-CRQ registration
  ibmvfc: store return code of H_FREE_SUB_CRQ during cleanup

 arch/powerpc/platforms/pseries/pci_dlpar.c |  4 +-
 drivers/scsi/ibmvscsi/ibmvfc.c | 49 ++
 2 files changed, 26 insertions(+), 27 deletions(-)

-- 
2.27.0

RE: [PATCH v3 07/10] clocksource/drivers/hyper-v: Handle vDSO differences inline

2021-03-03 Thread Michael Kelley

From: Daniel Lezcano  Sent: Tuesday, March 2, 2021 
2:14 PM
> 
> On 02/03/2021 22:38, Michael Kelley wrote:
> > While the driver for the Hyper-V Reference TSC and STIMERs is architecture
> > neutral, vDSO is implemented for x86/x64, but not for ARM64.  Current code
> > calls into utility functions under arch/x86 (and coming, under arch/arm64)
> > to handle the difference.
> >
> > Change this approach to handle the difference inline based on whether
> > VDSO_CLOCK_MODE_HVCLOCK is present.  The new approach removes code under
> > arch/* since the difference is tied more to the specifics of the Linux
> > implementation than to the architecture.
> >
> > No functional change.
> >
> > Signed-off-by: Michael Kelley 
> > Reviewed-by: Boqun Feng 
> > ---
> >  arch/x86/include/asm/mshyperv.h|  4 
> >  drivers/clocksource/hyperv_timer.c | 10 --
> >  2 files changed, 8 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/mshyperv.h 
> > b/arch/x86/include/asm/mshyperv.h
> > index c10dd1c..4f566db 100644
> > --- a/arch/x86/include/asm/mshyperv.h
> > +++ b/arch/x86/include/asm/mshyperv.h
> > @@ -27,10 +27,6 @@ static inline u64 hv_get_register(unsigned int reg)
> > return value;
> >  }
> >
> > -#define hv_set_clocksource_vdso(val) \
> > -   ((val).vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK)
> > -#define hv_enable_vdso_clocksource() \
> > -   vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> >  #define hv_get_raw_timer() rdtsc_ordered()
> >
> >  /*
> > diff --git a/drivers/clocksource/hyperv_timer.c 
> > b/drivers/clocksource/hyperv_timer.c
> > index c73c127..06984fa 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -370,11 +370,13 @@ static void resume_hv_clock_tsc(struct clocksource 
> > *arg)
> > hv_set_register(HV_REGISTER_REFERENCE_TSC, tsc_msr);
> >  }
> >
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> >  static int hv_cs_enable(struct clocksource *cs)
> >  {
> > -   hv_enable_vdso_clocksource();
> > +   vclocks_set_used(VDSO_CLOCKMODE_HVCLOCK);
> > return 0;
> >  }
> > +#endif
> 
> We had a confusion here. The suggestion was to remove the #ifdef here
> and add the __maybe_unused annotation to the function.

I wondered if maybe that's what you were getting at with your
most recent comments.  But the code doesn't compile on ARM64
with __maybe_unused instead of the #ifdef because
VDSO_CLOCKMODE_HVCLOCK is undefined.

Michael

> 
> >  static struct clocksource hyperv_cs_tsc = {
> > .name   = "hyperv_clocksource_tsc_page",
> > @@ -384,7 +386,12 @@ static int hv_cs_enable(struct clocksource *cs)
> > .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
> > .suspend= suspend_hv_clock_tsc,
> > .resume = resume_hv_clock_tsc,
> > +#ifdef VDSO_CLOCKMODE_HVCLOCK
> > .enable = hv_cs_enable,
> > +   .vdso_clock_mode = VDSO_CLOCKMODE_HVCLOCK,
> > +#else
> > +   .vdso_clock_mode = VDSO_CLOCKMODE_NONE,
> > +#endif
> >  };
> >

[PATCH v5 4/5] ibmvfc: store return code of H_FREE_SUB_CRQ during cleanup

2021-03-03 Thread Tyrel Datwyler

The H_FREE_SUB_CRQ hypercall can return a retry delay return code that
indicates the call needs to be retried after a specific amount of time
delay. The error path to free a sub-CRQ in case of a failure during
channel registration fails to capture the return code of H_FREE_SUB_CRQ
which will result in the delay loop being skipped in the case of a retry
delay return code.

Store the return code result of the H_FREE_SUB_CRQ call such that the
return code check in the delay loop evaluates a meaningful value. Also,
use the rtas_busy_delay() to check the rc value and delay for the
appropriate amount of time.

Fixes: 39e461fddff0 ("ibmvfc: map/request irq and register Sub-CRQ interrupt 
handler")
Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index 1d9f961715ca..ef03fa559433 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -5670,8 +5671,8 @@ static int ibmvfc_register_scsi_channel(struct 
ibmvfc_host *vhost,
 
 irq_failed:
do {
-   plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address, 
scrq->cookie);
-   } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
+   rc = plpar_hcall_norets(H_FREE_SUB_CRQ, vdev->unit_address, 
scrq->cookie);
+   } while (rtas_busy_delay(rc));
 reg_failed:
ibmvfc_free_queue(vhost, scrq);
LEAVE;
-- 
2.27.0

[PATCH v5 5/5] ibmvfc: reinitialize sub-CRQs and perform channel enquiry after LPM

2021-03-03 Thread Tyrel Datwyler

A live partition migration (LPM) results in a CRQ disconnect similar to
a hard reset. In this LPM case the hypervisor moslty perserves the CRQ
transport such that it simply needs to be reenabled. However, the
capabilities may have changed such as fewer channels, or no channels at
all. Further, its possible that there may be sub-CRQ support, but no
channel support. The CRQ reenable path currently doesn't take any of
this into consideration.

For simpilicty release and reinitialize sub-CRQs during reenable, and
set do_enquiry and using_channels with the appropriate values to trigger
channel renegotiation.

fixes: 3034ebe26389 ("ibmvfc: add alloc/dealloc routines for SCSI Sub-CRQ 
Channels")
Signed-off-by: Tyrel Datwyler 
Reviewed-by: Brian King 
---
 drivers/scsi/ibmvscsi/ibmvfc.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
index ef03fa559433..1e2ea21713ad 100644
--- a/drivers/scsi/ibmvscsi/ibmvfc.c
+++ b/drivers/scsi/ibmvscsi/ibmvfc.c
@@ -903,6 +903,9 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host 
*vhost)
 {
int rc = 0;
struct vio_dev *vdev = to_vio_dev(vhost->dev);
+   unsigned long flags;
+
+   ibmvfc_release_sub_crqs(vhost);
 
/* Re-enable the CRQ */
do {
@@ -914,6 +917,15 @@ static int ibmvfc_reenable_crq_queue(struct ibmvfc_host 
*vhost)
if (rc)
dev_err(vhost->dev, "Error enabling adapter (rc=%d)\n", rc);
 
+   spin_lock_irqsave(vhost->host->host_lock, flags);
+   spin_lock(vhost->crq.q_lock);
+   vhost->do_enquiry = 1;
+   vhost->using_channels = 0;
+   spin_unlock(vhost->crq.q_lock);
+   spin_unlock_irqrestore(vhost->host->host_lock, flags);
+
+   ibmvfc_init_sub_crqs(vhost);
+
return rc;
 }
 
-- 
2.27.0

Re: [PATCH v4 1/1] kernel/crash_core: Add crashkernel=auto for vmcore creation

2021-03-03 Thread john . p . donnelly


On 2/25/21 6:38 PM, Dave Young wrote:

On 02/23/21 at 09:41am, Saeed Mirzamohammadi wrote:

This adds crashkernel=auto feature to configure reserved memory for
vmcore creation. CONFIG_CRASH_AUTO_STR is defined to be set for
different kernel distributions and different archs based on their
needs.

Signed-off-by: Saeed Mirzamohammadi 
Signed-off-by: John Donnelly 
Tested-by: John Donnelly 
---
  Documentation/admin-guide/kdump/kdump.rst |  3 ++-
  .../admin-guide/kernel-parameters.txt |  6 ++
  arch/Kconfig  | 20 +++
  kernel/crash_core.c   |  7 +++
  4 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst 
b/Documentation/admin-guide/kdump/kdump.rst
index 75a9dd98e76e..ae030111e22a 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -285,7 +285,8 @@ This would mean:
  2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
  3) if the RAM size is larger than 2G, then reserve 128M
  
-

+Or you can use crashkernel=auto to choose the crash kernel memory size
+based on the recommended configuration set for each arch.
  
  Boot into System Kernel

  ===
diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 9e3cdb271d06..a5deda5c85fe 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -747,6 +747,12 @@
a memory unit (amount[KMG]). See also
Documentation/admin-guide/kdump/kdump.rst for an 
example.
  
+	crashkernel=auto

+   [KNL] This parameter will set the reserved memory for
+   the crash kernel based on the value of the 
CRASH_AUTO_STR
+   that is the best effort estimation for each arch. See 
also
+   arch/Kconfig for further details.
+
crashkernel=size[KMG],high
[KNL, X86-64] range could be above 4G. Allow kernel
to allocate physical memory region from top, so could
diff --git a/arch/Kconfig b/arch/Kconfig
index 24862d15f3a3..23d047548772 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -14,6 +14,26 @@ menu "General architecture-dependent options"
  config CRASH_CORE
bool
  
+config CRASH_AUTO_STR

+   string "Memory reserved for crash kernel"
+   depends on CRASH_CORE
+   default "1G-64G:128M,64G-1T:256M,1T-:512M"
+   help
+ This configures the reserved memory dependent
+ on the value of System RAM. The syntax is:
+ crashkernel=:[,:,...][@offset]
+ range=start-[end]
+
+ For example:
+ crashkernel=512M-2G:64M,2G-:128M
+
+ This would mean:
+
+ 1) if the RAM is smaller than 512M, then don't reserve anything
+(this is the "rescue" case)
+ 2) if the RAM size is between 512M and 2G (exclusive), then 
reserve 64M
+ 3) if the RAM size is larger than 2G, then reserve 128M
+
  config KEXEC_CORE
select CRASH_CORE
bool
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 825284baaf46..90f9e4bb6704 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -7,6 +7,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include 

  #include 
@@ -250,6 +251,12 @@ static int __init __parse_crashkernel(char *cmdline,
if (suffix)
return parse_crashkernel_suffix(ck_cmdline, crash_size,
suffix);
+#ifdef CONFIG_CRASH_AUTO_STR
+   if (strncmp(ck_cmdline, "auto", 4) == 0) {
+   ck_cmdline = CONFIG_CRASH_AUTO_STR;
+   pr_info("Using crashkernel=auto, the size chosen is a best effort 
estimation.\n");
+   }
+#endif
/*
 * if the commandline contains a ':', then that's the extended
 * syntax -- if not, it must be the classic syntax
--
2.27.0




Acked-by: Dave Young 

Thanks
Dave


Hi,

  Thank you.

  When can  we expect this to be applied in a future build ?

Re: [PATCH v2] mm: vmstat: add cma statistics

2021-03-03 Thread John Hubbard


On 3/2/21 10:33, Minchan Kim wrote:

Since CMA is used more widely, it's worth to have CMA
allocation statistics into vmstat. With it, we could
know how agressively system uses cma allocation and
how often it fails.

Signed-off-by: Minchan Kim 
---
* from v1 - 
https://lore.kernel.org/linux-mm/20210217170025.512704-1-minc...@kernel.org/
   * change alloc_attempt with alloc_success - jhubbard
   * item per line for vm_event_item - jhubbard

  include/linux/vm_event_item.h |  4 
  mm/cma.c  | 12 +---
  mm/vmstat.c   |  4 
  3 files changed, 17 insertions(+), 3 deletions(-)



Seems reasonable, and the diffs look good.

Reviewed-by: John Hubbard 


thanks,
--
John Hubbard
NVIDIA


diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 18e75974d4e3..21d7c7f72f1c 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -70,6 +70,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
  #endif
  #ifdef CONFIG_HUGETLB_PAGE
HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
+#endif
+#ifdef CONFIG_CMA
+   CMA_ALLOC_SUCCESS,
+   CMA_ALLOC_FAIL,
  #endif
UNEVICTABLE_PGCULLED,   /* culled to noreclaim list */
UNEVICTABLE_PGSCANNED,  /* scanned for reclaimability */
diff --git a/mm/cma.c b/mm/cma.c
index 23d4a97c834a..04ca863d1807 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -435,13 +435,13 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
int ret = -ENOMEM;
  
  	if (!cma || !cma->count || !cma->bitmap)

-   return NULL;
+   goto out;
  
  	pr_debug("%s(cma %p, count %zu, align %d)\n", __func__, (void *)cma,

 count, align);
  
  	if (!count)

-   return NULL;
+   goto out;
  
  	mask = cma_bitmap_aligned_mask(cma, align);

offset = cma_bitmap_aligned_offset(cma, align);
@@ -449,7 +449,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
bitmap_count = cma_bitmap_pages_to_bits(cma, count);
  
  	if (bitmap_count > bitmap_maxno)

-   return NULL;
+   goto out;
  
  	for (;;) {

mutex_lock(&cma->lock);
@@ -506,6 +506,12 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
}
  
  	pr_debug("%s(): returned %p\n", __func__, page);

+out:
+   if (page)
+   count_vm_event(CMA_ALLOC_SUCCESS);
+   else
+   count_vm_event(CMA_ALLOC_FAIL);
+
return page;
  }
  
diff --git a/mm/vmstat.c b/mm/vmstat.c

index 97fc32a53320..d8c32a33208d 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1305,6 +1305,10 @@ const char * const vmstat_text[] = {
  #ifdef CONFIG_HUGETLB_PAGE
"htlb_buddy_alloc_success",
"htlb_buddy_alloc_fail",
+#endif
+#ifdef CONFIG_CMA
+   "cma_alloc_success",
+   "cma_alloc_fail",
  #endif
"unevictable_pgs_culled",
"unevictable_pgs_scanned",

Re: [PATCH] netdevsim: init u64 stats for 32bit hardware

2021-03-03 Thread Jakub Kicinski

On Tue, 2 Mar 2021 12:55:47 +0100 Dmitry Vyukov wrote:
> On Tue, Mar 2, 2021 at 10:06 AM Hillf Danton  wrote:
> > On Mar 2, 2021 at 16:40 Dmitry Vyukov wrote:
> > >I hoped this would get at least into 5.12. syzbot can't start testing  
> > >arm32 because of this.  

FWIW the submission never got into patchwork or lore so we had 
no public source to apply from.

> > Or what is more feasible is you send a fix to Jakub today.  
> 
> So far I can't figure out how to make git work with my Gmail account
> with 1.5-factor auth enabled, neither password nor asp work...

LMK if you get overly frustrated, I can get the patch from my inbox and
resend it for you :)

Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan

2021-03-03 Thread Nicolin Chen

On Sat, Feb 27, 2021 at 12:59:17PM +0300, Dmitry Osipenko wrote:
> 25.02.2021 09:27, Nicolin Chen пишет:
> ...
> >> The partially revert should be okay, but it's not clear to me what makes
> >> difference for T124 since I don't see that problem on T30, which also
> >> has active display at a boot time.
> > 
> > Hmm..do you see ->attach_dev() is called from host1x_client_iommu_attach
> > or from of_dma_configure_id/arch_setup_dma_ops?
> > 
> 
> I applied yours debug-patch, please see dmesg.txt attached to the email.
> Seems probe-defer of the tegra-dc driver prevents the implicit
> tegra_smmu_attach_dev, so it happens to work by accident.

> [0.327826] tegra-dc 5420.dc: ---tegra_smmu_of_xlate: id 1
> [0.328641] [] (tegra_smmu_of_xlate) from [] 
> (of_iommu_xlate+0x51/0x70)
> [0.328740] [] (of_iommu_xlate) from [] 
> (of_iommu_configure+0x127/0x150)
> [0.328896] [] (of_iommu_configure) from [] 
> (of_dma_configure_id+0x1fb/0x2ec)
> [0.329060] [] (of_dma_configure_id) from [] 
> (really_probe+0x7b/0x2a0)
> [0.331438] tegra-dc 5420.dc: tegra_smmu_probe_device, 822
> [0.332234] [] (tegra_smmu_probe_device) from [] 
> (__iommu_probe_device+0x35/0x1c4)
> [0.332391] [] (__iommu_probe_device) from [] 
> (iommu_probe_device+0x19/0xec)
> [0.332545] [] (iommu_probe_device) from [] 
> (of_iommu_configure+0xfb/0x150)
> [0.332701] [] (of_iommu_configure) from [] 
> (of_dma_configure_id+0x1fb/0x2ec)
> [0.332804] [] (of_dma_configure_id) from [] 
> (really_probe+0x7b/0x2a0)
> [0.335202] tegra-dc 5420.dc: -iommu_group_get_for_dev, 1572
> [0.335292] tegra-dc 5420.dc: -tegra_smmu_device_group, 862
> [0.335474] tegra-dc 5420.dc: -tegra_smmu_device_group, 909: 
> 1: drm
> [0.335566] tegra-dc 5420.dc: -iommu_group_get_for_dev, 1574
> [0.335718] tegra-dc 5420.dc: -iommu_group_add_device, 858
> [0.335862] tegra-dc 5420.dc: Adding to iommu group 1
> [0.335955] tegra-dc 5420.dc: -iommu_alloc_default_domain, 
> 1543: type 3
> [0.336101] iommu: --iommu_group_alloc_default_domain: platform, 
> (null), drm
> [0.336187] -tegra_smmu_domain_alloc, 284: type 3
 [0.336968] [] (tegra_smmu_domain_alloc) from [] 
(iommu_group_alloc_default_domain+0x4b/0xfa)
> [0.337127] [] (iommu_group_alloc_default_domain) from 
> [] (iommu_probe_device+0x69/0xec)
> [0.337285] [] (iommu_probe_device) from [] 
> (of_iommu_configure+0xfb/0x150)
> [0.337441] [] (of_iommu_configure) from [] 
> (of_dma_configure_id+0x1fb/0x2ec)
> [0.337599] [] (of_dma_configure_id) from [] 
> (really_probe+0x7b/0x2a0)
> [0.339913] tegra-dc 5420.dc: -iommu_probe_device, 272
> [0.348144] tegra-dc 5420.dc: failed to probe RGB output: -517

Hmm..not sure where this EPROBE_DEFER comes from. But you are right,
as of_dma_configure_id() returns because of that so it didn't run to
arch_setup_dma_ops() call, which allocates an UNMANAGED iommu domain
and attaches DC to it on Tegra124.

By the way, anyone can accept this change? It doesn't feel right to
leave a regression in the newer release...

[PATCH v5 0/2] add IRQF_NO_AUTOEN for request_irq

2021-03-03 Thread Barry Song

-v5:
  * add the same check for IRQF_NO_AUTOEN in request_nmi()
  * combine a dozen of separate patches of input into one (hopefully
this could easy the life of the maintainers)

-v4:
  * remove the irq_settings magic for NOAUTOEN with respect to
Thomas's comment

Barry Song (2):
  genirq: add IRQF_NO_AUTOEN for request_irq
  Input: move to use request_irq by IRQF_NO_AUTOEN flag

 drivers/input/keyboard/tca6416-keypad.c  |  3 +--
 drivers/input/keyboard/tegra-kbc.c   |  5 ++---
 drivers/input/touchscreen/ar1021_i2c.c   |  5 +
 drivers/input/touchscreen/atmel_mxt_ts.c |  5 ++---
 drivers/input/touchscreen/bu21029_ts.c   |  4 ++--
 drivers/input/touchscreen/cyttsp_core.c  |  5 ++---
 drivers/input/touchscreen/melfas_mip4.c  |  5 ++---
 drivers/input/touchscreen/mms114.c   |  4 ++--
 drivers/input/touchscreen/stmfts.c   |  3 +--
 drivers/input/touchscreen/wm831x-ts.c|  3 +--
 drivers/input/touchscreen/zinitix.c  |  4 ++--
 include/linux/interrupt.h|  4 
 kernel/irq/manage.c  | 11 +--
 13 files changed, 31 insertions(+), 30 deletions(-)

-- 
2.25.1

Re: [PATCH v1] pstore/ram: Rate-limit "uncorrectable error in header" message

2021-03-03 Thread Kees Cook

On Tue, 2 Mar 2021 12:58:50 +0300, Dmitry Osipenko wrote:
> There is a quite huge "uncorrectable error in header" flood in KMSG
> on a clean system boot since there is no pstore buffer saved in RAM.
> Let's silence the redundant noisy messages by rate-limiting the printk
> message. Now there are maximum 10 messages printed repeatedly instead
> of 35+.

Applied to for-next/pstore, thanks!

[1/1] pstore/ram: Rate-limit "uncorrectable error in header" message
  https://git.kernel.org/kees/c/7db688e99c0f

-- 
Kees Cook

[PATCH v5 1/2] genirq: add IRQF_NO_AUTOEN for request_irq

2021-03-03 Thread Barry Song

Many drivers don't want interrupts enabled automatically due to
request_irq(). So they are handling this issue by either way of
the below two:
(1)
irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_irq(dev, irq...);
(2)
request_irq(dev, irq...);
disable_irq(irq);

The code in the second way is silly and unsafe. In the small time
gap between request_irq() and disable_irq(), interrupts can still
come.
The code in the first way is safe though we might be able to do it
in the generic irq code.

With this patch, drivers can request_irq with IRQF_NO_AUTOEN flag.
They will need neither irq_set_status_flags() nor disable_irq().

In the meantime, drivers using the below pattern for NMI
irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_nmi(dev, irq...);

can also move to request_nmi() with IRQF_NO_AUTOEN flag.

Cc: Dmitry Torokhov 
Signed-off-by: Barry Song 
---
-v5:
  * add the same check for IRQF_NO_AUTOEN in request_nmi()

 include/linux/interrupt.h |  4 
 kernel/irq/manage.c   | 11 +--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 967e25767153..76f1161a441a 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -61,6 +61,9 @@
  *interrupt handler after suspending interrupts. For system
  *wakeup devices users need to implement wakeup detection in
  *their interrupt handlers.
+ * IRQF_NO_AUTOEN - Don't enable IRQ or NMI automatically when users request 
it.
+ *Users will enable it explicitly by enable_irq() or 
enable_nmi()
+ *later.
  */
 #define IRQF_SHARED0x0080
 #define IRQF_PROBE_SHARED  0x0100
@@ -74,6 +77,7 @@
 #define IRQF_NO_THREAD 0x0001
 #define IRQF_EARLY_RESUME  0x0002
 #define IRQF_COND_SUSPEND  0x0004
+#define IRQF_NO_AUTOEN 0x0008
 
 #define IRQF_TIMER (__IRQF_TIMER | IRQF_NO_SUSPEND | 
IRQF_NO_THREAD)
 
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index dec3f73e8db9..97c231a5644c 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1693,7 +1693,8 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, 
struct irqaction *new)
irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
}
 
-   if (irq_settings_can_autoenable(desc)) {
+   if (!(new->flags & IRQF_NO_AUTOEN) &&
+   irq_settings_can_autoenable(desc)) {
irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
} else {
/*
@@ -2086,10 +2087,15 @@ int request_threaded_irq(unsigned int irq, 
irq_handler_t handler,
 * which interrupt is which (messes up the interrupt freeing
 * logic etc).
 *
+* Also shared interrupts do not go well with disabling auto enable.
+* The sharing interrupt might request it while it's still disabled
+* and then wait for interrupts forever.
+*
 * Also IRQF_COND_SUSPEND only makes sense for shared interrupts and
 * it cannot be set along with IRQF_NO_SUSPEND.
 */
if (((irqflags & IRQF_SHARED) && !dev_id) ||
+   ((irqflags & IRQF_SHARED) && (irqflags & IRQF_NO_AUTOEN)) ||
(!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
return -EINVAL;
@@ -2245,7 +2251,8 @@ int request_nmi(unsigned int irq, irq_handler_t handler,
 
desc = irq_to_desc(irq);
 
-   if (!desc || irq_settings_can_autoenable(desc) ||
+   if (!desc || (irq_settings_can_autoenable(desc) &&
+   !(irqflags & IRQF_NO_AUTOEN)) ||
!irq_settings_can_request(desc) ||
WARN_ON(irq_settings_is_per_cpu_devid(desc)) ||
!irq_supports_nmi(desc))
-- 
2.25.1

[PATCH v10 2/2] ufs: sysfs: Resume the proper scsi device

2021-03-03 Thread Asutosh Das

Resumes the actual scsi device the unit descriptor of which
is being accessed instead of the hba alone.

Signed-off-by: Asutosh Das 
---
 drivers/scsi/ufs/ufs-sysfs.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index acc54f5..3fc182b 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -245,9 +245,9 @@ static ssize_t wb_on_store(struct device *dev, struct 
device_attribute *attr,
goto out;
}
 
-   pm_runtime_get_sync(hba->dev);
+   scsi_autopm_get_device(hba->sdev_ufs_device);
res = ufshcd_wb_ctrl(hba, wb_enable);
-   pm_runtime_put_sync(hba->dev);
+   scsi_autopm_put_device(hba->sdev_ufs_device);
 out:
up(&hba->host_sem);
return res < 0 ? res : count;
@@ -297,10 +297,10 @@ static ssize_t ufs_sysfs_read_desc_param(struct ufs_hba 
*hba,
goto out;
}
 
-   pm_runtime_get_sync(hba->dev);
+   scsi_autopm_get_device(hba->sdev_ufs_device);
ret = ufshcd_read_desc_param(hba, desc_id, desc_index,
param_offset, desc_buf, param_size);
-   pm_runtime_put_sync(hba->dev);
+   scsi_autopm_put_device(hba->sdev_ufs_device);
if (ret) {
ret = -EINVAL;
goto out;
@@ -678,7 +678,7 @@ static ssize_t _name##_show(struct device *dev, 
\
up(&hba->host_sem); \
return -ENOMEM; \
}   \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_descriptor_retry(hba,\
UPIU_QUERY_OPCODE_READ_DESC, QUERY_DESC_IDN_DEVICE, \
0, 0, desc_buf, &desc_len); \
@@ -695,7 +695,7 @@ static ssize_t _name##_show(struct device *dev, 
\
goto out;   \
ret = sysfs_emit(buf, "%s\n", desc_buf);\
 out:   \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
kfree(desc_buf);\
up(&hba->host_sem); \
return ret; \
@@ -744,10 +744,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_flags(QUERY_FLAG_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_flag(hba, UPIU_QUERY_OPCODE_READ_FLAG,   \
QUERY_FLAG_IDN##_uname, index, &flag);  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -813,10 +813,10 @@ static ssize_t _name##_show(struct device *dev,   
\
}   \
if (ufshcd_is_wb_attrs(QUERY_ATTR_IDN##_uname)) \
index = ufshcd_wb_get_query_index(hba); \
-   pm_runtime_get_sync(hba->dev);  \
+   scsi_autopm_get_device(hba->sdev_ufs_device);   \
ret = ufshcd_query_attr(hba, UPIU_QUERY_OPCODE_READ_ATTR,   \
QUERY_ATTR_IDN##_uname, index, 0, &value);  \
-   pm_runtime_put_sync(hba->dev);  \
+   scsi_autopm_put_device(hba->sdev_ufs_device);   \
if (ret) {  \
ret = -EINVAL;  \
goto out;   \
@@ -899,11 +899,15 @@ static ssize_t _pname##_show(struct device *dev,  
\
struct scsi_device *sdev = to_scsi_device(dev); \
struct ufs_hba *hba = shost_priv(sdev->host);   \
u8 lun = u

Re: [PATCH 7/7] kdump: Use vmlinux_build_id() to simplify

2021-03-03 Thread Stephen Boyd

Quoting Baoquan He (2021-03-02 00:19:09)
> On 03/01/21 at 09:47am, Stephen Boyd wrote:
> > - note_sec->n_hdr.n_descsz,
> > - BUILD_ID_MAX);
> > - return;
> > - }
> > - n_remain -= sizeof(struct elf_note) +
> > - ALIGN(note_sec->n_hdr.n_namesz, 4) +
> > - ALIGN(note_sec->n_hdr.n_descsz, 4);
> > + const char *build_id = vmlinux_build_id();
> 
> It's strange that I can only see the cover letter and this patch 7,
> couldn't find the patch where vmlinux_build_id() is introduced in lkml.
> 

Hmm not sure. Maybe spam stuff? I will Cc you on the first patch for v2.

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-03 Thread Josh Poimboeuf

On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> When building out-of-tree kernel modules, the build system doesn't
> require the GCC version to match the version used to build the original
> kernel.  That's probably [1] fine.
> 
> In fact, for many distros, the version of GCC used to build the latest
> kernel doesn't necessarily match the latest released GCC, so a GCC
> mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> it's probably more common.
> 
> So a lot of users have come to rely on being able to use a different
> version of GCC when building OOT modules.
> 
> But with GCC plugins enabled, that's no longer allowed:
> 
>   cc1: error: incompatible gcc/plugin versions
>   cc1: error: failed to initialize plugin 
> ./scripts/gcc-plugins/structleak_plugin.so
> 
> That error comes from the plugin's call to
> plugin_default_version_check(), which strictly enforces the GCC version.
> The strict check makes sense, because there's nothing to prevent the GCC
> plugin ABI from changing -- and it often does.
> 
> But failing the build isn't necessary.  For most plugins, OOT modules
> will otherwise work just fine without the plugin instrumentation.
> 
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.

Hi Masahiro,

This problem is becoming more prevalent.  We will need to fix it one way
or another, if we want to support distro adoption of these GCC
plugin-based features.

Frank suggested a possibly better idea: always rebuild the plugins when
the GCC version changes.  What do you think?  Any suggestions on how to
implement that?  Otherwise I can try to hack something together.

-- 
Josh

Re: [PATCH 5/7] printk: Make %pS and friends print module build ID

2021-03-03 Thread Stephen Boyd

Quoting Steven Rostedt (2021-03-01 18:43:19)
> On Mon,  1 Mar 2021 09:47:47 -0800
> Stephen Boyd  wrote:
> 
> > The %pS printk format (among some others) is used to print kernel
> > addresses symbolically. When the kernel prints an address inside of a
> > module, the kernel prints the addresses' symbol name along with the
> > module's name that contains the address. Let's make kernel stacktraces
> > easier to identify on KALLSYMS builds by including the build ID of a
> > module when we print the address.
> 
> Please no!
> 
> This kills the output of tracing with offset, and can possibly break
> scripts. I don't want to look at traces like this!
> 
>   -0   [004] ..s1   353.842577: ipv4_conntrack_in+0x0/0x10 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_hook_slow+0x40/0xb0
>   -0   [004] ..s1   353.842577: nf_conntrack_in+0x0/0x5c0 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_hook_slow+0x40/0xb0
>   -0   [004] ..s1   353.842577: get_l4proto+0x0/0x190 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_conntrack_in+0x92/0x5c0 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s1   353.842577: nf_ct_get_tuple+0x0/0x240 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_conntrack_in+0xec/0x5c0 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s1   353.842577: 
> hash_conntrack_raw+0x0/0x170 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x28c/0x5c0 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s1   353.842578: 
> __nf_conntrack_find_get.isra.0+0x0/0x2f0 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x29d/0x5c0 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s1   353.842578: 
> nf_conntrack_tcp_packet+0x0/0x1760 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_conntrack_in+0x3c8/0x5c0 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s2   353.842578: nf_ct_seq_offset+0x0/0x40 
> [nf_conntrack] (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_conntrack_tcp_packet+0x26d/0x1760 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051)
>   -0   [004] ..s1   353.842578: 
> __nf_ct_refresh_acct+0x0/0x50 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051) 
> <-nf_conntrack_tcp_packet+0x558/0x1760 [nf_conntrack] 
> (3b39eb771b2566331887f671c741f90bfba0b051)
> 
>  NACK!
> 

Heh, OK. Would adding a new printk format work then? From after the cut:

> Yet another alternative is to add a printk format like %pSB for
> backtrace prints. This would require a handful of architecture updates
> and I'm not sure it's worth the effort for that.
>

Or maybe take the approach of putting all the linked in module build IDs
in the "Modules linked in:" section? But as I said in the commit text
that can become quite verbose. Looking forward to your suggestions here.

Re: [PATCH] Revert "arm64: dts: amlogic: add missing ethernet reset ID"

2021-03-03 Thread Kevin Hilman

Hi Neil,

Neil Armstrong  writes:

> It has been reported on IRC and in KernelCI boot tests, this change breaks
> internal PHY support on the Amlogic G12A/SM1 Based boards.
>
> We suspect the added signal to reset more than the Ethernet MAC but also
> the MDIO/(RG)MII mux used to redirect the MAC signals to the internal PHY.
>
> This reverts commit f3362f0c18174a1f334a419ab7d567a36bd1b3f3 while we find
> and acceptable solution to cleanly reset the Ethernet MAC.
>
> Reported-by: Corentin Labbe 
> Acked-by: Jérôme Brunet 
> Signed-off-by: Neil Armstrong 
> ---
> Hi Kevin,
>
> This has been reported to also break on 5.10, when this lands on 5.11, I'll 
> send another
> patch for 5.10 because meson-axg.dtsi needs a conflict resolution on 5.11.

Looks like this never got submitted to v5.10.  I just discovered it that
v5.10 in SEI610 (internal PHY) is broken.

Could you submit this to v5.10 stable also please?

Thanks,

Kevin

Re: [PATCH net-next RFC v4] net: hdlc_x25: Queue outgoing LAPB frames

2021-03-03 Thread Jakub Kicinski

On Tue, 02 Mar 2021 08:04:20 +0100 Martin Schiller wrote:
> On 2021-03-01 09:56, Xie He wrote:
> > On Sun, Feb 28, 2021 at 10:56 PM Martin Schiller  wrote:  
> >> I mean the change from only one hdlc interface to both hdlc and
> >> hdlc_x25.
> >> 
> >> I can't estimate how many users are out there and how their setup 
> >> looks
> >> like.  
> > 
> > I'm also thinking about solving this issue by adding new APIs to the
> > HDLC subsystem (hdlc_stop_queue / hdlc_wake_queue) for hardware
> > drivers to call instead of netif_stop_queue / netif_wake_queue. This
> > way we can preserve backward compatibility.
> > 
> > However I'm reluctant to change the code of all the hardware drivers
> > because I'm afraid of introducing bugs, etc. When I look at the code
> > of "wan/lmc/lmc_main.c", I feel I'm not able to make sure there are no
> > bugs (related to stop_queue / wake_queue) after my change (and even
> > before my change, actually). There are even serious style problems:
> > the majority of its lines are indented by spaces.
> > 
> > So I don't want to mess with all the hardware drivers. Hardware driver
> > developers (if they wish to properly support hdlc_x25) should do the
> > change themselves. This is not a problem for me, because I use my own
> > out-of-tree hardware driver. However if I add APIs with no user code
> > in the kernel, other developers may think these APIs are not
> > necessary.  
> 
> I don't think a change that affects the entire HDLC subsystem is
> justified, since the actual problem only affects the hdlc_x25 area.
> 
> The approach with the additional hdlc_x25 is clean and purposeful and
> I personally could live with it.
> 
> I just don't see myself in the position to decide such a change at the
> moment.
> 
> @Jakub: What is your opinion on this.

Hard question to answer, existing users seem happy and Xie's driver
isn't upstream, so the justification for potentially breaking backward
compatibility isn't exactly "strong".

Can we cop out and add a knob somewhere to control spawning the extra
netdev? Let people who just want a newer kernel carry on without
distractions and those who want the extra layer can flip the switch?

Re: [PATCH v21 08/26] x86/mm: Introduce _PAGE_COW

2021-03-03 Thread Yu, Yu-cheng


On 3/1/2021 7:52 AM, Borislav Petkov wrote:

On Wed, Feb 17, 2021 at 02:27:12PM -0800, Yu-cheng Yu wrote:


[...]


  static inline pmd_t pmd_mkdirty(pmd_t pmd)
  {
-   return pmd_set_flags(pmd, _PAGE_DIRTY | _PAGE_SOFT_DIRTY);
+   pmdval_t dirty = _PAGE_DIRTY;
+
+   /* Avoid creating (HW)Dirty=1, Write=0 PMDs */
+   if (cpu_feature_enabled(X86_FEATURE_SHSTK) && !(pmd_flags(pmd) & 
_PAGE_RW))


  !(pmd_write(pmd))



There is a problem of doing that: pmd_write() is defined after this 
function.  Maybe we can declare it first, or leave this as-is?


--
Yu-cheng

Re: XDP socket rings, and LKMM litmus tests

2021-03-03 Thread Paul E. McKenney

On Tue, Mar 02, 2021 at 04:14:46PM -0500, Alan Stern wrote:
> On Tue, Mar 02, 2021 at 07:46:27PM +0100, Björn Töpel wrote:
> > Hi!
> > 
> > Firstly; The long Cc-list is to reach the LKMM-folks.
> > 
> > Some background; the XDP sockets use a ring-buffer to communicate
> > between the kernel and userland. It's a
> > single-consumer/single-producer ring, and described in
> > net/xdp/xsk_queue.h.
> > 
> > --8<---
> > /* The structure of the shared state of the rings are the same as the
> >  * ring buffer in kernel/events/ring_buffer.c. For the Rx and completion
> >  * ring, the kernel is the producer and user space is the consumer. For
> >  * the Tx and fill rings, the kernel is the consumer and user space is
> >  * the producer.
> >  *
> >  * producer consumer
> >  *
> >  * if (LOAD ->consumer) {   LOAD ->producer
> >  *(A)   smp_rmb()   (C)
> >  *STORE $data   LOAD $data
> >  *smp_wmb()   (B)   smp_mb()(D)
> >  *STORE ->producer  STORE ->consumer
> >  * }
> >  *
> >  * (A) pairs with (D), and (B) pairs with (C).
> > ...
> > -->8---
> > 
> > I'd like to replace the smp_{r,w,}mb() barriers with acquire-release
> > semantics [1], without breaking existing userspace applications.
> > 
> > So, I figured I'd use herd7 and the LKMM model to build a litmus test
> > for the barrier version, then for the acquire-release version, and
> > finally permutations of both.
> > 
> > The idea is to use a one element ring, with a state machine outlined
> > in the litmus test.
> > 
> > The basic test for the existing smp_{r,w,}mb() barriers looks like:
> > 
> > $ cat spsc-rb+1p1c.litmus
> > C spsc-rb+1p1c
> > 
> > // Stupid one entry ring:
> > // prod cons allowed action   prod cons
> > //00 =>   prod  =>   10
> > //01 =>   cons  =>   00
> > //10 =>   cons  =>   11
> > //11 =>   prod  =>   01
> > 
> > { prod = 1; }
> > 
> > // Here, we start at prod==1,cons==0, data==0, i.e. producer has
> > // written data=0, so from here only the consumer can start, and should
> > // consume data==0. Afterwards, producer can continue and write 1 to
> > // data. Can we enter state prod==0, cons==1, but consumer observerd
> > // the write of 1?
> > 
> > P0(int *prod, int *cons, int *data)
> > {
> > int p;
> > int c;
> > int cond = 0;
> > 
> > p = READ_ONCE(*prod);
> > c = READ_ONCE(*cons);
> > if (p == 0)
> > if (c == 0)
> > cond = 1;
> > if (p == 1)
> > if (c == 1)
> > cond = 1;
> > 
> > if (cond) {
> > smp_mb();
> > WRITE_ONCE(*data, 1);
> > smp_wmb();
> > WRITE_ONCE(*prod, p ^ 1);
> > }
> > }
> > 
> > P1(int *prod, int *cons, int *data)
> > {
> > int p;
> > int c;
> > int d = -1;
> > int cond = 0;
> > 
> > p = READ_ONCE(*prod);
> > c = READ_ONCE(*cons);
> > if (p == 1)
> > if (c == 0)
> > cond = 1;
> > if (p == 0)
> > if (c == 1)
> > cond = 1;
> > 
> > if (cond == 1) {
> > smp_rmb();
> > d = READ_ONCE(*data);
> > smp_mb();
> > WRITE_ONCE(*cons, c ^ 1);
> > }
> > }
> > 
> > exists( 1:d=1 /\ prod=0 /\ cons=1 );
> > 
> > --
> > 
> > The weird state changing if-statements is because that I didn't get
> > '&&' and '||' to work with herd.
> > 
> > When this is run:
> > 
> > $ herd7 -conf linux-kernel.cfg litmus-tests/spsc-rb+1p1c.litmus
> > Test spsc-rb+1p1c Allowed
> > States 2
> > 1:d=0; cons=1; prod=0;
> > 1:d=0; cons=1; prod=1;
> > No
> > Witnesses
> > Positive: 0 Negative: 2
> > Condition exists (1:d=1 /\ prod=0 /\ cons=1)
> > Observation spsc-rb+1p1c Never 0 2
> > Time spsc-rb+1p1c 0.04
> > Hash=b399756d6a1301ca5bda042f32130791
> > 
> > Now to my question; In P0 there's an smp_mb(). Without that, the d==1
> > can be observed from P1 (consumer):
> > 
> > $ herd7 -conf linux-kernel.cfg litmus-tests/spsc-rb+1p1c.litmus
> > Test spsc-rb+1p1c Allowed
> > States 3
> > 1:d=0; cons=1; prod=0;
> > 1:d=0; cons=1; prod=1;
> > 1:d=1; cons=1; prod=0;
> > Ok
> > Witnesses
> > Positive: 1 Negative: 2
> > Condition exists (1:d=1 /\ prod=0 /\ cons=1)
> > Observation spsc-rb+1p1c Sometimes 1 2
> > Time spsc-rb+1p1c 0.04
> > Hash=0047fc21fa77da9a9aee15e35ec367ef
> 
> This result is wrong, apparently because of a bug in herd7.  There 
> should be control dependencies from each of the two loads in P0 to each 
> of the two stores, but herd7 doesn't detect them.
> 
> Maybe Luc can find some time to check whether this really is a bug and 
> if it is, fix it.

I agree that herd7's control dependency tracking could be improved.

But sadly, it is currently doing exactly what I asked Luc to make it do,
which is to confine the control dependency to its "if" statement.  But as
usual I wasn't thinking globally enough.  And I am not

Re: [PATCH v2 1/1] mm/madvise: replace ptrace attach requirement for process_madvise

2021-03-03 Thread Suren Baghdasaryan

On Mon, Feb 1, 2021 at 9:34 PM Suren Baghdasaryan  wrote:
>
> On Thu, Jan 28, 2021 at 11:08 PM Suren Baghdasaryan  wrote:
> >
> > On Thu, Jan 28, 2021 at 11:51 AM Suren Baghdasaryan  
> > wrote:
> > >
> > > On Tue, Jan 26, 2021 at 5:52 AM 'Michal Hocko' via kernel-team
> > >  wrote:
> > > >
> > > > On Wed 20-01-21 14:17:39, Jann Horn wrote:
> > > > > On Wed, Jan 13, 2021 at 3:22 PM Michal Hocko  wrote:
> > > > > > On Tue 12-01-21 09:51:24, Suren Baghdasaryan wrote:
> > > > > > > On Tue, Jan 12, 2021 at 9:45 AM Oleg Nesterov  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On 01/12, Michal Hocko wrote:
> > > > > > > > >
> > > > > > > > > On Mon 11-01-21 09:06:22, Suren Baghdasaryan wrote:
> > > > > > > > >
> > > > > > > > > > What we want is the ability for one process to influence 
> > > > > > > > > > another process
> > > > > > > > > > in order to optimize performance across the entire system 
> > > > > > > > > > while leaving
> > > > > > > > > > the security boundary intact.
> > > > > > > > > > Replace PTRACE_MODE_ATTACH with a combination of 
> > > > > > > > > > PTRACE_MODE_READ
> > > > > > > > > > and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR 
> > > > > > > > > > metadata
> > > > > > > > > > and CAP_SYS_NICE for influencing process performance.
> > > > > > > > >
> > > > > > > > > I have to say that ptrace modes are rather obscure to me. So 
> > > > > > > > > I cannot
> > > > > > > > > really judge whether MODE_READ is sufficient. My 
> > > > > > > > > understanding has
> > > > > > > > > always been that this is requred to RO access to the address 
> > > > > > > > > space. But
> > > > > > > > > this operation clearly has a visible side effect. Do we have 
> > > > > > > > > any actual
> > > > > > > > > documentation for the existing modes?
> > > > > > > > >
> > > > > > > > > I would be really curious to hear from Jann and Oleg (now 
> > > > > > > > > Cced).
> > > > > > > >
> > > > > > > > Can't comment, sorry. I never understood these security checks 
> > > > > > > > and never tried.
> > > > > > > > IIUC only selinux/etc can treat ATTACH/READ differently and I 
> > > > > > > > have no idea what
> > > > > > > > is the difference.
> > > > >
> > > > > Yama in particular only does its checks on ATTACH and ignores READ,
> > > > > that's the difference you're probably most likely to encounter on a
> > > > > normal desktop system, since some distros turn Yama on by default.
> > > > > Basically the idea there is that running "gdb -p $pid" or "strace -p
> > > > > $pid" as a normal user will usually fail, but reading /proc/$pid/maps
> > > > > still works; so you can see things like detailed memory usage
> > > > > information and such, but you're not supposed to be able to directly
> > > > > peek into a running SSH client and inject data into the existing SSH
> > > > > connection, or steal the cryptographic keys for the current
> > > > > connection, or something like that.
> > > > >
> > > > > > > I haven't seen a written explanation on ptrace modes but when I
> > > > > > > consulted Jann his explanation was:
> > > > > > >
> > > > > > > PTRACE_MODE_READ means you can inspect metadata about processes 
> > > > > > > with
> > > > > > > the specified domain, across UID boundaries.
> > > > > > > PTRACE_MODE_ATTACH means you can fully impersonate processes with 
> > > > > > > the
> > > > > > > specified domain, across UID boundaries.
> > > > > >
> > > > > > Maybe this would be a good start to document expectations. Some more
> > > > > > practical examples where the difference is visible would be great as
> > > > > > well.
> > > > >
> > > > > Before documenting the behavior, it would be a good idea to figure out
> > > > > what to do with perf_event_open(). That one's weird in that it only
> > > > > requires PTRACE_MODE_READ, but actually allows you to sample stuff
> > > > > like userspace stack and register contents (if perf_event_paranoid is
> > > > > 1 or 2). Maybe for SELinux things (and maybe also for Yama), there
> > > > > should be a level in between that allows fully inspecting the process
> > > > > (for purposes like profiling) but without the ability to corrupt its
> > > > > memory or registers or things like that. Or maybe perf_event_open()
> > > > > should just use the ATTACH mode.
> > > >
> > > > Thanks for the clarification. I still cannot say I would have a good
> > > > mental picture. Having something in Documentation/core-api/ sounds
> > > > really needed. Wrt to perf_event_open it sounds really odd it can do
> > > > more than other places restrict indeed. Something for the respective
> > > > maintainer but I strongly suspect people simply copy the pattern from
> > > > other places because the expected semantic is not really clear.
> > > >
> > >
> > > Sorry, back to the matters of this patch. Are there any actionable
> > > items for me to take care of before it can be accepted? The only
> > > request from Andrew to write a man page is being worked on at
> > > https://lore.kernel.org/linux

[PATCH v5 2/2] Input: move to use request_irq by IRQF_NO_AUTOEN flag

2021-03-03 Thread Barry Song

disable_irq() after request_irq() still has a time gap in which
interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will
disable IRQ auto-enable because of requesting.

On the other hand, request_irq() after setting IRQ_NOAUTOEN as
below
irq_set_status_flags(irq, IRQ_NOAUTOEN);
request_irq(dev, irq...);
can also be replaced by request_irq() with IRQF_NO_AUTOEN flag.

Signed-off-by: Barry Song 
---
 drivers/input/keyboard/tca6416-keypad.c  | 3 +--
 drivers/input/keyboard/tegra-kbc.c   | 5 ++---
 drivers/input/touchscreen/ar1021_i2c.c   | 5 +
 drivers/input/touchscreen/atmel_mxt_ts.c | 5 ++---
 drivers/input/touchscreen/bu21029_ts.c   | 4 ++--
 drivers/input/touchscreen/cyttsp_core.c  | 5 ++---
 drivers/input/touchscreen/melfas_mip4.c  | 5 ++---
 drivers/input/touchscreen/mms114.c   | 4 ++--
 drivers/input/touchscreen/stmfts.c   | 3 +--
 drivers/input/touchscreen/wm831x-ts.c| 3 +--
 drivers/input/touchscreen/zinitix.c  | 4 ++--
 11 files changed, 18 insertions(+), 28 deletions(-)

diff --git a/drivers/input/keyboard/tca6416-keypad.c 
b/drivers/input/keyboard/tca6416-keypad.c
index 9b0f9665dcb0..2a9755910065 100644
--- a/drivers/input/keyboard/tca6416-keypad.c
+++ b/drivers/input/keyboard/tca6416-keypad.c
@@ -274,7 +274,7 @@ static int tca6416_keypad_probe(struct i2c_client *client,
error = request_threaded_irq(chip->irqnum, NULL,
 tca6416_keys_isr,
 IRQF_TRIGGER_FALLING |
-   IRQF_ONESHOT,
+IRQF_ONESHOT | IRQF_NO_AUTOEN,
 "tca6416-keypad", chip);
if (error) {
dev_dbg(&client->dev,
@@ -282,7 +282,6 @@ static int tca6416_keypad_probe(struct i2c_client *client,
chip->irqnum, error);
goto fail1;
}
-   disable_irq(chip->irqnum);
}
 
error = input_register_device(input);
diff --git a/drivers/input/keyboard/tegra-kbc.c 
b/drivers/input/keyboard/tegra-kbc.c
index 9671842a082a..570fe18c0ce9 100644
--- a/drivers/input/keyboard/tegra-kbc.c
+++ b/drivers/input/keyboard/tegra-kbc.c
@@ -694,14 +694,13 @@ static int tegra_kbc_probe(struct platform_device *pdev)
input_set_drvdata(kbc->idev, kbc);
 
err = devm_request_irq(&pdev->dev, kbc->irq, tegra_kbc_isr,
-  IRQF_TRIGGER_HIGH, pdev->name, kbc);
+  IRQF_TRIGGER_HIGH | IRQF_NO_AUTOEN,
+  pdev->name, kbc);
if (err) {
dev_err(&pdev->dev, "failed to request keyboard IRQ\n");
return err;
}
 
-   disable_irq(kbc->irq);
-
err = input_register_device(kbc->idev);
if (err) {
dev_err(&pdev->dev, "failed to register input device\n");
diff --git a/drivers/input/touchscreen/ar1021_i2c.c 
b/drivers/input/touchscreen/ar1021_i2c.c
index c0d5c2413356..dc6a85362a40 100644
--- a/drivers/input/touchscreen/ar1021_i2c.c
+++ b/drivers/input/touchscreen/ar1021_i2c.c
@@ -125,7 +125,7 @@ static int ar1021_i2c_probe(struct i2c_client *client,
 
error = devm_request_threaded_irq(&client->dev, client->irq,
  NULL, ar1021_i2c_irq,
- IRQF_ONESHOT,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
  "ar1021_i2c", ar1021);
if (error) {
dev_err(&client->dev,
@@ -133,9 +133,6 @@ static int ar1021_i2c_probe(struct i2c_client *client,
return error;
}
 
-   /* Disable the IRQ, we'll enable it in ar1021_i2c_open() */
-   disable_irq(client->irq);
-
error = input_register_device(ar1021->input);
if (error) {
dev_err(&client->dev,
diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c 
b/drivers/input/touchscreen/atmel_mxt_ts.c
index 383a848eb601..3c837c7b24b3 100644
--- a/drivers/input/touchscreen/atmel_mxt_ts.c
+++ b/drivers/input/touchscreen/atmel_mxt_ts.c
@@ -3156,15 +3156,14 @@ static int mxt_probe(struct i2c_client *client, const 
struct i2c_device_id *id)
}
 
error = devm_request_threaded_irq(&client->dev, client->irq,
- NULL, mxt_interrupt, IRQF_ONESHOT,
+ NULL, mxt_interrupt,
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
  client->name, data);
if (error) {
dev_err(&client->dev, "Failed to register interrupt\n");
return error;
}
 
-   disable_irq(client->irq);
-
error = regulator_bulk_enable(ARRAY_SIZE(data->regulators),
  data->regulators

Re: [PATCH] mm/memcg: set memcg when split pages

2021-03-03 Thread Hugh Dickins

On Tue, 2 Mar 2021, Johannes Weiner wrote:
> On Tue, Mar 02, 2021 at 12:24:41PM -0800, Hugh Dickins wrote:
> > On Tue, 2 Mar 2021, Michal Hocko wrote:
> > > [Cc Johannes for awareness and fixup Nick's email]
> > > 
> > > On Tue 02-03-21 01:34:51, Zhou Guanghui wrote:
> > > > When split page, the memory cgroup info recorded in first page is
> > > > not copied to tail pages. In this case, when the tail pages are
> > > > freed, the uncharge operation is not performed. As a result, the
> > > > usage of this memcg keeps increasing, and the OOM may occur.
> > > > 
> > > > So, the copying of first page's memory cgroup info to tail pages
> > > > is needed when split page.
> > > 
> > > I was not aware that alloc_pages_exact is used for accounted allocations
> > > but git grep told me otherwise so this is not a theoretical one. Both
> > > users (arm64 and s390 kvm) are quite recent AFAICS. split_page is also
> > > used in dma allocator but I got lost in indirection so I have no idea
> > > whether there are any users there.
> > 
> > Yes, it's a bit worrying that such a low-level thing as split_page()
> > can now get caught up in memcg accounting, but I suppose that's okay.
> > 
> > I feel rather strongly that whichever way it is done, THP splitting
> > and split_page() should use the same interface to memcg.
> > 
> > And a look at mem_cgroup_split_huge_fixup() suggests that nowadays
> > there need to be css_get()s too - or better, a css_get_many().
> > 
> > Its #ifdef CONFIG_TRANSPARENT_HUGEPAGE should be removed, rename
> > it mem_cgroup_split_page_fixup(), and take order from caller.
> 
> +1
> 
> There is already a split_page_owner() in both these places as well
> which does a similar thing. Mabye we can match that by calling it
> split_page_memcg() and having it take a nr of pages?

Agreed on both counts :) "fixup" was not an inspiring name.

> 
> > Though I've never much liked that separate pass: would it be
> > better page by page, like this copy_page_memcg() does?  Though
> > mem_cgroup_disabled() and css_getting make that less appealing.
> 
> Agreed on both counts. mem_cgroup_disabled() is a jump label and would
> be okay, IMO, but the refcounting - though it is (usually) per-cpu -
> adds at least two branches and rcu read locking.

Re: [PATCH net 1/1] stmmac: intel: Fix mdio bus registration issue for TGL-H/ADL-S

2021-03-03 Thread patchwork-bot+netdevbpf

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Tue,  2 Mar 2021 16:57:21 +0800 you wrote:
> On Intel platforms which consist of two Ethernet Controllers such as
> TGL-H and ADL-S, a unique MDIO bus id is required for MDIO bus to be
> successful registered:
> 
> [   13.076133] sysfs: cannot create duplicate filename 
> '/class/mdio_bus/stmmac-1'
> [   13.083404] CPU: 8 PID: 1898 Comm: systemd-udevd Tainted: G U  
>   5.11.0-net-next #106
> [   13.092410] Hardware name: Intel Corporation Alder Lake Client 
> Platform/AlderLake-S ADP-S DRR4 CRB, BIOS ADLIFSI1.R00.1494.B00.2012031421 
> 12/03/2020
> [   13.105709] Call Trace:
> [   13.108176]  dump_stack+0x64/0x7c
> [   13.111553]  sysfs_warn_dup+0x56/0x70
> [   13.115273]  sysfs_do_create_link_sd.isra.2+0xbd/0xd0
> [   13.120371]  device_add+0x4df/0x840
> [   13.123917]  ? complete_all+0x2a/0x40
> [   13.127636]  __mdiobus_register+0x98/0x310 [libphy]
> [   13.132572]  stmmac_mdio_register+0x1c5/0x3f0 [stmmac]
> [   13.137771]  ? stmmac_napi_add+0xa5/0xf0 [stmmac]
> [   13.142493]  stmmac_dvr_probe+0x806/0xee0 [stmmac]
> [   13.147341]  intel_eth_pci_probe+0x1cb/0x250 [dwmac_intel]
> [   13.152884]  pci_device_probe+0xd2/0x150
> [   13.156897]  really_probe+0xf7/0x4d0
> [   13.160527]  driver_probe_device+0x5d/0x140
> [   13.164761]  device_driver_attach+0x4f/0x60
> [   13.168996]  __driver_attach+0xa2/0x140
> [   13.172891]  ? device_driver_attach+0x60/0x60
> [   13.177300]  bus_for_each_dev+0x76/0xc0
> [   13.181188]  bus_add_driver+0x189/0x230
> [   13.185083]  ? 0xc0795000
> [   13.188446]  driver_register+0x5b/0xf0
> [   13.192249]  ? 0xc0795000
> [   13.195577]  do_one_initcall+0x4d/0x210
> [   13.199467]  ? kmem_cache_alloc_trace+0x2ff/0x490
> [   13.204228]  do_init_module+0x5b/0x21c
> [   13.208031]  load_module+0x2a0c/0x2de0
> [   13.211838]  ? __do_sys_finit_module+0xb1/0x110
> [   13.216420]  __do_sys_finit_module+0xb1/0x110
> [   13.220825]  do_syscall_64+0x33/0x40
> [   13.224451]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   13.229515] RIP: 0033:0x7fc2b1919ccd
> [   13.233113] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 
> f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d 93 31 0c 00 f7 d8 64 89 01 48
> [   13.251912] RSP: 002b:7ffcea2e5b98 EFLAGS: 0246 ORIG_RAX: 
> 0139
> [   13.259527] RAX: ffda RBX: 560558920f10 RCX: 
> 7fc2b1919ccd
> [   13.266706] RDX:  RSI: 7fc2b1a881e3 RDI: 
> 0012
> [   13.273887] RBP: 0002 R08:  R09: 
> 
> [   13.281036] R10: 0012 R11: 0246 R12: 
> 7fc2b1a881e3
> [   13.288183] R13:  R14:  R15: 
> 7ffcea2e5d58
> [   13.295389] libphy: mii_bus stmmac-1 failed to register
> 
> [...]

Here is the summary with links:
  - [net,1/1] stmmac: intel: Fix mdio bus registration issue for TGL-H/ADL-S
https://git.kernel.org/netdev/net/c/fa706dce2f2d

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

[PATCH] clk: qcom: rcg2: Rectify clk_gfx3d rate rounding without mux division

2021-03-03 Thread Marijn Suijten

In case the mux is not divided parent_req was mistakenly not assigned to
leading __clk_determine_rate to determine the best frequency setting for
a requested rate of 0, resulting in the msm8996 platform not booting.
Rectify this by refactoring the logic to unconditionally assign to
parent_req.rate with the clock rate the caller is expecting.

Fixes: 7cbb78a99db6 ("clk: qcom: rcg2: Stop hardcoding gfx3d pingpong parent 
numbers")
Reported-by: Konrad Dybcio 
Tested-by: Konrad Dybcio 
Reviewed-By: AngeloGioacchino Del Regno 

Signed-off-by: Marijn Suijten 
---
 drivers/clk/qcom/clk-rcg2.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/clk/qcom/clk-rcg2.c b/drivers/clk/qcom/clk-rcg2.c
index 42f13a2d1cc1..05ff3b0d233e 100644
--- a/drivers/clk/qcom/clk-rcg2.c
+++ b/drivers/clk/qcom/clk-rcg2.c
@@ -730,7 +730,8 @@ static int clk_gfx3d_determine_rate(struct clk_hw *hw,
struct clk_rate_request parent_req = { };
struct clk_rcg2_gfx3d *cgfx = to_clk_rcg2_gfx3d(hw);
struct clk_hw *xo, *p0, *p1, *p2;
-   unsigned long request, p0_rate;
+   unsigned long p0_rate;
+   u8 mux_div = cgfx->div;
int ret;
 
p0 = cgfx->hws[0];
@@ -750,14 +751,15 @@ static int clk_gfx3d_determine_rate(struct clk_hw *hw,
return 0;
}
 
-   request = req->rate;
-   if (cgfx->div > 1)
-   parent_req.rate = request = request * cgfx->div;
+   if (mux_div == 0)
+   mux_div = 1;
+
+   parent_req.rate = req->rate * mux_div;
 
/* This has to be a fixed rate PLL */
p0_rate = clk_hw_get_rate(p0);
 
-   if (request == p0_rate) {
+   if (parent_req.rate == p0_rate) {
req->rate = req->best_parent_rate = p0_rate;
req->best_parent_hw = p0;
return 0;
@@ -765,7 +767,7 @@ static int clk_gfx3d_determine_rate(struct clk_hw *hw,
 
if (req->best_parent_hw == p0) {
/* Are we going back to a previously used rate? */
-   if (clk_hw_get_rate(p2) == request)
+   if (clk_hw_get_rate(p2) == parent_req.rate)
req->best_parent_hw = p2;
else
req->best_parent_hw = p1;
@@ -780,8 +782,7 @@ static int clk_gfx3d_determine_rate(struct clk_hw *hw,
return ret;
 
req->rate = req->best_parent_rate = parent_req.rate;
-   if (cgfx->div > 1)
-   req->rate /= cgfx->div;
+   req->rate /= mux_div;
 
return 0;
 }
-- 
2.30.1

Re: [PATCH v2] MIPS: select CPU_MIPS64 for remaining MIPS64 CPUs

2021-03-03 Thread Thomas Bogendoerfer

On Mon, Mar 01, 2021 at 05:27:46PM +0100, Jason A. Donenfeld wrote:
> Hey Thomas,
> 
> Would you mind sending this for 5.12 in an rc at some point, rather
> than waiting for 5.13? I'd like to see this backported to 5.10 and 5.4
> for OpenWRT.

why is this so important for OpenWRT ? Just to select CRYPTO_POLY1305_MIPS ?

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]

Re: [PATCH] MIPS: BMIPS: Reserve exception base to prevent corruption

2021-03-03 Thread Thomas Bogendoerfer

On Mon, Mar 01, 2021 at 08:19:38PM -0800, Florian Fainelli wrote:
> BMIPS is one of the few platforms that do change the exception base.
> After commit 2dcb39645441 ("memblock: do not start bottom-up allocations
> with kernel_end") we started seeing BMIPS boards fail to boot with the
> built-in FDT being corrupted.
> 
> Before the cited commit, early allocations would be in the [kernel_end,
> RAM_END] range, but after commit they would be within [RAM_START +
> PAGE_SIZE, RAM_END].
> 
> The custom exception base handler that is installed by
> bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the
> memory region allocated by unflatten_and_copy_device_tree() thus
> corrupting the FDT used by the kernel.
> 
> To fix this, we need to perform an early reservation of the custom
> exception that is going to be installed and this needs to happen at
> plat_mem_setup() time to ensure that unflatten_and_copy_device_tree()
> finds a space that is suitable, away from reserved memory.
> 
> Huge thanks to Serget for analysing and proposing a solution to this
> issue.
> 
> Fixes: Fixes: 2dcb39645441 ("memblock: do not start bottom-up allocations 
> with kernel_end")
> Debugged-by: Serge Semin 
> Reported-by: Kamal Dasu 
> Signed-off-by: Florian Fainelli 
> ---
> Thomas,
> 
> This is intended as a stop-gap solution for 5.12-rc1 and to be picked up
> by the stable team for 5.11. We should find a safer way to avoid these
> problems for 5.13 maybe.

let's try to make it in one ago. Hwo about reserving vector space in
cpu_probe, if it's known there and leave the rest to trap_init() ?

Below patch got a quick test on IP22 (real hardware) and malta (qemu).
Not sure, if I got all BMIPS parts correct, so please check/test.
BTW. do we really need to EXPORT_SYMBOL ebase ?

Thomas,


diff --git a/arch/mips/include/asm/setup.h b/arch/mips/include/asm/setup.h
index bb36a400203d..3ef62c23c34f 100644
--- a/arch/mips/include/asm/setup.h
+++ b/arch/mips/include/asm/setup.h
@@ -23,7 +23,7 @@ typedef void (*vi_handler_t)(void);
 extern void *set_vi_handler(int n, vi_handler_t addr);
 
 extern void *set_except_vector(int n, void *addr);
-extern unsigned long ebase;
+extern unsigned long ebase, ebase_size;
 extern unsigned int hwrena;
 extern void per_cpu_trap_init(bool);
 extern void cpu_cache_init(void);
diff --git a/arch/mips/include/asm/traps.h b/arch/mips/include/asm/traps.h
index 6aa8f126a43d..f7d59831aae3 100644
--- a/arch/mips/include/asm/traps.h
+++ b/arch/mips/include/asm/traps.h
@@ -26,6 +26,8 @@ extern void (*board_cache_error_setup)(void);
 extern int register_nmi_notifier(struct notifier_block *nb);
 extern char except_vec_nmi[];
 
+#define VECTORSPACING 0x100/* for EI/VI mode */
+
 #define nmi_notifier(fn, pri)  \
 ({ \
static struct notifier_block fn##_nb = {\
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index 9a89637b4ecf..eef1a4e304da 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -25,7 +26,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 
 #include "fpu-probe.h"
@@ -1628,6 +1631,8 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips 
*c, unsigned int cpu)
c->cputype = CPU_BMIPS3300;
__cpu_name[cpu] = "Broadcom BMIPS3300";
set_elf_platform(cpu, "bmips3300");
+   ebase = 0x8400;
+   ebase_size = VECTORSPACING * 64;
break;
case PRID_IMP_BMIPS43XX: {
int rev = c->processor_id & PRID_REV_MASK;
@@ -1638,6 +1643,8 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips 
*c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS4380";
set_elf_platform(cpu, "bmips4380");
c->options |= MIPS_CPU_RIXI;
+   ebase = 0x8400;
+   ebase_size = VECTORSPACING * 64;
} else {
c->cputype = CPU_BMIPS4350;
__cpu_name[cpu] = "Broadcom BMIPS4350";
@@ -1654,6 +1661,8 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips 
*c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS5000";
set_elf_platform(cpu, "bmips5000");
c->options |= MIPS_CPU_ULRI | MIPS_CPU_RIXI;
+   ebase = 0x80001000;
+   ebase_size = VECTORSPACING * 64;
break;
}
 }
@@ -2133,6 +2142,13 @@ void cpu_probe(void)
if (cpu == 0)
__ua_limit = ~((1ull << cpu_vmbits) - 1);
 #endif
+
+   if (ebase_size == 0 && !cpu_has_mips_r2_r6) {
+   ebase = CAC_BASE;
+   ebase_size = 0x400;
+   }
+   if (eb

[PATCH] mfd: stmpe: Revert "Constify static struct resource"

2021-03-03 Thread Rikard Falkeborn

Andy noted that constification of some static resource structs in
intel_quark_i2c_gpio.c were incorrect. It turns out there is another
change from the same series that is also incorrect in stmpe.c.
These structures are modified at init and can not be made const.

This reverts commit 8d7b3a6dac4eae22c58b0853696cbd256966741b.

Fixes: 8d7b3a6dac4e ("mfd: stmpe: Constify static struct resource")
Cc: Andy Shevchenko 
Signed-off-by: Rikard Falkeborn 
---
I went through the series and this was the only additional issue I
found. Sorry about that.

 drivers/mfd/stmpe.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/mfd/stmpe.c b/drivers/mfd/stmpe.c
index 90f3292230c9..1aee3b3253fc 100644
--- a/drivers/mfd/stmpe.c
+++ b/drivers/mfd/stmpe.c
@@ -312,7 +312,7 @@ EXPORT_SYMBOL_GPL(stmpe_set_altfunc);
  * GPIO (all variants)
  */
 
-static const struct resource stmpe_gpio_resources[] = {
+static struct resource stmpe_gpio_resources[] = {
/* Start and end filled dynamically */
{
.flags  = IORESOURCE_IRQ,
@@ -336,7 +336,7 @@ static const struct mfd_cell stmpe_gpio_cell_noirq = {
  * Keypad (1601, 2401, 2403)
  */
 
-static const struct resource stmpe_keypad_resources[] = {
+static struct resource stmpe_keypad_resources[] = {
{
.name   = "KEYPAD",
.flags  = IORESOURCE_IRQ,
@@ -357,7 +357,7 @@ static const struct mfd_cell stmpe_keypad_cell = {
 /*
  * PWM (1601, 2401, 2403)
  */
-static const struct resource stmpe_pwm_resources[] = {
+static struct resource stmpe_pwm_resources[] = {
{
.name   = "PWM0",
.flags  = IORESOURCE_IRQ,
@@ -445,7 +445,7 @@ static struct stmpe_variant_info stmpe801_noirq = {
  * Touchscreen (STMPE811 or STMPE610)
  */
 
-static const struct resource stmpe_ts_resources[] = {
+static struct resource stmpe_ts_resources[] = {
{
.name   = "TOUCH_DET",
.flags  = IORESOURCE_IRQ,
@@ -467,7 +467,7 @@ static const struct mfd_cell stmpe_ts_cell = {
  * ADC (STMPE811)
  */
 
-static const struct resource stmpe_adc_resources[] = {
+static struct resource stmpe_adc_resources[] = {
{
.name   = "STMPE_TEMP_SENS",
.flags  = IORESOURCE_IRQ,
-- 
2.30.1

Re: [PATCH] ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls

2021-03-03 Thread Sergei Trofimovich

On Sun, 21 Feb 2021 00:25:53 +
Sergei Trofimovich  wrote:

> In https://bugs.gentoo.org/769614 Dmitry noticed that
> `ptrace(PTRACE_GET_SYSCALL_INFO)` does not work for syscalls called
> via glibc's syscall() wrapper.
> 
> ia64 has two ways to call syscalls from userspace: via `break` and via
> `eps` instructions.
> 
> The difference is in stack layout:
> 
> 1. `eps` creates simple stack frame: no locals, in{0..7} == out{0..8}
> 2. `break` uses userspace stack frame: may be locals (glibc provides
>one), in{0..7} == out{0..8}.
> 
> Both work fine in syscall handling cde itself.
> 
> But `ptrace(PTRACE_GET_SYSCALL_INFO)` uses unwind mechanism to
> re-extract syscall arguments but it does not account for locals.
> 
> The change always skips locals registers. It should not change `eps`
> path as kernel's handler already enforces locals=0 and fixes `break`.
> 
> Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
> 
> CC: Oleg Nesterov 
> CC: linux-i...@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> CC: Andrew Morton 
> Reported-by: Dmitry V. Levin 
> Bug: https://bugs.gentoo.org/769614
> Signed-off-by: Sergei Trofimovich 
> ---
>  arch/ia64/kernel/ptrace.c | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/ia64/kernel/ptrace.c b/arch/ia64/kernel/ptrace.c
> index c3490ee2daa5..e14f5653393a 100644
> --- a/arch/ia64/kernel/ptrace.c
> +++ b/arch/ia64/kernel/ptrace.c
> @@ -2013,27 +2013,39 @@ static void syscall_get_set_args_cb(struct 
> unw_frame_info *info, void *data)
>  {
>   struct syscall_get_set_args *args = data;
>   struct pt_regs *pt = args->regs;
> - unsigned long *krbs, cfm, ndirty;
> + unsigned long *krbs, cfm, ndirty, nlocals, nouts;
>   int i, count;
>  
>   if (unw_unwind_to_user(info) < 0)
>   return;
>  
> + /*
> +  * We get here via a few paths:
> +  * - break instruction: cfm is shared with caller.
> +  *   syscall args are in out= regs, locals are non-empty.
> +  * - epsinstruction: cfm is set by br.call
> +  *   locals don't exist.
> +  *
> +  * For both cases argguments are reachable in cfm.sof - cfm.sol.
> +  * CFM: [ ... | sor: 17..14 | sol : 13..7 | sof : 6..0 ]
> +  */
>   cfm = pt->cr_ifs;
> + nlocals = (cfm >> 7) & 0x7f; /* aka sol */
> + nouts = (cfm & 0x7f) - nlocals; /* aka sof - sol */
>   krbs = (unsigned long *)info->task + IA64_RBS_OFFSET/8;
>   ndirty = ia64_rse_num_regs(krbs, krbs + (pt->loadrs >> 19));
>  
>   count = 0;
>   if (in_syscall(pt))
> - count = min_t(int, args->n, cfm & 0x7f);
> + count = min_t(int, args->n, nouts);
>  
> + /* Iterate over outs. */
>   for (i = 0; i < count; i++) {
> + int j = ndirty + nlocals + i + args->i;
>   if (args->rw)
> - *ia64_rse_skip_regs(krbs, ndirty + i + args->i) =
> - args->args[i];
> + *ia64_rse_skip_regs(krbs, j) = args->args[i];
>   else
> - args->args[i] = *ia64_rse_skip_regs(krbs,
> - ndirty + i + args->i);
> + args->args[i] = *ia64_rse_skip_regs(krbs, j);
>   }
>  
>   if (!args->rw) {
> -- 
> 2.30.1
> 

Andrew, would it be fine to pass it through misc tree?
Or should it go through Oleg as it's about ptrace?

-- 

  Sergei

Re: [PATCH] counter: stm32-timer-cnt: fix ceiling write max value

2021-03-03 Thread William Breathitt Gray

On Tue, Mar 02, 2021 at 06:03:25PM +0100, Fabrice Gasnier wrote:
> On 3/2/21 3:56 PM, William Breathitt Gray wrote:
> > Side question: if priv->ceiling is tracking the current ceiling
> > configuration, would it make sense to change stm32_count_ceiling_read()
> > to print the value of priv->ceiling instead of doing a regmap_read()
> > call?
> 
> Hi William,
> 
> Thanks for reviewing.
> 
> I'd be fine either way. So no objection to move to the priv->ceiling
> (cached) value. It could also here here.
> By looking at this, I figured out there's probably another thing to fix
> here, for initial conditions.
> 
> At probe time priv->ceiling is initialized to max value (ex 65535 for a
> 16 bits counter). But the register content is 0 (clear by mfd driver at
> probe time).
> 
> - So, reading ceiling from sysfs currently reports 0 (regmap_read())
> after booting and probing.
> 
> I see two cases at this point:
> - In case the counter gets enabled without any prior configuration, it
> won't count: ceiling value (e.g. 65535) should be written to register
> before it is enabled, so the counter will actually count. So there's
> room for a fix here.
> 
> - In case function gets set (ex: quadrature x4), priv->ceiling (e.g.
> 65535) gets written to the register (although it's been read earlier as
> 0 from sysfs).
> This could be fixed by reading the priv->ceiling in
> stm32_count_ceiling_read() as you're asking (provided 1st case has been
> fixed as well)
> 
> I'll probably prepare one or two patches for the above cases, if you agree ?
> 
> Best Regards,
> Fabrice

Looking through the driver, it doesn't seem like priv->ceiling is used
in any performance critical code, just the callbacks for count_write()
and function_set(). It might make more sense to remove priv->ceiling and
replace it with the regmap_read() calls where necessary so that we
always get the most current ceiling value from the device when needed.

As for the default ceiling value for the device at probe time, this
should probably be set to the max value because that is what a normal
user would expect when loading a Counter driver (a ceiling value of 0 at
startup is somewhat unintuitive).

If you prepare those two patches, then that should resolve this.

William Breathitt Gray


signature.asc
Description: PGP signature

Re: [PATCH] ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

2021-03-03 Thread Sergei Trofimovich

On Sun, 21 Feb 2021 00:25:54 +
Sergei Trofimovich  wrote:

> In https://bugs.gentoo.org/769614 Dmitry noticed that
> `ptrace(PTRACE_GET_SYSCALL_INFO)` does not return error sign properly.
> 
> The bug is in mismatch between get/set errors:
> 
> static inline long syscall_get_error(struct task_struct *task,
>  struct pt_regs *regs)
> {
> return regs->r10 == -1 ? regs->r8:0;
> }
> 
> static inline long syscall_get_return_value(struct task_struct *task,
> struct pt_regs *regs)
> {
> return regs->r8;
> }
> 
> static inline void syscall_set_return_value(struct task_struct *task,
> struct pt_regs *regs,
> int error, long val)
> {
> if (error) {
> /* error < 0, but ia64 uses > 0 return value */
> regs->r8 = -error;
> regs->r10 = -1;
> } else {
> regs->r8 = val;
> regs->r10 = 0;
> }
> }
> 
> Tested on v5.10 on rx3600 machine (ia64 9040 CPU).
> 
> CC: linux-i...@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> CC: Andrew Morton 
> Reported-by: Dmitry V. Levin 
> Bug: https://bugs.gentoo.org/769614
> Signed-off-by: Sergei Trofimovich 
> ---
>  arch/ia64/include/asm/syscall.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/ia64/include/asm/syscall.h b/arch/ia64/include/asm/syscall.h
> index 6c6f16e409a8..0d23c0049301 100644
> --- a/arch/ia64/include/asm/syscall.h
> +++ b/arch/ia64/include/asm/syscall.h
> @@ -32,7 +32,7 @@ static inline void syscall_rollback(struct task_struct 
> *task,
>  static inline long syscall_get_error(struct task_struct *task,
>struct pt_regs *regs)
>  {
> - return regs->r10 == -1 ? regs->r8:0;
> + return regs->r10 == -1 ? -regs->r8:0;
>  }
>  
>  static inline long syscall_get_return_value(struct task_struct *task,
> -- 
> 2.30.1
> 

Andrew, would it be fine to pass it through misc tree?
Or should it go through Oleg as it's mostly about ptrace?

-- 

  Sergei

Re: [PATCH] mm: delete bool "migrated"

2021-03-03 Thread Andrew Morton

On Mon,  1 Mar 2021 20:57:01 +0800 Wang Qing  wrote:

> Smatch gives the warning:
>   do_numa_page() warn: assigning (-11) to unsigned variable 'migrated'
> 
> ...
>
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -4102,7 +4102,6 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>   int page_nid = NUMA_NO_NODE;
>   int last_cpupid;
>   int target_nid;
> - bool migrated = false;
>   pte_t pte, old_pte;
>   bool was_writable = pte_savedwrite(vmf->orig_pte);
>   int flags = 0;
> @@ -4172,8 +4171,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
>   }
>  
>   /* Migrate to the requested node */
> - migrated = migrate_misplaced_page(page, vma, target_nid);
> - if (migrated) {
> + if (migrate_misplaced_page(page, vma, target_nid)) {
>   page_nid = target_nid;
>   flags |= TNF_MIGRATED;
>   } else

Looks right.

Methinks both migrate_misplaced_page() and numamigrate_isolate_page()
should return bools.  (And that their return values should be documented!)

Re: [PATCH] [v2] Input: Add "Share" button to Microsoft Xbox One controller.

2021-03-03 Thread Chris Ye

Hi Bastien,
The "Share button" is a name Microsoft calls it, it actually has
HID descriptor defined in the bluetooth interface, which the HID usage
is:
consumer 0xB2:
0x05, 0x0C,//   Usage Page (Consumer)
0x0A, 0xB2, 0x00,  //   Usage (Record)
Microsoft wants the same key code to be generated consistently for USB
and bluetooth.
Thanks!
Chris


On Tue, Mar 2, 2021 at 1:50 AM Bastien Nocera  wrote:
>
> On Thu, 2021-02-25 at 05:32 +, Chris Ye wrote:
> > Add "Share" button input capability and input event mapping for
> > Microsoft Xbox One controller.
> > Fixed Microsoft Xbox One controller share button not working under USB
> > connection.
> >
> > Signed-off-by: Chris Ye 
> > ---
> >  drivers/input/joystick/xpad.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/input/joystick/xpad.c
> > b/drivers/input/joystick/xpad.c
> > index 9f0d07dcbf06..0c3374091aff 100644
> > --- a/drivers/input/joystick/xpad.c
> > +++ b/drivers/input/joystick/xpad.c
> > @@ -79,6 +79,7 @@
> >  #define MAP_DPAD_TO_BUTTONS(1 << 0)
> >  #define MAP_TRIGGERS_TO_BUTTONS(1 << 1)
> >  #define MAP_STICKS_TO_NULL (1 << 2)
> > +#define MAP_SHARE_BUTTON   (1 << 3)
> >  #define DANCEPAD_MAP_CONFIG(MAP_DPAD_TO_BUTTONS
> > |  \
> > MAP_TRIGGERS_TO_BUTTONS |
> > MAP_STICKS_TO_NULL)
> >
> > @@ -130,6 +131,7 @@ static const struct xpad_device {
> > { 0x045e, 0x02e3, "Microsoft X-Box One Elite pad", 0,
> > XTYPE_XBOXONE },
> > { 0x045e, 0x02ea, "Microsoft X-Box One S pad", 0, XTYPE_XBOXONE
> > },
> > { 0x045e, 0x0719, "Xbox 360 Wireless Receiver",
> > MAP_DPAD_TO_BUTTONS, XTYPE_XBOX360W },
> > +   { 0x045e, 0x0b12, "Microsoft X-Box One X pad",
> > MAP_SHARE_BUTTON, XTYPE_XBOXONE },
> > { 0x046d, 0xc21d, "Logitech Gamepad F310", 0, XTYPE_XBOX360 },
> > { 0x046d, 0xc21e, "Logitech Gamepad F510", 0, XTYPE_XBOX360 },
> > { 0x046d, 0xc21f, "Logitech Gamepad F710", 0, XTYPE_XBOX360 },
> > @@ -862,6 +864,8 @@ static void xpadone_process_packet(struct usb_xpad
> > *xpad, u16 cmd, unsigned char
> > /* menu/view buttons */
> > input_report_key(dev, BTN_START,  data[4] & 0x04);
> > input_report_key(dev, BTN_SELECT, data[4] & 0x08);
> > +   if (xpad->mapping & MAP_SHARE_BUTTON)
> > +   input_report_key(dev, KEY_RECORD, data[22] & 0x01);
> >
> > /* buttons A,B,X,Y */
> > input_report_key(dev, BTN_A,data[4] & 0x10);
> > @@ -1669,9 +1673,12 @@ static int xpad_init_input(struct usb_xpad
> > *xpad)
> >
> > /* set up model-specific ones */
> > if (xpad->xtype == XTYPE_XBOX360 || xpad->xtype ==
> > XTYPE_XBOX360W ||
> > -   xpad->xtype == XTYPE_XBOXONE) {
> > +   xpad->xtype == XTYPE_XBOXONE) {
> > for (i = 0; xpad360_btn[i] >= 0; i++)
> > input_set_capability(input_dev, EV_KEY,
> > xpad360_btn[i]);
> > +   if (xpad->mapping & MAP_SHARE_BUTTON) {
> > +   input_set_capability(input_dev, EV_KEY,
> > KEY_RECORD);
>
> Is there not a better keycode to use than "Record"? Should a "share"
> keycode be added?
>
> I couldn't find a share button in the most recent USB HID Usage Tables:
> https://www.usb.org/document-library/hid-usage-tables-121
>
> > +   }
> > } else {
> > for (i = 0; xpad_btn[i] >= 0; i++)
> > input_set_capability(input_dev, EV_KEY,
> > xpad_btn[i]);
>
>

Re: [PATCH 07/15] ASoC: cs42l42: Set clock source for both ways of stream

2021-03-03 Thread kernel test robot

Hi Lucas,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on asoc/for-next]
[also build test ERROR on v5.12-rc1 next-20210302]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Lucas-Tanure/Report-jack-and-button-detection-Capture-Support/20210303-012348
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 
for-next
config: arc-allyesconfig (attached as .config)
compiler: arceb-elf-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/d004d17fc0cf6b114d467e0b352fe619c2d653a4
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Lucas-Tanure/Report-jack-and-button-detection-Capture-Support/20210303-012348
git checkout d004d17fc0cf6b114d467e0b352fe619c2d653a4
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

Note: the 
linux-review/Lucas-Tanure/Report-jack-and-button-detection-Capture-Support/20210303-012348
 HEAD 844324ce32306dc48f88e1a9fb44f51783b3942d builds fine.
  It only hurts bisectibility.

All errors (new ones prefixed by >>):

   sound/soc/codecs/cs42l42.c: In function 'cs42l42_mute_stream':
>> sound/soc/codecs/cs42l42.c:804:3: error: 'cs42l42' undeclared (first use in 
>> this function)
 804 |   cs42l42->stream_use &= ~(1 << stream);
 |   ^~~
   sound/soc/codecs/cs42l42.c:804:3: note: each undeclared identifier is 
reported only once for each function it appears in


vim +/cs42l42 +804 sound/soc/codecs/cs42l42.c

   788  
   789  static int cs42l42_mute_stream(struct snd_soc_dai *dai, int mute, int 
stream)
   790  {
   791  struct snd_soc_component *component = dai->component;
   792  unsigned int regval;
   793  u8 fullScaleVol;
   794  
   795  if (mute) {
   796  /* Mute the headphone */
   797  if (stream == SNDRV_PCM_STREAM_PLAYBACK)
   798  snd_soc_component_update_bits(component, 
CS42L42_HP_CTL,
   799
CS42L42_HP_ANA_AMUTE_MASK |
   800
CS42L42_HP_ANA_BMUTE_MASK,
   801
CS42L42_HP_ANA_AMUTE_MASK |
   802
CS42L42_HP_ANA_BMUTE_MASK);
   803  
 > 804  cs42l42->stream_use &= ~(1 << stream);
   805  if(!cs42l42->stream_use) {
   806  /*
   807   * Switch to the internal oscillator.
   808   * SCLK must remain running until after this 
clock switch.
   809   * Without a source of clock the I2C bus 
doesn't work.
   810   */
   811  snd_soc_component_update_bits(component, 
CS42L42_OSC_SWITCH,
   812
CS42L42_SCLK_PRESENT_MASK, 0);
   813  snd_soc_component_update_bits(component, 
CS42L42_PLL_CTL1,
   814
CS42L42_PLL_START_MASK, 0);
   815  }
   816  } else {
   817  if (!cs42l42->stream_use) {
   818  /* SCLK must be running before codec unmute */
   819  snd_soc_component_update_bits(component, 
CS42L42_PLL_CTL1,
   820
CS42L42_PLL_START_MASK, 1);
   821  
   822  /* Mark SCLK as present, turn off internal 
oscillator */
   823  snd_soc_component_update_bits(component, 
CS42L42_OSC_SWITCH,
   824
CS42L42_SCLK_PRESENT_MASK,
   825
CS42L42_SCLK_PRESENT_MASK);
   826  }
   827  cs42l42->stream_use |= 1 << stream;
   828  
   829  if (stream == SNDRV_PCM_STREAM_PLAYBACK) {
   830  /* Read the headphone load */
   831  regval = snd_soc_component_read(component, 
CS42L42_LOAD_DET_RCSTAT);
   832  if (((regval & CS42L42_RLA_STAT_MASK) >> 
CS42L42_RLA_STAT_SHIFT) ==
   83

Re: [PATCH v4 0/3] Cleanup and fixups for vmemmap handling

2021-03-03 Thread Andrew Morton

On Mon,  1 Mar 2021 09:32:27 +0100 Oscar Salvador  wrote:

> Hi Andrew,
> 
> Now that 5.12-rc1 is out, and as discussed, here there is a new version on top
> of it.
> Please, consider picking up the series.
>

I grabbed them, but...

> 
>  arch/x86/mm/init_64.c | 189 
> +++---

Perhaps a better route would be via an x86 tree.

Re: [PATCH v2] MIPS: select CPU_MIPS64 for remaining MIPS64 CPUs

2021-03-03 Thread Jason A. Donenfeld

On 3/3/21, Thomas Bogendoerfer  wrote:
> On Mon, Mar 01, 2021 at 05:27:46PM +0100, Jason A. Donenfeld wrote:
>> Hey Thomas,
>>
>> Would you mind sending this for 5.12 in an rc at some point, rather
>> than waiting for 5.13? I'd like to see this backported to 5.10 and 5.4
>> for OpenWRT.
>
> why is this so important for OpenWRT ? Just to select CRYPTO_POLY1305_MIPS
> ?

Yes. The performance boost on Octeon is significant for WireGuard users.

And it becomes incredibly frustrating to bifurcate packaging into
"mips64 where the config is right" vs "mips64 where the config is
borked." This saves us a lot of trouble.

Plus the patch is trivial.

Re: [PATCH v2 1/1] mm/madvise: replace ptrace attach requirement for process_madvise

2021-03-03 Thread Suren Baghdasaryan

On Tue, Mar 2, 2021 at 4:17 PM Andrew Morton  wrote:
>
> On Tue, 2 Mar 2021 15:53:39 -0800 Suren Baghdasaryan  
> wrote:
>
> > Hi Andrew,
> > A friendly reminder to please include this patch into mm tree.
> > There seem to be no more questions or objections.
> > The man page you requested is accepted here:
> > https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=a144f458bad476a3358e3a45023789cb7bb9f993
> > stable is CC'ed and this patch should go into 5.10 and later kernels
> > The patch has been:
> > Acked-by: Minchan Kim 
> > Acked-by: David Rientjes 
> > Reviewed-by: Kees Cook 
> >
> > If you want me to resend it, please let me know.
>
> This patch was tough.  I think it would be best to resend please, being
> sure to cc everyone who commented.  To give everyone another chance to
> get their heads around it.  If necessary, please update the changelog
> to address any confusion/questions which have arisen thus far.

Sure, will do. Thanks!

>
> Thanks.

Re: [PATCH v2 1/1] mm/madvise: replace ptrace attach requirement for process_madvise

2021-03-03 Thread Andrew Morton

On Tue, 2 Mar 2021 15:53:39 -0800 Suren Baghdasaryan  wrote:

> Hi Andrew,
> A friendly reminder to please include this patch into mm tree.
> There seem to be no more questions or objections.
> The man page you requested is accepted here:
> https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=a144f458bad476a3358e3a45023789cb7bb9f993
> stable is CC'ed and this patch should go into 5.10 and later kernels
> The patch has been:
> Acked-by: Minchan Kim 
> Acked-by: David Rientjes 
> Reviewed-by: Kees Cook 
> 
> If you want me to resend it, please let me know.

This patch was tough.  I think it would be best to resend please, being
sure to cc everyone who commented.  To give everyone another chance to
get their heads around it.  If necessary, please update the changelog
to address any confusion/questions which have arisen thus far.

Thanks.

Re: KASAN: use-after-free Read in cipso_v4_genopt

2021-03-03 Thread Paul Moore

On Tue, Mar 2, 2021 at 2:15 PM Dmitry Vyukov  wrote:

...

> Not sure if it's the root cause or not, but I am looking at this
> reference drop in cipso_v4_doi_remove:
> https://elixir.bootlin.com/linux/v5.12-rc1/source/net/ipv4/cipso_ipv4.c#L522
> The thing is that it does not remove from the list if reference is not
> 0, right? So what if I send 1000 of netlink remove messages? Will it
> drain refcount to 0?
> I did not read all involved code, but the typical pattern is to drop
> refcount and always remove from the list. Then the last use will
> delete the object.
> Does it make any sense?

Looking at it quickly, the logic above seems sane.  I wrote this code
a *long* time ago, so let me get my head back into it and make sure
that still holds.

-- 
paul moore
www.paul-moore.com

[PATCH v12 02/17] arm64: hyp-stub: Move invalid vector entries into the vectors

2021-03-03 Thread Pavel Tatashin

From: James Morse 

Most of the hyp-stub's vector entries are invalid. These are each
a unique function that branches to itself. To move these into the
vectors, merge the ventry and invalid_vector macros and give each
one a unique name.

This means we can copy the hyp-stub as it is self contained within
its vectors.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 56 +++-
 1 file changed, 23 insertions(+), 33 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 572b28646005..ff329c5c074d 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -16,31 +16,38 @@
 #include 
 #include 
 
+.macro invalid_vector  label
+SYM_CODE_START_LOCAL(\label)
+   .align 7
+   b   \label
+SYM_CODE_END(\label)
+.endm
+
.text
.pushsection.hyp.text, "ax"
 
.align 11
 
 SYM_CODE_START(__hyp_stub_vectors)
-   ventry  el2_sync_invalid// Synchronous EL2t
-   ventry  el2_irq_invalid // IRQ EL2t
-   ventry  el2_fiq_invalid // FIQ EL2t
-   ventry  el2_error_invalid   // Error EL2t
+   invalid_vector  hyp_stub_el2t_sync_invalid  // Synchronous EL2t
+   invalid_vector  hyp_stub_el2t_irq_invalid   // IRQ EL2t
+   invalid_vector  hyp_stub_el2t_fiq_invalid   // FIQ EL2t
+   invalid_vector  hyp_stub_el2t_error_invalid // Error EL2t
 
-   ventry  el2_sync_invalid// Synchronous EL2h
-   ventry  el2_irq_invalid // IRQ EL2h
-   ventry  el2_fiq_invalid // FIQ EL2h
-   ventry  el2_error_invalid   // Error EL2h
+   invalid_vector  hyp_stub_el2h_sync_invalid  // Synchronous EL2h
+   invalid_vector  hyp_stub_el2h_irq_invalid   // IRQ EL2h
+   invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
+   invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
ventry  el1_sync// Synchronous 64-bit EL1
-   ventry  el1_irq_invalid // IRQ 64-bit EL1
-   ventry  el1_fiq_invalid // FIQ 64-bit EL1
-   ventry  el1_error_invalid   // Error 64-bit EL1
-
-   ventry  el1_sync_invalid// Synchronous 32-bit EL1
-   ventry  el1_irq_invalid // IRQ 32-bit EL1
-   ventry  el1_fiq_invalid // FIQ 32-bit EL1
-   ventry  el1_error_invalid   // Error 32-bit EL1
+   invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
+   invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
+
+   invalid_vector  hyp_stub_32b_el1_sync_invalid   // Synchronous 32-bit 
EL1
+   invalid_vector  hyp_stub_32b_el1_irq_invalid// IRQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_fiq_invalid// FIQ 32-bit EL1
+   invalid_vector  hyp_stub_32b_el1_error_invalid  // Error 32-bit EL1
.align 11
 SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
@@ -173,23 +180,6 @@ SYM_CODE_END(enter_vhe)
 
.popsection
 
-.macro invalid_vector  label
-SYM_CODE_START_LOCAL(\label)
-   b \label
-SYM_CODE_END(\label)
-.endm
-
-   invalid_vector  el2_sync_invalid
-   invalid_vector  el2_irq_invalid
-   invalid_vector  el2_fiq_invalid
-   invalid_vector  el2_error_invalid
-   invalid_vector  el1_sync_invalid
-   invalid_vector  el1_irq_invalid
-   invalid_vector  el1_fiq_invalid
-   invalid_vector  el1_error_invalid
-
-   .popsection
-
 /*
  * __hyp_set_vectors: Call this after boot to set the initial hypervisor
  * vectors as part of hypervisor installation.  On an SMP system, this should
-- 
2.25.1

[PATCH v12 00/17] arm64: MMU enabled kexec relocation

2021-03-03 Thread Pavel Tatashin

Changelog:
v12:
- A major change compared to previous version. Instead of using
  contiguous VA range a copy of linear map is now used to perform
  copying of segments during relocation as it was agreed in the
  discussion of version 11 of this project.
- In addition to using linear map, I also took several ideas from
  James Morse to better organize the kexec relocation:
1. skip relocation function entirely if that is not needed
2. remove the PoC flushing function since it is not needed
   anymore with MMU enabled.
v11:
- Fixed missing KEXEC_CORE dependency for trans_pgd.c
- Removed useless "if(rc) return rc" statement (thank you Tyler Hicks)
- Another 12 patches were accepted into maintainer's get.
  Re-based patches against:
  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git
  Branch: for-next/kexec
v10:
- Addressed a lot of comments form James Morse and from  Marc Zyngier
- Added review-by's
- Synchronized with mainline

v9: - 9 patches from previous series landed in upstream, so now series
  is smaller
- Added two patches from James Morse to address idmap issues for 
machines
  with high physical addresses.
- Addressed comments from Selin Dag about compiling issues. He also 
tested
  my series and got similar performance results: ~60 ms instead of ~580 
ms
  with an initramfs size of ~120MB.
v8:
- Synced with mainline to keep series up-to-date
v7:
-- Addressed comments from James Morse
- arm64: hibernate: pass the allocated pgdp to ttbr0
  Removed "Fixes" tag, and added Added Reviewed-by: James Morse
- arm64: hibernate: check pgd table allocation
  Sent out as a standalone patch so it can be sent to stable
  Series applies on mainline + this patch
- arm64: hibernate: add trans_pgd public functions
  Remove second allocation of tmp_pg_dir in swsusp_arch_resume
  Added Reviewed-by: James Morse 
- arm64: kexec: move relocation function setup and clean up
  Fixed typo in commit log
  Changed kern_reloc to phys_addr_t types.
  Added explanation why kern_reloc is needed.
  Split into four patches:
  arm64: kexec: make dtb_mem always enabled
  arm64: kexec: remove unnecessary debug prints
  arm64: kexec: call kexec_image_info only once
  arm64: kexec: move relocation function setup
- arm64: kexec: add expandable argument to relocation function
  Changed types of new arguments from unsigned long to phys_addr_t.
  Changed offset prefix to KEXEC_*
  Split into four patches:
  arm64: kexec: cpu_soft_restart change argument types
  arm64: kexec: arm64_relocate_new_kernel clean-ups
  arm64: kexec: arm64_relocate_new_kernel don't use x0 as temp
  arm64: kexec: add expandable argument to relocation function
- arm64: kexec: configure trans_pgd page table for kexec
  Added invalid entries into EL2 vector table
  Removed KEXEC_EL2_VECTOR_TABLE_SIZE and KEXEC_EL2_VECTOR_TABLE_OFFSET
  Copy relocation functions and table into separate pages
  Changed types in kern_reloc_arg.
  Split into three patches:
  arm64: kexec: offset for relocation function
  arm64: kexec: kexec EL2 vectors
  arm64: kexec: configure trans_pgd page table for kexec
- arm64: kexec: enable MMU during kexec relocation
  Split into two patches:
  arm64: kexec: enable MMU during kexec relocation
  arm64: kexec: remove head from relocation argument
v6:
- Sync with mainline tip
- Added Acked's from Dave Young
v5:
- Addressed comments from Matthias Brugger: added review-by's, improved
  comments, and made cleanups to swsusp_arch_resume() in addition to
  create_safe_exec_page().
- Synced with mainline tip.
v4:
- Addressed comments from James Morse.
- Split "check pgd table allocation" into two patches, and moved to
  the beginning of series  for simpler backport of the fixes.
  Added "Fixes:" tags to commit logs.
- Changed "arm64, hibernate:" to "arm64: hibernate:"
- Added Reviewed-by's
- Moved "add PUD_SECT_RDONLY" earlier in series to be with other
  clean-ups
- Added "Derived from:" to arch/arm64/mm/trans_pgd.c
- Removed "flags" from trans_info
- Changed .trans_alloc_page assumption to return zeroed page.
- Simplify changes to trans_pgd_map_page(), by keeping the old
  code.
- Simplify changes to trans_pgd_create_copy, by keeping the old
  code.
- Removed: "add trans_pgd_create_empty"
- replace init_mm with NULL, and keep using

[PATCH v12 01/17] arm64: hyp-stub: Check the size of the HYP stub's vectors

2021-03-03 Thread Pavel Tatashin

From: James Morse 

Hibernate contains a set of temporary EL2 vectors used to 'park'
EL2 somewhere safe while all the memory is thrown in the air.
Making kexec do its relocations with the MMU on means they have to
be done at EL1, so EL2 has to be parked. This means yet another
set of vectors.

All these things do is HVC_SET_VECTORS and HVC_SOFT_RESTART, both
of which are implemented by the hyp-stub. Lets copy it instead
of re-inventing it.

To do this the hyp-stub's entrails need to be packed neatly inside
its 2K vectors.

Start by moving the final 2K alignment inside the end marker, and
add a build check that we didn't overflow 2K.

Signed-off-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 5eccbd62fec8..572b28646005 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -41,9 +41,13 @@ SYM_CODE_START(__hyp_stub_vectors)
ventry  el1_irq_invalid // IRQ 32-bit EL1
ventry  el1_fiq_invalid // FIQ 32-bit EL1
ventry  el1_error_invalid   // Error 32-bit EL1
+   .align 11
+SYM_INNER_LABEL(__hyp_stub_vectors_end, SYM_L_LOCAL)
 SYM_CODE_END(__hyp_stub_vectors)
 
-   .align 11
+# Check the __hyp_stub_vectors didn't overflow
+.org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
+
 
 SYM_CODE_START_LOCAL(el1_sync)
cmp x0, #HVC_SET_VECTORS
-- 
2.25.1

linux-next: build failure after merge of the tip tree

2021-03-03 Thread Stephen Rothwell

Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

In file included from drivers/usb/usbip/usbip_common.c:18:
drivers/usb/usbip/usbip_common.h: In function 'usbip_kcov_handle_init':
drivers/usb/usbip/usbip_common.h:348:20: error: implicit declaration of 
function 'kcov_common_handle' [-Werror=implicit-function-declaration]
  348 |  ud->kcov_handle = kcov_common_handle();
  |^~
drivers/usb/usbip/usbip_common.h: In function 'usbip_kcov_remote_start':
drivers/usb/usbip/usbip_common.h:353:2: error: implicit declaration of function 
'kcov_remote_start_common' [-Werror=implicit-function-declaration]
  353 |  kcov_remote_start_common(ud->kcov_handle);
  |  ^~~~
drivers/usb/usbip/usbip_common.h: In function 'usbip_kcov_remote_stop':
drivers/usb/usbip/usbip_common.h:358:2: error: implicit declaration of function 
'kcov_remote_stop'; did you mean 'usbip_kcov_remote_stop'? 
[-Werror=implicit-function-declaration]
  358 |  kcov_remote_stop();
  |  ^~~~
  |  usbip_kcov_remote_stop

Caused by commit

  eae7a59d5a1e ("kcov: Remove kcov include from sched.h and move it to its 
users.")

I have used the tip tree from next-20210302 for today.

-- 
Cheers,
Stephen Rothwell


pgphrQPytO2Ft.pgp
Description: OpenPGP digital signature

Re: [PATCH 03/15] KVM: x86/mmu: Ensure MMU pages are available when allocating roots

2021-03-03 Thread Ben Gardon

On Tue, Mar 2, 2021 at 10:46 AM Sean Christopherson  wrote:
>
> Hold the mmu_lock for write for the entire duration of allocating and
> initializing an MMU's roots.  This ensures there are MMU pages available
> and thus prevents root allocations from failing.  That in turn fixes a
> bug where KVM would fail to free valid PAE roots if a one of the later
> roots failed to allocate.
>
> Note, KVM still leaks the PAE roots if the lm_root allocation fails.
> This will be addressed in a future commit.
>
> Cc: Ben Gardon 
> Signed-off-by: Sean Christopherson 

Reviewed-by: Ben Gardon 

Very tidy cleanup!

> ---
>  arch/x86/kvm/mmu/mmu.c | 41 --
>  arch/x86/kvm/mmu/tdp_mmu.c | 23 +
>  2 files changed, 18 insertions(+), 46 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 2ed3fac1244e..1f129001a30c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2398,6 +2398,9 @@ static int make_mmu_pages_available(struct kvm_vcpu 
> *vcpu)
>  {
> unsigned long avail = kvm_mmu_available_pages(vcpu->kvm);
>
> +   /* Ensure all four PAE roots can be allocated in a single pass. */
> +   BUILD_BUG_ON(KVM_MIN_FREE_MMU_PAGES < 4);
> +

For a second I thought that this should be 5 since a page is needed to
hold the 4 PAE roots, but that page is allocated at vCPU creation and
reused, so no need to check for it here.

> if (likely(avail >= KVM_MIN_FREE_MMU_PAGES))
> return 0;
>
> @@ -3220,16 +3223,9 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, 
> gfn_t gfn, gva_t gva,
>  {
> struct kvm_mmu_page *sp;
>
> -   write_lock(&vcpu->kvm->mmu_lock);
> -
> -   if (make_mmu_pages_available(vcpu)) {
> -   write_unlock(&vcpu->kvm->mmu_lock);
> -   return INVALID_PAGE;
> -   }
> sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL);
> ++sp->root_count;
>
> -   write_unlock(&vcpu->kvm->mmu_lock);
> return __pa(sp->spt);
>  }
>
> @@ -3241,16 +3237,10 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu 
> *vcpu)
>
> if (is_tdp_mmu_enabled(vcpu->kvm)) {
> root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
> -
> -   if (!VALID_PAGE(root))
> -   return -ENOSPC;
> vcpu->arch.mmu->root_hpa = root;
> } else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
> root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level,
>   true);
> -
> -   if (!VALID_PAGE(root))
> -   return -ENOSPC;

There's so much going on in mmu_alloc_root that removing this check
makes me nervous, but I think it should be safe.
I checked though the function because I was worried it might yield
somewhere in there, which could result in the page cache being emptied
and the allocation failing, but I don't think mmu_alloc_root this
function will yield.

> vcpu->arch.mmu->root_hpa = root;
> } else if (shadow_root_level == PT32E_ROOT_LEVEL) {
> for (i = 0; i < 4; ++i) {
> @@ -3258,8 +3248,6 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>
> root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
>   i << 30, PT32_ROOT_LEVEL, true);
> -   if (!VALID_PAGE(root))
> -   return -ENOSPC;
> vcpu->arch.mmu->pae_root[i] = root | PT_PRESENT_MASK;
> }
> vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
> @@ -3294,8 +3282,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>
> root = mmu_alloc_root(vcpu, root_gfn, 0,
>   vcpu->arch.mmu->shadow_root_level, 
> false);
> -   if (!VALID_PAGE(root))
> -   return -ENOSPC;
> vcpu->arch.mmu->root_hpa = root;
> goto set_root_pgd;
> }
> @@ -3325,6 +3311,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>
> for (i = 0; i < 4; ++i) {
> MMU_WARN_ON(VALID_PAGE(vcpu->arch.mmu->pae_root[i]));
> +
> if (vcpu->arch.mmu->root_level == PT32E_ROOT_LEVEL) {
> pdptr = vcpu->arch.mmu->get_pdptr(vcpu, i);
> if (!(pdptr & PT_PRESENT_MASK)) {
> @@ -3338,8 +3325,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
>
> root = mmu_alloc_root(vcpu, root_gfn, i << 30,
>   PT32_ROOT_LEVEL, false);
> -   if (!VALID_PAGE(root))
> -   return -ENOSPC;
> vcpu->arch.mmu->pae_root[i] = root | pm_mask;
> }
> vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
> @@ -3373,14 +3358,6 @@ static int mmu_alloc_shadow_roots(struct kv

[PATCH v12 04/17] arm64: kernel: add helper for booted at EL2 and not VHE

2021-03-03 Thread Pavel Tatashin

Replace places that contain logic like this:
is_hyp_mode_available() && !is_kernel_in_hyp_mode()

With a dedicated boolean function  is_hyp_callable(). This will be needed
later in kexec in order to sooner switch back to EL2.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/virt.h | 5 +
 arch/arm64/kernel/cpu-reset.h | 3 +--
 arch/arm64/kernel/hibernate.c | 9 +++--
 arch/arm64/kernel/sdei.c  | 2 +-
 4 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7379f35ae2c6..4216c8623538 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -128,6 +128,11 @@ static __always_inline bool is_protected_kvm_enabled(void)
return cpus_have_final_cap(ARM64_KVM_PROTECTED_MODE);
 }
 
+static inline bool is_hyp_callable(void)
+{
+   return is_hyp_mode_available() && !is_kernel_in_hyp_mode();
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* ! __ASM__VIRT_H */
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index ed50e9587ad8..1922e7a690f8 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,8 +20,7 @@ static inline void __noreturn cpu_soft_restart(unsigned long 
entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = !is_kernel_in_hyp_mode() &&
-   is_hyp_mode_available();
+   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index b1cef371df2b..c764574a1acb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,9 +48,6 @@
  */
 extern int in_suspend;
 
-/* Do we need to reset el2? */
-#define el2_reset_needed() (is_hyp_mode_available() && 
!is_kernel_in_hyp_mode())
-
 /* temporary el2 vectors in the __hibernate_exit_text section. */
 extern char hibernate_el2_vectors[];
 
@@ -125,7 +122,7 @@ int arch_hibernation_header_save(void *addr, unsigned int 
max_size)
hdr->reenter_kernel = _cpu_resume;
 
/* We can't use __hyp_get_vectors() because kvm may still be loaded */
-   if (el2_reset_needed())
+   if (is_hyp_callable())
hdr->__hyp_stub_vectors = __pa_symbol(__hyp_stub_vectors);
else
hdr->__hyp_stub_vectors = 0;
@@ -387,7 +384,7 @@ int swsusp_arch_suspend(void)
dcache_clean_range(__idmap_text_start, __idmap_text_end);
 
/* Clean kvm setup code to PoC? */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
dcache_clean_range(__hyp_idmap_text_start, 
__hyp_idmap_text_end);
dcache_clean_range(__hyp_text_start, __hyp_text_end);
}
@@ -482,7 +479,7 @@ int swsusp_arch_resume(void)
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (el2_reset_needed()) {
+   if (is_hyp_callable()) {
phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
el2_vectors += hibernate_el2_vectors -
   __hibernate_exit_text_start; /* offset */
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
index 2c7ca449dd51..af0ac2f920cf 100644
--- a/arch/arm64/kernel/sdei.c
+++ b/arch/arm64/kernel/sdei.c
@@ -200,7 +200,7 @@ unsigned long sdei_arch_get_entry_point(int conduit)
 * dropped to EL1 because we don't support VHE, then we can't support
 * SDEI.
 */
-   if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
+   if (is_hyp_callable()) {
pr_err("Not supported on this hardware/boot configuration\n");
goto out_err;
}
-- 
2.25.1

[PATCH v12 05/17] arm64: trans_pgd: hibernate: Add trans_pgd_copy_el2_vectors

2021-03-03 Thread Pavel Tatashin

Users of trans_pgd may also need a copy of vector table because it is
also may be overwritten if a linear map can be overwritten.

Move setup of EL2 vectors from hibernate to trans_pgd, so it can be
later shared with kexec as well.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/trans_pgd.h |  3 +++
 arch/arm64/include/asm/virt.h  |  3 +++
 arch/arm64/kernel/hibernate.c  | 28 ++--
 arch/arm64/mm/trans_pgd.c  | 20 
 4 files changed, 36 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/trans_pgd.h 
b/arch/arm64/include/asm/trans_pgd.h
index 5d08e5adf3d5..e0760e52d36d 100644
--- a/arch/arm64/include/asm/trans_pgd.h
+++ b/arch/arm64/include/asm/trans_pgd.h
@@ -36,4 +36,7 @@ int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t 
*trans_pgd,
 int trans_pgd_idmap_page(struct trans_pgd_info *info, phys_addr_t *trans_ttbr0,
 unsigned long *t0sz, void *page);
 
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors);
+
 #endif /* _ASM_TRANS_TABLE_H */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 4216c8623538..bfbb66018114 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -67,6 +67,9 @@
  */
 extern u32 __boot_cpu_mode[2];
 
+extern char __hyp_stub_vectors[];
+#define ARM64_VECTOR_TABLE_LEN SZ_2K
+
 void __hyp_set_vectors(phys_addr_t phys_vector_base);
 void __hyp_reset_vectors(void);
 
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index c764574a1acb..0b8bad8bb6eb 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -48,12 +48,6 @@
  */
 extern int in_suspend;
 
-/* temporary el2 vectors in the __hibernate_exit_text section. */
-extern char hibernate_el2_vectors[];
-
-/* hyp-stub vectors, used to restore el2 during resume from hibernate. */
-extern char __hyp_stub_vectors[];
-
 /*
  * The logical cpu number we should resume on, initialised to a non-cpu
  * number.
@@ -428,6 +422,7 @@ int swsusp_arch_resume(void)
void *zero_page;
size_t exit_size;
pgd_t *tmp_pg_dir;
+   phys_addr_t el2_vectors;
void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *,
  void *, phys_addr_t, phys_addr_t);
struct trans_pgd_info trans_info = {
@@ -455,6 +450,14 @@ int swsusp_arch_resume(void)
return -ENOMEM;
}
 
+   if (is_hyp_callable()) {
+   rc = trans_pgd_copy_el2_vectors(&trans_info, &el2_vectors);
+   if (rc) {
+   pr_err("Failed to setup el2 vectors\n");
+   return rc;
+   }
+   }
+
exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
/*
 * Copy swsusp_arch_suspend_exit() to a safe page. This will generate
@@ -467,25 +470,14 @@ int swsusp_arch_resume(void)
return rc;
}
 
-   /*
-* The hibernate exit text contains a set of el2 vectors, that will
-* be executed at el2 with the mmu off in order to reload hyp-stub.
-*/
-   __flush_dcache_area(hibernate_exit, exit_size);
-
/*
 * KASLR will cause the el2 vectors to be in a different location in
 * the resumed kernel. Load hibernate's temporary copy into el2.
 *
 * We can skip this step if we booted at EL1, or are running with VHE.
 */
-   if (is_hyp_callable()) {
-   phys_addr_t el2_vectors = (phys_addr_t)hibernate_exit;
-   el2_vectors += hibernate_el2_vectors -
-  __hibernate_exit_text_start; /* offset */
-
+   if (is_hyp_callable())
__hyp_set_vectors(el2_vectors);
-   }
 
hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1,
   resume_hdr.reenter_kernel, restore_pblist,
diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c
index 527f0a39c3da..61549451ed3a 100644
--- a/arch/arm64/mm/trans_pgd.c
+++ b/arch/arm64/mm/trans_pgd.c
@@ -322,3 +322,23 @@ int trans_pgd_idmap_page(struct trans_pgd_info *info, 
phys_addr_t *trans_ttbr0,
 
return 0;
 }
+
+/*
+ * Create a copy of the vector table so we can call HVC_SET_VECTORS or
+ * HVC_SOFT_RESTART from contexts where the table may be overwritten.
+ */
+int trans_pgd_copy_el2_vectors(struct trans_pgd_info *info,
+  phys_addr_t *el2_vectors)
+{
+   void *hyp_stub = trans_alloc(info);
+
+   if (!hyp_stub)
+   return -ENOMEM;
+   *el2_vectors = virt_to_phys(hyp_stub);
+   memcpy(hyp_stub, &__hyp_stub_vectors, ARM64_VECTOR_TABLE_LEN);
+   __flush_icache_range((unsigned long)hyp_stub,
+(unsigned long)hyp_stub + ARM64_VECTOR_TABLE_LEN);
+   __flush_dcache_

[PATCH v12 08/17] arm64: kexec: skip relocation code for inplace kexec

2021-03-03 Thread Pavel Tatashin

In case of kdump or when segments are already in place the relocation
is not needed, therefore the setup of relocation function and call to
it can be skipped.

Signed-off-by: Pavel Tatashin 
Suggested-by: James Morse 
---
 arch/arm64/kernel/machine_kexec.c   | 34 ++---
 arch/arm64/kernel/relocate_kernel.S |  3 ---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 3a034bc25709..b150b65f0b84 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -139,21 +139,23 @@ int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
 
-   /* If in place flush new kernel image, else flush lists and buffers */
-   if (kimage->head & IND_DONE)
+   /* If in place, relocation is not used, only flush next kernel */
+   if (kimage->head & IND_DONE) {
kexec_segment_flush(kimage);
-   else
-   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
+   return 0;
+   }
 
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   arm64_relocate_new_kernel_size);
+   kexec_list_flush(kimage);
+   kexec_image_info(kimage);
 
return 0;
 }
@@ -180,19 +182,25 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* cpu_soft_restart will shutdown the MMU, disable data caches, then
-* transfer control to the kern_reloc which contains a copy of
-* the arm64_relocate_new_kernel routine.  arm64_relocate_new_kernel
-* uses physical addressing to relocate the new image to its final
-* position and transfers control to the image entry point when the
-* relocation is complete.
+* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* caches. However, restart will start new kernel or purgatory directly,
+* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
 * In kexec_file case, the kernel starts directly without purgatory.
 */
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head, kimage->start,
-kimage->arch.dtb_mem);
+   if (kimage->head & IND_DONE) {
+   typeof(__cpu_soft_restart) *restart;
+
+   cpu_install_idmap();
+   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
+   0, 0);
+   } else {
+   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
+kimage->start, kimage->arch.dtb_mem);
+   }
 
BUG(); /* Should never get here. */
 }
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index b78ea5de97a4..8058fabe0a76 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -32,8 +32,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x16, x0 /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
-   /* Check if the new image needs relocation. */
-   tbnzx16, IND_DONE_BIT, .Ldone
raw_dcache_line_size x15, x1/* x15 = dcache line size */
 .Lloop:
and x12, x16, PAGE_MASK /* x12 = addr */
@@ -65,7 +63,6 @@ SYM_CODE_START(arm64_relocate_new_kernel)
 .Lnext:
ldr x16, [x14], #8  /* entry = *ptr++ */
tbz x16, IND_DONE_BIT, .Lloop   /* while (!(entry & DONE)) */
-.Ldone:
/* wait for writes from copy_page to finish */
dsb nsh
ic  iallu
-- 
2.25.1

[PATCH v12 06/17] arm64: hibernate: abstract ttrb0 setup function

2021-03-03 Thread Pavel Tatashin

Currently, only hibernate sets custom ttbr0 with safe idmaped function.
Kexec, is also going to be using this functinality when relocation code
is going to be idmapped.

Move the setup seqeuence to a dedicated cpu_install_ttbr0() for custom
ttbr0.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/mmu_context.h | 24 
 arch/arm64/kernel/hibernate.c| 21 +
 2 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/mmu_context.h 
b/arch/arm64/include/asm/mmu_context.h
index 70ce8c1d2b07..c6521c8c06ac 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -132,6 +132,30 @@ static inline void cpu_install_idmap(void)
cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm);
 }
 
+/*
+ * Load our new page tables. A strict BBM approach requires that we ensure that
+ * TLBs are free of any entries that may overlap with the global mappings we 
are
+ * about to install.
+ *
+ * For a real hibernate/resume/kexec cycle TTBR0 currently points to a zero
+ * page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI runtime
+ * services), while for a userspace-driven test_resume cycle it points to
+ * userspace page tables (and we must point it at a zero page ourselves).
+ *
+ * We change T0SZ as part of installing the idmap. This is undone by
+ * cpu_uninstall_idmap() in __cpu_suspend_exit().
+ */
+static inline void cpu_install_ttbr0(phys_addr_t ttbr0, unsigned long t0sz)
+{
+   cpu_set_reserved_ttbr0();
+   local_flush_tlb_all();
+   __cpu_set_tcr_t0sz(t0sz);
+
+   /* avoid cpu_switch_mm() and its SW-PAN and CNP interactions */
+   write_sysreg(ttbr0, ttbr0_el1);
+   isb();
+}
+
 /*
  * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD,
  * avoiding the possibility of conflicting TLB entries being allocated.
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 0b8bad8bb6eb..ded5115bcb63 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -206,26 +206,7 @@ static int create_safe_exec_page(void *src_start, size_t 
length,
if (rc)
return rc;
 
-   /*
-* Load our new page tables. A strict BBM approach requires that we
-* ensure that TLBs are free of any entries that may overlap with the
-* global mappings we are about to install.
-*
-* For a real hibernate/resume cycle TTBR0 currently points to a zero
-* page, but TLBs may contain stale ASID-tagged entries (e.g. for EFI
-* runtime services), while for a userspace-driven test_resume cycle it
-* points to userspace page tables (and we must point it at a zero page
-* ourselves).
-*
-* We change T0SZ as part of installing the idmap. This is undone by
-* cpu_uninstall_idmap() in __cpu_suspend_exit().
-*/
-   cpu_set_reserved_ttbr0();
-   local_flush_tlb_all();
-   __cpu_set_tcr_t0sz(t0sz);
-   write_sysreg(trans_ttbr0, ttbr0_el1);
-   isb();
-
+   cpu_install_ttbr0(trans_ttbr0, t0sz);
*phys_dst_addr = virt_to_phys(page);
 
return 0;
-- 
2.25.1

[bisected] 5.12-rc1 hpsa regression: "scsi: hpsa: Correct dev cmds outstanding for retried cmds" breaks hpsa P600

2021-03-03 Thread Sergei Trofimovich

On Tue, 2 Mar 2021 23:31:32 +0100
John Paul Adrian Glaubitz  wrote:

> Hi Sergei!
> 
> On 3/2/21 11:26 PM, Sergei Trofimovich wrote:
> > Gave v5.12-rc1 a try today and got a similar boot failure around
> > hpsa queue initialization, but my failure is later:
> > https://dev.gentoo.org/~slyfox/configs/guppy-dmesg-5.12-rc1
> > Maybe I get different error because I flipped on most debugging
> > kernel options :)
> > 
> > Looks like 'ERROR: Invalid distance value range' while being
> > very scary are harmless. It's just a new spammy way for kernel
> > to report lack of NUMA config on the machine (no SRAT and SLIT
> > ACPI tables).
> > 
> > At least I get hpsa detected on PCI bus. But I guess it's discovered
> > configuration is very wrong as I get unaligned accesses:
> > [   19.811570] kernel unaligned access to 0xe00105dd8295, 
> > ip=0xa00100b874d1
> > 
> > Bisecting now.  
> 
> Sounds good. I guess we should get Jens' fix for the signal regression
> merged as well as your two fixes for strace.

"bisected" (cheated halfway through) and verified that reverting
f749d8b7a9896bc6e5ffe104cc64345037e0b152 makes rx3600 boot again.

CCing authors who might be able to help us here.

commit f749d8b7a9896bc6e5ffe104cc64345037e0b152
Author: Don Brace 
Date:   Mon Feb 15 16:26:57 2021 -0600

scsi: hpsa: Correct dev cmds outstanding for retried cmds

Prevent incrementing device->commands_outstanding for ioaccel command
retries that are driver initiated.  If the command goes through the retry
path, the device->commands_outstanding counter has already accounted for
the number of commands outstanding to the device.  Only commands going
through function hpsa_cmd_resolve_events decrement this counter.

 - ioaccel commands go to either HBA disks or to logical volumes comprised
   of SSDs.

The extra increment is causing device resets to hang.

 - Resets wait for all device outstanding commands to complete before
   returning.

Replace unused field abort_pending with retry_pending. This is a
maintenance driver so these changes have the least impact/risk.

Link: 
https://lore.kernel.org/r/161342801747.29388.13045495968308188518.stgit@brunhilda
Tested-by: Joe Szczypek 
Reviewed-by: Scott Benesh 
Reviewed-by: Scott Teel 
Reviewed-by: Tomas Henzl 
Signed-off-by: Don Brace 
Signed-off-by: Martin K. Petersen 

Don, do you happen to know why this patch caused some controller init failure
for device
14:01.0 RAID bus controller: Hewlett-Packard Company Smart Array P600
?

Boot failure: https://dev.gentoo.org/~slyfox/configs/guppy-dmesg-5.12-rc1
Boot success: https://dev.gentoo.org/~slyfox/configs/guppy-dmesg-5.12-rc1-good

The difference between the two boots is 
f749d8b7a9896bc6e5ffe104cc64345037e0b152 reverted on top of 5.12-rc1
in -good case.

Looks like hpsa controller fails to initialize in bad case (could be a race?).

-- 

  Sergei

[PATCH v12 12/17] arm64: kexec: relocate in EL1 mode

2021-03-03 Thread Pavel Tatashin

Since we are going to keep MMU enabled during relocation, we need to
keep EL1 mode throughout the relocation.

Keep EL1 enabled, and switch EL2 only before enterying the new world.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/cpu-reset.h   |  3 +--
 arch/arm64/kernel/machine_kexec.c   |  4 ++--
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
index 1922e7a690f8..f6d95512fec6 100644
--- a/arch/arm64/kernel/cpu-reset.h
+++ b/arch/arm64/kernel/cpu-reset.h
@@ -20,11 +20,10 @@ static inline void __noreturn cpu_soft_restart(unsigned 
long entry,
 {
typeof(__cpu_soft_restart) *restart;
 
-   unsigned long el2_switch = is_hyp_callable();
restart = (void *)__pa_symbol(__cpu_soft_restart);
 
cpu_install_idmap();
-   restart(el2_switch, entry, arg0, arg1, arg2);
+   restart(0, entry, arg0, arg1, arg2);
unreachable();
 }
 
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index fb03b6676fb9..d5940b7889f8 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -231,8 +231,8 @@ void machine_kexec(struct kimage *kimage)
} else {
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
-0, 0);
+   cpu_soft_restart(kimage->arch.kern_reloc,
+virt_to_phys(kimage), 0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 36b4496524c3..df023b82544b 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
@@ -61,12 +62,20 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
+   ldr x1, [x0, #KIMAGE_ARCH_EL2_VECTORS]  /* relocation start */
+   cbz x1, .Lel1
+   ldr x1, [x0, #KIMAGE_START] /* relocation start */
+   ldr x2, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
+   mov x3, xzr
+   mov x4, xzr
+   mov x0, #HVC_SOFT_RESTART
+   hvc #0  /* Jumps from el2 */
+.Lel1:
ldr x4, [x0, #KIMAGE_START] /* relocation start */
ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
-   mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x4
+   br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1

[PATCH v12 09/17] arm64: kexec: Use dcache ops macros instead of open-coding

2021-03-03 Thread Pavel Tatashin

From: James Morse 

kexec does dcache maintenance when it re-writes all memory. Our
dcache_by_line_op macro depends on reading the sanitised DminLine
from memory. Kexec may have overwritten this, so open-codes the
sequence.

dcache_by_line_op is a whole set of macros, it uses dcache_line_size
which uses read_ctr for the sanitsed DminLine. Reading the DminLine
is the first thing the dcache_by_line_op does.

Rename dcache_by_line_op dcache_by_myline_op and take DminLine as
an argument. Kexec can now use the slightly smaller macro.

This makes up-coming changes to the dcache maintenance easier on
the eye.

Code generated by the existing callers is unchanged.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 12 
 arch/arm64/kernel/relocate_kernel.S | 13 +++--
 2 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..29061b76aab6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -371,10 +371,9 @@ alternative_else
 alternative_endif
.endm
 
-   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
-   dcache_line_size \tmp1, \tmp2
+   .macro dcache_by_myline_op op, domain, kaddr, size, linesz, tmp2
add \size, \kaddr, \size
-   sub \tmp2, \tmp1, #1
+   sub \tmp2, \linesz, #1
bic \kaddr, \kaddr, \tmp2
 9998:
.ifc\op, cvau
@@ -394,12 +393,17 @@ alternative_endif
.endif
.endif
.endif
-   add \kaddr, \kaddr, \tmp1
+   add \kaddr, \kaddr, \linesz
cmp \kaddr, \size
b.lo9998b
dsb \domain
.endm
 
+   .macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+   dcache_line_size \tmp1, \tmp2
+   dcache_by_myline_op \op, \domain, \kaddr, \size, \tmp1, \tmp2
+   .endm
+
 /*
  * Macro to perform an instruction cache maintenance for the interval
  * [start, end)
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 8058fabe0a76..718037bef560 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -41,16 +41,9 @@ SYM_CODE_START(arm64_relocate_new_kernel)
tbz x16, IND_SOURCE_BIT, .Ltest_indirection
 
/* Invalidate dest page to PoC. */
-   mov x2, x13
-   add x20, x2, #PAGE_SIZE
-   sub x1, x15, #1
-   bic x2, x2, x1
-2: dc  ivac, x2
-   add x2, x2, x15
-   cmp x2, x20
-   b.lo2b
-   dsb sy
-
+   mov x2, x13
+   mov x1, #PAGE_SIZE
+   dcache_by_myline_op ivac, sy, x2, x1, x15, x20
copy_page x13, x12, x1, x2, x3, x4, x5, x6, x7, x8
b   .Lnext
 .Ltest_indirection:
-- 
2.25.1

[PATCH v12 11/17] arm64: kexec: kexec may require EL2 vectors

2021-03-03 Thread Pavel Tatashin

If we have a EL2 mode without VHE, the EL2 vectors are needed in order
to switch to EL2 and jump to new world with hypervisor privileges.

In preporation to MMU enabled relocation, configure our EL2 table now.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/Kconfig|  2 +-
 arch/arm64/include/asm/kexec.h|  1 +
 arch/arm64/kernel/asm-offsets.c   |  1 +
 arch/arm64/kernel/machine_kexec.c | 31 +++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1f212b47a48a..825fe88b7c08 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1141,7 +1141,7 @@ config CRASH_DUMP
 
 config TRANS_TABLE
def_bool y
-   depends on HIBERNATION
+   depends on HIBERNATION || KEXEC_CORE
 
 config XEN_DOM0
def_bool y
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 9befcd87e9a8..305cf0840ed3 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -96,6 +96,7 @@ struct kimage_arch {
void *dtb;
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
+   phys_addr_t el2_vectors;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 0c92e193f866..2e3278df1fc3 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,6 +157,7 @@ int main(void)
 #endif
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 2e734e4ae12e..fb03b6676fb9 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu-reset.h"
 
@@ -42,7 +43,9 @@ static void _kexec_image_info(const char *func, int line,
pr_debug("start:   %lx\n", kimage->start);
pr_debug("head:%lx\n", kimage->head);
pr_debug("nr_segments: %lu\n", kimage->nr_segments);
+   pr_debug("dtb_mem: %pa\n", &kimage->arch.dtb_mem);
pr_debug("kern_reloc: %pa\n", &kimage->arch.kern_reloc);
+   pr_debug("el2_vectors: %pa\n", &kimage->arch.el2_vectors);
 
for (i = 0; i < kimage->nr_segments; i++) {
pr_debug("  segment[%lu]: %016lx - %016lx, 0x%lx bytes, %lu 
pages\n",
@@ -137,9 +140,27 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+/* Allocates pages for kexec page table */
+static void *kexec_page_alloc(void *arg)
+{
+   struct kimage *kimage = (struct kimage *)arg;
+   struct page *page = kimage_alloc_control_pages(kimage, 0);
+
+   if (!page)
+   return NULL;
+
+   memset(page_address(page), 0, PAGE_SIZE);
+
+   return page_address(page);
+}
+
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   struct trans_pgd_info info = {
+   .trans_alloc_page   = kexec_page_alloc,
+   .trans_alloc_arg= kimage,
+   };
 
/* If in place, relocation is not used, only flush next kernel */
if (kimage->head & IND_DONE) {
@@ -148,6 +169,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return 0;
}
 
+   kimage->arch.el2_vectors = 0;
+   if (is_hyp_callable()) {
+   int rc = trans_pgd_copy_el2_vectors(&info,
+   &kimage->arch.el2_vectors);
+   if (rc)
+   return rc;
+   }
+
memcpy(reloc_code, arm64_relocate_new_kernel,
   arm64_relocate_new_kernel_size);
kimage->arch.kern_reloc = __pa(reloc_code);
@@ -200,6 +229,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   if (is_hyp_callable())
+   __hyp_set_vectors(kimage->arch.el2_vectors);
cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
 0, 0);
}
-- 
2.25.1

[PATCH v12 15/17] arm64: kexec: keep MMU enabled during kexec relocation

2021-03-03 Thread Pavel Tatashin

Now, that we have linear map page tables configured, keep MMU enabled
to allow faster relocation of segments to final destination.

The performance data: for a moderate size kernel + initramfs: 25M the
relocation was taking 0.382s, with enabled MMU it now takes
0.019s only or x20 improvement.

The time is proportional to the size of relocation, therefore if initramfs
is larger, 100M it could take over a second.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h  |  3 +++
 arch/arm64/kernel/asm-offsets.c |  1 +
 arch/arm64/kernel/machine_kexec.c   | 16 ++
 arch/arm64/kernel/relocate_kernel.S | 33 +++--
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 59ac166daf53..5fc87b51f8a9 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,8 +97,11 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr0;
phys_addr_t ttbr1;
phys_addr_t zero_page;
+   unsigned long phys_offset;
+   unsigned long t0sz;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 609362b5aa76..ec7bb80aedc8 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -159,6 +159,7 @@ int main(void)
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
   DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_PHYS_OFFSET,  offsetof(struct kimage, 
arch.phys_offset));
   DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index c875ef522e53..d5c8aefc66f3 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -190,6 +190,11 @@ int machine_kexec_post_load(struct kimage *kimage)
reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
+   rc = trans_pgd_idmap_page(&info, &kimage->arch.ttbr0,
+ &kimage->arch.t0sz, reloc_code);
+   if (rc)
+   return rc;
+   kimage->arch.phys_offset = virt_to_phys(kimage) - (long)kimage;
 
/* Flush the reloc_code in preparation for its execution. */
__flush_dcache_area(reloc_code, reloc_size);
@@ -223,9 +228,9 @@ void machine_kexec(struct kimage *kimage)
local_daif_mask();
 
/*
-* Both restart and cpu_soft_restart will shutdown the MMU, disable data
+* Both restart and kernel_reloc will shutdown the MMU, disable data
 * caches. However, restart will start new kernel or purgatory directly,
-* cpu_soft_restart will transfer control to arm64_relocate_new_kernel
+* kernel_reloc contains the body of arm64_relocate_new_kernel
 * In kexec case, kimage->start points to purgatory assuming that
 * kernel entry and dtb address are embedded in purgatory by
 * userspace (kexec-tools).
@@ -239,10 +244,13 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
+   void (*kernel_reloc)(struct kimage *kimage);
+
if (is_hyp_callable())
__hyp_set_vectors(kimage->arch.el2_vectors);
-   cpu_soft_restart(kimage->arch.kern_reloc,
-virt_to_phys(kimage), 0, 0);
+   cpu_install_ttbr0(kimage->arch.ttbr0, kimage->arch.t0sz);
+   kernel_reloc = (void *)kimage->arch.kern_reloc;
+   kernel_reloc(kimage);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index e83b6380907d..8ac4b2d7f5e8 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -4,6 +4,8 @@
  *
  * Copyright (C) Linaro.
  * Copyright (C) Huawei Futurewei Technologies.
+ * Copyright (C) 2020, Microsoft Corporation.
+ * Pavel Tatashin 
  */
 
 #include 
@@ -15,6 +17,15 @@
 #include 
 #include 
 
+.macro turn_off_mmu tmp1, tmp2
+   mrs \tmp1, sctlr_el1
+   mov_q   \tmp2, SCTLR_ELx_FLAGS
+   bic \tmp1, \tmp1, \tmp2
+   pre_disable_mmu_workaround
+   msr sctlr_el1, \tmp1
+   isb
+.endm
+
 .pushsection

[PATCH v12 10/17] arm64: kexec: pass kimage as the only argument to relocation function

2021-03-03 Thread Pavel Tatashin

Currently, kexec relocation function (arm64_relocate_new_kernel) accepts
the following arguments:

head:   start of array that contains relocation information.
entry:  entry point for new kernel or purgatory.
dtb_mem:first and only argument to entry.

The number of arguments cannot be easily expended, because this
function is also called from HVC_SOFT_RESTART, which preserves only
three arguments. And, also arm64_relocate_new_kernel is written in
assembly but called without stack, thus no place to move extra arguments
to free registers.

Soon, we will need to pass more arguments: once we enable MMU we
will need to pass information about page tables.

Pass kimage to arm64_relocate_new_kernel, and teach it to get the
required fields from kimage.

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/asm-offsets.c |  7 +++
 arch/arm64/kernel/machine_kexec.c   |  6 --
 arch/arm64/kernel/relocate_kernel.S | 10 --
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..0c92e193f866 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -9,6 +9,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -153,6 +154,12 @@ int main(void)
   DEFINE(PTRAUTH_USER_KEY_APGA,offsetof(struct 
ptrauth_keys_user, apga));
   DEFINE(PTRAUTH_KERNEL_KEY_APIA,  offsetof(struct ptrauth_keys_kernel, 
apia));
   BLANK();
+#endif
+#ifdef CONFIG_KEXEC_CORE
+  DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
+  DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
+  DEFINE(KIMAGE_START, offsetof(struct kimage, start));
+  BLANK();
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index b150b65f0b84..2e734e4ae12e 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -83,6 +83,8 @@ static void kexec_list_flush(struct kimage *kimage)
 {
kimage_entry_t *entry;
 
+   __flush_dcache_area(kimage, sizeof(*kimage));
+
for (entry = &kimage->head; ; entry++) {
unsigned int flag;
void *addr;
@@ -198,8 +200,8 @@ void machine_kexec(struct kimage *kimage)
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-   cpu_soft_restart(kimage->arch.kern_reloc, kimage->head,
-kimage->start, kimage->arch.dtb_mem);
+   cpu_soft_restart(kimage->arch.kern_reloc, virt_to_phys(kimage),
+0, 0);
}
 
BUG(); /* Should never get here. */
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index 718037bef560..36b4496524c3 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -27,9 +27,7 @@
  */
 SYM_CODE_START(arm64_relocate_new_kernel)
/* Setup the list loop variables. */
-   mov x18, x2 /* x18 = dtb address */
-   mov x17, x1 /* x17 = kimage_start */
-   mov x16, x0 /* x16 = kimage_head */
+   ldr x16, [x0, #KIMAGE_HEAD] /* x16 = kimage_head */
mov x14, xzr/* x14 = entry ptr */
mov x13, xzr/* x13 = copy dest */
raw_dcache_line_size x15, x1/* x15 = dcache line size */
@@ -63,12 +61,12 @@ SYM_CODE_START(arm64_relocate_new_kernel)
isb
 
/* Start new image. */
-   mov x0, x18
+   ldr x4, [x0, #KIMAGE_START] /* relocation start */
+   ldr x0, [x0, #KIMAGE_ARCH_DTB_MEM]  /* dtb address */
mov x1, xzr
mov x2, xzr
mov x3, xzr
-   br  x17
-
+   br  x4
 SYM_CODE_END(arm64_relocate_new_kernel)
 
 .align 3   /* To keep the 64-bit values below naturally aligned. */
-- 
2.25.1

[PATCH v12 03/17] arm64: hyp-stub: Move el1_sync into the vectors

2021-03-03 Thread Pavel Tatashin

From: James Morse 

The hyp-stub's el1_sync code doesn't do very much, this can easily fit
in the vectors.

With this, all of the hyp-stubs behaviour is contained in its vectors.
This lets kexec and hibernate copy the hyp-stub when they need its
behaviour, instead of re-implementing it.

Signed-off-by: James Morse 

[Fixed merging issues]

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/hyp-stub.S | 59 ++--
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index ff329c5c074d..d1a73d0f74e0 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -21,6 +21,34 @@ SYM_CODE_START_LOCAL(\label)
.align 7
b   \label
 SYM_CODE_END(\label)
+.endm
+
+.macro hyp_stub_el1_sync
+SYM_CODE_START_LOCAL(hyp_stub_el1_sync)
+   .align 7
+   cmp x0, #HVC_SET_VECTORS
+   b.ne2f
+   msr vbar_el2, x1
+   b   9f
+
+2: cmp x0, #HVC_SOFT_RESTART
+   b.ne3f
+   mov x0, x2
+   mov x2, x4
+   mov x4, x1
+   mov x1, x3
+   br  x4  // no return
+
+3: cmp x0, #HVC_RESET_VECTORS
+   beq 9f  // Nothing to reset!
+
+   /* Someone called kvm_call_hyp() against the hyp-stub... */
+   mov_q   x0, HVC_STUB_ERR
+   eret
+
+9: mov x0, xzr
+   eret
+SYM_CODE_END(hyp_stub_el1_sync)
 .endm
 
.text
@@ -39,7 +67,7 @@ SYM_CODE_START(__hyp_stub_vectors)
invalid_vector  hyp_stub_el2h_fiq_invalid   // FIQ EL2h
invalid_vector  hyp_stub_el2h_error_invalid // Error EL2h
 
-   ventry  el1_sync// Synchronous 64-bit EL1
+   hyp_stub_el1_sync   // Synchronous 64-bit 
EL1
invalid_vector  hyp_stub_el1_irq_invalid// IRQ 64-bit EL1
invalid_vector  hyp_stub_el1_fiq_invalid// FIQ 64-bit EL1
invalid_vector  hyp_stub_el1_error_invalid  // Error 64-bit EL1
@@ -55,35 +83,6 @@ SYM_CODE_END(__hyp_stub_vectors)
 # Check the __hyp_stub_vectors didn't overflow
 .org . - (__hyp_stub_vectors_end - __hyp_stub_vectors) + SZ_2K
 
-
-SYM_CODE_START_LOCAL(el1_sync)
-   cmp x0, #HVC_SET_VECTORS
-   b.ne1f
-   msr vbar_el2, x1
-   b   9f
-
-1: cmp x0, #HVC_VHE_RESTART
-   b.eqmutate_to_vhe
-
-2: cmp x0, #HVC_SOFT_RESTART
-   b.ne3f
-   mov x0, x2
-   mov x2, x4
-   mov x4, x1
-   mov x1, x3
-   br  x4  // no return
-
-3: cmp x0, #HVC_RESET_VECTORS
-   beq 9f  // Nothing to reset!
-
-   /* Someone called kvm_call_hyp() against the hyp-stub... */
-   mov_q   x0, HVC_STUB_ERR
-   eret
-
-9: mov x0, xzr
-   eret
-SYM_CODE_END(el1_sync)
-
 // nVHE? No way! Give me the real thing!
 SYM_CODE_START_LOCAL(mutate_to_vhe)
// Sanity check: MMU *must* be off
-- 
2.25.1

[PATCH v12 07/17] arm64: kexec: flush image and lists during kexec load time

2021-03-03 Thread Pavel Tatashin

Currently, during kexec load we are copying relocation function and
flushing it. However, we can also flush kexec relocation buffers and
if new kernel image is already in place (i.e. crash kernel), we can
also flush the new kernel image itself.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 49 +++
 1 file changed, 23 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index 90a335c74442..3a034bc25709 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -59,23 +59,6 @@ void machine_kexec_cleanup(struct kimage *kimage)
/* Empty routine needed to avoid build errors. */
 }
 
-int machine_kexec_post_load(struct kimage *kimage)
-{
-   void *reloc_code = page_to_virt(kimage->control_code_page);
-
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
-   kimage->arch.kern_reloc = __pa(reloc_code);
-   kexec_image_info(kimage);
-
-   /* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
-   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
-
-   return 0;
-}
-
 /**
  * machine_kexec_prepare - Prepare for a kexec reboot.
  *
@@ -152,6 +135,29 @@ static void kexec_segment_flush(const struct kimage 
*kimage)
}
 }
 
+int machine_kexec_post_load(struct kimage *kimage)
+{
+   void *reloc_code = page_to_virt(kimage->control_code_page);
+
+   /* If in place flush new kernel image, else flush lists and buffers */
+   if (kimage->head & IND_DONE)
+   kexec_segment_flush(kimage);
+   else
+   kexec_list_flush(kimage);
+
+   memcpy(reloc_code, arm64_relocate_new_kernel,
+  arm64_relocate_new_kernel_size);
+   kimage->arch.kern_reloc = __pa(reloc_code);
+   kexec_image_info(kimage);
+
+   /* Flush the reloc_code in preparation for its execution. */
+   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
+  arm64_relocate_new_kernel_size);
+
+   return 0;
+}
+
 /**
  * machine_kexec - Do the kexec reboot.
  *
@@ -169,13 +175,6 @@ void machine_kexec(struct kimage *kimage)
WARN(in_kexec_crash && (stuck_cpus || smp_crash_stop_failed()),
"Some CPUs may be stale, kdump will be unreliable.\n");
 
-   /* Flush the kimage list and its buffers. */
-   kexec_list_flush(kimage);
-
-   /* Flush the new image if already in place. */
-   if ((kimage != kexec_crash_image) && (kimage->head & IND_DONE))
-   kexec_segment_flush(kimage);
-
pr_info("Bye!\n");
 
local_daif_mask();
@@ -250,8 +249,6 @@ void arch_kexec_protect_crashkres(void)
 {
int i;
 
-   kexec_segment_flush(kexec_crash_image);
-
for (i = 0; i < kexec_crash_image->nr_segments; i++)
set_memory_valid(
__phys_to_virt(kexec_crash_image->segment[i].mem),
-- 
2.25.1

[PATCH v12 17/17] arm64: kexec: Remove cpu-reset.h

2021-03-03 Thread Pavel Tatashin

This header contains only cpu_soft_restart() which is never used directly
anymore. So, remove this header, and rename the helper to be
cpu_soft_restart().

Suggested-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/kexec.h|  6 ++
 arch/arm64/kernel/cpu-reset.S |  7 +++
 arch/arm64/kernel/cpu-reset.h | 30 --
 arch/arm64/kernel/machine_kexec.c |  6 ++
 4 files changed, 11 insertions(+), 38 deletions(-)
 delete mode 100644 arch/arm64/kernel/cpu-reset.h

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 5fc87b51f8a9..ee71ae3b93ed 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -90,6 +90,12 @@ static inline void crash_prepare_suspend(void) {}
 static inline void crash_post_resume(void) {}
 #endif
 
+#if defined(CONFIG_KEXEC_CORE)
+void cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
+ unsigned long arg0, unsigned long arg1,
+ unsigned long arg2);
+#endif
+
 #define ARCH_HAS_KIMAGE_ARCH
 
 struct kimage_arch {
diff --git a/arch/arm64/kernel/cpu-reset.S b/arch/arm64/kernel/cpu-reset.S
index 37721eb6f9a1..5d47d6c92634 100644
--- a/arch/arm64/kernel/cpu-reset.S
+++ b/arch/arm64/kernel/cpu-reset.S
@@ -16,8 +16,7 @@
 .pushsection.idmap.text, "awx"
 
 /*
- * __cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2) - Helper for
- * cpu_soft_restart.
+ * cpu_soft_restart(el2_switch, entry, arg0, arg1, arg2)
  *
  * @el2_switch: Flag to indicate a switch to EL2 is needed.
  * @entry: Location to jump to for soft reset.
@@ -29,7 +28,7 @@
  * branch to what would be the reset vector. It must be executed with the
  * flat identity mapping.
  */
-SYM_CODE_START(__cpu_soft_restart)
+SYM_CODE_START(cpu_soft_restart)
/* Clear sctlr_el1 flags. */
mrs x12, sctlr_el1
mov_q   x13, SCTLR_ELx_FLAGS
@@ -51,6 +50,6 @@ SYM_CODE_START(__cpu_soft_restart)
mov x1, x3  // arg1
mov x2, x4  // arg2
br  x8
-SYM_CODE_END(__cpu_soft_restart)
+SYM_CODE_END(cpu_soft_restart)
 
 .popsection
diff --git a/arch/arm64/kernel/cpu-reset.h b/arch/arm64/kernel/cpu-reset.h
deleted file mode 100644
index f6d95512fec6..
--- a/arch/arm64/kernel/cpu-reset.h
+++ /dev/null
@@ -1,30 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * CPU reset routines
- *
- * Copyright (C) 2015 Huawei Futurewei Technologies.
- */
-
-#ifndef _ARM64_CPU_RESET_H
-#define _ARM64_CPU_RESET_H
-
-#include 
-
-void __cpu_soft_restart(unsigned long el2_switch, unsigned long entry,
-   unsigned long arg0, unsigned long arg1, unsigned long arg2);
-
-static inline void __noreturn cpu_soft_restart(unsigned long entry,
-  unsigned long arg0,
-  unsigned long arg1,
-  unsigned long arg2)
-{
-   typeof(__cpu_soft_restart) *restart;
-
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
-
-   cpu_install_idmap();
-   restart(0, entry, arg0, arg1, arg2);
-   unreachable();
-}
-
-#endif
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index a1c9bee0cddd..ef7ba93f2bd6 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -23,8 +23,6 @@
 #include 
 #include 
 
-#include "cpu-reset.h"
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -197,10 +195,10 @@ void machine_kexec(struct kimage *kimage)
 * In kexec_file case, the kernel starts directly without purgatory.
 */
if (kimage->head & IND_DONE) {
-   typeof(__cpu_soft_restart) *restart;
+   typeof(cpu_soft_restart) *restart;
 
cpu_install_idmap();
-   restart = (void *)__pa_symbol(__cpu_soft_restart);
+   restart = (void *)__pa_symbol(cpu_soft_restart);
restart(is_hyp_callable(), kimage->start, kimage->arch.dtb_mem,
0, 0);
} else {
-- 
2.25.1

[PATCH v12 16/17] arm64: kexec: remove the pre-kexec PoC maintenance

2021-03-03 Thread Pavel Tatashin

Now that kexec does its relocations with the MMU enabled, we no longer
need to clean the relocation data to the PoC.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/kernel/machine_kexec.c | 40 ---
 1 file changed, 40 deletions(-)

diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5c8aefc66f3..a1c9bee0cddd 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -76,45 +76,6 @@ int machine_kexec_prepare(struct kimage *kimage)
return 0;
 }
 
-/**
- * kexec_list_flush - Helper to flush the kimage list and source pages to PoC.
- */
-static void kexec_list_flush(struct kimage *kimage)
-{
-   kimage_entry_t *entry;
-
-   __flush_dcache_area(kimage, sizeof(*kimage));
-
-   for (entry = &kimage->head; ; entry++) {
-   unsigned int flag;
-   void *addr;
-
-   /* flush the list entries. */
-   __flush_dcache_area(entry, sizeof(kimage_entry_t));
-
-   flag = *entry & IND_FLAGS;
-   if (flag == IND_DONE)
-   break;
-
-   addr = phys_to_virt(*entry & PAGE_MASK);
-
-   switch (flag) {
-   case IND_INDIRECTION:
-   /* Set entry point just before the new list page. */
-   entry = (kimage_entry_t *)addr - 1;
-   break;
-   case IND_SOURCE:
-   /* flush the source pages. */
-   __flush_dcache_area(addr, PAGE_SIZE);
-   break;
-   case IND_DESTINATION:
-   break;
-   default:
-   BUG();
-   }
-   }
-}
-
 /**
  * kexec_segment_flush - Helper to flush the kimage segments to PoC.
  */
@@ -200,7 +161,6 @@ int machine_kexec_post_load(struct kimage *kimage)
__flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
   reloc_size);
-   kexec_list_flush(kimage);
kexec_image_info(kimage);
 
return 0;
-- 
2.25.1

[PATCH v12 13/17] arm64: kexec: use ld script for relocation function

2021-03-03 Thread Pavel Tatashin

Currently, relocation code declares start and end variables
which are used to compute its size.

The better way to do this is to use ld script incited, and put relocation
function in its own section.

Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/sections.h   |  1 +
 arch/arm64/kernel/machine_kexec.c   | 14 ++
 arch/arm64/kernel/relocate_kernel.S | 15 ++-
 arch/arm64/kernel/vmlinux.lds.S | 19 +++
 4 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/arch/arm64/include/asm/sections.h 
b/arch/arm64/include/asm/sections.h
index 2f36b16a5b5d..31e459af89f6 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -20,5 +20,6 @@ extern char __exittext_begin[], __exittext_end[];
 extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
 extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+extern char __relocate_new_kernel_start[], __relocate_new_kernel_end[];
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index d5940b7889f8..f1451d807708 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -20,14 +20,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "cpu-reset.h"
 
-/* Global variables for the arm64_relocate_new_kernel routine. */
-extern const unsigned char arm64_relocate_new_kernel[];
-extern const unsigned long arm64_relocate_new_kernel_size;
-
 /**
  * kexec_image_info - For debugging output.
  */
@@ -157,6 +154,7 @@ static void *kexec_page_alloc(void *arg)
 int machine_kexec_post_load(struct kimage *kimage)
 {
void *reloc_code = page_to_virt(kimage->control_code_page);
+   long reloc_size;
struct trans_pgd_info info = {
.trans_alloc_page   = kexec_page_alloc,
.trans_alloc_arg= kimage,
@@ -177,14 +175,14 @@ int machine_kexec_post_load(struct kimage *kimage)
return rc;
}
 
-   memcpy(reloc_code, arm64_relocate_new_kernel,
-  arm64_relocate_new_kernel_size);
+   reloc_size = __relocate_new_kernel_end - __relocate_new_kernel_start;
+   memcpy(reloc_code, __relocate_new_kernel_start, reloc_size);
kimage->arch.kern_reloc = __pa(reloc_code);
 
/* Flush the reloc_code in preparation for its execution. */
-   __flush_dcache_area(reloc_code, arm64_relocate_new_kernel_size);
+   __flush_dcache_area(reloc_code, reloc_size);
flush_icache_range((uintptr_t)reloc_code, (uintptr_t)reloc_code +
-  arm64_relocate_new_kernel_size);
+  reloc_size);
kexec_list_flush(kimage);
kexec_image_info(kimage);
 
diff --git a/arch/arm64/kernel/relocate_kernel.S 
b/arch/arm64/kernel/relocate_kernel.S
index df023b82544b..7a600ba33ae1 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -15,6 +15,7 @@
 #include 
 #include 
 
+.pushsection".kexec_relocate.text", "ax"
 /*
  * arm64_relocate_new_kernel - Put a 2nd stage image in place and boot it.
  *
@@ -77,16 +78,4 @@ SYM_CODE_START(arm64_relocate_new_kernel)
mov x3, xzr
br  x4  /* Jumps from el1 */
 SYM_CODE_END(arm64_relocate_new_kernel)
-
-.align 3   /* To keep the 64-bit values below naturally aligned. */
-
-.Lcopy_end:
-.org   KEXEC_CONTROL_PAGE_SIZE
-
-/*
- * arm64_relocate_new_kernel_size - Number of bytes to copy to the
- * control_code_page.
- */
-.globl arm64_relocate_new_kernel_size
-arm64_relocate_new_kernel_size:
-   .quad   .Lcopy_end - arm64_relocate_new_kernel
+.popsection
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..0d9d5e6af66f 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -92,6 +93,16 @@ jiffies = jiffies_64;
 #define HIBERNATE_TEXT
 #endif
 
+#ifdef CONFIG_KEXEC_CORE
+#define KEXEC_TEXT \
+   . = ALIGN(SZ_4K);   \
+   __relocate_new_kernel_start = .;\
+   *(.kexec_relocate.text) \
+   __relocate_new_kernel_end = .;
+#else
+#define KEXEC_TEXT
+#endif
+
 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0
 #define TRAMP_TEXT \
. = ALIGN(PAGE_SIZE);   \
@@ -152,6 +163,7 @@ SECTIONS
HYPERVISOR_TEXT
IDMAP_TEXT
HIBERNATE_TEXT
+   KEXEC_TEXT
TRAMP_TEXT
*(.fixup)
*(.gnu.warning)
@@ -336,3 +348,10 @@ ASSERT(swapper_pg_dir - reserved_pg_dir == 
RESERVED_SWAPPER_OFFSE

[PATCH v12 14/17] arm64: kexec: install a copy of the linear-map

2021-03-03 Thread Pavel Tatashin

To perform the kexec relocations with the MMU enabled, we need a copy
of the linear map.

Create one, and install it from the relocation code. This has to be done
from the assembly code as it will be idmapped with TTBR0. The kernel
runs in TTRB1, so can't use the break-before-make sequence on the mapping
it is executing from.

The makes no difference yet as the relocation code runs with the MMU
disabled.

Co-developed-by: James Morse 
Signed-off-by: Pavel Tatashin 
---
 arch/arm64/include/asm/assembler.h  | 19 +++
 arch/arm64/include/asm/kexec.h  |  2 ++
 arch/arm64/kernel/asm-offsets.c |  2 ++
 arch/arm64/kernel/hibernate-asm.S   | 20 
 arch/arm64/kernel/machine_kexec.c   | 16 ++--
 arch/arm64/kernel/relocate_kernel.S |  3 +++
 6 files changed, 40 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h 
b/arch/arm64/include/asm/assembler.h
index 29061b76aab6..3ce8131ad660 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -425,6 +425,25 @@ USER(\label, icivau, \tmp2)// 
invalidate I line PoU
isb
.endm
 
+/*
+ * To prevent the possibility of old and new partial table walks being visible
+ * in the tlb, switch the ttbr to a zero page when we invalidate the old
+ * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
+ * Even switching to our copied tables will cause a changed output address at
+ * each stage of the walk.
+ */
+   .macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
+   phys_to_ttbr \tmp, \zero_page
+   msr ttbr1_el1, \tmp
+   isb
+   tlbivmalle1
+   dsb nsh
+   phys_to_ttbr \tmp, \page_table
+   offset_ttbr1 \tmp, \tmp2
+   msr ttbr1_el1, \tmp
+   isb
+   .endm
+
 /*
  * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
  */
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 305cf0840ed3..59ac166daf53 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -97,6 +97,8 @@ struct kimage_arch {
phys_addr_t dtb_mem;
phys_addr_t kern_reloc;
phys_addr_t el2_vectors;
+   phys_addr_t ttbr1;
+   phys_addr_t zero_page;
/* Core ELF header buffer */
void *elf_headers;
unsigned long elf_headers_mem;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 2e3278df1fc3..609362b5aa76 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -158,6 +158,8 @@ int main(void)
 #ifdef CONFIG_KEXEC_CORE
   DEFINE(KIMAGE_ARCH_DTB_MEM,  offsetof(struct kimage, arch.dtb_mem));
   DEFINE(KIMAGE_ARCH_EL2_VECTORS,  offsetof(struct kimage, 
arch.el2_vectors));
+  DEFINE(KIMAGE_ARCH_ZERO_PAGE,offsetof(struct kimage, 
arch.zero_page));
+  DEFINE(KIMAGE_ARCH_TTBR1,offsetof(struct kimage, arch.ttbr1));
   DEFINE(KIMAGE_HEAD,  offsetof(struct kimage, head));
   DEFINE(KIMAGE_START, offsetof(struct kimage, start));
   BLANK();
diff --git a/arch/arm64/kernel/hibernate-asm.S 
b/arch/arm64/kernel/hibernate-asm.S
index 8ccca660034e..a31e621ba867 100644
--- a/arch/arm64/kernel/hibernate-asm.S
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -15,26 +15,6 @@
 #include 
 #include 
 
-/*
- * To prevent the possibility of old and new partial table walks being visible
- * in the tlb, switch the ttbr to a zero page when we invalidate the old
- * records. D4.7.1 'General TLB maintenance requirements' in ARM DDI 0487A.i
- * Even switching to our copied tables will cause a changed output address at
- * each stage of the walk.
- */
-.macro break_before_make_ttbr_switch zero_page, page_table, tmp, tmp2
-   phys_to_ttbr \tmp, \zero_page
-   msr ttbr1_el1, \tmp
-   isb
-   tlbivmalle1
-   dsb nsh
-   phys_to_ttbr \tmp, \page_table
-   offset_ttbr1 \tmp, \tmp2
-   msr ttbr1_el1, \tmp
-   isb
-.endm
-
-
 /*
  * Resume from hibernate
  *
diff --git a/arch/arm64/kernel/machine_kexec.c 
b/arch/arm64/kernel/machine_kexec.c
index f1451d807708..c875ef522e53 100644
--- a/arch/arm64/kernel/machine_kexec.c
+++ b/arch/arm64/kernel/machine_kexec.c
@@ -153,6 +153,8 @@ static void *kexec_page_alloc(void *arg)
 
 int machine_kexec_post_load(struct kimage *kimage)
 {
+   int rc;
+   pgd_t *trans_pgd;
void *reloc_code = page_to_virt(kimage->control_code_page);
long reloc_size;
struct trans_pgd_info info = {
@@ -169,12 +171,22 @@ int machine_kexec_post_load(struct kimage *kimage)
 
kimage->arch.el2_vectors = 0;
if (is_hyp_callable()) {
-   int rc = trans_pgd_copy_el2_vectors(&info,
-   &kimage->arch.el2_vectors);
+   rc = trans_pgd_copy_el2_vectors(&info,
+

[PATCH] drm/radeon: fix copy of uninitialized variable back to userspace

2021-03-03 Thread Colin King

From: Colin Ian King 

Currently the ioctl command RADEON_INFO_SI_BACKEND_ENABLED_MASK can
copy back uninitialised data in value_tmp that pointer *value points
to. This can occur when rdev->family is less than CHIP_BONAIRE and
less than CHIP_TAHITI.  Fix this by adding in a missing -EINVAL
so that no invalid value is copied back to userspace.

Addresses-Coverity: ("Uninitialized scalar variable)
Cc: sta...@vger.kernel.org # 3.13+
Fixes: 439a1cfffe2c ("drm/radeon: expose render backend mask to the userspace")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/radeon/radeon_kms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/radeon/radeon_kms.c 
b/drivers/gpu/drm/radeon/radeon_kms.c
index 2479d6ab7a36..58876bb4ef2a 100644
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -518,6 +518,7 @@ int radeon_info_ioctl(struct drm_device *dev, void *data, 
struct drm_file *filp)
*value = rdev->config.si.backend_enable_mask;
} else {
DRM_DEBUG_KMS("BACKEND_ENABLED_MASK is si+ only!\n");
+   return -EINVAL;
}
break;
case RADEON_INFO_MAX_SCLK:
-- 
2.30.0

Re: [PATCH 0/3] objtool: OBJTOOL_ARGS and --backup

2021-03-03 Thread Josh Poimboeuf

On Fri, Feb 26, 2021 at 11:57:42AM +0100, Peter Zijlstra wrote:
> Boris asked for an environment variable to have objtool preserve the original
> object file so that it becomes trivial to inspect what actually changed.

I might bikeshed the suffix ".o.orig" instead of ".obj".

Acked-by: Josh Poimboeuf 

-- 
Josh

Re: 5.11 regression: "ia64: add support for TIF_NOTIFY_SIGNAL" breaks ia64 boot

2021-03-03 Thread Jens Axboe

On 3/2/21 4:27 PM, Sergei Trofimovich wrote:
> On Tue, 2 Mar 2021 15:31:13 -0700
> Jens Axboe  wrote:
> 
>> On 3/2/21 3:07 PM, Sergei Trofimovich wrote:
>>> On Tue, 23 Feb 2021 08:08:30 +
>>> Sergei Trofimovich  wrote:
>>>   
 On Mon, 22 Feb 2021 17:43:58 -0700
 Jens Axboe  wrote:
  
> On 2/22/21 5:41 PM, Jens Axboe wrote:
>> On 2/22/21 5:34 PM, Jens Axboe wrote:  
>>> On 2/22/21 4:53 PM, Sergei Trofimovich wrote:  
 On Mon, 22 Feb 2021 16:34:50 -0700
 Jens Axboe  wrote:
  
> On 2/22/21 4:05 PM, Sergei Trofimovich wrote:  
>> Hia Jens!
>>
>> Tried 5.11 on rx3600 box and noticed it has
>> a problem handling init (5.10 booted fine):
>>
>> INIT: version 2.98 booting
>>
>>OpenRC 0.42.1 is starting up Gentoo Linux (ia64)
>>
>> mkdir `/run/openrc': Read-only file system
>> mkdir `/run/openrc/starting': No such file or directory
>> mkdir `/run/openrc/started': No such file or directory
>> mkdir `/run/openrc/stopping': No such file or directory
>> mkdir `/run/openrc/inactive': No such file or directory
>> mkdir `/run/openrc/wasinactive': No such file or directory
>> mkdir `/run/openrc/failed': No such file or directory
>> mkdir `/run/openrc/hotplugged': No such file or directory
>> mkdir `/run/openrc/daemons': No such file or directory
>> mkdir `/run[   14.595059] Kernel panic - not syncing: Attempted to 
>> kill init! exitcode=0x000b
>> [   14.599059] ---[ end Kernel panic - not syncing: Attempted to 
>> kill init! exitcode=0x000b ]---
>>
>> I suspect we build bad signal stack frame for userspace.
>>
>> With a bit of #define DEBUG_SIG 1 enabled the signals are SIGCHLD:
>>
>> [   34.969771] SIG deliver (gendepends.sh:69): sig=17 
>> sp=6f6aeaa0 ip=a0040740 handler=4b4c59b6
>> [   34.969948] SIG deliver (init:1): sig=17 sp=6f1ccc50 
>> ip=a0040740 handler=4638b9e5
>> [   34.969948] SIG deliver (gendepends.sh:69): sig=17 
>> sp=6f6adf90 ip=a0040740 handler=4b4c59b6
>> [   34.973948] SIG deliver (init:1): sig=17 sp=6f1cc140 
>> ip=a0040740 handler=4638b9e5
>> [   34.973948] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>> [   34.973948] SIG deliver (gendepends.sh:69): sig=17 
>> sp=6f6ad480 ip=a0040740 handler=4b4c59b6
>> [   34.973948] ---[ end Kernel panic - not syncing: Attempted to 
>> kill init! exitcode=0x000b ]---
>>
>> Bisect points at:
>>
>> commit b269c229b0e89aedb7943c06673b56b6052cf5e5
>> Author: Jens Axboe 
>> Date:   Fri Oct 9 14:49:43 2020 -0600
>>
>> ia64: add support for TIF_NOTIFY_SIGNAL
>>
>> Wire up TIF_NOTIFY_SIGNAL handling for ia64.
>>
>> Cc: linux-i...@vger.kernel.org
>> [axboe: added fixes from Mike Rapoport ]
>> Signed-off-by: Jens Axboe 
>>
>> diff --git a/arch/ia64/include/asm/thread_info.h 
>> b/arch/ia64/include/asm/thread_info.h
>> index 64a1011f6812..51d20cb37706 100644
>> --- a/arch/ia64/include/asm/thread_info.h
>> +++ b/arch/ia64/include/asm/thread_info.h
>> @@ -103,6 +103,7 @@ struct thread_info {
>>  #define TIF_SYSCALL_TRACE  2   /* syscall trace active */
>>  #define TIF_SYSCALL_AUDIT  3   /* syscall auditing active */
>>  #define TIF_SINGLESTEP 4   /* restore singlestep on 
>> return to user mode */
>> +#define TIF_NOTIFY_SIGNAL  5   /* signal notification exist 
>> */
>>  #define TIF_NOTIFY_RESUME  6   /* resumption notification 
>> requested */
>>  #define TIF_MEMDIE 17  /* is terminating due to OOM 
>> killer */
>>  #define TIF_MCA_INIT   18  /* this task is processing 
>> MCA or INIT */
>> @@ -115,6 +116,7 @@ struct thread_info {
>>  #define _TIF_SINGLESTEP(1 << TIF_SINGLESTEP)
>>  #define _TIF_SYSCALL_TRACEAUDIT
>> (_TIF_SYSCALL_TRACE|_TIF_SYSCALL_AUDIT|_TIF_SINGLESTEP)
>>  #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
>> +#define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL)
>>  #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
>>  #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)
>>  #define _TIF_MCA_INIT  (1 << TIF_MCA_INIT)
>> @@ -124,7 +126,7 @@ struct thread_info {
>>
>>  /* "work to do on user-return"

Re: [PATCH] Squashfs: fix xattr id and id lookup sanity checks

2021-03-03 Thread Andrew Morton

On Mon, 1 Mar 2021 07:27:57 + (GMT) Phillip Lougher 
 wrote:

> The checks for maximum metadata block size is
> missing SQUASHFS_BLOCK_OFFSET (the two byte length
> count).

What are the user visible consequences of this bug?

> Cc: sta...@vger.kernel.org
> Signed-off-by: Phillip Lougher 

Fixes: f37aa4c7366e23f ("squashfs: add more sanity checks in id lookup")

yes?

RE: [PATCH v8 1/6] arm64: hyperv: Add Hyper-V hypercall and register access utilities

2021-03-03 Thread Sunil Muthuswamy

> +}
> +EXPORT_SYMBOL_GPL(hv_do_fast_hypercall8);
> +
> +
nit: Extra line, here and few other places

> +u64 hv_get_vpreg(u32 msr)
> +{
> + struct hv_get_vp_registers_input*input;
> + struct hv_get_vp_registers_output   *output;
> + u64 result;
It seems like the code is using both spaces and tabs to align variable names. 
Can
we stick to one or the other, preferably spaces.

> +
> + /*
> +  * Allocate a power of 2 size so alignment to that size is
> +  * guaranteed, since the hypercall input and output areas
> +  * must not cross a page boundary.
> +  */
> + input = kzalloc(roundup_pow_of_two(sizeof(input->header) +
> + sizeof(input->element[0])), GFP_ATOMIC);
> + output = kmalloc(roundup_pow_of_two(sizeof(*output)), GFP_ATOMIC);
> +
Check for null from these malloc routines? Here and in other places.

> + __hv_get_vpreg_128(msr, input, output);

+ * Linux-specific definitions for managing interactions with Microsoft's
+ * Hyper-V hypervisor. The definitions in this file are specific to
+ * the ARM64 architecture.  See include/asm-generic/mshyperv.h for
nit: Two space before 'See'. Here and in couple of other places.

Re: [PATCH v3 1/2] x86/setup: consolidate early memory reservations

2021-03-03 Thread Baoquan He

On 03/02/21 at 05:17pm, Mike Rapoport wrote:
> On Tue, Mar 02, 2021 at 09:04:09PM +0800, Baoquan He wrote:
...
> > > +static void __init early_reserve_memory(void)
> > > +{
> > > + /*
> > > +  * Reserve the memory occupied by the kernel between _text and
> > > +  * __end_of_kernel_reserve symbols. Any kernel sections after the
> > > +  * __end_of_kernel_reserve symbol must be explicitly reserved with a
> > > +  * separate memblock_reserve() or they will be discarded.
> > > +  */
> > > + memblock_reserve(__pa_symbol(_text),
> > > +  (unsigned long)__end_of_kernel_reserve - (unsigned 
> > > long)_text);
> > > +
> > > + /*
> > > +  * Make sure page 0 is always reserved because on systems with
> > > +  * L1TF its contents can be leaked to user processes.
> > > +  */
> > > + memblock_reserve(0, PAGE_SIZE);
> > > +
> > > + early_reserve_initrd();
> > > +
> > > + if (efi_enabled(EFI_BOOT))
> > > + efi_memblock_x86_reserve_range();
> > > +
> > > + memblock_x86_reserve_range_setup_data();
> > 
> > This patch looks good to me, thanks for the effort.
> > 
> > While at it, wondering if we can rename the above function to
> > memblock_reserve_setup_data() just as its e820 counterpart
> > e820__reserve_setup_data(), adding 'x86' to a function under arch/x86
> > seems redundant.
> 
> I'd rather keep these names for now. First, it's easier to dig to them in the 
> git
> history and second, I'm planning more changes in this area and these names
> are as good as FIXME: to remind what still needs to be checked :)

I see, thanks for explanation.

Re: [PATCH v2] binfmt_misc: Fix possible deadlock in bm_register_write

2021-03-03 Thread Andrew Morton

On Sun, 28 Feb 2021 14:44:14 -0800 Lior Ribak  wrote:

> There is a deadlock in bm_register_write:
> First, in the beggining of the function, a lock is taken on the
> binfmt_misc root inode with inode_lock(d_inode(root))
> Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will
> call open_exec on the user-provided interpreter.
> open_exec will call a path lookup, and if the path lookup process
> includes the root of binfmt_misc, it will try to take a shared lock
> on its inode again, but it is already locked, and the code will
> get stuck in a deadlock
> 
> To reproduce the bug:
> $ echo ":i:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > 
> /proc/sys/fs/binfmt_misc/register
> 
> backtrace of where the lock occurs (#5):
> 0  schedule () at ./arch/x86/include/asm/current.h:15
> 1  0x81b51237 in rwsem_down_read_slowpath (sem=0x888003b202e0, 
> count=, state=state@entry=2) at kernel/locking/rwsem.c:992
> 2  0x81b5150a in __down_read_common (state=2, sem=) at 
> kernel/locking/rwsem.c:1213
> 3  __down_read (sem=) at kernel/locking/rwsem.c:1222
> 4  down_read (sem=) at kernel/locking/rwsem.c:1355
> 5  0x811ee22a in inode_lock_shared (inode=) at 
> ./include/linux/fs.h:783
> 6  open_last_lookups (op=0xc922fe34, file=0x888004098600, 
> nd=0xc922fd10) at fs/namei.c:3177
> 7  path_openat (nd=nd@entry=0xc922fd10, 
> op=op@entry=0xc922fe34, flags=flags@entry=65) at fs/namei.c:3366
> 8  0x811efe1c in do_filp_open (dfd=, 
> pathname=pathname@entry=0x8880031b9000, op=op@entry=0xc922fe34) 
> at fs/namei.c:3396
> 9  0x811e493f in do_open_execat (fd=fd@entry=-100, 
> name=name@entry=0x8880031b9000, flags=, flags@entry=0) at 
> fs/exec.c:913
> 10 0x811e4a92 in open_exec (name=) at fs/exec.c:948
> 11 0x8124aa84 in bm_register_write (file=, 
> buffer=, count=19, ppos=) at 
> fs/binfmt_misc.c:682
> 12 0x811decd2 in vfs_write (file=file@entry=0x888004098500, 
> buf=buf@entry=0xa758d0 ":i:E::ii::i:CF\n", count=count@entry=19, 
> pos=pos@entry=0xc922ff10) at fs/read_write.c:603
> 13 0x811defda in ksys_write (fd=, buf=0xa758d0 
> ":i:E::ii::i:CF\n", count=19) at fs/read_write.c:658
> 14 0x81b49813 in do_syscall_64 (nr=, 
> regs=0xc922ff58) at arch/x86/entry/common.c:46
> 15 0x81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
> 
> To solve the issue, the open_exec call is moved to before the write
> lock is taken by bm_register_write
> 

Looks good to me.

I assume this is an ancient bug and that a backport to -stable trees
(with a cc:stable) is warranted?

Re: [PATCH] KVM: LAPIC: Advancing the timer expiration on guest initiated write

2021-03-03 Thread Wanpeng Li

On Wed, 3 Mar 2021 at 01:16, Sean Christopherson  wrote:
>
> On Tue, Mar 02, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li 
> >
> > Advancing the timer expiration should only be necessary on guest initiated
> > writes. Now, we cancel the timer, clear .pending and clear 
> > expired_tscdeadline
> > at the same time during state restore.
>
> That last sentence is confusing.  kvm_apic_set_state() already clears 
> .pending,
> by way of __start_apic_timer().  I think what you mean is:
>
>   When we cancel the timer and clear .pending during state restore, clear
>   expired_tscdeadline as well.

Good statement. :)

>
> With that,
>
> Reviewed-by: Sean Christopherson 
>
>
> Side topic, I think there's a theoretical bug where KVM could inject a 
> spurious
> timer interrupt.  If KVM is using hrtimer, the hrtimer expires early due to an
> overzealous timer_advance_ns, and the guest writes MSR_TSCDEADLINE after the
> hrtimer expires but before the vCPU is kicked, then KVM will inject a spurious
> timer IRQ since the premature expiration should have been canceled by the 
> guest's
> WRMSR.
>
> It could also cause KVM to soft hang the guest if the new 
> lapic_timer.tscdeadline
> is written before apic_timer_expired() captures it in expired_tscdeadline.  In
> that case, KVM will wait for the new deadline, which could be far in the 
> future.

The hrtimer_cancel() before setting new lapic_timer.tscdeadline in
kvm_set_lapic_tscdeadline_msr() will wait for the hrtimer callback
function to finish. Could it solve this issue?

Wanpeng

Broken kretprobe stack traces

2021-03-03 Thread Daniel Xu

Hi Masami,

Jakub reported a bug with kretprobe stack traces -- wondering if you've gotten
any bug reports related to stack traces being broken for kretprobes.

I think (can't prove) this used to work:

# bpftrace -e 'kretprobe:__tcp_retransmit_skb { @[kstack()] = count() }'
Attaching 1 probe...
^C

@[
kretprobe_trampoline+0
]: 1

fentry/fexit probes seem to work:

# bpftrace -e 'kretfunc:__tcp_retransmit_skb { @[kstack()] = count() }'
Attaching 1 probe...
^C

@[
ftrace_trampoline+10799
bpf_get_stackid_raw_tp+121
ftrace_trampoline+10799
__tun_chr_ioctl.isra.0.cold+33312
__tcp_retransmit_skb+5
tcp_send_loss_probe+254
tcp_write_timer_handler+394
tcp_write_timer+149
call_timer_fn+41
__run_timers+493
run_timer_softirq+25
__softirqentry_text_start+207
asm_call_sysvec_on_stack+18
do_softirq_own_stack+55
irq_exit_rcu+158
sysvec_apic_timer_interrupt+54
asm_sysvec_apic_timer_interrupt+18
]: 1
@[
ftrace_trampoline+10799
bpf_get_stackid_raw_tp+121
ftrace_trampoline+10799
__tun_chr_ioctl.isra.0.cold+33312
__tcp_retransmit_skb+5
  <...>

which makes me suspect it's a kprobe specific issue.

Thanks,
Daniel

[PATCH] crypto: mips/poly1305 - enable for all MIPS processors

2021-03-03 Thread Maciej W. Rozycki

The MIPS Poly1305 implementation is generic MIPS code written such as to 
support down to the original MIPS I and MIPS III ISA for the 32-bit and 
64-bit variant respectively.  Lift the current limitation then to enable 
code for MIPSr1 ISA or newer processors only and have it available for 
all MIPS processors.

Signed-off-by: Maciej W. Rozycki 
Fixes: a11d055e7a64 ("crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS 
optimized implementation")
Cc: sta...@vger.kernel.org # v5.5+
---
On Wed, 3 Mar 2021, Jason A. Donenfeld wrote:

> >> Would you mind sending this for 5.12 in an rc at some point, rather
> >> than waiting for 5.13? I'd like to see this backported to 5.10 and 5.4
> >> for OpenWRT.
> >
> > why is this so important for OpenWRT ? Just to select CRYPTO_POLY1305_MIPS
> > ?
> 
> Yes. The performance boost on Octeon is significant for WireGuard users.

 But that's the wrong fix for that purpose.  I've skimmed over that module 
and there's nothing MIPS64-specific there.  In fact it's plain generic 
MIPS assembly, with some R2 optimisations enabled where applicable but not 
necessary (and then R6 tweaks, but that's irrelevant here).

 As a matter of interest I have just built it successfully for a MIPS I 
DECstation configuration:

$ file arch/mips/crypto/poly1305-mips.ko
arch/mips/crypto/poly1305-mips.ko: ELF 32-bit LSB relocatable, MIPS, MIPS-I 
version 1 (SYSV), BuildID[sha1]=d36384d94f60ba7deff638ca8a24500120b45b56, not 
stripped
$ 

Patch included, please apply.

 So while your change is surely right, what you want is this really.

  Maciej
---
 arch/mips/crypto/Makefile |4 ++--
 crypto/Kconfig|2 +-
 drivers/net/Kconfig   |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

Index: linux/arch/mips/crypto/Makefile
===
--- linux.orig/arch/mips/crypto/Makefile
+++ linux/arch/mips/crypto/Makefile
@@ -12,8 +12,8 @@ AFLAGS_chacha-core.o += -O2 # needed to
 obj-$(CONFIG_CRYPTO_POLY1305_MIPS) += poly1305-mips.o
 poly1305-mips-y := poly1305-core.o poly1305-glue.o
 
-perlasm-flavour-$(CONFIG_CPU_MIPS32) := o32
-perlasm-flavour-$(CONFIG_CPU_MIPS64) := 64
+perlasm-flavour-$(CONFIG_32BIT) := o32
+perlasm-flavour-$(CONFIG_64BIT) := 64
 
 quiet_cmd_perlasm = PERLASM $@
   cmd_perlasm = $(PERL) $(<) $(perlasm-flavour-y) $(@)
Index: linux/crypto/Kconfig
===
--- linux.orig/crypto/Kconfig
+++ linux/crypto/Kconfig
@@ -772,7 +772,7 @@ config CRYPTO_POLY1305_X86_64
 
 config CRYPTO_POLY1305_MIPS
tristate "Poly1305 authenticator algorithm (MIPS optimized)"
-   depends on CPU_MIPS32 || (CPU_MIPS64 && 64BIT)
+   depends on MIPS
select CRYPTO_ARCH_HAVE_LIB_POLY1305
 
 config CRYPTO_MD4
Index: linux/drivers/net/Kconfig
===
--- linux.orig/drivers/net/Kconfig
+++ linux/drivers/net/Kconfig
@@ -92,7 +92,7 @@ config WIREGUARD
select CRYPTO_POLY1305_ARM if ARM
select CRYPTO_CURVE25519_NEON if ARM && KERNEL_MODE_NEON
select CRYPTO_CHACHA_MIPS if CPU_MIPS32_R2
-   select CRYPTO_POLY1305_MIPS if CPU_MIPS32 || (CPU_MIPS64 && 64BIT)
+   select CRYPTO_POLY1305_MIPS if MIPS
help
  WireGuard is a secure, fast, and easy to use replacement for IPSec
  that uses modern cryptography and clever networking tricks. It's

[GIT PULL] Misc 5.12 fixes

2021-03-03 Thread Jens Axboe

Hi Linus,

Two misc fixes that don't belong in other branches:

- Fix a regression with ia64 signals, introduced by the
  TIF_NOTIFY_SIGNAL change in 5.11.

- Fix the current swapfile regression from this merge window.

Please pull!


The following changes since commit 7a7fd0de4a9804299793e564a555a49c1fc924cb:

  Merge branch 'kmap-conversion-for-5.12' of 
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux (2021-03-01 11:24:18 
-0800)

are available in the Git repository at:

  git://git.kernel.dk/linux-block.git tags/misc-5.12-2021-03-02

for you to fetch changes up to caf6912f3f4af7232340d500a4a2008f81b93f14:

  swap: fix swapfile read/write offset (2021-03-02 17:25:46 -0700)


misc-5.12-2021-03-02


Jens Axboe (2):
  ia64: don't call handle_signal() unless there's actually a signal queued
  swap: fix swapfile read/write offset

 arch/ia64/kernel/signal.c |  3 ++-
 include/linux/swap.h  |  1 +
 mm/page_io.c  |  5 -
 mm/swapfile.c | 13 +
 4 files changed, 16 insertions(+), 6 deletions(-)

-- 
Jens Axboe

Re: [PATCH 05/13] rcu/nocb: Use the rcuog CPU's ->nocb_timer

2021-03-03 Thread Paul E. McKenney

On Tue, Feb 23, 2021 at 01:10:03AM +0100, Frederic Weisbecker wrote:
> Currently each offline rdp has its own nocb_timer armed when the
> nocb_gp wakeup must be deferred. This layout has many drawbacks,
> compared to a solution based on a single timer per rdp group:
> 
> * There are a lot of timers to maintain.
> 
> * The per-rdp ->nocb_lock must be held to queue and cancel the timer
>   and this lock can already be quite contended.
> 
> * One timer firing doesn't cancel the other timers in the same group:
>   - These other timers can thus cause spurious wakeups
>   - Each rdp that queued a timer must lock both ->nocb_lock and then
> ->nocb_gp_lock upon exit from the kernel to idle/user/guest mode.
> 
> * We can't cancel all of them if we detect an unflushed bypass in
>   nocb_gp_wait(). In fact currently we only ever cancel the nocb_timer
>   of the leader group.
> 
> * The leader group's nocb_timer is cancelled without locking ->nocb_lock
>   in nocb_gp_wait().  This currently appears to be safe but is an
>   accident waiting to happen.
> 
> * Since the timer acquires ->nocb_lock, it requires extra care in the
>   NOCB (de-)offloading process, requiring that it be either enabled or
>   disabled and flushed.
> 
> This commit instead uses the rcuog kthread's CPU's ->nocb_timer instead.
> It is protected by nocb_gp_lock, which is _way_ less contended and
> remains so even after this change.  As a matter of fact, the nocb_timer
> almost never fires and the deferred wakeup is mostly carried out upon
> idle/user/guest entry.  Now the early check performed at this point in
> do_nocb_deferred_wakeup() is done on rdp_gp->nocb_defer_wakeup, which
> is of course racy.  However, this raciness is harmless because we only
> need the guarantee that the timer is queued if we were the last one to
> queue it.  Any other situation (another CPU has queued it and we either
> see it or not) is fine.
> 
> This solves all the issues listed above.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Josh Triplett 
> Cc: Lai Jiangshan 
> Cc: Joel Fernandes 
> Cc: Neeraj Upadhyay 
> Cc: Boqun Feng 

I pulled in the previous three (2-4/13) with the usual commit-log wordsmithing,
thank you!  And I could not resist wordsmithing above.

I do very much like the general approach, but a few questions below.

The first question is of course: Did you try this with lockdep enabled?  ;-)

> ---
>  kernel/rcu/tree.h|   1 -
>  kernel/rcu/tree_plugin.h | 142 +--
>  2 files changed, 78 insertions(+), 65 deletions(-)
> 
> diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> index 71821d59d95c..b280a843bd2c 100644
> --- a/kernel/rcu/tree.h
> +++ b/kernel/rcu/tree.h
> @@ -257,7 +257,6 @@ struct rcu_data {
>  };
>  
>  /* Values for nocb_defer_wakeup field in struct rcu_data. */
> -#define RCU_NOCB_WAKE_OFF-1
>  #define RCU_NOCB_WAKE_NOT0
>  #define RCU_NOCB_WAKE1
>  #define RCU_NOCB_WAKE_FORCE  2
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 587df271d640..847636d3e93d 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -33,10 +33,6 @@ static inline bool rcu_current_is_nocb_kthread(struct 
> rcu_data *rdp)
>   return false;
>  }
>  
> -static inline bool rcu_running_nocb_timer(struct rcu_data *rdp)
> -{
> - return (timer_curr_running(&rdp->nocb_timer) && !in_irq());
> -}
>  #else
>  static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp)
>  {
> @@ -48,11 +44,6 @@ static inline bool rcu_current_is_nocb_kthread(struct 
> rcu_data *rdp)
>   return false;
>  }
>  
> -static inline bool rcu_running_nocb_timer(struct rcu_data *rdp)
> -{
> - return false;
> -}
> -
>  #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
>  
>  static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> @@ -72,8 +63,7 @@ static bool rcu_rdp_is_offloaded(struct rcu_data *rdp)
> rcu_lockdep_is_held_nocb(rdp) ||
> (rdp == this_cpu_ptr(&rcu_data) &&
>  !(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible())) ||
> -   rcu_current_is_nocb_kthread(rdp) ||
> -   rcu_running_nocb_timer(rdp)),
> +   rcu_current_is_nocb_kthread(rdp)),
>   "Unsafe read of RCU_NOCB offloaded state"
>   );
>  
> @@ -1702,43 +1692,50 @@ bool rcu_is_nocb_cpu(int cpu)
>   return false;
>  }
>  
> -/*
> - * Kick the GP kthread for this NOCB group.  Caller holds ->nocb_lock
> - * and this function releases it.
> - */
> -static bool wake_nocb_gp(struct rcu_data *rdp, bool force,
> -  unsigned long flags)
> - __releases(rdp->nocb_lock)
> +static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
> +struct rcu_data *rdp,
> +bool force, unsigned long flags)
> + __releases(rdp_gp->nocb_gp_lock)
>  {
>   bool needwake = false;
> - struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
>  
> - lockdep_assert_held(&rdp->n

Re: [PATCH 11/13] rcu/nocb: Only cancel nocb timer if not polling

2021-03-03 Thread Paul E. McKenney

On Tue, Feb 23, 2021 at 01:10:09AM +0100, Frederic Weisbecker wrote:
> No need to disarm the nocb_timer if rcu_nocb is polling because it
> shouldn't be armed either.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Josh Triplett 
> Cc: Lai Jiangshan 
> Cc: Joel Fernandes 
> Cc: Neeraj Upadhyay 
> Cc: Boqun Feng 

OK, so it does make sense to move that del_timer() under the following
"if" statement, then.  ;-)

> ---
>  kernel/rcu/tree_plugin.h | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 9da67b0d3997..d8b50ff40e4b 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -2186,18 +2186,18 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
>   my_rdp->nocb_gp_gp = needwait_gp;
>   my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
>   if (bypass) {
> - raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
> - // Avoid race with first bypass CB.
> - if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
> - WRITE_ONCE(my_rdp->nocb_defer_wakeup, 
> RCU_NOCB_WAKE_NOT);
> - del_timer(&my_rdp->nocb_timer);
> - }
>   if (!rcu_nocb_poll) {
> + raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
> + // Avoid race with first bypass CB.
> + if (my_rdp->nocb_defer_wakeup > RCU_NOCB_WAKE_NOT) {
> + WRITE_ONCE(my_rdp->nocb_defer_wakeup, 
> RCU_NOCB_WAKE_NOT);
> + del_timer(&my_rdp->nocb_timer);
> + }
>   // At least one child with non-empty ->nocb_bypass, so 
> set
>   // timer in order to avoid stranding its callbacks.
>   mod_timer(&my_rdp->nocb_bypass_timer, j + 2);
> + raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, 
> flags);
>   }
> - raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
>   }
>   if (rcu_nocb_poll) {
>   /* Polling, so trace if first poll in the series. */
> -- 
> 2.25.1
>

Re: [PATCH 0/2] tracing: Detect unsafe dereferencing of pointers from trace events

2021-03-03 Thread Peter Chen

On 21-03-02 09:56:05, Steven Rostedt wrote:
> On Tue, 2 Mar 2021 16:23:55 +0800
> Peter Chen  wrote:
> 
> s it looks like it uses %pa which IIUC from the printk code, it
> > > >> dereferences the pointer to find it's virtual address. The event has
> > > >> this as the field:
> > > >>
> > > >> __field(struct cdns3_trb *, start_trb_addr)
> > > >>
> > > >> Assigns it with:
> > > >>
> > > >> __entry->start_trb_addr = req->trb;
> > > >>
> > > >> And prints that with %pa, which will dereference pointer at the time of
> > > >> reading, where the address in question may no longer be around. That
> > > >> looks to me as a potential bug.  
> > 
> > Steven, thanks for reporting. Do you mind sending patch to fix it?
> > If you have no time to do it, I will do it later.
> > 
> 
> I would have already fixed it, but I wasn't exactly sure how this is used.
> 
> In Documentation/core-api/printk-formats.rst we have:
> 
>Physical address types phys_addr_t
>--
> 
>::
> 
>%pa[p]  0x01234567 or 0x0123456789abcdef
> 
>For printing a phys_addr_t type (and its derivatives, such as
>resource_size_t) which can vary based on build options, regardless of the
>width of the CPU data path.
> 
> So it only looks like it is used to for the size of the pointer.
> 
> I guess something like this might work:
> 
> diff --git a/drivers/usb/cdns3/cdns3-trace.h b/drivers/usb/cdns3/cdns3-trace.h
> index 8648c7a7a9dd..d3b8624fc427 100644
> --- a/drivers/usb/cdns3/cdns3-trace.h
> +++ b/drivers/usb/cdns3/cdns3-trace.h
> @@ -214,7 +214,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
>   __field(int, no_interrupt)
>   __field(int, start_trb)
>   __field(int, end_trb)
> - __field(struct cdns3_trb *, start_trb_addr)
> + __field(phys_addr_t, start_trb_addr)
>   __field(int, flags)
>   __field(unsigned int, stream_id)
>   ),
> @@ -230,7 +230,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
>   __entry->no_interrupt = req->request.no_interrupt;
>   __entry->start_trb = req->start_trb;
>   __entry->end_trb = req->end_trb;
> - __entry->start_trb_addr = req->trb;
> + __entry->start_trb_addr = *(const phys_addr_t *)req->trb;
>   __entry->flags = req->flags;
>   __entry->stream_id = req->request.stream_id;
>   ),
> @@ -244,7 +244,7 @@ DECLARE_EVENT_CLASS(cdns3_log_request,
>   __entry->status,
>   __entry->start_trb,
>   __entry->end_trb,
> - __entry->start_trb_addr,
> + /* %pa dereferences */ &__entry->start_trb_addr,
>   __entry->flags,
>   __entry->stream_id
>   )
> 
> 
> Can you please test it? I don't have the hardware, but I also want to make
> sure I don't break anything.
> 

Hi Steve,

Regarding this issue, I have one question:
- If the virtual address is got from dma_alloc_coherent, can't we print
this address using %pa to get its physical address (the same with DMA address),
or its DMA address using %pad? req->trb is the virtual address got from
dma_alloc_coherent. And what's the logic for this "unsafe dereference" warning?
Thanks.

-- 

Thanks,
Peter Chen

Re: Why do kprobes and uprobes singlestep?

2021-03-03 Thread Alexei Starovoitov

On Tue, Mar 2, 2021 at 1:02 PM Andy Lutomirski  wrote:
>
>
> > On Mar 2, 2021, at 12:24 PM, Alexei Starovoitov 
> >  wrote:
> >
> > On Tue, Mar 2, 2021 at 10:38 AM Andy Lutomirski  wrote:
> >>
> >> Is there something like a uprobe test suite?  How maintained /
> >> actively used is uprobe?
> >
> > uprobe+bpf is heavily used in production.
> > selftests/bpf has only one test for it though.
> >
> > Why are you asking?
>
> Because the integration with the x86 entry code is a mess, and I want to know 
> whether to mark it BROKEN or how to make sure the any cleanups actually work.

Any test case to repro the issue you found?
Is it a bug or just messy code?
Nowadays a good chunk of popular applications (python, mysql, etc) has
USDTs in them.
Issues reported with bcc:
https://github.com/iovisor/bcc/issues?q=is%3Aissue+USDT
Similar thing with bpftrace.
Both standard USDT and semaphore based are used in the wild.
uprobe for containers has been a long standing feature request.
If you can improve uprobe performance that would be awesome.
That's another thing that people report often. We optimized it a bit.
More can be done.

Re: [PATCH 10/13] rcu/nocb: Delete bypass_timer upon nocb_gp wakeup

2021-03-03 Thread Paul E. McKenney

On Tue, Feb 23, 2021 at 01:10:08AM +0100, Frederic Weisbecker wrote:
> A NOCB-gp wake up can safely delete the nocb_bypass_timer. nocb_gp_wait()
> is going to check again the bypass state and rearm the bypass timer if
> necessary.
> 
> Signed-off-by: Frederic Weisbecker 
> Cc: Josh Triplett 
> Cc: Lai Jiangshan 
> Cc: Joel Fernandes 
> Cc: Neeraj Upadhyay 
> Cc: Boqun Feng 

Give that you delete this code a couple of patches later in this series,
why not just leave it out entirely?  ;-)

Thanx, Paul

> ---
>  kernel/rcu/tree_plugin.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index b62ad79bbda5..9da67b0d3997 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -1711,6 +1711,8 @@ static bool __wake_nocb_gp(struct rcu_data *rdp_gp,
>   del_timer(&rdp_gp->nocb_timer);
>   }
>  
> + del_timer(&rdp_gp->nocb_bypass_timer);
> +
>   if (force || READ_ONCE(rdp_gp->nocb_gp_sleep)) {
>   WRITE_ONCE(rdp_gp->nocb_gp_sleep, false);
>   needwake = true;
> -- 
> 2.25.1
>

Re: [PATCH] recordmcount: Fix the wrong use of w* in arm64_is_fake_mcount()

2021-03-03 Thread Li Huafei





On 2021/3/3 6:30, Steven Rostedt wrote:

On Thu, 25 Feb 2021 16:01:17 +
Will Deacon  wrote:


On Thu, Feb 25, 2021 at 09:44:26AM -0500, Steven Rostedt wrote:

This requires an acked-by from one of the ARM64 maintainers.

-- Steve


On Thu, 25 Feb 2021 22:07:47 +0800
Li Huafei  wrote:
   

When cross-compiling the kernel, the endian of the target machine and
the local machine may not match, at this time the recordmcount tool
needs byte reversal when processing elf's variables to get the correct
value. w* callback function is used to solve this problem, w is used for
4-byte variable processing, while w8 is used for 8-byte.

arm64_is_fake_mcount() is used to filter '_mcount' relocations that are
not used by ftrace. In arm64_is_fake_mcount(), rp->info is 8 bytes in
size, but w is used. This causes arm64_is_fake_mcount() to get the wrong
type of relocation when we cross-compile the arm64_be kernel image on an
x86_le machine, and all valid '_mcount' is filtered out. The
recordmcount tool does not collect any mcount function call locations.
At kernel startup, the following ftrace log is seen:

ftrace: No functions to be traced?

and thus ftrace cannot be used.

Using w8 to get the value of rp->r_info will fix the problem.

Fixes: ea0eada45632 ("recordmcount: only record relocation of type
R_AARCH64_CALL26 on arm64")
Signed-off-by: Li Huafei 
---
  scripts/recordmcount.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index b9c2ee7ab43f..cce12e1971d8 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -438,7 +438,7 @@ static int arm_is_fake_mcount(Elf32_Rel const *rp)
  
  static int arm64_is_fake_mcount(Elf64_Rel const *rp)

  {
-   return ELF64_R_TYPE(w(rp->r_info)) != R_AARCH64_CALL26;
+   return ELF64_R_TYPE(w8(rp->r_info)) != R_AARCH64_CALL26;


Acked-by: Will Deacon 

But you know you could avoid these sorts of problems by moving to little
endian along with everybody else? ;)



I just realized that I received this patch twice, and thought it was the
same patch! Chen was three days ahead of you, so he get's the credit ;-)

  https://lore.kernel.org/r/20210222135840.56250-1-chenjun...@huawei.com

-- Steve
.



That's fine, thanks Steve and Will!

Huafei

Re: [PATCH] MIPS: BMIPS: Reserve exception base to prevent corruption

2021-03-03 Thread Florian Fainelli




On 3/2/2021 3:54 PM, Thomas Bogendoerfer wrote:
> On Mon, Mar 01, 2021 at 08:19:38PM -0800, Florian Fainelli wrote:
>> BMIPS is one of the few platforms that do change the exception base.
>> After commit 2dcb39645441 ("memblock: do not start bottom-up allocations
>> with kernel_end") we started seeing BMIPS boards fail to boot with the
>> built-in FDT being corrupted.
>>
>> Before the cited commit, early allocations would be in the [kernel_end,
>> RAM_END] range, but after commit they would be within [RAM_START +
>> PAGE_SIZE, RAM_END].
>>
>> The custom exception base handler that is installed by
>> bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the
>> memory region allocated by unflatten_and_copy_device_tree() thus
>> corrupting the FDT used by the kernel.
>>
>> To fix this, we need to perform an early reservation of the custom
>> exception that is going to be installed and this needs to happen at
>> plat_mem_setup() time to ensure that unflatten_and_copy_device_tree()
>> finds a space that is suitable, away from reserved memory.
>>
>> Huge thanks to Serget for analysing and proposing a solution to this
>> issue.
>>
>> Fixes: Fixes: 2dcb39645441 ("memblock: do not start bottom-up allocations 
>> with kernel_end")
>> Debugged-by: Serge Semin 
>> Reported-by: Kamal Dasu 
>> Signed-off-by: Florian Fainelli 
>> ---
>> Thomas,
>>
>> This is intended as a stop-gap solution for 5.12-rc1 and to be picked up
>> by the stable team for 5.11. We should find a safer way to avoid these
>> problems for 5.13 maybe.
> 
> let's try to make it in one ago. Hwo about reserving vector space in
> cpu_probe, if it's known there and leave the rest to trap_init() ?
> 
> Below patch got a quick test on IP22 (real hardware) and malta (qemu).
> Not sure, if I got all BMIPS parts correct, so please check/test.

Works for me here:

Tested-by: Florian Fainelli 

Thanks!

> BTW. do we really need to EXPORT_SYMBOL ebase ?

It seems like MIPS KVM support can be built as a module which is why
ebase was exported to modules with
878edf014e29de38c49153aba20273fbc9ae31af ("MIPS: KVM: Restore host EBase
from ebase variable")?
-- 
Florian

Re: [PATCH 01/13] rcu/nocb: Fix potential missed nocb_timer rearm

2021-03-03 Thread Frederic Weisbecker

On Tue, Mar 02, 2021 at 10:17:29AM -0800, Paul E. McKenney wrote:
> On Tue, Mar 02, 2021 at 01:34:44PM +0100, Frederic Weisbecker wrote:
> 
> OK, how about if I queue a temporary commit (shown below) that just
> calls out the first scenario so that I can start testing, and you get
> me more detail on the second scenario?  I can then update the commit.

Sure, meanwhile here is an attempt for a nocb_bypass_timer based
scenario, it's overly hairy and perhaps I picture more power
in the hands of callbacks advancing on nocb_cb_wait() than it
really has:

0.  CPU 0's ->nocb_cb_kthread just called rcu_do_batch() and
executed all the ready callbacks. Its segcblist is now
entirely empty. It's preempted while calling local_bh_enable().

1.  A new callback is enqueued on CPU 0 with IRQs enabled. So
the ->nocb_gp_kthread for CPU 0-2's is awaken. Then a storm
of callbacks enqueue follows on CPU 0 and even reaches the
bypass queue. Note that ->nocb_gp_kthread is also associated
with CPU 0.

2.  CPU 0 queues one last bypass callback.

3.  The ->nocb_gp_kthread wakes up and associates a grace period
with the whole queue of regular callbacks on CPU 0. It also
tries to flush the bypass queue of CPU 0 but the bypass lock
is contended due to the concurrent enqueuing on the previous
step 2, so the flush fails.

4.  This ->nocb_gp_kthread arms its ->nocb_bypass_timer and goes
to sleep waiting for the end of this future grace period.

5.  This grace period elapses before the ->nocb_bypass_timer timer
fires. This is normally improbably given that the timer is set
for only two jiffies, but timers can be delayed.  Besides, it
is possible that kernel was built with 
CONFIG_RCU_STRICT_GRACE_PERIOD=y.

6.  The grace period ends, so rcu_gp_kthread awakens the
->nocb_gp_kthread but it doesn't get a chance to run on a CPU
before a while.

7.  CPU 0's ->nocb_cb_kthread get back to the CPU after its preemption.
As it notices the new completed grace period, it advances the 
callbacks
and executes them. Then it gets preempted again on 
local_bh_enabled().

8.  A new callback enqueue on CPU 0 flushes itself the bypass queue
because CPU 0's ->nocb_nobypass_count < nocb_nobypass_lim_per_jiffy.

9.  CPUs from other ->nocb_gp_kthread groups (above CPU 2) initiate and
elapse a few grace periods. CPU 0's ->nocb_gp_kthread still hasn't
got an opportunity to run on a CPU and its ->nocb_bypass_timer still
hasn't fired.

10. CPU 0's ->nocb_cb_kthread wakes up from preemption. It notices the
new grace periods that have elapsed, advance all the callbacks and
executes them. Then it goes to sleep waiting for invocable 
callbacks.

11. CPU 0 enqueues a new callback with interrupts disabled, and
defers awakening its ->nocb_gp_kthread even though ->nocb_gp_sleep
is actually false. It therefore queues its rcu_data structure's
->nocb_timer. At this point, CPU 0's rdp->nocb_defer_wakeup is
RCU_NOCB_WAKE.

12. The ->nocb_bypass_timer finally fires! It doesn't wake up
->nocb_gp_kthread because it's actually awaken already.
But it cancels CPU 0's ->nocb_timer armed at 11. Yet it doesn't
re-initialize CPU 0's ->nocb_defer_wakeup which stays with the
stale RCU_NOCB_WAKE value. So CPU 0's->nocb_defer_wakeup and
its ->nocb_timer are now desynchronized.

13. The ->nocb_gp_kthread finally runs. It cancels the 
->nocb_bypass_timer
which has already fired. It sees the new callback on CPU 0 and
associate it with a new grace period then sleep on it.

14. The grace period elapses, rcu_gp_kthread wakes up ->nocb_gb_kthread
which wakes up CPU 0's->nocb_cb_kthread which runs the callback.
Both ->nocb_gp_kthread and CPU 0's->nocb_cb_kthread now wait for new
callbacks.

15. CPU 0 enqueues another callback, again with interrupts
disabled so it must queue a timer for a deferred wakeup. However
the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which
incorrectly indicates that a timer is already queued.  Instead,
CPU 0's ->nocb_timer was cancelled in 12.  CPU 0 therefore fails
to queue the ->nocb_timer.

16. CPU 0 has its pending callback and it may go unnoticed until
some other CPU ever wakes up ->nocb_gp_kthread or CPU 0 ever
calls an explicit deferred wakeup, for example, during idle entry.

Thanks.

[PATCH v8 1/1] fpga: dfl: afu: harden port enable logic

2021-03-03 Thread Russ Weight

Port enable is not complete until ACK = 0. Change
__afu_port_enable() to guarantee that the enable process
is complete by polling for ACK == 0.

Signed-off-by: Russ Weight 
Reviewed-by: Tom Rix 
Reviewed-by: Matthew Gerlach 
Acked-by: Wu Hao 
---
v8:
  - Rebased to 5.12-rc1 (there were no conflicts)
v7:
  - Added Acked-by tag from Wu Hao
v6:
  - Fixed the dev_warn statement, which had "__func__" embedded in the
string instead of treated as a parameter to the format string.
v5:
  - Added Reviewed-by tag to commit message
v4:
  - Added a dev_warn() call for the -EINVAL case of afu_port_err_clear()
  - Modified dev_err() message in __afu_port_disable() to say "disable"
instead of "reset"
v3:
  - afu_port_err_clear() changed to prioritize port_enable failure over
other a detected mismatch in port errors.
  - reorganized code in port_reset() to be more readable.
v2:
  - Fixed typo in commit message
---
 drivers/fpga/dfl-afu-error.c | 10 ++
 drivers/fpga/dfl-afu-main.c  | 33 +++--
 drivers/fpga/dfl-afu.h   |  2 +-
 3 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/drivers/fpga/dfl-afu-error.c b/drivers/fpga/dfl-afu-error.c
index c4691187cca9..ab7be6217368 100644
--- a/drivers/fpga/dfl-afu-error.c
+++ b/drivers/fpga/dfl-afu-error.c
@@ -52,7 +52,7 @@ static int afu_port_err_clear(struct device *dev, u64 err)
struct dfl_feature_platform_data *pdata = dev_get_platdata(dev);
struct platform_device *pdev = to_platform_device(dev);
void __iomem *base_err, *base_hdr;
-   int ret = -EBUSY;
+   int enable_ret = 0, ret = -EBUSY;
u64 v;
 
base_err = dfl_get_feature_ioaddr_by_id(dev, PORT_FEATURE_ID_ERROR);
@@ -96,18 +96,20 @@ static int afu_port_err_clear(struct device *dev, u64 err)
v = readq(base_err + PORT_FIRST_ERROR);
writeq(v, base_err + PORT_FIRST_ERROR);
} else {
+   dev_warn(dev, "%s: received 0x%llx, expected 0x%llx\n",
+__func__, v, err);
ret = -EINVAL;
}
 
/* Clear mask */
__afu_port_err_mask(dev, false);
 
-   /* Enable the Port by clear the reset */
-   __afu_port_enable(pdev);
+   /* Enable the Port by clearing the reset */
+   enable_ret = __afu_port_enable(pdev);
 
 done:
mutex_unlock(&pdata->lock);
-   return ret;
+   return enable_ret ? enable_ret : ret;
 }
 
 static ssize_t errors_show(struct device *dev, struct device_attribute *attr,
diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
index 753cda4b2568..77dadaae5b8f 100644
--- a/drivers/fpga/dfl-afu-main.c
+++ b/drivers/fpga/dfl-afu-main.c
@@ -21,6 +21,9 @@
 
 #include "dfl-afu.h"
 
+#define RST_POLL_INVL 10 /* us */
+#define RST_POLL_TIMEOUT 1000 /* us */
+
 /**
  * __afu_port_enable - enable a port by clear reset
  * @pdev: port platform device.
@@ -32,7 +35,7 @@
  *
  * The caller needs to hold lock for protection.
  */
-void __afu_port_enable(struct platform_device *pdev)
+int __afu_port_enable(struct platform_device *pdev)
 {
struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev->dev);
void __iomem *base;
@@ -41,7 +44,7 @@ void __afu_port_enable(struct platform_device *pdev)
WARN_ON(!pdata->disable_count);
 
if (--pdata->disable_count != 0)
-   return;
+   return 0;
 
base = dfl_get_feature_ioaddr_by_id(&pdev->dev, PORT_FEATURE_ID_HEADER);
 
@@ -49,10 +52,20 @@ void __afu_port_enable(struct platform_device *pdev)
v = readq(base + PORT_HDR_CTRL);
v &= ~PORT_CTRL_SFTRST;
writeq(v, base + PORT_HDR_CTRL);
-}
 
-#define RST_POLL_INVL 10 /* us */
-#define RST_POLL_TIMEOUT 1000 /* us */
+   /*
+* HW clears the ack bit to indicate that the port is fully out
+* of reset.
+*/
+   if (readq_poll_timeout(base + PORT_HDR_CTRL, v,
+  !(v & PORT_CTRL_SFTRST_ACK),
+  RST_POLL_INVL, RST_POLL_TIMEOUT)) {
+   dev_err(&pdev->dev, "timeout, failure to enable device\n");
+   return -ETIMEDOUT;
+   }
+
+   return 0;
+}
 
 /**
  * __afu_port_disable - disable a port by hold reset
@@ -86,7 +99,7 @@ int __afu_port_disable(struct platform_device *pdev)
if (readq_poll_timeout(base + PORT_HDR_CTRL, v,
   v & PORT_CTRL_SFTRST_ACK,
   RST_POLL_INVL, RST_POLL_TIMEOUT)) {
-   dev_err(&pdev->dev, "timeout, fail to reset device\n");
+   dev_err(&pdev->dev, "timeout, failure to disable device\n");
return -ETIMEDOUT;
}
 
@@ -111,9 +124,9 @@ static int __port_reset(struct platform_device *pdev)
 
ret = __afu_port_disable(pdev);
if (!ret)
-   __afu_port_enable(pdev);
+   return ret;
 
-   return ret;
+   return __afu_port_enable(pdev);

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1879 matches

Mail list logo