Re: [PATCH v3 5/9] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

2021-03-04 Thread Xu, Like

On 2021/3/5 0:12, Sean Christopherson wrote:

On Thu, Mar 04, 2021, Xu, Like wrote:

Hi Sean,

Thanks for your detailed review on the patch set.

On 2021/3/4 0:58, Sean Christopherson wrote:

On Wed, Mar 03, 2021, Like Xu wrote:

@@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct 
kvm_vcpu *vcpu,
return true;
   }
+/*
+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+   if (best && depth && !(depth % 8))

This is still wrong, it fails to weed out depth > 64.

How come ? The testcases depth = {65, 127, 128} get #GP as expected.

@depth is a u64, throw in a number that is a multiple of 8 and >= 520, and the
"(1ULL << (depth / 8 - 1))" will trigger undefined behavior due to shifting
beyond the capacity of a ULL / u64.


Extra:

when we say "undefined behavior" if shifting beyond the capacity of a ULL,
do you mean that the actual behavior depends on the machine, architecture 
or compiler?




Adding the "< 64" check would also allow dropping the " & 0xff" since the check
would ensure the shift doesn't go beyond bit 7.  I'm not sure the cleverness is
worth shaving a cycle, though.


Finally how about:

    if (best && depth && (depth < 65) && !(depth & 7))
        return best->eax & BIT_ULL(depth / 8 - 1);

    return false;

Do you see the room for optimization ?




Not that this is a hot path, but it's probably worth double checking that the
compiler generates simple code for "depth % 8", e.g. it can be "depth & 7)".

Emm, the "%" operation is quite normal over kernel code.

So is "&" :-)  I was just pointing out that the compiler should optimize this,
and it did.


if (best && depth && !(depth % 8))
    10659:   48 85 c0    test   rax,rax
    1065c:   74 c7   je 10625 
    1065e:   4d 85 e4    test   r12,r12
    10661:   74 c2   je 10625 
    10663:   41 f6 c4 07 test   r12b,0x7
    10667:   75 bc   jne    10625 

It looks like the compiler does the right thing.
Do you see the room for optimization ?


+   return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

Actually, looking at this again, I would explicitly use BIT() instead of 1ULL
(or BIT_ULL), since the shift must be 7 or less.


+
+   return false;
+}
+




Re: [PATCH 2/2] MIPS: Loongson64: Move loongson_system_configuration to loongson.h

2021-03-04 Thread Jiaxun Yang




在 2021/3/4 下午7:00, Qing Zhang 写道:

The purpose of separating loongson_system_configuration from boot_param.h
is to keep the other structure consistent with the firmware.

Signed-off-by: Jiaxun Yang 
Signed-off-by: Qing Zhang 




Acked-by: Jiaxun Yang 



- Jiaxun


---
  .../include/asm/mach-loongson64/boot_param.h   | 18 --
  .../include/asm/mach-loongson64/loongson.h | 18 ++
  drivers/irqchip/irq-loongson-liointc.c |  2 +-
  3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h 
b/arch/mips/include/asm/mach-loongson64/boot_param.h
index 1c1cdf57137e..035b1a69e2d0 100644
--- a/arch/mips/include/asm/mach-loongson64/boot_param.h
+++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
@@ -198,24 +198,6 @@ enum loongson_bridge_type {
VIRTUAL = 3
  };
  
-struct loongson_system_configuration {

-   u32 nr_cpus;
-   u32 nr_nodes;
-   int cores_per_node;
-   int cores_per_package;
-   u16 boot_cpu_id;
-   u16 reserved_cpus_mask;
-   enum loongson_cpu_type cputype;
-   enum loongson_bridge_type bridgetype;
-   u64 restart_addr;
-   u64 poweroff_addr;
-   u64 suspend_addr;
-   u64 vgabios_addr;
-   u32 dma_mask_bits;
-   u64 workarounds;
-   void (*early_config)(void);
-};
-
  extern struct efi_memory_map_loongson *loongson_memmap;
  extern struct loongson_system_configuration loongson_sysconf;
  
diff --git a/arch/mips/include/asm/mach-loongson64/loongson.h b/arch/mips/include/asm/mach-loongson64/loongson.h

index ac1c20e172a2..6189deb188cf 100644
--- a/arch/mips/include/asm/mach-loongson64/loongson.h
+++ b/arch/mips/include/asm/mach-loongson64/loongson.h
@@ -12,6 +12,24 @@
  #include 
  #include 
  
+/* machine-specific boot configuration */

+struct loongson_system_configuration {
+   u32 nr_cpus;
+   u32 nr_nodes;
+   int cores_per_node;
+   int cores_per_package;
+   u16 boot_cpu_id;
+   u16 reserved_cpus_mask;
+   enum loongson_cpu_type cputype;
+   enum loongson_bridge_type bridgetype;
+   u64 restart_addr;
+   u64 poweroff_addr;
+   u64 suspend_addr;
+   u64 vgabios_addr;
+   u32 dma_mask_bits;
+   u64 workarounds;
+   void (*early_config)(void);
+};
  
  /* machine-specific reboot/halt operation */

  extern void mach_prepare_reboot(void);
diff --git a/drivers/irqchip/irq-loongson-liointc.c 
b/drivers/irqchip/irq-loongson-liointc.c
index 09b91b81851c..249566a23cc4 100644
--- a/drivers/irqchip/irq-loongson-liointc.c
+++ b/drivers/irqchip/irq-loongson-liointc.c
@@ -16,7 +16,7 @@
  #include 
  #include 
  
-#include 

+#include 
  
  #define LIOINTC_CHIP_IRQ	32

  #define LIOINTC_NUM_PARENT 4



Re: [PATCH 1/2] MIPS: Loongson64: Remove unused sysconf members

2021-03-04 Thread Jiaxun Yang




在 2021/3/4 下午7:00, Qing Zhang 写道:

We don't need them anymore, They are uniform on all Loongson64 systems
and have been fixed in DeviceTree.loongson3_platform_init is replaced
with DTS + driver.

Signed-off-by: Jiaxun Yang 
Signed-off-by: Qing Zhang 


Acked-by: Jiaxun Yang 

Hmm, why it comes with my sign-off?
I assue it's my patch somewhere off the tree?

Thanks.

- Jiaxun


---
  .../include/asm/mach-loongson64/boot_param.h  |  9 
  arch/mips/loongson64/Makefile |  2 +-
  arch/mips/loongson64/env.c| 20 -
  arch/mips/loongson64/platform.c   | 42 ---
  4 files changed, 1 insertion(+), 72 deletions(-)
  delete mode 100644 arch/mips/loongson64/platform.c

diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h 
b/arch/mips/include/asm/mach-loongson64/boot_param.h
index 4592841b6b0c..1c1cdf57137e 100644
--- a/arch/mips/include/asm/mach-loongson64/boot_param.h
+++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
@@ -207,20 +207,11 @@ struct loongson_system_configuration {
u16 reserved_cpus_mask;
enum loongson_cpu_type cputype;
enum loongson_bridge_type bridgetype;
-   u64 ht_control_base;
-   u64 pci_mem_start_addr;
-   u64 pci_mem_end_addr;
-   u64 pci_io_base;
u64 restart_addr;
u64 poweroff_addr;
u64 suspend_addr;
u64 vgabios_addr;
u32 dma_mask_bits;
-   char ecname[32];
-   u32 nr_uarts;
-   struct uart_device uarts[MAX_UARTS];
-   u32 nr_sensors;
-   struct sensor_device sensors[MAX_SENSORS];
u64 workarounds;
void (*early_config)(void);
  };
diff --git a/arch/mips/loongson64/Makefile b/arch/mips/loongson64/Makefile
index cc76944b1a9d..e806280bbb85 100644
--- a/arch/mips/loongson64/Makefile
+++ b/arch/mips/loongson64/Makefile
@@ -2,7 +2,7 @@
  #
  # Makefile for Loongson-3 family machines
  #
-obj-$(CONFIG_MACH_LOONGSON64) += cop2-ex.o platform.o dma.o \
+obj-$(CONFIG_MACH_LOONGSON64) += cop2-ex.o dma.o \
setup.o init.o env.o time.o reset.o \
  
  obj-$(CONFIG_SMP)	+= smp.o

diff --git a/arch/mips/loongson64/env.c b/arch/mips/loongson64/env.c
index 51a5d050a94c..1821d461b606 100644
--- a/arch/mips/loongson64/env.c
+++ b/arch/mips/loongson64/env.c
@@ -95,7 +95,6 @@ void __init prom_init_env(void)
loongson_freqctrl[1] = 0x900010001fe001d0;
loongson_freqctrl[2] = 0x900020001fe001d0;
loongson_freqctrl[3] = 0x900030001fe001d0;
-   loongson_sysconf.ht_control_base = 0x9EFDFB00;
loongson_sysconf.workarounds = WORKAROUND_CPUFREQ;
break;
case Legacy_3B:
@@ -118,7 +117,6 @@ void __init prom_init_env(void)
loongson_freqctrl[1] = 0x900020001fe001d0;
loongson_freqctrl[2] = 0x900040001fe001d0;
loongson_freqctrl[3] = 0x900060001fe001d0;
-   loongson_sysconf.ht_control_base = 0x90001EFDFB00;
loongson_sysconf.workarounds = WORKAROUND_CPUHOTPLUG;
break;
default:
@@ -136,9 +134,6 @@ void __init prom_init_env(void)
loongson_sysconf.cores_per_node - 1) /
loongson_sysconf.cores_per_node;
  
-	loongson_sysconf.pci_mem_start_addr = eirq_source->pci_mem_start_addr;

-   loongson_sysconf.pci_mem_end_addr = eirq_source->pci_mem_end_addr;
-   loongson_sysconf.pci_io_base = eirq_source->pci_io_start_addr;
loongson_sysconf.dma_mask_bits = eirq_source->dma_mask_bits;
if (loongson_sysconf.dma_mask_bits < 32 ||
loongson_sysconf.dma_mask_bits > 64)
@@ -153,23 +148,8 @@ void __init prom_init_env(void)
loongson_sysconf.poweroff_addr, loongson_sysconf.restart_addr,
loongson_sysconf.vgabios_addr);
  
-	memset(loongson_sysconf.ecname, 0, 32);

-   if (esys->has_ec)
-   memcpy(loongson_sysconf.ecname, esys->ec_name, 32);
loongson_sysconf.workarounds |= esys->workarounds;
  
-	loongson_sysconf.nr_uarts = esys->nr_uarts;

-   if (esys->nr_uarts < 1 || esys->nr_uarts > MAX_UARTS)
-   loongson_sysconf.nr_uarts = 1;
-   memcpy(loongson_sysconf.uarts, esys->uarts,
-   sizeof(struct uart_device) * loongson_sysconf.nr_uarts);
-
-   loongson_sysconf.nr_sensors = esys->nr_sensors;
-   if (loongson_sysconf.nr_sensors > MAX_SENSORS)
-   loongson_sysconf.nr_sensors = 0;
-   if (loongson_sysconf.nr_sensors)
-   memcpy(loongson_sysconf.sensors, esys->sensors,
-   sizeof(struct sensor_device) * 
loongson_sysconf.nr_sensors);
pr_info("CpuClock = %u\n", cpu_clock_freq);
  
  	/* Read the ID of PCI host bridge to detect bridge type */

diff --git a/arch/mips/loongson64/platform.c b/arch/mips/loongson64/platform.c
deleted file mode 100644
index 9674ae1361a8..
--- 

Re: [PATCH] clocksource: don't run watchdog forever

2021-03-04 Thread Feng Tang
On Thu, Mar 04, 2021 at 03:15:13PM +0100, Thomas Gleixner wrote:
> Feng,
> 
> On Thu, Mar 04 2021 at 15:43, Feng Tang wrote:
> > On Wed, Mar 03, 2021 at 04:50:31PM +0100, Thomas Gleixner wrote:
> >> Anything pre TSC_ADJUST wants the watchdog on. With TSC ADJUST available
> >> we can probably avoid it.
> >> 
> >> There is a caveat though. If the machine never goes idle then TSC adjust
> >> is not able to detect a potential wreckage. OTOH, most of the broken
> >> BIOSes tweak TSC only by a few cycles and that is usually detectable
> >> during boot. So we might be clever about it and schedule a check every
> >> hour when during the first 10 minutes a modification of TSC adjust is
> >> seen on any CPU.
> >
> > I don't have much experience with tsc_adjust, and try to understand it:
> > The 'modification of TSC' here has 2 cases: ? 
> > * First read of 'TSC_ADJUST' MSR of a just boot CPU returns non-zero
> > value
> 
> That's catching stupid BIOSes which set the TSC to random values during
> boot/reboot. That's a one off boot issue and not a real problem. The
> kernel fixes it up and is done with it. Nothing to care about.
> 
> > * Following read of 'TSC_ADJUST' doesn't equal to the 'tsc_adjust' value
> >   saved in per-cpu data.
> 
> That's where we catch broken BIOS/SMI implementations which try to
> "hide" the time wasted in BIOS/SMI by setting the TSC back to the value
> they saved on SMI entry. That was a popular BIOS "feature" some years
> ago, but it seems the BIOS tinkerers finally gave up on it.
 
Thanks for the detailed explaination! I understand now.

> >> Where is this TSC_DISABLE_WRITE bit again?
> 
> I'm serious about this. Once the kernel has taken over a CPU there is
> absolutely no reason for any context to write to the TSC/TSC_ADJUST
> register ever again. So having a mechanism to prevent writes would
> surely help to make the TSC more trustworthy.
> 
> > Also, does the patch ("x86/tsc: mark tsc reliable for qualified platforms")
> > need to wait till this caveat case is solved?
> 
> Yes, but that should be trivial to do. 

Ok, I see.

Thanks,
Feng

> 
> Thanks,
> 
> tglx


Re: [PATCH] [v2] Input: Add "Share" button to Microsoft Xbox One controller.

2021-03-04 Thread Chris Ye
Hi Bastien,  just want to follow up again on this.  I've checked again
with the Xbox team that the "Share button" is given for the product,
the HID usage profile and mapping to RECORD is what Xbox team expects
and they want the same mapping for USB.

Thanks!
Chris


On Tue, Mar 2, 2021 at 3:57 PM Chris Ye  wrote:
>
> Hi Bastien,
> The "Share button" is a name Microsoft calls it, it actually has
> HID descriptor defined in the bluetooth interface, which the HID usage
> is:
> consumer 0xB2:
> 0x05, 0x0C,//   Usage Page (Consumer)
> 0x0A, 0xB2, 0x00,  //   Usage (Record)
> Microsoft wants the same key code to be generated consistently for USB
> and bluetooth.
> Thanks!
> Chris
>
>
> On Tue, Mar 2, 2021 at 1:50 AM Bastien Nocera  wrote:
> >
> > On Thu, 2021-02-25 at 05:32 +, Chris Ye wrote:
> > > Add "Share" button input capability and input event mapping for
> > > Microsoft Xbox One controller.
> > > Fixed Microsoft Xbox One controller share button not working under USB
> > > connection.
> > >
> > > Signed-off-by: Chris Ye 
> > > ---
> > >  drivers/input/joystick/xpad.c | 9 -
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/input/joystick/xpad.c
> > > b/drivers/input/joystick/xpad.c
> > > index 9f0d07dcbf06..0c3374091aff 100644
> > > --- a/drivers/input/joystick/xpad.c
> > > +++ b/drivers/input/joystick/xpad.c
> > > @@ -79,6 +79,7 @@
> > >  #define MAP_DPAD_TO_BUTTONS(1 << 0)
> > >  #define MAP_TRIGGERS_TO_BUTTONS(1 << 1)
> > >  #define MAP_STICKS_TO_NULL (1 << 2)
> > > +#define MAP_SHARE_BUTTON   (1 << 3)
> > >  #define DANCEPAD_MAP_CONFIG(MAP_DPAD_TO_BUTTONS
> > > |  \
> > > MAP_TRIGGERS_TO_BUTTONS |
> > > MAP_STICKS_TO_NULL)
> > >
> > > @@ -130,6 +131,7 @@ static const struct xpad_device {
> > > { 0x045e, 0x02e3, "Microsoft X-Box One Elite pad", 0,
> > > XTYPE_XBOXONE },
> > > { 0x045e, 0x02ea, "Microsoft X-Box One S pad", 0, XTYPE_XBOXONE
> > > },
> > > { 0x045e, 0x0719, "Xbox 360 Wireless Receiver",
> > > MAP_DPAD_TO_BUTTONS, XTYPE_XBOX360W },
> > > +   { 0x045e, 0x0b12, "Microsoft X-Box One X pad",
> > > MAP_SHARE_BUTTON, XTYPE_XBOXONE },
> > > { 0x046d, 0xc21d, "Logitech Gamepad F310", 0, XTYPE_XBOX360 },
> > > { 0x046d, 0xc21e, "Logitech Gamepad F510", 0, XTYPE_XBOX360 },
> > > { 0x046d, 0xc21f, "Logitech Gamepad F710", 0, XTYPE_XBOX360 },
> > > @@ -862,6 +864,8 @@ static void xpadone_process_packet(struct usb_xpad
> > > *xpad, u16 cmd, unsigned char
> > > /* menu/view buttons */
> > > input_report_key(dev, BTN_START,  data[4] & 0x04);
> > > input_report_key(dev, BTN_SELECT, data[4] & 0x08);
> > > +   if (xpad->mapping & MAP_SHARE_BUTTON)
> > > +   input_report_key(dev, KEY_RECORD, data[22] & 0x01);
> > >
> > > /* buttons A,B,X,Y */
> > > input_report_key(dev, BTN_A,data[4] & 0x10);
> > > @@ -1669,9 +1673,12 @@ static int xpad_init_input(struct usb_xpad
> > > *xpad)
> > >
> > > /* set up model-specific ones */
> > > if (xpad->xtype == XTYPE_XBOX360 || xpad->xtype ==
> > > XTYPE_XBOX360W ||
> > > -   xpad->xtype == XTYPE_XBOXONE) {
> > > +   xpad->xtype == XTYPE_XBOXONE) {
> > > for (i = 0; xpad360_btn[i] >= 0; i++)
> > > input_set_capability(input_dev, EV_KEY,
> > > xpad360_btn[i]);
> > > +   if (xpad->mapping & MAP_SHARE_BUTTON) {
> > > +   input_set_capability(input_dev, EV_KEY,
> > > KEY_RECORD);
> >
> > Is there not a better keycode to use than "Record"? Should a "share"
> > keycode be added?
> >
> > I couldn't find a share button in the most recent USB HID Usage Tables:
> > https://www.usb.org/document-library/hid-usage-tables-121
> >
> > > +   }
> > > } else {
> > > for (i = 0; xpad_btn[i] >= 0; i++)
> > > input_set_capability(input_dev, EV_KEY,
> > > xpad_btn[i]);
> >
> >


Re: [PATCH] kbuild: rebuild GCC plugins when the compiler is upgraded

2021-03-04 Thread Josh Poimboeuf
On Thu, Mar 04, 2021 at 03:37:14PM -0800, Linus Torvalds wrote:
> On Thu, Mar 4, 2021 at 3:20 PM Kees Cook  wrote:
> >
> > This seems fine to me, but I want to make sure Josh has somewhere to
> > actually go with this. Josh, does this get you any closer?

No, this doesn't seem to help me at all.

> > It sounds like the plugins need to move to another location for
> > packaged kernels?
> 
> Well, it might be worth extending the stuff that gets installed with
> /lib/modules// with enough information and
> infrastruvcture to then build any external modules.

The gcc plugins live in scripts/, which get installed by "make
modules_install" already.  So the plugins' source and makefiles are in
/lib/modules//build/scripts/gcc-plugins.

So everything needed for building the plugins is already there.  We just
need the kernel makefiles to rebuild the plugins locally, when building
an external module.

-- 
Josh



[PATCH v2] f2fs: add sysfs nodes to get accumulated compression info

2021-03-04 Thread Daeho Jeong
From: Daeho Jeong 

Added acc_compr_inodes to show accumulated compressed inode count and
acc_compr_blocks to show accumulated secured block count with
compression in sysfs. These can be re-initialized to "0" by writing "0"
value in one of both.

Signed-off-by: Daeho Jeong 
---
v2: thanks to kernel test robot , fixed compile issue
related to kernel config.
---
 Documentation/ABI/testing/sysfs-fs-f2fs | 13 +++
 fs/f2fs/checkpoint.c|  8 
 fs/f2fs/compress.c  |  4 +-
 fs/f2fs/data.c  |  2 +-
 fs/f2fs/f2fs.h  | 50 -
 fs/f2fs/file.c  |  8 ++--
 fs/f2fs/inode.c |  1 +
 fs/f2fs/super.c | 10 -
 fs/f2fs/sysfs.c | 45 ++
 include/linux/f2fs_fs.h |  4 +-
 10 files changed, 135 insertions(+), 10 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs 
b/Documentation/ABI/testing/sysfs-fs-f2fs
index cbeac1bebe2f..f4fc87503754 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -409,3 +409,16 @@ Description:   Give a way to change checkpoint merge 
daemon's io priority.
I/O priority "3". We can select the class between "rt" and "be",
and set the I/O priority within valid range of it. "," delimiter
is necessary in between I/O class and priority number.
+
+What:  /sys/fs/f2fs//acc_compr_inodes
+Date:  March 2021
+Contact:   "Daeho Jeong" 
+Description:   Show accumulated compressed inode count. If you write "0" here,
+   you can initialize acc_compr_inodes and acc_compr_blocks as "0".
+
+What:  /sys/fs/f2fs//acc_compr_blocks
+Date:  March 2021
+Contact:   "Daeho Jeong" 
+Description:   Show accumulated secured block count with compression.
+   If you write "0" here, you can initialize acc_compr_inodes and
+   acc_compr_blocks as "0".
diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 174a0819ad96..cd944a569162 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1514,6 +1514,14 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, 
struct cp_control *cpc)
seg_i->journal->info.kbytes_written = cpu_to_le64(kbytes_written);
 
if (__remain_node_summaries(cpc->reason)) {
+   /* Record compression statistics in the hot node summary */
+   spin_lock(>acc_compr_lock);
+   seg_i->journal->info.acc_compr_blocks =
+   cpu_to_le64(sbi->acc_compr_blocks);
+   seg_i->journal->info.acc_compr_inodes =
+   cpu_to_le32(sbi->acc_compr_inodes);
+   spin_unlock(>acc_compr_lock);
+
f2fs_write_node_summaries(sbi, start_blk);
start_blk += NR_CURSEG_NODE_TYPE;
}
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 77fa342de38f..9029e95f4ae4 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -1351,8 +1351,8 @@ static int f2fs_write_compressed_pages(struct 
compress_ctx *cc,
}
 
if (fio.compr_blocks)
-   f2fs_i_compr_blocks_update(inode, fio.compr_blocks - 1, false);
-   f2fs_i_compr_blocks_update(inode, cc->nr_cpages, true);
+   f2fs_i_compr_blocks_update(inode, fio.compr_blocks - 1, false, 
true);
+   f2fs_i_compr_blocks_update(inode, cc->nr_cpages, true, true);
 
set_inode_flag(cc->inode, FI_APPEND_WRITE);
if (cc->cluster_idx == 0)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index b9721c8f116c..d3afb9b0090e 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -2591,7 +2591,7 @@ int f2fs_do_write_data_page(struct f2fs_io_info *fio)
ClearPageError(page);
 
if (fio->compr_blocks && fio->old_blkaddr == COMPRESS_ADDR)
-   f2fs_i_compr_blocks_update(inode, fio->compr_blocks - 1, false);
+   f2fs_i_compr_blocks_update(inode, fio->compr_blocks - 1, false, 
true);
 
/* LFS mode write path */
f2fs_outplace_write_data(, fio);
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index e2d302ae3a46..a12edf5283cd 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1609,6 +1609,11 @@ struct f2fs_sb_info {
u64 sectors_written_start;
u64 kbytes_written;
 
+   /* For accumulated compression statistics */
+   u64 acc_compr_blocks;
+   u32 acc_compr_inodes;
+   spinlock_t acc_compr_lock;
+
/* Reference to checksum algorithm driver via cryptoapi */
struct crypto_shash *s_chksum_driver;
 
@@ -3985,6 +3990,43 @@ static inline int __init f2fs_init_compress_cache(void) 
{ return 0; }
 static inline void f2fs_destroy_compress_cache(void) { }
 #endif
 
+#define inc_acc_compr_inodes(inode)\
+   do {   

Re: [PATCH] MIPS: Add comment about CONFIG_MIPS32_O32 in loongson3_defconfig when build with Clang

2021-03-04 Thread Jiaxun Yang




在 2021/3/5 上午7:08, Maciej W. Rozycki 写道:

On Thu, 4 Mar 2021, Tiezhu Yang wrote:


This is a known bug [2] with Clang, as Simon Atanasyan said,
"There is no plan on support O32 for MIPS64 due to lack of
resources".


  Huh?  Is that a joke?  From the o32 psABI's point of view a MIPS64 CPU is
exactly the same as a MIPS32 one (for whatever ISA revision), so there's
nothing to support there really other than the CPU/ISA name.


Clang treat MIPS32 as a different backend so we may need some extra 
effort


TBH it is a toolchain issue and kernel workaround seems bogus.

From my point view we can "s/mips64r2/mips32r2" when doing syscall checks
to workaround clang issue instead of disable it for kernel.

Thanks.

- Jiaxun



  As much as I dislike all the hacks the Clang community seems to come up
with for the shortcomings of their tool there has to be a saner workaround
available rather than forcibly disabling support for the o32 ABI with
CONFIG_64BIT kernels, but the report is missing the compiler invocation
line triggering the issue (V=1 perhaps?), which should be included with
any commit description anyway, so I can't suggest anything based on the
limited information provided.

   Maciej



Re: [PATCH] RTIC: selinux: ARM64: Move selinux_state to a separate page

2021-03-04 Thread Paul Moore
On Tue, Feb 16, 2021 at 5:19 AM Preeti Nagar  wrote:
>
> The changes introduce a new security feature, RunTime Integrity Check
> (RTIC), designed to protect Linux Kernel at runtime. The motivation
> behind these changes is:
> 1. The system protection offered by Security Enhancements(SE) for
> Android relies on the assumption of kernel integrity. If the kernel
> itself is compromised (by a perhaps as yet unknown future vulnerability),
> SE for Android security mechanisms could potentially be disabled and
> rendered ineffective.
> 2. Qualcomm Snapdragon devices use Secure Boot, which adds cryptographic
> checks to each stage of the boot-up process, to assert the authenticity
> of all secure software images that the device executes.  However, due to
> various vulnerabilities in SW modules, the integrity of the system can be
> compromised at any time after device boot-up, leading to un-authorized
> SW executing.
>
> The feature's idea is to move some sensitive kernel structures to a
> separate page and monitor further any unauthorized changes to these,
> from higher Exception Levels using stage 2 MMU. Moving these to a
> different page will help avoid getting page faults from un-related data.
> The mechanism we have been working on removes the write permissions for
> HLOS in the stage 2 page tables for the regions to be monitored, such
> that any modification attempts to these will lead to faults being
> generated and handled by handlers. If the protected assets are moved to
> a separate page, faults will be generated corresponding to change attempts
> to these assets only. If not moved to a separate page, write attempts to
> un-related data present on the monitored pages will also be generated.
>
> Using this feature, some sensitive variables of the kernel which are
> initialized after init or are updated rarely can also be protected from
> simple overwrites and attacks trying to modify these.
>
> Currently, the change moves selinux_state structure to a separate page.
> The page is 2MB aligned not 4K to avoid TLB related performance impact as,
> for some CPU core designs, the TLB does not cache 4K stage 2 (IPA to PA)
> mappings if the IPA comes from a stage 1 mapping. In future, we plan to
> move more security-related kernel assets to this page to enhance
> protection.
>
> Signed-off-by: Preeti Nagar 
> ---
> The RFC patch reviewed available at:
> https://lore.kernel.org/linux-security-module/1610099389-28329-1-git-send-email-pna...@codeaurora.org/
> ---
>  include/asm-generic/vmlinux.lds.h | 10 ++
>  include/linux/init.h  |  6 ++
>  security/Kconfig  | 11 +++
>  security/selinux/hooks.c  |  2 +-
>  4 files changed, 28 insertions(+), 1 deletion(-)

As long as we are only talking about moving the selinux_state struct
itself and none of the pointers inside I think we should be okay (the
access decision cache pointed to by selinux_state->avc could change
frequently).  Have you done any performance measurements of this
change?  Assuming they are not terrible, I have no objections to this
patch from a SELinux perspective.

-- 
paul moore
www.paul-moore.com


Re: [PATCH v3 RFC 14/14] mm: speedup page alloc for MPOL_PREFERRED_MANY by adding a NO_SLOWPATH gfp bit

2021-03-04 Thread Feng Tang
On Thu, Mar 04, 2021 at 01:59:40PM +0100, Michal Hocko wrote:
> On Thu 04-03-21 16:14:14, Feng Tang wrote:
> > On Wed, Mar 03, 2021 at 09:22:50AM -0800, Ben Widawsky wrote:
> > > On 21-03-03 18:14:30, Michal Hocko wrote:
> > > > On Wed 03-03-21 08:31:41, Ben Widawsky wrote:
> > > > > On 21-03-03 14:59:35, Michal Hocko wrote:
> > > > > > On Wed 03-03-21 21:46:44, Feng Tang wrote:
> > > > > > > On Wed, Mar 03, 2021 at 09:18:32PM +0800, Tang, Feng wrote:
> > > > > > > > On Wed, Mar 03, 2021 at 01:32:11PM +0100, Michal Hocko wrote:
> > > > > > > > > On Wed 03-03-21 20:18:33, Feng Tang wrote:
> > > > > > [...]
> > > > > > > > > > One thing I tried which can fix the slowness is:
> > > > > > > > > > 
> > > > > > > > > > +   gfp_mask &= ~(__GFP_DIRECT_RECLAIM | 
> > > > > > > > > > __GFP_KSWAPD_RECLAIM);
> > > > > > > > > > 
> > > > > > > > > > which explicitly clears the 2 kinds of reclaim. And I 
> > > > > > > > > > thought it's too
> > > > > > > > > > hacky and didn't mention it in the commit log.
> > > > > > > > > 
> > > > > > > > > Clearing __GFP_DIRECT_RECLAIM would be the right way to 
> > > > > > > > > achieve
> > > > > > > > > GFP_NOWAIT semantic. Why would you want to exclude kswapd as 
> > > > > > > > > well? 
> > > > > > > > 
> > > > > > > > When I tried gfp_mask &= ~__GFP_DIRECT_RECLAIM, the slowness 
> > > > > > > > couldn't
> > > > > > > > be fixed.
> > > > > > > 
> > > > > > > I just double checked by rerun the test, 'gfp_mask &= 
> > > > > > > ~__GFP_DIRECT_RECLAIM'
> > > > > > > can also accelerate the allocation much! though is still a little 
> > > > > > > slower than
> > > > > > > this patch. Seems I've messed some of the tries, and sorry for 
> > > > > > > the confusion!
> > > > > > > 
> > > > > > > Could this be used as the solution? or the adding another 
> > > > > > > fallback_nodemask way?
> > > > > > > but the latter will change the current API quite a bit.
> > > > > > 
> > > > > > I haven't got to the whole series yet. The real question is whether 
> > > > > > the
> > > > > > first attempt to enforce the preferred mask is a general win. I 
> > > > > > would
> > > > > > argue that it resembles the existing single node preferred memory 
> > > > > > policy
> > > > > > because that one doesn't push heavily on the preferred node either. 
> > > > > > So
> > > > > > dropping just the direct reclaim mode makes some sense to me.
> > > > > > 
> > > > > > IIRC this is something I was recommending in an early proposal of 
> > > > > > the
> > > > > > feature.
> > > > > 
> > > > > My assumption [FWIW] is that the usecases we've outlined for 
> > > > > multi-preferred
> > > > > would want more heavy pushing on the preference mask. However, maybe 
> > > > > the uapi
> > > > > could dictate how hard to try/not try.
> > > > 
> > > > What does that mean and what is the expectation from the kernel to be
> > > > more or less cast in stone?
> > > > 
> > > 
> > > (I'm not positive I've understood your question, so correct me if I
> > > misunderstood)
> > > 
> > > I'm not sure there is a stone-cast way to define it nor should we. At the 
> > > very
> > > least though, something in uapi that has a general mapping to GFP flags
> > > (specifically around reclaim) for the first round of allocation could make
> > > sense.
> > > 
> > > In my head there are 3 levels of request possible for multiple nodes:
> > > 1. BIND: Those nodes or die.
> > > 2. Preferred hard: Those nodes and I'm willing to wait. Fallback if 
> > > impossible.
> > > 3. Preferred soft: Those nodes but I don't want to wait.
> > > 
> > > Current UAPI in the series doesn't define a distinction between 2, and 3. 
> > > As I
> > > understand the change, Feng is defining the behavior to be #3, which 
> > > makes #2
> > > not an option. I sort of punted on defining it entirely, in the beginning.
> > 
> > As discussed earlier in the thread, one less hacky solution is to clear
> > __GFP_DIRECT_RECLAIM bit so that it won't go into direct reclaim, but still
> > wakeup the kswapd of target nodes and retry, which sits now between 
> > 'Preferred hard'
> > and 'Preferred soft' :)
> 
> Yes that is what I've had in mind when talking about a lightweight
> attempt.
> 
> > For current MPOL_PREFERRED, its semantic is also 'Preferred hard', that it
> 
> Did you mean to say prefer soft? Because the direct reclaim is attempted
> only when node reclaim is enabled.
> 
> > will check free memory of other nodes before entering slowpath waiting.
> 
> Yes, hence "soft" semantic.

Yes, it's the #3 item: 'Preferred soft' 

Thanks,
Feng

> -- 
> Michal Hocko
> SUSE Labs


Re: [f2fs-dev] [PATCH] f2fs: expose # of overprivision segments

2021-03-04 Thread Chao Yu

On 2021/3/5 1:50, Jaegeuk Kim wrote:

On 03/04, Chao Yu wrote:

On 2021/3/3 2:44, Jaegeuk Kim wrote:

On 03/02, Jaegeuk Kim wrote:

On 03/02, Chao Yu wrote:

On 2021/3/2 13:42, Jaegeuk Kim wrote:

This is useful when checking conditions during checkpoint=disable in Android.


This sysfs entry is readonly, how about putting this at
/sys/fs/f2fs//stat/?


Urg.. "stat" is a bit confused. I'll take a look a better ones.


Oh, I mean put it into "stat" directory, not "stat" entry, something like this:

/sys/fs/f2fs//stat/ovp_segments


I meant that too. Why is it like stat, since it's a geomerty?


Hmm.. I feel a little bit weired to treat ovp_segments as 'stat' class, one 
reason
is ovp_segments is readonly and is matching the readonly attribute of a stat.







Taking a look at other entries using in Android, I feel that this one can't be
in stat or whatever other location, since I worry about the consistency with
similar dirty/free segments. It seems it's not easy to clean up the existing
ones anymore.


Well, actually, the entry number are still increasing continuously, the result 
is
that it becomes more and more slower and harder for me to find target entry name
from that directory.

IMO, once new readonly entry was added to "" directory, there is no chance
to reloacate it due to interface compatibility. So I think this is the only
chance to put it to the appropriate place at this time.


I know, but this will diverge those info into different places. I don't have
big concern when finding a specific entry with this tho, how about making
symlinks to create a dir structure for your easy access? Or, using a script
would be alternative way.


Yes, there should be some alternative ways to help to access f2fs sysfs
interface, but from a point view of user, I'm not sure he can figure out those
ways.

For those fs meta stat, why not adding a single entry to include all info you
need rather than adding them one by one? e.g.

/proc/fs/f2fs//super_block
/proc/fs/f2fs//checkpoint
/proc/fs/f2fs//nat_table
/proc/fs/f2fs//sit_table
...

Thanks,





Thanks,









Signed-off-by: Jaegeuk Kim 
---
fs/f2fs/sysfs.c | 8 
1 file changed, 8 insertions(+)

diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index e38a7f6921dd..254b6fa17406 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -91,6 +91,13 @@ static ssize_t free_segments_show(struct f2fs_attr *a,
(unsigned long long)(free_segments(sbi)));
}
+static ssize_t ovp_segments_show(struct f2fs_attr *a,
+   struct f2fs_sb_info *sbi, char *buf)
+{
+   return sprintf(buf, "%llu\n",
+   (unsigned long long)(overprovision_segments(sbi)));
+}
+
static ssize_t lifetime_write_kbytes_show(struct f2fs_attr *a,
struct f2fs_sb_info *sbi, char *buf)
{
@@ -629,6 +636,7 @@ F2FS_RW_ATTR(F2FS_SBI, f2fs_sb_info, node_io_flag, 
node_io_flag);
F2FS_RW_ATTR(CPRC_INFO, ckpt_req_control, ckpt_thread_ioprio, 
ckpt_thread_ioprio);
F2FS_GENERAL_RO_ATTR(dirty_segments);
F2FS_GENERAL_RO_ATTR(free_segments);
+F2FS_GENERAL_RO_ATTR(ovp_segments);


Missed to add document entry in Documentation/ABI/testing/sysfs-fs-f2fs?


Yeah, thanks.



Thanks,


F2FS_GENERAL_RO_ATTR(lifetime_write_kbytes);
F2FS_GENERAL_RO_ATTR(features);
F2FS_GENERAL_RO_ATTR(current_reserved_blocks);




___
Linux-f2fs-devel mailing list
linux-f2fs-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

.


.



[PATCH] KVM: x86: Ensure deadline timer has truly expired before posting its IRQ

2021-03-04 Thread Sean Christopherson
When posting a deadline timer interrupt, open code the checks guarding
__kvm_wait_lapic_expire() in order to skip the lapic_timer_int_injected()
check in kvm_wait_lapic_expire().  The injection check will always fail
since the interrupt has not yet be injected.  Moving the call after
injection would also be wrong as that wouldn't actually delay delivery
of the IRQ if it is indeed sent via posted interrupt.

Fixes: 010fd37fddf6 ("KVM: LAPIC: Reduce world switch latency caused by 
timer_advance_ns")
Cc: sta...@vger.kernel.org
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/lapic.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 45d40bfacb7c..cb8ebfaccfb6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1642,7 +1642,16 @@ static void apic_timer_expired(struct kvm_lapic *apic, 
bool from_timer_fn)
}
 
if (kvm_use_posted_timer_interrupt(apic->vcpu)) {
-   kvm_wait_lapic_expire(vcpu);
+   /*
+* Ensure the guest's timer has truly expired before posting an
+* interrupt.  Open code the relevant checks to avoid querying
+* lapic_timer_int_injected(), which will be false since the
+* interrupt isn't yet injected.  Waiting until after injecting
+* is not an option since that won't help a posted interrupt.
+*/
+   if (vcpu->arch.apic->lapic_timer.expired_tscdeadline &&
+   vcpu->arch.apic->lapic_timer.timer_advance_ns)
+   __kvm_wait_lapic_expire(vcpu);
kvm_apic_inject_pending_timer_irqs(apic);
return;
}
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH] docs: livepatch: Fix a typo in the file shadow-vars.rst

2021-03-04 Thread Bhaskar Chowdhury


s/ varibles/variables/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/livepatch/shadow-vars.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/livepatch/shadow-vars.rst 
b/Documentation/livepatch/shadow-vars.rst
index c05715aeafa4..8464866d18ba 100644
--- a/Documentation/livepatch/shadow-vars.rst
+++ b/Documentation/livepatch/shadow-vars.rst
@@ -165,7 +165,7 @@ In-flight parent objects

 Sometimes it may not be convenient or possible to allocate shadow
 variables alongside their parent objects.  Or a livepatch fix may
-require shadow varibles to only a subset of parent object instances.  In
+require shadow variables to only a subset of parent object instances.  In
 these cases, the klp_shadow_get_or_alloc() call can be used to attach
 shadow variables to parents already in-flight.

--
2.30.1



[PATCH] KVM: SVM: Connect 'npt' module param to KVM's internal 'npt_enabled'

2021-03-04 Thread Sean Christopherson
Directly connect the 'npt' param to the 'npt_enabled' variable so that
runtime adjustments to npt_enabled are reflected in sysfs.  Move the
!PAE restriction to a runtime check to ensure NPT is forced off if the
host is using 2-level paging, and add a comment explicitly stating why
NPT requires a 64-bit kernel or a kernel with PAE enabled.

Opportunistically switch the param to octal permissions.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/svm/svm.c | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 54610270f66a..0ee74321461e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -115,13 +115,6 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_INVALID, .always = false },
 };
 
-/* enable NPT for AMD64 and X86 with PAE */
-#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
-bool npt_enabled = true;
-#else
-bool npt_enabled;
-#endif
-
 /*
  * These 2 parameters are used to config the controls for Pause-Loop Exiting:
  * pause_filter_count: On processors that support Pause filtering(indicated
@@ -170,9 +163,12 @@ module_param(pause_filter_count_shrink, ushort, 0444);
 static unsigned short pause_filter_count_max = KVM_SVM_DEFAULT_PLE_WINDOW_MAX;
 module_param(pause_filter_count_max, ushort, 0444);
 
-/* allow nested paging (virtualized MMU) for all guests */
-static int npt = true;
-module_param(npt, int, S_IRUGO);
+/*
+ * Use nested page tables by default.  Note, NPT may get forced off by
+ * svm_hardware_setup() if it's unsupported by hardware or the host kernel.
+ */
+bool npt_enabled = true;
+module_param_named(npt, npt_enabled, bool, 0444);
 
 /* allow nested virtualization in KVM/SVM */
 static int nested = true;
@@ -988,12 +984,17 @@ static __init int svm_hardware_setup(void)
goto err;
}
 
+   /*
+* KVM's MMU doesn't support using 2-level paging for itself, and thus
+* NPT isn't supported if the host is using 2-level paging since host
+* CR4 is unchanged on VMRUN.
+*/
+   if (!IS_ENABLED(CONFIG_X86_64) && !IS_ENABLED(CONFIG_X86_PAE))
+   npt_enabled = false;
+
if (!boot_cpu_has(X86_FEATURE_NPT))
npt_enabled = false;
 
-   if (npt_enabled && !npt)
-   npt_enabled = false;
-
kvm_configure_mmu(npt_enabled, get_max_npt_level(), PG_LEVEL_1G);
pr_info("kvm: Nested Paging %sabled\n", npt_enabled ? "en" : "dis");
 
-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH v3] selinux: measure state and policy capabilities

2021-03-04 Thread Lakshmi Ramasubramanian

On 3/4/21 5:45 PM, Paul Moore wrote:

On Thu, Mar 4, 2021 at 2:20 PM Lakshmi Ramasubramanian
 wrote:

On 2/12/21 8:37 AM, Lakshmi Ramasubramanian wrote:

Hi Paul,


SELinux stores the configuration state and the policy capabilities
in kernel memory.  Changes to this data at runtime would have an impact
on the security guarantees provided by SELinux.  Measuring this data
through IMA subsystem provides a tamper-resistant way for
an attestation service to remotely validate it at runtime.

Measure the configuration state and policy capabilities by calling
the IMA hook ima_measure_critical_data().



I have addressed your comments on the v2 patch for selinux measurement
using IMA. Could you please let me know if there are any other comments
that I need to address in this patch?


The merge window just closed earlier this week, and there were a
handful of bugs that needed to be addressed before I could look at
this patch.  If I don't get a chance to review this patch tonight, I
will try to get to it this weekend or early next week.



Thanks Paul.

 -lakshmi



Re: [PATCH v3 4/4] clk: rockchip: add clock controller for rk3568

2021-03-04 Thread Kever Yang



On 2021/3/1 下午2:47, Elaine Zhang wrote:

Add the clock tree definition for the new rk3568 SoC.

Signed-off-by: Elaine Zhang 


Patch looks good to me.

Reviewed-by: Kever Yang 

Thanks,
- Kever

---
  drivers/clk/rockchip/Kconfig  |7 +
  drivers/clk/rockchip/Makefile |1 +
  drivers/clk/rockchip/clk-rk3568.c | 1726 +
  drivers/clk/rockchip/clk.h|   30 +-
  4 files changed, 1763 insertions(+), 1 deletion(-)
  create mode 100644 drivers/clk/rockchip/clk-rk3568.c

diff --git a/drivers/clk/rockchip/Kconfig b/drivers/clk/rockchip/Kconfig
index effd05032e85..2e31901f4213 100644
--- a/drivers/clk/rockchip/Kconfig
+++ b/drivers/clk/rockchip/Kconfig
@@ -85,4 +85,11 @@ config CLK_RK3399
default y
help
  Build the driver for RK3399 Clock Driver.
+
+config CLK_RK3568
+   tristate "Rockchip RK3568 clock controller support"
+   depends on (ARM64 || COMPILE_TEST)
+   default y
+   help
+ Build the driver for RK3568 Clock Driver.
  endif
diff --git a/drivers/clk/rockchip/Makefile b/drivers/clk/rockchip/Makefile
index a99e4d9bbae1..2b78f1247372 100644
--- a/drivers/clk/rockchip/Makefile
+++ b/drivers/clk/rockchip/Makefile
@@ -26,3 +26,4 @@ obj-$(CONFIG_CLK_RK3308)+= clk-rk3308.o
  obj-$(CONFIG_CLK_RK3328)+= clk-rk3328.o
  obj-$(CONFIG_CLK_RK3368)+= clk-rk3368.o
  obj-$(CONFIG_CLK_RK3399)+= clk-rk3399.o
+obj-$(CONFIG_CLK_RK3568)   += clk-rk3568.o
diff --git a/drivers/clk/rockchip/clk-rk3568.c 
b/drivers/clk/rockchip/clk-rk3568.c
new file mode 100644
index ..60913aa91897
--- /dev/null
+++ b/drivers/clk/rockchip/clk-rk3568.c
@@ -0,0 +1,1726 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2021 Rockchip Electronics Co. Ltd.
+ * Author: Elaine Zhang 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "clk.h"
+
+#define RK3568_GRF_SOC_STATUS0 0x580
+
+enum rk3568_pmu_plls {
+   ppll, hpll,
+};
+
+enum rk3568_plls {
+   apll, dpll, gpll, cpll, npll, vpll,
+};
+
+static struct rockchip_pll_rate_table rk3568_pll_rates[] = {
+   /* _mhz, _refdiv, _fbdiv, _postdiv1, _postdiv2, _dsmpd, _frac */
+   RK3036_PLL_RATE(220800, 1, 92, 1, 1, 1, 0),
+   RK3036_PLL_RATE(218400, 1, 91, 1, 1, 1, 0),
+   RK3036_PLL_RATE(216000, 1, 90, 1, 1, 1, 0),
+   RK3036_PLL_RATE(208800, 1, 87, 1, 1, 1, 0),
+   RK3036_PLL_RATE(206400, 1, 86, 1, 1, 1, 0),
+   RK3036_PLL_RATE(204000, 1, 85, 1, 1, 1, 0),
+   RK3036_PLL_RATE(201600, 1, 84, 1, 1, 1, 0),
+   RK3036_PLL_RATE(199200, 1, 83, 1, 1, 1, 0),
+   RK3036_PLL_RATE(192000, 1, 80, 1, 1, 1, 0),
+   RK3036_PLL_RATE(189600, 1, 79, 1, 1, 1, 0),
+   RK3036_PLL_RATE(18, 1, 75, 1, 1, 1, 0),
+   RK3036_PLL_RATE(170400, 1, 71, 1, 1, 1, 0),
+   RK3036_PLL_RATE(160800, 1, 67, 1, 1, 1, 0),
+   RK3036_PLL_RATE(16, 3, 200, 1, 1, 1, 0),
+   RK3036_PLL_RATE(158400, 1, 132, 2, 1, 1, 0),
+   RK3036_PLL_RATE(156000, 1, 130, 2, 1, 1, 0),
+   RK3036_PLL_RATE(153600, 1, 128, 2, 1, 1, 0),
+   RK3036_PLL_RATE(151200, 1, 126, 2, 1, 1, 0),
+   RK3036_PLL_RATE(148800, 1, 124, 2, 1, 1, 0),
+   RK3036_PLL_RATE(146400, 1, 122, 2, 1, 1, 0),
+   RK3036_PLL_RATE(144000, 1, 120, 2, 1, 1, 0),
+   RK3036_PLL_RATE(141600, 1, 118, 2, 1, 1, 0),
+   RK3036_PLL_RATE(14, 3, 350, 2, 1, 1, 0),
+   RK3036_PLL_RATE(139200, 1, 116, 2, 1, 1, 0),
+   RK3036_PLL_RATE(136800, 1, 114, 2, 1, 1, 0),
+   RK3036_PLL_RATE(134400, 1, 112, 2, 1, 1, 0),
+   RK3036_PLL_RATE(132000, 1, 110, 2, 1, 1, 0),
+   RK3036_PLL_RATE(129600, 1, 108, 2, 1, 1, 0),
+   RK3036_PLL_RATE(127200, 1, 106, 2, 1, 1, 0),
+   RK3036_PLL_RATE(124800, 1, 104, 2, 1, 1, 0),
+   RK3036_PLL_RATE(12, 1, 100, 2, 1, 1, 0),
+   RK3036_PLL_RATE(118800, 1, 99, 2, 1, 1, 0),
+   RK3036_PLL_RATE(110400, 1, 92, 2, 1, 1, 0),
+   RK3036_PLL_RATE(11, 3, 275, 2, 1, 1, 0),
+   RK3036_PLL_RATE(100800, 1, 84, 2, 1, 1, 0),
+   RK3036_PLL_RATE(10, 3, 250, 2, 1, 1, 0),
+   RK3036_PLL_RATE(91200, 1, 76, 2, 1, 1, 0),
+   RK3036_PLL_RATE(81600, 1, 68, 2, 1, 1, 0),
+   RK3036_PLL_RATE(8, 3, 200, 2, 1, 1, 0),
+   RK3036_PLL_RATE(7, 3, 350, 4, 1, 1, 0),
+   RK3036_PLL_RATE(69600, 1, 116, 4, 1, 1, 0),
+   RK3036_PLL_RATE(6, 1, 100, 4, 1, 1, 0),
+   RK3036_PLL_RATE(59400, 1, 99, 4, 1, 1, 0),
+   RK3036_PLL_RATE(5, 1, 125, 6, 1, 1, 0),
+   RK3036_PLL_RATE(40800, 1, 68, 2, 2, 1, 0),
+   RK3036_PLL_RATE(31200, 1, 78, 6, 1, 1, 0),
+   RK3036_PLL_RATE(21600, 1, 72, 4, 2, 1, 0),
+   RK3036_PLL_RATE(2, 1, 100, 3, 4, 1, 0),
+   RK3036_PLL_RATE(14850, 1, 99, 4, 4, 1, 0),
+   

Re: [PATCH v3 2/4] clk: rockchip: add dt-binding header for rk3568

2021-03-04 Thread Kever Yang



On 2021/3/1 下午2:47, Elaine Zhang wrote:

Add the dt-bindings header for the rk3568, that gets shared between
the clock controller and the clock references in the dts.
Add softreset ID for rk3568.

Signed-off-by: Elaine Zhang 


Patch looks good to me.

Reviewed-by: Kever Yang 

Thanks,
- Kever

---
  include/dt-bindings/clock/rk3568-cru.h | 926 +
  1 file changed, 926 insertions(+)
  create mode 100644 include/dt-bindings/clock/rk3568-cru.h

diff --git a/include/dt-bindings/clock/rk3568-cru.h 
b/include/dt-bindings/clock/rk3568-cru.h
new file mode 100644
index ..d29890865150
--- /dev/null
+++ b/include/dt-bindings/clock/rk3568-cru.h
@@ -0,0 +1,926 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2021 Rockchip Electronics Co. Ltd.
+ * Author: Elaine Zhang 
+ */
+
+#ifndef _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
+#define _DT_BINDINGS_CLK_ROCKCHIP_RK3568_H
+
+/* pmucru-clocks indices */
+
+/* pmucru plls */
+#define PLL_PPLL   1
+#define PLL_HPLL   2
+
+/* pmucru clocks */
+#define XIN_OSC0_DIV   4
+#define CLK_RTC_32K5
+#define CLK_PMU6
+#define CLK_I2C0   7
+#define CLK_RTC32K_FRAC8
+#define CLK_UART0_DIV  9
+#define CLK_UART0_FRAC 10
+#define SCLK_UART0 11
+#define DBCLK_GPIO012
+#define CLK_PWM0   13
+#define CLK_CAPTURE_PWM0_NDFT  14
+#define CLK_PMUPVTM15
+#define CLK_CORE_PMUPVTM   16
+#define CLK_REF24M 17
+#define XIN_OSC0_USBPHY0_G 18
+#define CLK_USBPHY0_REF19
+#define XIN_OSC0_USBPHY1_G 20
+#define CLK_USBPHY1_REF21
+#define XIN_OSC0_MIPIDSIPHY0_G 22
+#define CLK_MIPIDSIPHY0_REF23
+#define XIN_OSC0_MIPIDSIPHY1_G 24
+#define CLK_MIPIDSIPHY1_REF25
+#define CLK_WIFI_DIV   26
+#define CLK_WIFI_OSC0  27
+#define CLK_WIFI   28
+#define CLK_PCIEPHY0_DIV   29
+#define CLK_PCIEPHY0_OSC0  30
+#define CLK_PCIEPHY0_REF   31
+#define CLK_PCIEPHY1_DIV   32
+#define CLK_PCIEPHY1_OSC0  33
+#define CLK_PCIEPHY1_REF   34
+#define CLK_PCIEPHY2_DIV   35
+#define CLK_PCIEPHY2_OSC0  36
+#define CLK_PCIEPHY2_REF   37
+#define CLK_PCIE30PHY_REF_M38
+#define CLK_PCIE30PHY_REF_N39
+#define CLK_HDMI_REF   40
+#define XIN_OSC0_EDPPHY_G  41
+#define PCLK_PDPMU 42
+#define PCLK_PMU   43
+#define PCLK_UART0 44
+#define PCLK_I2C0  45
+#define PCLK_GPIO0 46
+#define PCLK_PMUPVTM   47
+#define PCLK_PWM0  48
+#define CLK_PDPMU  49
+#define SCLK_32K_IOE   50
+
+#define CLKPMU_NR_CLKS (SCLK_32K_IOE + 1)
+
+/* cru-clocks indices */
+
+/* cru plls */
+#define PLL_APLL   1
+#define PLL_DPLL   2
+#define PLL_CPLL   3
+#define PLL_GPLL   4
+#define PLL_VPLL   5
+#define PLL_NPLL   6
+
+/* cru clocks */
+#define CPLL_333M  9
+#define ARMCLK 10
+#define USB480M11
+#define ACLK_CORE_NIU2BUS  18
+#define CLK_CORE_PVTM  19
+#define CLK_CORE_PVTM_CORE 20
+#define CLK_CORE_PVTPLL21
+#define CLK_GPU_SRC22
+#define CLK_GPU_PRE_NDFT   23
+#define CLK_GPU_PRE_MUX24
+#define ACLK_GPU_PRE   25
+#define PCLK_GPU_PRE   26
+#define CLK_GPU27
+#define CLK_GPU_NP528
+#define PCLK_GPU_PVTM  29
+#define CLK_GPU_PVTM   30
+#define CLK_GPU_PVTM_CORE  31
+#define CLK_GPU_PVTPLL 32
+#define CLK_NPU_SRC33
+#define CLK_NPU_PRE_NDFT   34
+#define CLK_NPU35
+#define CLK_NPU_NP536
+#define HCLK_NPU_PRE   37
+#define PCLK_NPU_PRE   38
+#define ACLK_NPU_PRE   39
+#define ACLK_NPU   40
+#define HCLK_NPU   41
+#define PCLK_NPU_PVTM  42
+#define CLK_NPU_PVTM   43
+#define CLK_NPU_PVTM_CORE  44
+#define CLK_NPU_PVTPLL 45
+#define CLK_DDRPHY1X_SRC   46
+#define CLK_DDRPHY1X_HWFFC_SRC 47
+#define CLK_DDR1X  48
+#define CLK_MSCH   49
+#define CLK24_DDRMON   50
+#define ACLK_GIC_AUDIO 51
+#define HCLK_GIC_AUDIO 52
+#define HCLK_SDMMC_BUFFER  53
+#define DCLK_SDMMC_BUFFER  54
+#define ACLK_GIC60055
+#define ACLK_SPINLOCK  56
+#define HCLK_I2S0_8CH  57
+#define HCLK_I2S1_8CH  58
+#define HCLK_I2S2_2CH  59
+#define HCLK_I2S3_2CH  60
+#define CLK_I2S0_8CH_TX_SRC61
+#define CLK_I2S0_8CH_TX_FRAC   62
+#define MCLK_I2S0_8CH_TX   63
+#define I2S0_MCLKOUT_TX64
+#define CLK_I2S0_8CH_RX_SRC65
+#define CLK_I2S0_8CH_RX_FRAC   66
+#define MCLK_I2S0_8CH_RX   67
+#define I2S0_MCLKOUT_RX68

Re: [PATCH v3 3/4] clk: rockchip: support more core div setting

2021-03-04 Thread Kever Yang



On 2021/3/1 下午2:47, Elaine Zhang wrote:

Use arrays to support more core independent div settings.
A55 supports each core to work at different frequencies, and each core
has an independent divider control.

Signed-off-by: Elaine Zhang 


Patch looks good to me.

Reviewed-by: Kever Yang 

Thanks,
- Kever

---
  drivers/clk/rockchip/clk-cpu.c| 53 +--
  drivers/clk/rockchip/clk-px30.c   |  7 ++--
  drivers/clk/rockchip/clk-rk3036.c |  7 ++--
  drivers/clk/rockchip/clk-rk3128.c |  7 ++--
  drivers/clk/rockchip/clk-rk3188.c |  7 ++--
  drivers/clk/rockchip/clk-rk3228.c |  7 ++--
  drivers/clk/rockchip/clk-rk3288.c |  7 ++--
  drivers/clk/rockchip/clk-rk3308.c |  7 ++--
  drivers/clk/rockchip/clk-rk3328.c |  7 ++--
  drivers/clk/rockchip/clk-rk3368.c | 14 
  drivers/clk/rockchip/clk-rk3399.c | 14 
  drivers/clk/rockchip/clk-rv1108.c |  7 ++--
  drivers/clk/rockchip/clk.h| 24 +++---
  13 files changed, 94 insertions(+), 74 deletions(-)

diff --git a/drivers/clk/rockchip/clk-cpu.c b/drivers/clk/rockchip/clk-cpu.c
index fa9027fb1920..47288197c9d7 100644
--- a/drivers/clk/rockchip/clk-cpu.c
+++ b/drivers/clk/rockchip/clk-cpu.c
@@ -84,10 +84,10 @@ static unsigned long rockchip_cpuclk_recalc_rate(struct 
clk_hw *hw,
  {
struct rockchip_cpuclk *cpuclk = to_rockchip_cpuclk_hw(hw);
const struct rockchip_cpuclk_reg_data *reg_data = cpuclk->reg_data;
-   u32 clksel0 = readl_relaxed(cpuclk->reg_base + reg_data->core_reg);
+   u32 clksel0 = readl_relaxed(cpuclk->reg_base + reg_data->core_reg[0]);
  
-	clksel0 >>= reg_data->div_core_shift;

-   clksel0 &= reg_data->div_core_mask;
+   clksel0 >>= reg_data->div_core_shift[0];
+   clksel0 &= reg_data->div_core_mask[0];
return parent_rate / (clksel0 + 1);
  }
  
@@ -120,6 +120,7 @@ static int rockchip_cpuclk_pre_rate_change(struct rockchip_cpuclk *cpuclk,

const struct rockchip_cpuclk_rate_table *rate;
unsigned long alt_prate, alt_div;
unsigned long flags;
+   int i = 0;
  
  	/* check validity of the new rate */

rate = rockchip_get_cpuclk_settings(cpuclk, ndata->new_rate);
@@ -142,10 +143,10 @@ static int rockchip_cpuclk_pre_rate_change(struct 
rockchip_cpuclk *cpuclk,
if (alt_prate > ndata->old_rate) {
/* calculate dividers */
alt_div =  DIV_ROUND_UP(alt_prate, ndata->old_rate) - 1;
-   if (alt_div > reg_data->div_core_mask) {
+   if (alt_div > reg_data->div_core_mask[0]) {
pr_warn("%s: limiting alt-divider %lu to %d\n",
-   __func__, alt_div, reg_data->div_core_mask);
-   alt_div = reg_data->div_core_mask;
+   __func__, alt_div, reg_data->div_core_mask[0]);
+   alt_div = reg_data->div_core_mask[0];
}
  
  		/*

@@ -158,19 +159,17 @@ static int rockchip_cpuclk_pre_rate_change(struct 
rockchip_cpuclk *cpuclk,
pr_debug("%s: setting div %lu as alt-rate %lu > old-rate %lu\n",
 __func__, alt_div, alt_prate, ndata->old_rate);
  
-		writel(HIWORD_UPDATE(alt_div, reg_data->div_core_mask,

- reg_data->div_core_shift) |
-  HIWORD_UPDATE(reg_data->mux_core_alt,
-reg_data->mux_core_mask,
-reg_data->mux_core_shift),
-  cpuclk->reg_base + reg_data->core_reg);
-   } else {
-   /* select alternate parent */
-   writel(HIWORD_UPDATE(reg_data->mux_core_alt,
-reg_data->mux_core_mask,
-reg_data->mux_core_shift),
-  cpuclk->reg_base + reg_data->core_reg);
+   for (i = 0; i < reg_data->num_cores; i++) {
+   writel(HIWORD_UPDATE(alt_div, 
reg_data->div_core_mask[i],
+reg_data->div_core_shift[i]),
+  cpuclk->reg_base + reg_data->core_reg[i]);
+   }
}
+   /* select alternate parent */
+   writel(HIWORD_UPDATE(reg_data->mux_core_alt,
+reg_data->mux_core_mask,
+reg_data->mux_core_shift),
+  cpuclk->reg_base + reg_data->core_reg[0]);
  
  	spin_unlock_irqrestore(cpuclk->lock, flags);

return 0;
@@ -182,6 +181,7 @@ static int rockchip_cpuclk_post_rate_change(struct 
rockchip_cpuclk *cpuclk,
const struct rockchip_cpuclk_reg_data *reg_data = cpuclk->reg_data;
const struct rockchip_cpuclk_rate_table *rate;
unsigned long flags;
+   int i = 0;
  
  	rate = rockchip_get_cpuclk_settings(cpuclk, ndata->new_rate);

if (!rate) {
@@ -202,12 +202,17 @@ static int rockchip_cpuclk_post_rate_change(struct 
rockchip_cpuclk 

[PATCH] memstick: core: fix error return code of mspro_block_resume()

2021-03-04 Thread Jia-Ju Bai
When mspro_block_init_card() fails, no error return code of 
mspro_block_resume() is assigned.
To fix this bug, rc is assigned with the return value of 
mspro_block_init_card(), and then rc is checked.

Reported-by: TOTE Robot 
Signed-off-by: Jia-Ju Bai 
---
 drivers/memstick/core/mspro_block.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/memstick/core/mspro_block.c 
b/drivers/memstick/core/mspro_block.c
index afb892e7ffc6..cf7fe0d58ee7 100644
--- a/drivers/memstick/core/mspro_block.c
+++ b/drivers/memstick/core/mspro_block.c
@@ -1382,7 +1382,8 @@ static int mspro_block_resume(struct memstick_dev *card)
 
new_msb->card = card;
memstick_set_drvdata(card, new_msb);
-   if (mspro_block_init_card(card))
+   rc = mspro_block_init_card(card);
+   if (rc)
goto out_free;
 
for (cnt = 0; new_msb->attr_group.attrs[cnt]
-- 
2.17.1



Re: linux-next: build failure after merge of the pinctrl tree

2021-03-04 Thread Stephen Rothwell
Hi,

On Fri, 5 Mar 2021 08:46:46 +0800 "jay...@rock-chips.com" 
 wrote:
>
> Thanks,and I think i have miss to upstream the changes,
> I have resend them in a new thread.
> 
> --- a/arch/arm64/Kconfig.platforms
> +++ b/arch/arm64/Kconfig.platforms
> @@ -155,9 +155,7 @@ config ARCH_REALTEK
>  config ARCH_ROCKCHIP
> bool "Rockchip Platforms"
> select ARCH_HAS_RESET_CONTROLLER
> -   select GPIOLIB
> select PINCTRL
> -   select PINCTRL_ROCKCHIP
> select PM
> select ROCKCHIP_TIMER
> help
> diff --git a/drivers/pinctrl/Kconfig b/drivers/pinctrl/Kconfig
> index b197d23324fb..970c18191f6f 100644
> --- a/drivers/pinctrl/Kconfig
> +++ b/drivers/pinctrl/Kconfig
> @@ -179,10 +179,14 @@ config PINCTRL_OXNAS
>  
>  config PINCTRL_ROCKCHIP
> tristate "Rockchip gpio and pinctrl driver"
> +   select GPIOLIB
> select PINMUX
> select GENERIC_PINCONF
> select GENERIC_IRQ_CHIP
> select MFD_SYSCON
> +   default ARCH_ROCKCHIP
> +   help
> + This support pinctrl and gpio driver for Rockchip SoCs.
>  
>  config PINCTRL_RZA1
> bool "Renesas RZ/A1 gpio and pinctrl driver"
> 
> 
>  
> From: Linus Walleij
> Date: 2021-03-05 08:43
> To: jay...@rock-chips.com
> CC: Stephen Rothwell; Linux Kernel Mailing List; Linux Next Mailing List
> Subject: Re: Re: linux-next: build failure after merge of the pinctrl tree
> On Fri, Mar 5, 2021 at 1:13 AM jay...@rock-chips.com
>  wrote:
>  
> > Could you show me the issue log ?  
>  
> It's attached to Stephen's original mail in this thread.

Sorry I lost the error message, but it was a reference to a symbol that
has no EXPORT_SYMBOL.  So building the driver as a module should show
the error.

-- 
Cheers,
Stephen Rothwell


pgpWqdu6DORVG.pgp
Description: OpenPGP digital signature


[PATCH] net: tehuti: fix error return code in bdx_probe()

2021-03-04 Thread Jia-Ju Bai
When bdx_read_mac() fails, no error return code of bdx_probe() 
is assigned.
To fix this bug, err is assigned with -EFAULT as error return code.

Reported-by: TOTE Robot 
Signed-off-by: Jia-Ju Bai 
---
 drivers/net/ethernet/tehuti/tehuti.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/tehuti/tehuti.c 
b/drivers/net/ethernet/tehuti/tehuti.c
index b8f4f419173f..d054c6e83b1c 100644
--- a/drivers/net/ethernet/tehuti/tehuti.c
+++ b/drivers/net/ethernet/tehuti/tehuti.c
@@ -2044,6 +2044,7 @@ bdx_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
/*bdx_hw_reset(priv); */
if (bdx_read_mac(priv)) {
pr_err("load MAC address failed\n");
+   err = -EFAULT;
goto err_out_iomap;
}
SET_NETDEV_DEV(ndev, >dev);
-- 
2.17.1



[PATCH] arch: mips: bcm63xx: Spello fix in the file clk.c

2021-03-04 Thread Bhaskar Chowdhury



s/revelant/relevant/

Signed-off-by: Bhaskar Chowdhury 
---
 arch/mips/bcm63xx/clk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/bcm63xx/clk.c b/arch/mips/bcm63xx/clk.c
index 164115944a7f..5a3e325275d0 100644
--- a/arch/mips/bcm63xx/clk.c
+++ b/arch/mips/bcm63xx/clk.c
@@ -76,7 +76,7 @@ static struct clk clk_enet_misc = {
 };

 /*
- * Ethernet MAC clocks: only revelant on 6358, silently enable misc
+ * Ethernet MAC clocks: only relevant on 6358, silently enable misc
  * clocks
  */
 static void enetx_set(struct clk *clk, int enable)
--
2.30.1



[PATCH] input: s6sy761: fix coordinate read bit shift

2021-03-04 Thread Caleb Connolly
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

The touch coordinates are read by shifting a value left by 3,
this is incorrect and effectively causes the coordinates to
be half of the correct value.

Shift by 4 bits instead to report the correct value.

This matches downstream examples, and has been confirmed on my
device (OnePlus 7 Pro).

Signed-off-by: Caleb Connolly 
---
 drivers/input/touchscreen/s6sy761.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/touchscreen/s6sy761.c 
b/drivers/input/touchscreen/s6sy761.c
index b63d7fdf0cd2..85a1f465c097 100644
--- a/drivers/input/touchscreen/s6sy761.c
+++ b/drivers/input/touchscreen/s6sy761.c
@@ -145,8 +145,8 @@ static void s6sy761_report_coordinates(struct s6sy761_data 
*sdata,
u8 major = event[4];
u8 minor = event[5];
u8 z = event[6] & S6SY761_MASK_Z;
-   u16 x = (event[1] << 3) | ((event[3] & S6SY761_MASK_X) >> 4);
-   u16 y = (event[2] << 3) | (event[3] & S6SY761_MASK_Y);
+   u16 x = (event[1] << 4) | ((event[3] & S6SY761_MASK_X) >> 4);
+   u16 y = (event[2] << 4) | (event[3] & S6SY761_MASK_Y);
 
input_mt_slot(sdata->input, tid);
 
-- 
2.29.2


-BEGIN PGP SIGNATURE-
Version: ProtonMail

wsFmBAEBCAAQBQJgQZF1CRAFgzErGV9ktgAKCRAFgzErGV9ktroYD/9RVSpG
TNuHm0cz8tS/oPFxPxO6Y35p3IF7I77hv0/Qy8CDBgyiJ2pZYP5dOgMPp7NV
MbIYMlN0sjXsAhcm81eho4qp7r5Fnv5YdRoe8QaueRaBVqG5xeip//sdYsdP
lkSLLJqM7caXOZ2QaVZp4w9v7PpZvyGTdDeBtyhVwrRpuEcZraFBGhyfv4Xf
WvU/hKj+0cnKt+WpmnEtwBkkX4PDqm2yXQASm5HrHli5z8XSlaTO55jFcQRP
+PcV/uBEVm9yhi4qYGEZYFZ526IpIcB4vKJi5h3fYro3ye66GfT+zb4HHwA/
cRoEFfRQ6TX1NUBeawi5l7LnAXP3w/RMQ9HnULjiFgLI1a63EsucXGnan2gt
N0Ig0nnbD/c0UG9dsm5u7POM1JhDrX184Bvh3WPb4t0XfYRCIQVB4S+ElG3n
KoPLMWjOzfSxFQcEK4axpOYePDGfbqMYTNy+g+m/aQa40OnYem2Tp6koIpz4
tzzHgpKlrUuMe571jZFO+eIwIxuu/5yaE1qBANfaCIemDWxX2dXEeaoFCTtu
LdqkxaQuxqPaZORlbuRHFkaFEsk2iWZ6eOSNo6dBQ+qxbIHFj8xYqYkZh5Hn
g7zP0UpHbcTEggFgGB0AM6n3+qQXVR+EwtKr7vBWiSAR6p2T87MNflFWbvk5
1fCR4mi79Teaafgunw==
=hTKj
-END PGP SIGNATURE-



Re: [PATCH] squashfs: fix inode lookup sanity checks

2021-03-04 Thread Phillip Lougher

On 26/02/2021 09:29, Sean Nyekjaer wrote:

When mouting a squashfs image created without inode compression it
fails with: "unable to read inode lookup table"

It turns out that the BLOCK_OFFSET is missing when checking
the SQUASHFS_METADATA_SIZE agaist the actual size.

Fixes: eabac19e40c0 ("squashfs: add more sanity checks in inode lookup")
CC: sta...@vger.kernel.org
Signed-off-by: Sean Nyekjaer 


Acked-by: phil...@squashfs.org.uk.


---
  fs/squashfs/export.c  | 8 ++--
  fs/squashfs/squashfs_fs.h | 1 +
  2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/squashfs/export.c b/fs/squashfs/export.c
index eb02072d28dd..723763746238 100644
--- a/fs/squashfs/export.c
+++ b/fs/squashfs/export.c
@@ -152,14 +152,18 @@ __le64 *squashfs_read_inode_lookup_table(struct 
super_block *sb,
start = le64_to_cpu(table[n]);
end = le64_to_cpu(table[n + 1]);
  
-		if (start >= end || (end - start) > SQUASHFS_METADATA_SIZE) {

+   if (start >= end
+   || (end - start) >
+   (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) {
kfree(table);
return ERR_PTR(-EINVAL);
}
}
  
  	start = le64_to_cpu(table[indexes - 1]);

-   if (start >= lookup_table_start || (lookup_table_start - start) > 
SQUASHFS_METADATA_SIZE) {
+   if (start >= lookup_table_start ||
+   (lookup_table_start - start) >
+   (SQUASHFS_METADATA_SIZE + SQUASHFS_BLOCK_OFFSET)) {
kfree(table);
return ERR_PTR(-EINVAL);
}
diff --git a/fs/squashfs/squashfs_fs.h b/fs/squashfs/squashfs_fs.h
index 8d64edb80ebf..b3fdc8212c5f 100644
--- a/fs/squashfs/squashfs_fs.h
+++ b/fs/squashfs/squashfs_fs.h
@@ -17,6 +17,7 @@
  
  /* size of metadata (inode and directory) blocks */

  #define SQUASHFS_METADATA_SIZE8192
+#define SQUASHFS_BLOCK_OFFSET  2
  
  /* default size of block device I/O */

  #ifdef CONFIG_SQUASHFS_4K_DEVBLK_SIZE





[PATCH] crypto: allwinner: sun8i-ce: fix error return code in sun8i_ce_prng_generate()

2021-03-04 Thread Jia-Ju Bai
When dma_mapping_error() returns an error, no error return code of 
sun8i_ce_prng_generate() is assigned.
To fix this bug, err is assigned with -EFAULT as error return code.

Reported-by: TOTE Robot 
Signed-off-by: Jia-Ju Bai 
---
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-prng.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-prng.c 
b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-prng.c
index cfde9ee4356b..cd1baee424a1 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-prng.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-prng.c
@@ -99,6 +99,7 @@ int sun8i_ce_prng_generate(struct crypto_rng *tfm, const u8 
*src,
dma_iv = dma_map_single(ce->dev, ctx->seed, ctx->slen, DMA_TO_DEVICE);
if (dma_mapping_error(ce->dev, dma_iv)) {
dev_err(ce->dev, "Cannot DMA MAP IV\n");
+   err = -EFAULT;
goto err_iv;
}
 
-- 
2.17.1



[PATCH] thermal: Fix a typo in the file soctherm.c

2021-03-04 Thread Bhaskar Chowdhury


s/calibaration/calibration/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/thermal/tegra/soctherm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thermal/tegra/soctherm.c b/drivers/thermal/tegra/soctherm.c
index 66e0639da4bf..8b8fbd49679b 100644
--- a/drivers/thermal/tegra/soctherm.c
+++ b/drivers/thermal/tegra/soctherm.c
@@ -2195,7 +2195,7 @@ static int tegra_soctherm_probe(struct platform_device 
*pdev)
if (err)
return err;

-   /* calculate tsensor calibaration data */
+   /* calculate tsensor calibration data */
for (i = 0; i < soc->num_tsensors; ++i) {
err = tegra_calc_tsensor_calib(>tsensors[i],
   _calib,
--
2.30.1



[PATCH] drm/nouveau/kms/nve4-nv108: Limit cursors to 128x128

2021-03-04 Thread Lyude Paul
While Kepler does technically support 256x256 cursors, it turns out that
Kepler actually has some additional requirements for scanout surfaces that
we're not enforcing correctly, which aren't present on Maxwell and later.
Cursor surfaces must always use small pages (4K), and overlay surfaces must
always use large pages (128K).

Fixing this correctly though will take a bit more work: as we'll need to
add some code in prepare_fb() to move cursor FBs in large pages to small
pages, and vice-versa for overlay FBs. So until we have the time to do
that, just limit cursor surfaces to 128x128 - a size small enough to always
default to small pages.

This means small ovlys are still broken on Kepler, but it is extremely
unlikely anyone cares about those anyway :).

Signed-off-by: Lyude Paul 
Fixes: d3b2f0f7921c ("drm/nouveau/kms/nv50-: Report max cursor size to 
userspace")
Cc:  # v5.11+
---
 drivers/gpu/drm/nouveau/dispnv50/disp.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c 
b/drivers/gpu/drm/nouveau/dispnv50/disp.c
index 196612addfd6..1c9c0cdf85db 100644
--- a/drivers/gpu/drm/nouveau/dispnv50/disp.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c
@@ -2693,9 +2693,20 @@ nv50_display_create(struct drm_device *dev)
else
nouveau_display(dev)->format_modifiers = disp50xx_modifiers;
 
-   if (disp->disp->object.oclass >= GK104_DISP) {
+   /* FIXME: 256x256 cursors are supported on Kepler, however unlike 
Maxwell and later
+* generations Kepler requires that we use small pages (4K) for cursor 
scanout surfaces. The
+* proper fix for this is to teach nouveau to migrate fbs being used 
for the cursor plane to
+* small page allocations in prepare_fb(). When this is implemented, we 
should also force
+* large pages (128K) for ovly fbs in order to fix Kepler ovlys.
+* But until then, just limit cursors to 128x128 - which is small 
enough to avoid ever using
+* large pages.
+*/
+   if (disp->disp->object.oclass >= GM107_DISP) {
dev->mode_config.cursor_width = 256;
dev->mode_config.cursor_height = 256;
+   } else if (disp->disp->object.oclass >= GK104_DISP) {
+   dev->mode_config.cursor_width = 128;
+   dev->mode_config.cursor_height = 128;
} else {
dev->mode_config.cursor_width = 64;
dev->mode_config.cursor_height = 64;
-- 
2.29.2



RE: [PATCH] Input: elan_i2c - Reduce the resume time for new dev ices

2021-03-04 Thread jingle
HI Dmitry:

1. You mean to let all devices ignore skipping reset/sleep part of device
initialization?
2. The test team found that some old firmware will have errors (invalid
report etc...), so ELAN can only ensure that the new device can meet the
newer parts.

Thanks
jingle

-Original Message-
From: 'Dmitry Torokhov' [mailto:dmitry.torok...@gmail.com] 
Sent: Friday, March 05, 2021 9:31 AM
To: jingle
Cc: 'linux-kernel'; 'linux-input'; 'phoenix'; 'dave.wang'; 'josh.chen'
Subject: Re: [PATCH] Input: elan_i2c - Reduce the resume time for new dev
ices

Hi Jingle,

On Fri, Mar 05, 2021 at 09:24:05AM +0800, jingle wrote:
> HI Dmitry:
> 
> In this case (in the newer parts behavior regarding need to reset 
> after powering them on), it is consistent with the original driver 
> behavior with any new or old device (be called 
> data->ops->initialize(client) : usleep(100) , etc.. , because this 
> times "data->quirks" is equal 0 at probe state.)

You misunderstood my question. I was asking what specifically, if anything,
was changed in the firmware to allow skipping reset/sleep part of device
initialization on newer parts during resume process. Because of there were
no specific changes I would say let's not do a quirk and change the driver
to skip reset on resume.

Thanks.

--
Dmitry



Re: [PATCH v6] i2c: virtio: add a virtio i2c frontend driver

2021-03-04 Thread Jie Deng



On 2021/3/4 14:06, Viresh Kumar wrote:

Please always supply version history, it makes it difficult to review otherwise.

I will add the history.

  drivers/i2c/busses/Kconfig  |  11 ++
  drivers/i2c/busses/Makefile |   3 +
  drivers/i2c/busses/i2c-virtio.c | 289 
  include/uapi/linux/virtio_i2c.h |  42 ++
  include/uapi/linux/virtio_ids.h |   1 +
  5 files changed, 346 insertions(+)
  create mode 100644 drivers/i2c/busses/i2c-virtio.c
  create mode 100644 include/uapi/linux/virtio_i2c.h

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 05ebf75..0860395 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -21,6 +21,17 @@ config I2C_ALI1535
  This driver can also be built as a module.  If so, the module
  will be called i2c-ali1535.
  
+config I2C_VIRTIO

+   tristate "Virtio I2C Adapter"
+   depends on VIRTIO

depends on I2C as well ?
No need that. The dependency of I2C is included in the Kconfig in its 
parent directory.



+   help
+ If you say yes to this option, support will be included for the virtio
+ I2C adapter driver. The hardware can be emulated by any device model
+ software according to the virtio protocol.
+
+ This driver can also be built as a module. If so, the module
+ will be called i2c-virtio.
+
  config I2C_ALI1563
tristate "ALI 1563"
depends on PCI
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 615f35e..b88da08 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -6,6 +6,9 @@
  # ACPI drivers
  obj-$(CONFIG_I2C_SCMI)+= i2c-scmi.o
  
+# VIRTIO I2C host controller driver

+obj-$(CONFIG_I2C_VIRTIO)   += i2c-virtio.o
+

Not that it is important but I would have added it towards the end instead of at
the top of the file..

I'm OK to add it to the end.



  # PC SMBus host controller drivers
  obj-$(CONFIG_I2C_ALI1535) += i2c-ali1535.o
  obj-$(CONFIG_I2C_ALI1563) += i2c-ali1563.o
diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
new file mode 100644
index 000..98c0fcc
--- /dev/null
+++ b/drivers/i2c/busses/i2c-virtio.c
@@ -0,0 +1,289 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Virtio I2C Bus Driver
+ *
+ * The Virtio I2C Specification:
+ * 
https://raw.githubusercontent.com/oasis-tcs/virtio-spec/master/virtio-i2c.tex
+ *
+ * Copyright (c) 2021 Intel Corporation. All rights reserved.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 

I don't think above two are required here..


+#include 
+#include 
+#include 

same here.


+#include 

same here.

Will confirm and remove them if they are not required. Thank you.

+

And this blank line as well, since all are standard linux headers.

+#include 
+#include 
+
+/**
+ * struct virtio_i2c - virtio I2C data
+ * @vdev: virtio device for this controller
+ * @completion: completion of virtio I2C message
+ * @adap: I2C adapter for this controller
+ * @i2c_lock: lock for virtqueue processing
+ * @vq: the virtio virtqueue for communication
+ */
+struct virtio_i2c {
+   struct virtio_device *vdev;
+   struct completion completion;
+   struct i2c_adapter adap;
+   struct mutex i2c_lock;

i2c_ is redundant here. "lock" sounds good enough.

OK.

+   struct virtqueue *vq;
+};
+
+/**
+ * struct virtio_i2c_req - the virtio I2C request structure
+ * @out_hdr: the OUT header of the virtio I2C message
+ * @buf: the buffer into which data is read, or from which it's written
+ * @in_hdr: the IN header of the virtio I2C message
+ */
+struct virtio_i2c_req {
+   struct virtio_i2c_out_hdr out_hdr;
+   u8 *buf;
+   struct virtio_i2c_in_hdr in_hdr;
+};
+
+static void virtio_i2c_msg_done(struct virtqueue *vq)
+{
+   struct virtio_i2c *vi = vq->vdev->priv;
+
+   complete(>completion);
+}
+
+static int virtio_i2c_send_reqs(struct virtqueue *vq,
+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr)
+{
+   struct scatterlist *sgs[3], out_hdr, msg_buf, in_hdr;
+   int i, outcnt, incnt, err = 0;
+   u8 *buf;
+
+   for (i = 0; i < nr; i++) {
+   if (!msgs[i].len)
+   break;
+
+   /* Only 7-bit mode supported for this moment. For the address 
format,
+* Please check the Virtio I2C Specification.
+*/

Please use proper comment style.

will fix it.


 /*
  * Only 7-bit mode supported for now, check Virtio I2C
  * specification for format of "addr" field.
  */


+   reqs[i].out_hdr.addr = cpu_to_le16(msgs[i].addr << 1);
+
+   if (i != nr - 1)
+   reqs[i].out_hdr.flags |= VIRTIO_I2C_FLAGS_FAIL_NEXT;

Since flags field hasn't been touched anywhere, 

Re: [PATCH v3] selinux: measure state and policy capabilities

2021-03-04 Thread Paul Moore
On Thu, Mar 4, 2021 at 2:20 PM Lakshmi Ramasubramanian
 wrote:
> On 2/12/21 8:37 AM, Lakshmi Ramasubramanian wrote:
>
> Hi Paul,
>
> > SELinux stores the configuration state and the policy capabilities
> > in kernel memory.  Changes to this data at runtime would have an impact
> > on the security guarantees provided by SELinux.  Measuring this data
> > through IMA subsystem provides a tamper-resistant way for
> > an attestation service to remotely validate it at runtime.
> >
> > Measure the configuration state and policy capabilities by calling
> > the IMA hook ima_measure_critical_data().
> >
>
> I have addressed your comments on the v2 patch for selinux measurement
> using IMA. Could you please let me know if there are any other comments
> that I need to address in this patch?

The merge window just closed earlier this week, and there were a
handful of bugs that needed to be addressed before I could look at
this patch.  If I don't get a chance to review this patch tonight, I
will try to get to it this weekend or early next week.

-- 
paul moore
www.paul-moore.com


[PATCH] thermal: Fix couple of spellos in the file sun8i_thermal.c

2021-03-04 Thread Bhaskar Chowdhury



s/calibartion/calibration/
s/undocummented/undocumented/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/thermal/sun8i_thermal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/sun8i_thermal.c b/drivers/thermal/sun8i_thermal.c
index 8c80bd06dd9f..d9cd23cbb671 100644
--- a/drivers/thermal/sun8i_thermal.c
+++ b/drivers/thermal/sun8i_thermal.c
@@ -300,7 +300,7 @@ static int sun8i_ths_calibrate(struct ths_device *tmdev)
 * or 0x8xx, so they won't be away from the default value
 * for a lot.
 *
-* So here we do not return error if the calibartion data is
+* So here we do not return error if the calibration data is
 * not available, except the probe needs deferring.
 */
goto out;
@@ -418,7 +418,7 @@ static int sun8i_h3_thermal_init(struct ths_device *tmdev)
 }

 /*
- * Without this undocummented value, the returned temperatures would
+ * Without this undocumented value, the returned temperatures would
  * be higher than real ones by about 20C.
  */
 #define SUN50I_H6_CTRL0_UNK 0x002f
--
2.30.1



Re: [PATCH v3 1/4] dt-binding: clock: Document rockchip,rk3568-cru bindings

2021-03-04 Thread Kever Yang



On 2021/3/1 下午2:47, Elaine Zhang wrote:

Document the device tree bindings of the rockchip Rk3568 SoC
clock driver in 
Documentation/devicetree/bindings/clock/rockchip,rk3568-cru.yaml.

Signed-off-by: Elaine Zhang 


Patch looks good to me.

Reviewed-by: Kever Yang >


Thanks,
- Kever

---
  .../bindings/clock/rockchip,rk3568-cru.yaml   | 60 +++
  1 file changed, 60 insertions(+)
  create mode 100644 
Documentation/devicetree/bindings/clock/rockchip,rk3568-cru.yaml

diff --git a/Documentation/devicetree/bindings/clock/rockchip,rk3568-cru.yaml 
b/Documentation/devicetree/bindings/clock/rockchip,rk3568-cru.yaml
new file mode 100644
index ..b2c26097827f
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/rockchip,rk3568-cru.yaml
@@ -0,0 +1,60 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/rockchip,rk3568-cru.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: ROCKCHIP rk3568 Family Clock Control Module Binding
+
+maintainers:
+  - Elaine Zhang 
+  - Heiko Stuebner 
+
+description: |
+  The RK3568 clock controller generates the clock and also implements a
+  reset controller for SoC peripherals.
+  (examples: provide SCLK_UART1\PCLK_UART1 and SRST_P_UART1\SRST_S_UART1 for 
UART module)
+  Each clock is assigned an identifier and client nodes can use this identifier
+  to specify the clock which they consume. All available clocks are defined as
+  preprocessor macros in the dt-bindings/clock/rk3568-cru.h headers and can be
+  used in device tree sources.
+
+properties:
+  compatible:
+enum:
+  - rockchip,rk3568-cru
+  - rockchip,rk3568-pmucru
+
+  reg:
+maxItems: 1
+
+  "#clock-cells":
+const: 1
+
+  "#reset-cells":
+const: 1
+
+required:
+  - compatible
+  - reg
+  - "#clock-cells"
+  - "#reset-cells"
+
+additionalProperties: false
+
+examples:
+  # Clock Control Module node:
+  - |
+pmucru: clock-controller@fdd0 {
+  compatible = "rockchip,rk3568-pmucru";
+  reg = <0xfdd0 0x1000>;
+  #clock-cells = <1>;
+  #reset-cells = <1>;
+};
+  - |
+cru: clock-controller@fdd2 {
+  compatible = "rockchip,rk3568-cru";
+  reg = <0xfdd2 0x1000>;
+  #clock-cells = <1>;
+  #reset-cells = <1>;
+};





Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-04 Thread Aili Yao
On Fri, 5 Mar 2021 09:30:16 +0800
Aili Yao  wrote:

> On Thu, 4 Mar 2021 15:57:20 -0800
> "Luck, Tony"  wrote:
> 
> > On Thu, Mar 04, 2021 at 02:45:24PM +0800, Aili Yao wrote:
> > > > > if your methods works, should it be like this?
> > > > > 
> > > > > 1582 pteval = 
> > > > > swp_entry_to_pte(make_hwpoison_entry(subpage));
> > > > > 1583 if (PageHuge(page)) {
> > > > > 1584 
> > > > > hugetlb_count_sub(compound_nr(page), mm);
> > > > > 1585 set_huge_swap_pte_at(mm, address,
> > > > > 1586  pvmw.pte, 
> > > > > pteval,
> > > > > 1587  
> > > > > vma_mmu_pagesize(vma));
> > > > > 1588 } else {
> > > > > 1589 dec_mm_counter(mm, 
> > > > > mm_counter(page));
> > > > > 1590 set_pte_at(mm, address, 
> > > > > pvmw.pte, pteval);
> > > > > 1591 }
> > > > > 
> > > > > the page fault check if it's a poison page using is_hwpoison_entry(),
> > > > > 
> > > > 
> > > > And if it works, does we need some locking mechanism before we call 
> > > > walk_page_range();
> > > > if we lock, does we need to process the blocking interrupted error as 
> > > > other places will do?
> > > >   
> > > 
> > > And another thing:
> > > Do we need a call to flush_tlb_page(vma, address) to make the pte changes 
> > > into effect?  
> > 
> > Thanks for all the pointers.  I added them to the patch (see below).
> > [The pmd/pud cases may need some tweaking ... but just trying to get
> > the 4K page case working first]
> > 
> > I tried testing by skipping the call to memory_failure() and just
> > using this new code to search the page tables for current page and
> > marking it hwpoison (to simulate the case where 2nd process gets the
> > early return from memory_failure(). Something is still missing because I 
> > get:
> > 
> > [  481.911298] mce: pte_entry: matched pfn - mark poison & zap pte
> > [  481.917935] MCE: Killing einj_mem_uc: due to hardware memory 
> > corruption fault at 7fe64b33b400
> > [  482.933775] BUG: Bad page cache in process einj_mem_uc  pfn:408b6d6
> > [  482.940777] page:13ea6e96 refcount:3 mapcount:1 
> > mapping:e3a069d9 index:0x0 pfn:0x408b6d6
> > [  482.951355] memcg:94a809834000
> > [  482.955153] aops:shmem_aops ino:3c04
> > [  482.959142] flags: 
> > 0x97c0880015(locked|uptodate|lru|swapbacked|hwpoison)
> > [  482.967018] raw: 0097c0880015 94c80e93ec00 94c80e93ec00 
> > 94c80a9b25a8
> > [  482.975666] raw:   0003 
> > 94a809834000
> > [  482.984310] page dumped because: still mapped when deleted
> 
> From the walk, it seems we have got the virtual address, can we just send a 
> SIGBUS with it?

Does this walk proper for other memory-failure error cases, can it be applied 
to if (p->mce_vaddr != (void __user *)-1l) branch
in kill_me_maybe()?

> > commit e5de44560b33e2d407704243566253a70f858a59
> > Author: Tony Luck 
> > Date:   Tue Mar 2 15:06:33 2021 -0800
> > 
> > x86/mce: Handle races between machine checks
> > 
> > When multiple CPUs hit the same poison memory there is a race. The
> > first CPU into memory_failure() atomically marks the page as poison
> > and continues processing to hunt down all the tasks that map this page
> > so that the virtual addresses can be marked not-present and SIGBUS
> > sent to the task that did the access.
> > 
> > Later CPUs get an early return from memory_failure() and may return
> > to user mode and access the poison again.
> > 
> > Add a new argument to memory_failure() so that it can indicate when
> > the race has been lost. Fix kill_me_maybe() to scan page tables in
> > this case to unmap pages.
> > 
> > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> > index 7962355436da..a52c6a772de2 100644
> > --- a/arch/x86/kernel/cpu/mce/core.c
> > +++ b/arch/x86/kernel/cpu/mce/core.c
> > @@ -28,8 +28,12 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> > +#include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -637,6 +641,7 @@ static int uc_decode_notifier(struct notifier_block 
> > *nb, unsigned long val,
> >  {
> > struct mce *mce = (struct mce *)data;
> > unsigned long pfn;
> > +   int already = 0;
> >  
> > if (!mce || !mce_usable_address(mce))
> > return NOTIFY_DONE;
> > @@ -646,8 +651,9 @@ static int uc_decode_notifier(struct notifier_block 
> > *nb, unsigned long val,
> > return NOTIFY_DONE;
> >  
> > pfn = mce->addr >> PAGE_SHIFT;
> > -   if (!memory_failure(pfn, 0)) {
> > -   set_mce_nospec(pfn, whole_page(mce));
> > +   if (!memory_failure(pfn, 

Re: [PATCH mm] kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations

2021-03-04 Thread Andrew Morton
On Thu, 4 Mar 2021 22:05:48 +0100 Alexander Potapenko  wrote:

> On Thu, Mar 4, 2021 at 9:53 PM Marco Elver  wrote:
> >
> > cache_alloc_debugcheck_after() performs checks on an object, including
> > adjusting the returned pointer. None of this should apply to KFENCE
> > objects. While for non-bulk allocations, the checks are skipped when we
> > allocate via KFENCE, for bulk allocations cache_alloc_debugcheck_after()
> > is called via cache_alloc_debugcheck_after_bulk().
> 
> @Andrew, is this code used by anyone?
> As far as I understand, it cannot be enabled by any config option, so
> nobody really tests it.
> If it is still needed, shall we promote #if DEBUGs in slab.c to a
> separate config option, or maybe this code can be safely removed?

It's all used:

#ifdef CONFIG_DEBUG_SLAB
#define DEBUG   1
#define STATS   1
#define FORCED_DEBUG1
#else
#define DEBUG   0
#define STATS   0
#define FORCED_DEBUG0
#endif



Re: [PATCH] Input: elan_i2c - Reduce the resume time for new dev ices

2021-03-04 Thread 'Dmitry Torokhov'
Hi Jingle,

On Fri, Mar 05, 2021 at 09:24:05AM +0800, jingle wrote:
> HI Dmitry:
> 
> In this case (in the newer parts behavior regarding need to reset after
> powering them on), it is consistent with the original driver behavior with
> any new or old device
> (be called data->ops->initialize(client) : usleep(100) , etc.. , because
> this times "data->quirks" is equal 0 at probe state.) 

You misunderstood my question. I was asking what specifically, if
anything, was changed in the firmware to allow skipping reset/sleep part
of device initialization on newer parts during resume process. Because
of there were no specific changes I would say let's not do a quirk and
change the driver to skip reset on resume.

Thanks.

-- 
Dmitry


Re: [PATCH] mm,hwpoison: return -EBUSY when page already poisoned

2021-03-04 Thread Aili Yao
On Thu, 4 Mar 2021 15:57:20 -0800
"Luck, Tony"  wrote:

> On Thu, Mar 04, 2021 at 02:45:24PM +0800, Aili Yao wrote:
> > > > if your methods works, should it be like this?
> > > > 
> > > > 1582 pteval = 
> > > > swp_entry_to_pte(make_hwpoison_entry(subpage));
> > > > 1583 if (PageHuge(page)) {
> > > > 1584 
> > > > hugetlb_count_sub(compound_nr(page), mm);
> > > > 1585 set_huge_swap_pte_at(mm, address,
> > > > 1586  pvmw.pte, 
> > > > pteval,
> > > > 1587  
> > > > vma_mmu_pagesize(vma));
> > > > 1588 } else {
> > > > 1589 dec_mm_counter(mm, 
> > > > mm_counter(page));
> > > > 1590 set_pte_at(mm, address, pvmw.pte, 
> > > > pteval);
> > > > 1591 }
> > > > 
> > > > the page fault check if it's a poison page using is_hwpoison_entry(),
> > > > 
> > > 
> > > And if it works, does we need some locking mechanism before we call 
> > > walk_page_range();
> > > if we lock, does we need to process the blocking interrupted error as 
> > > other places will do?
> > >   
> > 
> > And another thing:
> > Do we need a call to flush_tlb_page(vma, address) to make the pte changes 
> > into effect?  
> 
> Thanks for all the pointers.  I added them to the patch (see below).
> [The pmd/pud cases may need some tweaking ... but just trying to get
> the 4K page case working first]
> 
> I tried testing by skipping the call to memory_failure() and just
> using this new code to search the page tables for current page and
> marking it hwpoison (to simulate the case where 2nd process gets the
> early return from memory_failure(). Something is still missing because I get:
> 
> [  481.911298] mce: pte_entry: matched pfn - mark poison & zap pte
> [  481.917935] MCE: Killing einj_mem_uc: due to hardware memory 
> corruption fault at 7fe64b33b400
> [  482.933775] BUG: Bad page cache in process einj_mem_uc  pfn:408b6d6
> [  482.940777] page:13ea6e96 refcount:3 mapcount:1 
> mapping:e3a069d9 index:0x0 pfn:0x408b6d6
> [  482.951355] memcg:94a809834000
> [  482.955153] aops:shmem_aops ino:3c04
> [  482.959142] flags: 
> 0x97c0880015(locked|uptodate|lru|swapbacked|hwpoison)
> [  482.967018] raw: 0097c0880015 94c80e93ec00 94c80e93ec00 
> 94c80a9b25a8
> [  482.975666] raw:   0003 
> 94a809834000
> [  482.984310] page dumped because: still mapped when deleted

>From the walk, it seems we have got the virtual address, can we just send a 
>SIGBUS with it?



> commit e5de44560b33e2d407704243566253a70f858a59
> Author: Tony Luck 
> Date:   Tue Mar 2 15:06:33 2021 -0800
> 
> x86/mce: Handle races between machine checks
> 
> When multiple CPUs hit the same poison memory there is a race. The
> first CPU into memory_failure() atomically marks the page as poison
> and continues processing to hunt down all the tasks that map this page
> so that the virtual addresses can be marked not-present and SIGBUS
> sent to the task that did the access.
> 
> Later CPUs get an early return from memory_failure() and may return
> to user mode and access the poison again.
> 
> Add a new argument to memory_failure() so that it can indicate when
> the race has been lost. Fix kill_me_maybe() to scan page tables in
> this case to unmap pages.
> 
> diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
> index 7962355436da..a52c6a772de2 100644
> --- a/arch/x86/kernel/cpu/mce/core.c
> +++ b/arch/x86/kernel/cpu/mce/core.c
> @@ -28,8 +28,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -637,6 +641,7 @@ static int uc_decode_notifier(struct notifier_block *nb, 
> unsigned long val,
>  {
>   struct mce *mce = (struct mce *)data;
>   unsigned long pfn;
> + int already = 0;
>  
>   if (!mce || !mce_usable_address(mce))
>   return NOTIFY_DONE;
> @@ -646,8 +651,9 @@ static int uc_decode_notifier(struct notifier_block *nb, 
> unsigned long val,
>   return NOTIFY_DONE;
>  
>   pfn = mce->addr >> PAGE_SHIFT;
> - if (!memory_failure(pfn, 0)) {
> - set_mce_nospec(pfn, whole_page(mce));
> + if (!memory_failure(pfn, 0, )) {
> + if (!already)
> + set_mce_nospec(pfn, whole_page(mce));
>   mce->kflags |= MCE_HANDLED_UC;
>   }
>  
> @@ -1248,6 +1254,79 @@ static void __mc_scan_banks(struct mce *m, struct 
> pt_regs *regs, struct mce *fin
>   *m = *final;
>  }
>  
> +static int pte_entry(pte_t *pte, unsigned long addr, unsigned long next, 
> struct mm_walk *walk)
> +{
> + 

Re: [PATCH 3/4] kbuild: check the minimum assembler version in Kconfig

2021-03-04 Thread Nick Desaulniers
On Wed, Mar 3, 2021 at 10:34 AM Masahiro Yamada  wrote:
>
> Documentation/process/changes.rst defines the minimum assembler version
> (binutils version), but we have never checked it in the build time.
>
> Kbuild never invokes 'as' directly because all assembly files in the
> kernel tree are *.S, hence must be preprocessed. I do not expect
> raw assembly source files (*.s) would be added to the kernel tree.
>
> Therefore, we always use $(CC) as the assembler driver, and commit
> aa824e0c962b ("kbuild: remove AS variable") removed 'AS'. However,
> we are still interested in the version of the assembler sitting behind.
>
> As usual, the --version option prints the version string.
>
>   $ as --version | head -n 1
>   GNU assembler (GNU Binutils for Ubuntu) 2.35.1
>
> But, we do not have $(AS). So, we can add the -Wa prefix so that
> $(CC) passes --version down to the backing assembler.
>
>   $ gcc -Wa,--version | head -n 1
>   gcc: fatal error: no input files
>   compilation terminated.
>
> OK, we need to input something to satisfy gcc.
>
>   $ gcc -Wa,--version -c -x assembler /dev/null -o /dev/null | head -n 1
>   GNU assembler (GNU Binutils for Ubuntu) 2.35.1
>
> The combination of Clang and GNU assembler works in the same way:
>
>   $ clang -no-integrated-as -Wa,--version -c -x assembler /dev/null -o 
> /dev/null | head -n 1
>   GNU assembler (GNU Binutils for Ubuntu) 2.35.1
>
> Clang with the integrated assembler fails like this:
>
>   $ clang -integrated-as -Wa,--version -c -x assembler /dev/null -o /dev/null 
> | head -n 1
>   clang: error: unsupported argument '--version' to option 'Wa,'

Was this a feature request to "please implement -Wa,--version for clang?" :-P
https://github.com/ClangBuiltLinux/linux/issues/1320

>
> With all this in my mind, I implemented scripts/as-version.sh.
>
>   $ scripts/as-version.sh gcc
>   GNU 23501
>   $ scripts/as-version.sh clang -no-integrated-as
>   GNU 23501
>   $ scripts/as-version.sh clang -integrated-as
>   LLVM 0
>
> Signed-off-by: Masahiro Yamada 
> ---
>
>  arch/Kconfig|  3 +-
>  init/Kconfig| 12 +++
>  scripts/Kconfig.include |  6 
>  scripts/as-version.sh   | 77 +
>  4 files changed, 96 insertions(+), 2 deletions(-)
>  create mode 100755 scripts/as-version.sh
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 2af10ebe5ed0..d7214f4ae1f7 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -631,8 +631,7 @@ config ARCH_SUPPORTS_LTO_CLANG_THIN
>  config HAS_LTO_CLANG
> def_bool y
> # Clang >= 11: https://github.com/ClangBuiltLinux/linux/issues/510
> -   depends on CC_IS_CLANG && CLANG_VERSION >= 11 && LD_IS_LLD
> -   depends on $(success,test $(LLVM_IAS) -eq 1)
> +   depends on CC_IS_CLANG && CLANG_VERSION >= 11 && LD_IS_LLD && 
> AS_IS_LLVM
> depends on $(success,$(NM) --help | head -n 1 | grep -qi llvm)
> depends on $(success,$(AR) --help | head -n 1 | grep -qi llvm)
> depends on ARCH_SUPPORTS_LTO_CLANG
> diff --git a/init/Kconfig b/init/Kconfig
> index 22946fe5ded9..f76e5a44e4fe 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -41,6 +41,18 @@ config CLANG_VERSION
> default $(cc-version) if CC_IS_CLANG
> default 0
>
> +config AS_IS_GNU
> +   def_bool $(success,test "$(as-name)" = GNU)
> +
> +config AS_IS_LLVM
> +   def_bool $(success,test "$(as-name)" = LLVM)
> +
> +config AS_VERSION
> +   int
> +   # If it is integrated assembler, the version is the same as Clang's 
> one.
> +   default CLANG_VERSION if AS_IS_LLVM
> +   default $(as-version)
> +
>  config LD_IS_BFD
> def_bool $(success,test "$(ld-name)" = BFD)
>
> diff --git a/scripts/Kconfig.include b/scripts/Kconfig.include
> index 58fdb5308725..0496efd6e117 100644
> --- a/scripts/Kconfig.include
> +++ b/scripts/Kconfig.include
> @@ -45,6 +45,12 @@ $(error-if,$(success,test -z "$(cc-info)"),Sorry$(comma) 
> this compiler is not su
>  cc-name := $(shell,set -- $(cc-info) && echo $1)
>  cc-version := $(shell,set -- $(cc-info) && echo $2)
>
> +# Get the assembler name, version, and error out if it is not supported.
> +as-info := $(shell,$(srctree)/scripts/as-version.sh $(CC) $(CLANG_FLAGS))
> +$(error-if,$(success,test -z "$(as-info)"),Sorry$(comma) this assembler is 
> not supported.)
> +as-name := $(shell,set -- $(as-info) && echo $1)
> +as-version := $(shell,set -- $(as-info) && echo $2)
> +
>  # Get the linker name, version, and error out if it is not supported.
>  ld-info := $(shell,$(srctree)/scripts/ld-version.sh $(LD))
>  $(error-if,$(success,test -z "$(ld-info)"),Sorry$(comma) this linker is not 
> supported.)
> diff --git a/scripts/as-version.sh b/scripts/as-version.sh
> new file mode 100755
> index ..205d8b9fc4d4
> --- /dev/null
> +++ b/scripts/as-version.sh
> @@ -0,0 +1,77 @@
> +#!/bin/sh
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Print the assembler name and its version in a 5 or 6-digit 

RE: [PATCH] Input: elan_i2c - Reduce the resume time for new dev ices

2021-03-04 Thread jingle
HI Dmitry:

In this case (in the newer parts behavior regarding need to reset after
powering them on), it is consistent with the original driver behavior with
any new or old device
(be called data->ops->initialize(client) : usleep(100) , etc.. , because
this times "data->quirks" is equal 0 at probe state.) 

THANKS
JINGLE

-Original Message-
From: Dmitry Torokhov [mailto:dmitry.torok...@gmail.com] 
Sent: Friday, March 05, 2021 8:55 AM
To: jingle.wu
Cc: linux-kernel; linux-input; phoenix; dave.wang; josh.chen
Subject: Re: [PATCH] Input: elan_i2c - Reduce the resume time for new dev
ices

Hi Jingle,

On Tue, Mar 02, 2021 at 09:04:57AM +0800, jingle.wu wrote:
> HI Dmitry:
> 
> So data->ops->initialize(client) essentially performs reset of the 
> controller (we may want to rename it even) and as far as I understand 
> you would want to avoid resetting the controller on newer devices, 
> right?
> 
> -> YES
> 
> My question is how behavior of older devices differ from the new ones 
> (are they stay in "undefined" state at power up) and whether it is 
> possible to determine if controller is in operating mode. For example, 
> what would happen on older devices if we call elan_query_product() 
> below without resetting the controller?
> 
> -> But there may be other problems, because ELAN can't test all the 
> -> older devices , so use quirk to divide this part.

OK, but could you please tell me what exactly was changed in the newer parts
behavior regarding need to reset after powering them on?

Thanks.

--
Dmitry



Re: [PATCH 1/2] units: Add the HZ_PER_KHZ macro

2021-03-04 Thread Andrew Morton
On Thu, 4 Mar 2021 13:41:27 +0100 Daniel Lezcano  
wrote:

> > Also, why make them signed types?  Negative Hz is physically
> > nonsensical.  If that upsets some code somewhere because it was dealing
> > with signed types then, well, that code needed fixing anyway.
> > 
> > Ditto MILLIWATT_PER_WATT and friends, sigh.
> 
> At the first glance converting to unsigned long should not hurt the
> users of this macro.
> 
> The current series introduces the macro and its usage but by converting
> the existing type.
> 
> Is it ok if I send a separate series to change the units from L to UL?

That's the way to do it...


Re: [f2fs-dev] [PATCH] f2fs: fix a redundant call to f2fs_balance_fs if an error occurs

2021-03-04 Thread Chao Yu

On 2021/3/4 17:21, Colin King wrote:

From: Colin Ian King 

The  uninitialized variable dn.node_changed does not get set when a
call to f2fs_get_node_page fails.  This uninitialized value gets used
in the call to f2fs_balance_fs() that may or not may not balances
dirty node and dentry pages depending on the uninitialized state of
the variable. Fix this by only calling f2fs_balance_fs if err is
not set.

Thanks to Jaegeuk Kim for suggesting an appropriate fix.

Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 2a3407607028 ("f2fs: call f2fs_balance_fs only when node was changed")
Signed-off-by: Colin Ian King 


Reviewed-by: Chao Yu 

Thanks,


Re: XDP socket rings, and LKMM litmus tests

2021-03-04 Thread Boqun Feng
On Thu, Mar 04, 2021 at 11:11:42AM -0500, Alan Stern wrote:
> On Thu, Mar 04, 2021 at 02:33:32PM +0800, Boqun Feng wrote:
> 
> > Right, I was thinking about something unrelated.. but how about the
> > following case:
> > 
> > local_v = 
> > r1 = READ_ONCE(*x); // f
> > 
> > if (r1 == 1) {
> > local_v =  // e
> > } else {
> > local_v =  // d
> > }
> > 
> > p = READ_ONCE(local_v); // g
> > 
> > r2 = READ_ONCE(*p);   // h
> > 
> > if r1 == 1, we definitely think we have:
> > 
> > f ->ctrl e ->rfi g ->addr h
> > 
> > , and if we treat ctrl;rfi as "to-r", then we have "f" happens before
> > "h". However compile can optimze the above as:
> > 
> > local_v = 
> > 
> > r1 = READ_ONCE(*x); // f
> > 
> > if (r1 != 1) {
> > local_v =  // d
> > }
> > 
> > p = READ_ONCE(local_v); // g
> > 
> > r2 = READ_ONCE(*p);   // h
> > 
> > , and when this gets executed, I don't think we have the guarantee we
> > have "f" happens before "h", because CPU can do optimistic read for "g"
> > and "h".
> 
> In your example, which accesses are supposed to be to actual memory and 
> which to registers?  Also, remember that the memory model assumes the 

Given that we use READ_ONCE() on local_v, local_v should be a memory
location but only accessed by this thread.

> hardware does not reorder loads if there is an address dependency 
> between them.
> 

Right, so "g" won't be reordered after "h".

> > Part of this is because when we take plain access into consideration, we
> > won't guarantee a read-from or other relations exists if compiler
> > optimization happens.
> > 
> > Maybe I'm missing something subtle, but just try to think through the
> > effect of making dep; rfi as "to-r".
> 
> Forget about local variables for the time being and just consider
> 
>   dep ; [Plain] ; rfi
> 
> For example:
> 
>   A: r1 = READ_ONCE(x);
>  y = r1;
>   B: r2 = READ_ONCE(y);
> 
> Should B be ordered after A?  I don't see how any CPU could hope to 
> excute B before A, but maybe I'm missing something.
> 

Agreed.

> There's another twist, connected with the fact that herd7 can't detect 
> control dependencies caused by unexecuted code.  If we have:
> 
>   A: r1 = READ_ONCE(x);
>   if (r1)
>   WRITE_ONCE(y, 5);
>   r2 = READ_ONCE(y);
>   B: WRITE_ONCE(z, r2);
> 
> then in executions where x == 0, herd7 doesn't see any control 
> dependency.  But CPUs do see control dependencies whenever there is a 
> conditional branch, whether the branch is taken or not, and so they will 
> never reorder B before A.
> 

Right, because B in this example is a write, what if B is a read that
depends on r2, like in my example? Let y be a pointer to a memory
location, and initialized as a valid value (pointing to a valid memory
location) you example changed to:

A: r1 = READ_ONCE(x);
if (r1)
WRITE_ONCE(y, 5);
C: r2 = READ_ONCE(y);
B: r3 = READ_ONCE(*r2);

, then A don't have the control dependency to B, because A and B is
read+read. So B can be ordered before A, right?

> One last thing to think about: My original assessment or Björn's problem 
> wasn't right, because the dep in (dep ; rfi) doesn't include control 
> dependencies.  Only data and address.  So I believe that the LKMM 

Ah, right. I was mising that part (ctrl is not in dep). So I guess my
example is pointless for the question we are discussing here ;-(

> wouldn't consider A to be ordered before B in this example even if x 
> was nonzero.

Yes, and similar to my example (changing B to a read).

I did try to run my example with herd, and got confused no matter I make
dep; [Plain]; rfi as to-r (I got the same result telling me a reorder
can happen). Now the reason is clear, because this is a ctrl; rfi not a
dep; rfi.

Thanks so much for walking with me on this ;-)

Regards,
Boqun

> 
> Alan


[PATCH v2 17/17] KVM: x86/mmu: WARN on NULL pae_root or lm_root, or bad shadow root level

2021-03-04 Thread Sean Christopherson
WARN if KVM is about to dereference a NULL pae_root or lm_root when
loading an MMU, and convert the BUG() on a bad shadow_root_level into a
WARN (now that errors are handled cleanly).  With nested NPT, botching
the level and sending KVM down the wrong path is all too easy, and the
on-demand allocation of pae_root and lm_root means bugs crash the host.
Obviously, KVM could unconditionally allocate the roots, but that's
arguably a worse failure mode as it would potentially corrupt the guest
instead of crashing it.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index bceff7d815c3..eb9dd8144fa5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3253,6 +3253,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true);
mmu->root_hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
+   if (WARN_ON_ONCE(!mmu->pae_root))
+   return -EIO;
+
for (i = 0; i < 4; ++i) {
WARN_ON_ONCE(mmu->pae_root[i]);
 
@@ -3262,8 +3265,10 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
   shadow_me_mask;
}
mmu->root_hpa = __pa(mmu->pae_root);
-   } else
-   BUG();
+   } else {
+   WARN_ONCE(1, "Bad TDP root level = %d\n", shadow_root_level);
+   return -EIO;
+   }
 
/* root_pgd is ignored for direct MMUs. */
mmu->root_pgd = 0;
@@ -3307,6 +3312,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
goto set_root_pgd;
}
 
+   if (WARN_ON_ONCE(!mmu->pae_root))
+   return -EIO;
+
/*
 * We shadow a 32 bit page table. This may be a legacy 2-level
 * or a PAE 3-level page table. In either case we need to be aware that
@@ -3316,6 +3324,9 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
+   if (WARN_ON_ONCE(!mmu->lm_root))
+   return -EIO;
+
mmu->lm_root[0] = __pa(mmu->pae_root) | pm_mask;
}
 
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 16/17] KVM: x86/mmu: Sync roots after MMU load iff load as successful

2021-03-04 Thread Sean Christopherson
For clarity, explicitly skip syncing roots if the MMU load failed
instead of relying on the !VALID_PAGE check in kvm_mmu_sync_roots().

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 4f66ca0f5f68..bceff7d815c3 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4850,10 +4850,11 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
else
r = mmu_alloc_shadow_roots(vcpu);
write_unlock(>kvm->mmu_lock);
+   if (r)
+   goto out;
 
kvm_mmu_sync_roots(vcpu);
-   if (r)
-   goto out;
+
kvm_mmu_load_pgd(vcpu);
static_call(kvm_x86_tlb_flush_current)(vcpu);
 out:
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 15/17] KVM: x86/mmu: Unexport MMU load/unload functions

2021-03-04 Thread Sean Christopherson
Unexport the MMU load and unload helpers now that they are no longer
used (incorrectly) in vendor code.

Opportunistically move the kvm_mmu_sync_roots() declaration into mmu.h,
it should not be exposed to vendor code.

No functional change intended.

Signed-off-by: Sean Christopherson 
---
 arch/x86/include/asm/kvm_host.h | 3 ---
 arch/x86/kvm/mmu.h  | 4 
 arch/x86/kvm/mmu/mmu.c  | 2 --
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6db60ea8ee5b..2da6c9f5935a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1592,9 +1592,6 @@ void kvm_update_dr7(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn);
 void __kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu);
-int kvm_mmu_load(struct kvm_vcpu *vcpu);
-void kvm_mmu_unload(struct kvm_vcpu *vcpu);
-void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu);
 void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
ulong roots_to_free);
 gpa_t translate_nested_gpa(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access,
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 72b0f66073dc..67e8c7c7a6ce 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -74,6 +74,10 @@ bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu);
 int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
u64 fault_address, char *insn, int insn_len);
 
+int kvm_mmu_load(struct kvm_vcpu *vcpu);
+void kvm_mmu_unload(struct kvm_vcpu *vcpu);
+void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu);
+
 static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
 {
if (likely(vcpu->arch.mmu->root_hpa != INVALID_PAGE))
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index fa1aca21f6eb..4f66ca0f5f68 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4859,7 +4859,6 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
 out:
return r;
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_load);
 
 void kvm_mmu_unload(struct kvm_vcpu *vcpu)
 {
@@ -4868,7 +4867,6 @@ void kvm_mmu_unload(struct kvm_vcpu *vcpu)
kvm_mmu_free_roots(vcpu, >arch.guest_mmu, KVM_MMU_ROOTS_ALL);
WARN_ON(VALID_PAGE(vcpu->arch.guest_mmu.root_hpa));
 }
-EXPORT_SYMBOL_GPL(kvm_mmu_unload);
 
 static bool need_remote_flush(u64 old, u64 new)
 {
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 13/17] KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch

2021-03-04 Thread Sean Christopherson
Defer reloading the MMU after a EPTP successful EPTP switch.  The VMFUNC
instruction itself is executed in the previous EPTP context, any side
effects, e.g. updating RIP, should occur in the old context.  Practically
speaking, this bug is benign as VMX doesn't touch the MMU when skipping
an emulated instruction, nor does queuing a single-step #DB.  No other
post-switch side effects exist.

Fixes: 41ab93727467 ("KVM: nVMX: Emulate EPTP switching for the L1 hypervisor")
Cc: sta...@vger.kernel.org
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/vmx/nested.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index fdd80dd8e781..81f609886c8b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -5473,16 +5473,11 @@ static int nested_vmx_eptp_switching(struct kvm_vcpu 
*vcpu,
if (!nested_vmx_check_eptp(vcpu, new_eptp))
return 1;
 
-   kvm_mmu_unload(vcpu);
mmu->ept_ad = accessed_dirty;
mmu->mmu_role.base.ad_disabled = !accessed_dirty;
vmcs12->ept_pointer = new_eptp;
-   /*
-* TODO: Check what's the correct approach in case
-* mmu reload fails. Currently, we just let the next
-* reload potentially fail
-*/
-   kvm_mmu_reload(vcpu);
+
+   kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
}
 
return 0;
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 14/17] KVM: x86: Defer the MMU unload to the normal path on an global INVPCID

2021-03-04 Thread Sean Christopherson
Defer unloading the MMU after a INVPCID until the instruction emulation
has completed, i.e. until after RIP has been updated.

On VMX, this is a benign bug as VMX doesn't touch the MMU when skipping
an emulated instruction.  However, on SVM, if nrip is disabled, the
emulator is used to skip an instruction, which would lead to fireworks
if the emulator were invoked without a valid MMU.

Fixes: eb4b248e152d ("kvm: vmx: Support INVPCID in shadow paging mode")
Cc: sta...@vger.kernel.org
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 828de7d65074..7b0adebec1ef 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11531,7 +11531,7 @@ int kvm_handle_invpcid(struct kvm_vcpu *vcpu, unsigned 
long type, gva_t gva)
 
fallthrough;
case INVPCID_TYPE_ALL_INCL_GLOBAL:
-   kvm_mmu_unload(vcpu);
+   kvm_make_request(KVM_REQ_MMU_RELOAD, vcpu);
return kvm_skip_emulated_instruction(vcpu);
 
default:
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 12/17] KVM: SVM: Don't strip the C-bit from CR2 on #PF interception

2021-03-04 Thread Sean Christopherson
Don't strip the C-bit from the faulting address on an intercepted #PF,
the address is a virtual address, not a physical address.

Fixes: 0ede79e13224 ("KVM: SVM: Clear C-bit from the page fault address")
Cc: sta...@vger.kernel.org
Cc: Brijesh Singh 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/svm/svm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4769cf8bf2fd..dfc8fe231e8b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1907,7 +1907,7 @@ static int pf_interception(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
-   u64 fault_address = __sme_clr(svm->vmcb->control.exit_info_2);
+   u64 fault_address = svm->vmcb->control.exit_info_2;
u64 error_code = svm->vmcb->control.exit_info_1;
 
return kvm_handle_page_fault(vcpu, error_code, fault_address,
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 11/17] KVM: x86/mmu: Mark the PAE roots as decrypted for shadow paging

2021-03-04 Thread Sean Christopherson
Set the PAE roots used as decrypted to play nice with SME when KVM is
using shadow paging.  Explicitly skip setting the C-bit when loading
CR3 for PAE shadow paging, even though it's completely ignored by the
CPU.  The extra documentation is nice to have.

Note, there are several subtleties at play with NPT.  In addition to
legacy shadow paging, the PAE roots are used for SVM's NPT when either
KVM is 32-bit (uses PAE paging) or KVM is 64-bit and shadowing 32-bit
NPT.  However, 32-bit Linux, and thus KVM, doesn't support SME.  And
64-bit KVM can happily set the C-bit in CR3.  This also means that
keeping __sme_set(root) for 32-bit KVM when NPT is enabled is
conceptually wrong, but functionally ok since SME is 64-bit only.
Leave it as is to avoid unnecessary pollution.

Fixes: d0ec49d4de90 ("kvm/x86/svm: Support Secure Memory Encryption within KVM")
Cc: sta...@vger.kernel.org
Cc: Brijesh Singh 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 24 +++-
 arch/x86/kvm/svm/svm.c |  7 +--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 09310c35fcf4..fa1aca21f6eb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "trace.h"
@@ -3377,7 +3378,10 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->lm_root))
return -EIO;
 
-   /* Unlike 32-bit NPT, the PDP table doesn't need to be in low mem. */
+   /*
+* Unlike 32-bit NPT, the PDP table doesn't need to be in low mem, and
+* doesn't need to be decrypted.
+*/
pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
if (!pae_root)
return -ENOMEM;
@@ -5264,6 +5268,8 @@ slot_handle_leaf(struct kvm *kvm, struct kvm_memory_slot 
*memslot,
 
 static void free_mmu_pages(struct kvm_mmu *mmu)
 {
+   if (!tdp_enabled && mmu->pae_root)
+   set_memory_encrypted((unsigned long)mmu->pae_root, 1);
free_page((unsigned long)mmu->pae_root);
free_page((unsigned long)mmu->lm_root);
 }
@@ -5301,6 +5307,22 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu)
for (i = 0; i < 4; ++i)
mmu->pae_root[i] = 0;
 
+   /*
+* CR3 is only 32 bits when PAE paging is used, thus it's impossible to
+* get the CPU to treat the PDPTEs as encrypted.  Decrypt the page so
+* that KVM's writes and the CPU's reads get along.  Note, this is
+* only necessary when using shadow paging, as 64-bit NPT can get at
+* the C-bit even when shadowing 32-bit NPT, and SME isn't supported
+* by 32-bit kernels (when KVM itself uses 32-bit NPT).
+*/
+   if (!tdp_enabled)
+   set_memory_decrypted((unsigned long)mmu->pae_root, 1);
+   else
+   WARN_ON_ONCE(shadow_me_mask);
+
+   for (i = 0; i < 4; ++i)
+   mmu->pae_root[i] = 0;
+
return 0;
 }
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 54610270f66a..4769cf8bf2fd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3908,15 +3908,18 @@ static void svm_load_mmu_pgd(struct kvm_vcpu *vcpu, 
unsigned long root,
struct vcpu_svm *svm = to_svm(vcpu);
unsigned long cr3;
 
-   cr3 = __sme_set(root);
if (npt_enabled) {
-   svm->vmcb->control.nested_cr3 = cr3;
+   svm->vmcb->control.nested_cr3 = __sme_set(root);
vmcb_mark_dirty(svm->vmcb, VMCB_NPT);
 
/* Loading L2's CR3 is handled by enter_svm_guest_mode.  */
if (!test_bit(VCPU_EXREG_CR3, (ulong *)>arch.regs_avail))
return;
cr3 = vcpu->arch.cr3;
+   } else if (vcpu->arch.mmu->shadow_root_level >= PT64_ROOT_4LEVEL) {
+   cr3 = __sme_set(root);
+   } else {
+   cr3 = root;
}
 
svm->vmcb->save.cr3 = cr3;
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 09/17] KVM: x86/mmu: Use '0' as the one and only value for an invalid PAE root

2021-03-04 Thread Sean Christopherson
Use '0' to denote an invalid pae_root instead of '0' or INVALID_PAGE.
Unlike root_hpa, the pae_roots hold permission bits and thus are
guaranteed to be non-zero.  Having to deal with both values leads to
bugs, e.g. failing to set back to INVALID_PAGE, warning on the wrong
value, etc...

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index b82c1b0d6d6e..dbf7f0395e4b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3197,11 +3197,14 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct 
kvm_mmu *mmu,
(mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
mmu_free_root_page(kvm, >root_hpa, _list);
} else if (mmu->pae_root) {
-   for (i = 0; i < 4; ++i)
-   if (mmu->pae_root[i] != 0)
-   mmu_free_root_page(kvm,
-  >pae_root[i],
-  _list);
+   for (i = 0; i < 4; ++i) {
+   if (!mmu->pae_root[i])
+   continue;
+
+   mmu_free_root_page(kvm, >pae_root[i],
+  _list);
+   mmu->pae_root[i] = 0;
+   }
}
mmu->root_hpa = INVALID_PAGE;
mmu->root_pgd = 0;
@@ -3250,8 +3253,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
mmu->root_hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
for (i = 0; i < 4; ++i) {
-   WARN_ON_ONCE(mmu->pae_root[i] &&
-VALID_PAGE(mmu->pae_root[i]));
+   WARN_ON_ONCE(mmu->pae_root[i]);
 
root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
  i << 30, PT32_ROOT_LEVEL, true);
@@ -3316,7 +3318,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
}
 
for (i = 0; i < 4; ++i) {
-   WARN_ON_ONCE(mmu->pae_root[i] && VALID_PAGE(mmu->pae_root[i]));
+   WARN_ON_ONCE(mmu->pae_root[i]);
 
if (mmu->root_level == PT32E_ROOT_LEVEL) {
if (!(pdptrs[i] & PT_PRESENT_MASK)) {
@@ -3438,7 +3440,7 @@ void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
for (i = 0; i < 4; ++i) {
hpa_t root = vcpu->arch.mmu->pae_root[i];
 
-   if (root && VALID_PAGE(root)) {
+   if (root && !WARN_ON_ONCE(!VALID_PAGE(root))) {
root &= PT64_BASE_ADDR_MASK;
sp = to_shadow_page(root);
mmu_sync_children(vcpu, sp);
@@ -5296,7 +5298,7 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct 
kvm_mmu *mmu)
 
mmu->pae_root = page_address(page);
for (i = 0; i < 4; ++i)
-   mmu->pae_root[i] = INVALID_PAGE;
+   mmu->pae_root[i] = 0;
 
return 0;
 }
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 10/17] KVM: x86/mmu: Set the C-bit in the PDPTRs and LM pseudo-PDPTRs

2021-03-04 Thread Sean Christopherson
Set the C-bit in SPTEs that are set outside of the normal MMU flows,
specifically the PDPDTRs and the handful of special cased "LM root"
entries, all of which are shadow paging only.

Note, the direct-mapped-root PDPTR handling is needed for the scenario
where paging is disabled in the guest, in which case KVM uses a direct
mapped MMU even though TDP is disabled.

Fixes: d0ec49d4de90 ("kvm/x86/svm: Support Secure Memory Encryption within KVM")
Cc: sta...@vger.kernel.org
Cc: Brijesh Singh 
Cc: Tom Lendacky 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dbf7f0395e4b..09310c35fcf4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3257,7 +3257,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 
root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
  i << 30, PT32_ROOT_LEVEL, true);
-   mmu->pae_root[i] = root | PT_PRESENT_MASK;
+   mmu->pae_root[i] = root | PT_PRESENT_MASK |
+  shadow_me_mask;
}
mmu->root_hpa = __pa(mmu->pae_root);
} else
@@ -3310,7 +3311,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * or a PAE 3-level page table. In either case we need to be aware that
 * the shadow page table may be a PAE or a long mode page table.
 */
-   pm_mask = PT_PRESENT_MASK;
+   pm_mask = PT_PRESENT_MASK | shadow_me_mask;
if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 08/17] KVM: x86/mmu: Fix and unconditionally enable WARNs to detect PAE leaks

2021-03-04 Thread Sean Christopherson
Exempt NULL PAE roots from the check to detect leaks, since
kvm_mmu_free_roots() doesn't set them back to INVALID_PAGE.  Stop hiding
the WARNs to detect PAE root leaks behind MMU_WARN_ON, the hidden WARNs
obviously didn't do their job given the hilarious number of bugs that
could lead to PAE roots being leaked, not to mention the above false
positive.

Opportunistically delete a warning on root_hpa being valid, there's
nothing special about 4/5-level shadow pages that warrants a WARN.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 9fc2b46f8541..b82c1b0d6d6e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3250,7 +3250,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
mmu->root_hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
for (i = 0; i < 4; ++i) {
-   MMU_WARN_ON(VALID_PAGE(mmu->pae_root[i]));
+   WARN_ON_ONCE(mmu->pae_root[i] &&
+VALID_PAGE(mmu->pae_root[i]));
 
root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
  i << 30, PT32_ROOT_LEVEL, true);
@@ -3296,8 +3297,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * write-protect the guests page table root.
 */
if (mmu->root_level >= PT64_ROOT_4LEVEL) {
-   MMU_WARN_ON(VALID_PAGE(mmu->root_hpa));
-
root = mmu_alloc_root(vcpu, root_gfn, 0,
  mmu->shadow_root_level, false);
mmu->root_hpa = root;
@@ -3317,7 +3316,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
}
 
for (i = 0; i < 4; ++i) {
-   MMU_WARN_ON(VALID_PAGE(mmu->pae_root[i]));
+   WARN_ON_ONCE(mmu->pae_root[i] && VALID_PAGE(mmu->pae_root[i]));
 
if (mmu->root_level == PT32E_ROOT_LEVEL) {
if (!(pdptrs[i] & PT_PRESENT_MASK)) {
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 07/17] KVM: x86/mmu: Check PDPTRs before allocating PAE roots

2021-03-04 Thread Sean Christopherson
Check the validity of the PDPTRs before allocating any of the PAE roots,
otherwise a bad PDPTR will cause KVM to leak any previously allocated
roots.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7ebfbc77b050..9fc2b46f8541 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3269,7 +3269,7 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 {
struct kvm_mmu *mmu = vcpu->arch.mmu;
-   u64 pdptr, pm_mask;
+   u64 pdptrs[4], pm_mask;
gfn_t root_gfn, root_pgd;
hpa_t root;
int i;
@@ -3280,6 +3280,17 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
if (mmu_check_root(vcpu, root_gfn))
return 1;
 
+   if (mmu->root_level == PT32E_ROOT_LEVEL) {
+   for (i = 0; i < 4; ++i) {
+   pdptrs[i] = mmu->get_pdptr(vcpu, i);
+   if (!(pdptrs[i] & PT_PRESENT_MASK))
+   continue;
+
+   if (mmu_check_root(vcpu, pdptrs[i] >> PAGE_SHIFT))
+   return 1;
+   }
+   }
+
/*
 * Do we shadow a long mode page table? If so we need to
 * write-protect the guests page table root.
@@ -3309,14 +3320,11 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
MMU_WARN_ON(VALID_PAGE(mmu->pae_root[i]));
 
if (mmu->root_level == PT32E_ROOT_LEVEL) {
-   pdptr = mmu->get_pdptr(vcpu, i);
-   if (!(pdptr & PT_PRESENT_MASK)) {
+   if (!(pdptrs[i] & PT_PRESENT_MASK)) {
mmu->pae_root[i] = 0;
continue;
}
-   root_gfn = pdptr >> PAGE_SHIFT;
-   if (mmu_check_root(vcpu, root_gfn))
-   return 1;
+   root_gfn = pdptrs[i] >> PAGE_SHIFT;
}
 
root = mmu_alloc_root(vcpu, root_gfn, i << 30,
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 06/17] KVM: x86/mmu: Ensure MMU pages are available when allocating roots

2021-03-04 Thread Sean Christopherson
Hold the mmu_lock for write for the entire duration of allocating and
initializing an MMU's roots.  This ensures there are MMU pages available
and thus prevents root allocations from failing.  That in turn fixes a
bug where KVM would fail to free valid PAE roots if a one of the later
roots failed to allocate.

Add a comment to make_mmu_pages_available() to call out that the limit
is a soft limit, e.g. KVM will temporarily exceed the threshold if a
page fault allocates multiple shadow pages and there was only one page
"available".

Note, KVM _still_ leaks the PAE roots if the guest PDPTR checks fail.
This will be addressed in a future commit.

Cc: Ben Gardon 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 50 +++---
 arch/x86/kvm/mmu/tdp_mmu.c | 23 --
 2 files changed, 25 insertions(+), 48 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dd9d5cc13a46..7ebfbc77b050 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2403,6 +2403,15 @@ static int make_mmu_pages_available(struct kvm_vcpu 
*vcpu)
 
kvm_mmu_zap_oldest_mmu_pages(vcpu->kvm, KVM_REFILL_PAGES - avail);
 
+   /*
+* Note, this check is intentionally soft, it only guarantees that one
+* page is available, while the caller may end up allocating as many as
+* four pages, e.g. for PAE roots or for 5-level paging.  Temporarily
+* exceeding the (arbitrary by default) limit will not harm the host,
+* being too agressive may unnecessarily kill the guest, and getting an
+* exact count is far more trouble than it's worth, especially in the
+* page fault paths.
+*/
if (!kvm_mmu_available_pages(vcpu->kvm))
return -ENOSPC;
return 0;
@@ -3220,16 +3229,9 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t 
gfn, gva_t gva,
 {
struct kvm_mmu_page *sp;
 
-   write_lock(>kvm->mmu_lock);
-
-   if (make_mmu_pages_available(vcpu)) {
-   write_unlock(>kvm->mmu_lock);
-   return INVALID_PAGE;
-   }
sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL);
++sp->root_count;
 
-   write_unlock(>kvm->mmu_lock);
return __pa(sp->spt);
 }
 
@@ -3242,16 +3244,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 
if (is_tdp_mmu_enabled(vcpu->kvm)) {
root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu);
-
-   if (!VALID_PAGE(root))
-   return -ENOSPC;
mmu->root_hpa = root;
} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
-   root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level,
- true);
-
-   if (!VALID_PAGE(root))
-   return -ENOSPC;
+   root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level, true);
mmu->root_hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
for (i = 0; i < 4; ++i) {
@@ -3259,8 +3254,6 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 
root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
  i << 30, PT32_ROOT_LEVEL, true);
-   if (!VALID_PAGE(root))
-   return -ENOSPC;
mmu->pae_root[i] = root | PT_PRESENT_MASK;
}
mmu->root_hpa = __pa(mmu->pae_root);
@@ -3296,8 +3289,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 
root = mmu_alloc_root(vcpu, root_gfn, 0,
  mmu->shadow_root_level, false);
-   if (!VALID_PAGE(root))
-   return -ENOSPC;
mmu->root_hpa = root;
goto set_root_pgd;
}
@@ -3316,6 +3307,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 
for (i = 0; i < 4; ++i) {
MMU_WARN_ON(VALID_PAGE(mmu->pae_root[i]));
+
if (mmu->root_level == PT32E_ROOT_LEVEL) {
pdptr = mmu->get_pdptr(vcpu, i);
if (!(pdptr & PT_PRESENT_MASK)) {
@@ -3329,8 +3321,6 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 
root = mmu_alloc_root(vcpu, root_gfn, i << 30,
  PT32_ROOT_LEVEL, false);
-   if (!VALID_PAGE(root))
-   return -ENOSPC;
mmu->pae_root[i] = root | pm_mask;
}
 
@@ -3394,14 +3384,6 @@ static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
return 0;
 }
 
-static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
-{
-   if (vcpu->arch.mmu->direct_map)
-   return mmu_alloc_direct_roots(vcpu);
-   else
-   return mmu_alloc_shadow_roots(vcpu);
-}
-
 void kvm_mmu_sync_roots(struct kvm_vcpu *vcpu)
 

[PATCH v2 05/17] KVM: x86/mmu: Allocate pae_root and lm_root pages in dedicated helper

2021-03-04 Thread Sean Christopherson
Move the on-demand allocation of the pae_root and lm_root pages, used by
nested NPT for 32-bit L1s, into a separate helper.  This will allow a
future patch to hold mmu_lock while allocating the non-special roots so
that make_mmu_pages_available() can be checked once at the start of root
allocation, and thus avoid having to deal with failure in the middle of
root allocation.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 84 +++---
 1 file changed, 54 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7cb5fb5d2d4d..dd9d5cc13a46 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3308,38 +3308,10 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * the shadow page table may be a PAE or a long mode page table.
 */
pm_mask = PT_PRESENT_MASK;
-   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
+   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
-   /*
-* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
-* tables are allocated and initialized at root creation as there is no
-* equivalent level in the guest's NPT to shadow.  Allocate the tables
-* on demand, as running a 32-bit L1 VMM is very rare.  Unlike 32-bit
-* NPT, the PDP table doesn't need to be in low mem.  Preallocate the
-* pages so that the PAE roots aren't leaked on failure.
-*/
-   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL &&
-   (!mmu->pae_root || !mmu->lm_root)) {
-   u64 *lm_root, *pae_root;
-
-   if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->lm_root))
-   return -EIO;
-
-   pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (!pae_root)
-   return -ENOMEM;
-
-   lm_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (!lm_root) {
-   free_page((unsigned long)pae_root);
-   return -ENOMEM;
-   }
-
-   mmu->pae_root = pae_root;
-   mmu->lm_root = lm_root;
-
-   lm_root[0] = __pa(mmu->pae_root) | pm_mask;
+   mmu->lm_root[0] = __pa(mmu->pae_root) | pm_mask;
}
 
for (i = 0; i < 4; ++i) {
@@ -3373,6 +3345,55 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
return 0;
 }
 
+static int mmu_alloc_special_roots(struct kvm_vcpu *vcpu)
+{
+   struct kvm_mmu *mmu = vcpu->arch.mmu;
+   u64 *lm_root, *pae_root;
+
+   /*
+* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
+* tables are allocated and initialized at root creation as there is no
+* equivalent level in the guest's NPT to shadow.  Allocate the tables
+* on demand, as running a 32-bit L1 VMM on 64-bit KVM is very rare.
+*/
+   if (mmu->direct_map || mmu->root_level >= PT64_ROOT_4LEVEL ||
+   mmu->shadow_root_level < PT64_ROOT_4LEVEL)
+   return 0;
+
+   /*
+* This mess only works with 4-level paging and needs to be updated to
+* work with 5-level paging.
+*/
+   if (WARN_ON_ONCE(mmu->shadow_root_level != PT64_ROOT_4LEVEL))
+   return -EIO;
+
+   if (mmu->pae_root && mmu->lm_root)
+   return 0;
+
+   /*
+* The special roots should always be allocated in concert.  Yell and
+* bail if KVM ends up in a state where only one of the roots is valid.
+*/
+   if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->lm_root))
+   return -EIO;
+
+   /* Unlike 32-bit NPT, the PDP table doesn't need to be in low mem. */
+   pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!pae_root)
+   return -ENOMEM;
+
+   lm_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!lm_root) {
+   free_page((unsigned long)pae_root);
+   return -ENOMEM;
+   }
+
+   mmu->pae_root = pae_root;
+   mmu->lm_root = lm_root;
+
+   return 0;
+}
+
 static int mmu_alloc_roots(struct kvm_vcpu *vcpu)
 {
if (vcpu->arch.mmu->direct_map)
@@ -4820,6 +4841,9 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
int r;
 
r = mmu_topup_memory_caches(vcpu, !vcpu->arch.mmu->direct_map);
+   if (r)
+   goto out;
+   r = mmu_alloc_special_roots(vcpu);
if (r)
goto out;
r = mmu_alloc_roots(vcpu);
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 04/17] KVM: x86/mmu: Allocate the lm_root before allocating PAE roots

2021-03-04 Thread Sean Christopherson
Allocate lm_root before the PAE roots so that the PAE roots aren't
leaked if the memory allocation for the lm_root happens to fail.

Note, KVM can still leak PAE roots if mmu_check_root() fails on a guest's
PDPTR, or if mmu_alloc_root() fails due to MMU pages not being available.
Those issues will be fixed in future commits.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 64 --
 1 file changed, 31 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c4f8e59f596c..7cb5fb5d2d4d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3308,21 +3308,38 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * the shadow page table may be a PAE or a long mode page table.
 */
pm_mask = PT_PRESENT_MASK;
-   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
+   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
-   /*
-* Allocate the page for the PDPTEs when shadowing 32-bit NPT
-* with 64-bit only when needed.  Unlike 32-bit NPT, it doesn't
-* need to be in low mem.  See also lm_root below.
-*/
-   if (!mmu->pae_root) {
-   WARN_ON_ONCE(!tdp_enabled);
+   /*
+* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
+* tables are allocated and initialized at root creation as there is no
+* equivalent level in the guest's NPT to shadow.  Allocate the tables
+* on demand, as running a 32-bit L1 VMM is very rare.  Unlike 32-bit
+* NPT, the PDP table doesn't need to be in low mem.  Preallocate the
+* pages so that the PAE roots aren't leaked on failure.
+*/
+   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL &&
+   (!mmu->pae_root || !mmu->lm_root)) {
+   u64 *lm_root, *pae_root;
 
-   mmu->pae_root = (void 
*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (!mmu->pae_root)
-   return -ENOMEM;
+   if (WARN_ON_ONCE(!tdp_enabled || mmu->pae_root || mmu->lm_root))
+   return -EIO;
+
+   pae_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!pae_root)
+   return -ENOMEM;
+
+   lm_root = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!lm_root) {
+   free_page((unsigned long)pae_root);
+   return -ENOMEM;
}
+
+   mmu->pae_root = pae_root;
+   mmu->lm_root = lm_root;
+
+   lm_root[0] = __pa(mmu->pae_root) | pm_mask;
}
 
for (i = 0; i < 4; ++i) {
@@ -3344,30 +3361,11 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
return -ENOSPC;
mmu->pae_root[i] = root | pm_mask;
}
-   mmu->root_hpa = __pa(mmu->pae_root);
-
-   /*
-* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
-* tables are allocated and initialized at MMU creation as there is no
-* equivalent level in the guest's NPT to shadow.  Allocate the tables
-* on demand, as running a 32-bit L1 VMM is very rare.  The PDP is
-* handled above (to share logic with PAE), deal with the PML4 here.
-*/
-   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
-   if (mmu->lm_root == NULL) {
-   u64 *lm_root;
-
-   lm_root = (void*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (!lm_root)
-   return -ENOMEM;
-
-   lm_root[0] = __pa(mmu->pae_root) | pm_mask;
-
-   mmu->lm_root = lm_root;
-   }
 
+   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL)
mmu->root_hpa = __pa(mmu->lm_root);
-   }
+   else
+   mmu->root_hpa = __pa(mmu->pae_root);
 
 set_root_pgd:
mmu->root_pgd = root_pgd;
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 03/17] KVM: x86/mmu: Capture 'mmu' in a local variable when allocating roots

2021-03-04 Thread Sean Christopherson
Grab 'mmu' and do s/vcpu->arch.mmu/mmu to shorten line lengths and yield
smaller diffs when moving code around in future cleanup without forcing
the new code to use the same ugly pattern.

No functional change intended.

Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 58 ++
 1 file changed, 30 insertions(+), 28 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2ed3fac1244e..c4f8e59f596c 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3235,7 +3235,8 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t 
gfn, gva_t gva,
 
 static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 {
-   u8 shadow_root_level = vcpu->arch.mmu->shadow_root_level;
+   struct kvm_mmu *mmu = vcpu->arch.mmu;
+   u8 shadow_root_level = mmu->shadow_root_level;
hpa_t root;
unsigned i;
 
@@ -3244,42 +3245,43 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 
if (!VALID_PAGE(root))
return -ENOSPC;
-   vcpu->arch.mmu->root_hpa = root;
+   mmu->root_hpa = root;
} else if (shadow_root_level >= PT64_ROOT_4LEVEL) {
root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level,
  true);
 
if (!VALID_PAGE(root))
return -ENOSPC;
-   vcpu->arch.mmu->root_hpa = root;
+   mmu->root_hpa = root;
} else if (shadow_root_level == PT32E_ROOT_LEVEL) {
for (i = 0; i < 4; ++i) {
-   MMU_WARN_ON(VALID_PAGE(vcpu->arch.mmu->pae_root[i]));
+   MMU_WARN_ON(VALID_PAGE(mmu->pae_root[i]));
 
root = mmu_alloc_root(vcpu, i << (30 - PAGE_SHIFT),
  i << 30, PT32_ROOT_LEVEL, true);
if (!VALID_PAGE(root))
return -ENOSPC;
-   vcpu->arch.mmu->pae_root[i] = root | PT_PRESENT_MASK;
+   mmu->pae_root[i] = root | PT_PRESENT_MASK;
}
-   vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
+   mmu->root_hpa = __pa(mmu->pae_root);
} else
BUG();
 
/* root_pgd is ignored for direct MMUs. */
-   vcpu->arch.mmu->root_pgd = 0;
+   mmu->root_pgd = 0;
 
return 0;
 }
 
 static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 {
+   struct kvm_mmu *mmu = vcpu->arch.mmu;
u64 pdptr, pm_mask;
gfn_t root_gfn, root_pgd;
hpa_t root;
int i;
 
-   root_pgd = vcpu->arch.mmu->get_guest_pgd(vcpu);
+   root_pgd = mmu->get_guest_pgd(vcpu);
root_gfn = root_pgd >> PAGE_SHIFT;
 
if (mmu_check_root(vcpu, root_gfn))
@@ -3289,14 +3291,14 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * Do we shadow a long mode page table? If so we need to
 * write-protect the guests page table root.
 */
-   if (vcpu->arch.mmu->root_level >= PT64_ROOT_4LEVEL) {
-   MMU_WARN_ON(VALID_PAGE(vcpu->arch.mmu->root_hpa));
+   if (mmu->root_level >= PT64_ROOT_4LEVEL) {
+   MMU_WARN_ON(VALID_PAGE(mmu->root_hpa));
 
root = mmu_alloc_root(vcpu, root_gfn, 0,
- vcpu->arch.mmu->shadow_root_level, false);
+ mmu->shadow_root_level, false);
if (!VALID_PAGE(root))
return -ENOSPC;
-   vcpu->arch.mmu->root_hpa = root;
+   mmu->root_hpa = root;
goto set_root_pgd;
}
 
@@ -3306,7 +3308,7 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * the shadow page table may be a PAE or a long mode page table.
 */
pm_mask = PT_PRESENT_MASK;
-   if (vcpu->arch.mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
+   if (mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
/*
@@ -3314,21 +3316,21 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * with 64-bit only when needed.  Unlike 32-bit NPT, it doesn't
 * need to be in low mem.  See also lm_root below.
 */
-   if (!vcpu->arch.mmu->pae_root) {
+   if (!mmu->pae_root) {
WARN_ON_ONCE(!tdp_enabled);
 
-   vcpu->arch.mmu->pae_root = (void 
*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (!vcpu->arch.mmu->pae_root)
+   mmu->pae_root = (void 
*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!mmu->pae_root)
return -ENOMEM;
}
}
 
for (i = 0; i < 4; ++i) {
-   

[PATCH v2 02/17] KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit

2021-03-04 Thread Sean Christopherson
Allocate the so called pae_root page on-demand, along with the lm_root
page, when shadowing 32-bit NPT with 64-bit NPT, i.e. when running a
32-bit L1.  KVM currently only allocates the page when NPT is disabled,
or when L0 is 32-bit (using PAE paging).

Note, there is an existing memory leak involving the MMU roots, as KVM
fails to free the PAE roots on failure.  This will be addressed in a
future commit.

Fixes: ee6268ba3a68 ("KVM: x86: Skip pae_root shadow allocation if tdp enabled")
Fixes: b6b80c78af83 ("KVM: x86/mmu: Allocate PAE root array when using SVM's 
32-bit NPT")
Cc: sta...@vger.kernel.org
Reviewed-by: Ben Gardon 
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 44 --
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 0987cc1d53eb..2ed3fac1244e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3187,14 +3187,14 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct 
kvm_mmu *mmu,
if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
(mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
mmu_free_root_page(kvm, >root_hpa, _list);
-   } else {
+   } else if (mmu->pae_root) {
for (i = 0; i < 4; ++i)
if (mmu->pae_root[i] != 0)
mmu_free_root_page(kvm,
   >pae_root[i],
   _list);
-   mmu->root_hpa = INVALID_PAGE;
}
+   mmu->root_hpa = INVALID_PAGE;
mmu->root_pgd = 0;
}
 
@@ -3306,9 +3306,23 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
 * the shadow page table may be a PAE or a long mode page table.
 */
pm_mask = PT_PRESENT_MASK;
-   if (vcpu->arch.mmu->shadow_root_level == PT64_ROOT_4LEVEL)
+   if (vcpu->arch.mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
pm_mask |= PT_ACCESSED_MASK | PT_WRITABLE_MASK | PT_USER_MASK;
 
+   /*
+* Allocate the page for the PDPTEs when shadowing 32-bit NPT
+* with 64-bit only when needed.  Unlike 32-bit NPT, it doesn't
+* need to be in low mem.  See also lm_root below.
+*/
+   if (!vcpu->arch.mmu->pae_root) {
+   WARN_ON_ONCE(!tdp_enabled);
+
+   vcpu->arch.mmu->pae_root = (void 
*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+   if (!vcpu->arch.mmu->pae_root)
+   return -ENOMEM;
+   }
+   }
+
for (i = 0; i < 4; ++i) {
MMU_WARN_ON(VALID_PAGE(vcpu->arch.mmu->pae_root[i]));
if (vcpu->arch.mmu->root_level == PT32E_ROOT_LEVEL) {
@@ -3331,21 +3345,19 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
vcpu->arch.mmu->root_hpa = __pa(vcpu->arch.mmu->pae_root);
 
/*
-* If we shadow a 32 bit page table with a long mode page
-* table we enter this path.
+* When shadowing 32-bit or PAE NPT with 64-bit NPT, the PML4 and PDP
+* tables are allocated and initialized at MMU creation as there is no
+* equivalent level in the guest's NPT to shadow.  Allocate the tables
+* on demand, as running a 32-bit L1 VMM is very rare.  The PDP is
+* handled above (to share logic with PAE), deal with the PML4 here.
 */
if (vcpu->arch.mmu->shadow_root_level == PT64_ROOT_4LEVEL) {
if (vcpu->arch.mmu->lm_root == NULL) {
-   /*
-* The additional page necessary for this is only
-* allocated on demand.
-*/
-
u64 *lm_root;
 
lm_root = (void*)get_zeroed_page(GFP_KERNEL_ACCOUNT);
-   if (lm_root == NULL)
-   return 1;
+   if (!lm_root)
+   return -ENOMEM;
 
lm_root[0] = __pa(vcpu->arch.mmu->pae_root) | pm_mask;
 
@@ -5248,9 +5260,11 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, 
struct kvm_mmu *mmu)
 * while the PDP table is a per-vCPU construct that's allocated at MMU
 * creation.  When emulating 32-bit mode, cr3 is only 32 bits even on
 * x86_64.  Therefore we need to allocate the PDP table in the first
-* 4GB of memory, which happens to fit the DMA32 zone.  Except for
-* SVM's 32-bit NPT support, TDP paging doesn't use PAE paging and can
-* skip allocating the PDP table.
+* 4GB of memory, which happens to fit the DMA32 zone.  TDP paging
+* generally doesn't use PAE paging and can skip 

[PATCH v2 01/17] KVM: nSVM: Set the shadow root level to the TDP level for nested NPT

2021-03-04 Thread Sean Christopherson
Override the shadow root level in the MMU context when configuring
NPT for shadowing nested NPT.  The level is always tied to the TDP level
of the host, not whatever level the guest happens to be using.

Fixes: 096586fda522 ("KVM: nSVM: Correctly set the shadow NPT root level in its 
MMU role")
Cc: sta...@vger.kernel.org
Signed-off-by: Sean Christopherson 
---
 arch/x86/kvm/mmu/mmu.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c462062d36aa..0987cc1d53eb 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4618,12 +4618,17 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, u32 
cr0, u32 cr4, u32 efer,
struct kvm_mmu *context = >arch.guest_mmu;
union kvm_mmu_role new_role = kvm_calc_shadow_npt_root_page_role(vcpu);
 
-   context->shadow_root_level = new_role.base.level;
-
__kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base, false, false);
 
-   if (new_role.as_u64 != context->mmu_role.as_u64)
+   if (new_role.as_u64 != context->mmu_role.as_u64) {
shadow_mmu_init_context(vcpu, context, cr0, cr4, efer, 
new_role);
+
+   /*
+* Override the level set by the common init helper, nested TDP
+* always uses the host's TDP configuration.
+*/
+   context->shadow_root_level = new_role.base.level;
+   }
 }
 EXPORT_SYMBOL_GPL(kvm_init_shadow_npt_mmu);
 
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 00/17] KVM: x86/mmu: Lots of bug fixes

2021-03-04 Thread Sean Christopherson
Fix nested NPT (nSVM) with 32-bit L1 and SME with shadow paging, which
are completely broken.  Opportunistically fix theoretical bugs related to
prematurely reloading/unloading the MMU.

If nNPT is enabled, L1 can crash the host simply by using 32-bit NPT to
trigger a null pointer dereference on pae_root.

SME with shadow paging (including nNPT) fails to set the C-bit in the
shadow pages that don't go through standard MMU flows (PDPTPRs and the
PML4 used by nNPT to shadow legacy NPT).  It also failes to account for
CR3[63:32], and thus the C-bit, being ignored outside of 64-bit mode.

Patches 01 and 02 fix the null pointer bugs.

Patches 03-09 fix mostly-benign related memory leaks.

Patches 10-12 fix the SME shadow paging bugs, which are also what led me to
the nNPT null pointer bugs.

Patches 13 and 14 fix theoretical bugs with PTP_SWITCH and INVPCID that
I found when auditing flows that touch the MMU context.

Patches 14-17 do additional clean up to hopefully make it harder to
introduce bugs in the future.

On the plus side, I finally understand why KVM supports shadowing 2-level
page tables with 4-level page tables...

Based on kvm/queue, commit fe5f0041c026 ("KVM/SVM: Move vmenter.S exception
fixups out of line").  The null pointer fixes cherry-pick cleanly onto
kvm/master, haven't tried the other bug fixes (I doubt they're worth
backporting even though I tagged 'em with stable).

v2:
  - Collect a review from Ben (did not include his review of patch 03
since the patch and its direct dependencies were changed).
  - Move pae_root and lm_root allocation to a separate helper to avoid
sleeping via get_zeroed_page() while holding mmu_lock.
  - Add a patch to grab 'mmu' in a local variable.
  - Remove the BUILD_BUG_ON() in make_mmu_pages_available() since the
final check wouldn't actually guarnatee 4 pages were "available".
Instead, add a comment about the limit being soft.

v1:
  - https://lkml.kernel.org/r/20210302184540.2829328-1-sea...@google.com
 
Sean Christopherson (17):
  KVM: nSVM: Set the shadow root level to the TDP level for nested NPT
  KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with
64-bit
  KVM: x86/mmu: Capture 'mmu' in a local variable when allocating roots
  KVM: x86/mmu: Allocate the lm_root before allocating PAE roots
  KVM: x86/mmu: Allocate pae_root and lm_root pages in dedicated helper
  KVM: x86/mmu: Ensure MMU pages are available when allocating roots
  KVM: x86/mmu: Check PDPTRs before allocating PAE roots
  KVM: x86/mmu: Fix and unconditionally enable WARNs to detect PAE leaks
  KVM: x86/mmu: Use '0' as the one and only value for an invalid PAE
root
  KVM: x86/mmu: Set the C-bit in the PDPTRs and LM pseudo-PDPTRs
  KVM: x86/mmu: Mark the PAE roots as decrypted for shadow paging
  KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
  KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch
  KVM: x86: Defer the MMU unload to the normal path on an global INVPCID
  KVM: x86/mmu: Unexport MMU load/unload functions
  KVM: x86/mmu: Sync roots after MMU load iff load as successful
  KVM: x86/mmu: WARN on NULL pae_root or lm_root, or bad shadow root
level

 arch/x86/include/asm/kvm_host.h |   3 -
 arch/x86/kvm/mmu.h  |   4 +
 arch/x86/kvm/mmu/mmu.c  | 273 
 arch/x86/kvm/mmu/tdp_mmu.c  |  23 +--
 arch/x86/kvm/svm/svm.c  |   9 +-
 arch/x86/kvm/vmx/nested.c   |   9 +-
 arch/x86/kvm/x86.c  |   2 +-
 7 files changed, 192 insertions(+), 131 deletions(-)

-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH] perf pmu: Validate raw event with sysfs exported format bits

2021-03-04 Thread Andi Kleen
> Single event:
> 
>   # ./perf stat -e cpu/r031234/ -a -- sleep 1
>   WARNING: event config '31234' not valid (bits 16 17 not supported by 
> kernel)!

Just noticed that again. Can you please print the original event as 
string in the message? While it's obvious with rXXX which one it is, 
it may not be obvious with offsetted fields (like umask=0x121212),
and hard to find in a long command line.

-Andi


Re: [PATCH net-next v2 3/3] net: phy: broadcom: Allow BCM54210E to configure APD

2021-03-04 Thread Vladimir Oltean
On Tue, Mar 02, 2021 at 07:37:34PM -0800, Florian Fainelli wrote:
> Took a while but for the 54210E reference board here are the numbers,
> your mileage will vary depending on the supplies, regulator efficiency
> and PCB design around the PHY obviously:
> 
> BMCR.PDOWN:   86.12 mW
> auto-power down:  77.84 mW

Quite curious that the APD power is lower than the normal BMCR.PDOWN
value. As far as my understanding goes, when in APD mode, the PHY even
wakes up from time to time to send pulses to the link partner?

> auto-power-down, DLL disabled:  30.83 mW

The jump from simple APD to APD with DLL disabled is pretty big.
Correct me if I'm wrong, but there's an intermediary step which was not
measured, where the CLK125 is disabled but the internal DLL (Delay
Locked Loop?) is still enabled. I think powering off the internal DLL
also implies powering off the CLK125 pin, at least that's how the PHY
driver treats things at the moment. But we don't know if the huge
reduction in power is due just to CLK125 or the DLL (it's more likely
it's due to both, in equal amounts).

Anyway, it's great to have some results which tell us exactly what is
worthwhile and what isn't. In other news, I've added the BCM5464 to the
list of PHYs with APD and I didn't see any issues thus far.

> IDDQ-low power:9.85 mW (requires a RESETn toggle)
> IDDQ with soft recovery:  10.75 mW
> 
> Interestingly, the 50212E that I am using requires writing the PDOWN bit
> and only that bit (not a RMW) in order to get in a correct state, both
> LEDs keep flashing when that happens, fixes coming.
> 
> When net-next opens back up I will submit patches to support IDDQ with
> soft recovery since that is clearly much better than the standard power
> down and it does not require a RESETn toggle.

Iddq must be the quiescent supply current, isn't it (but in that case,
I'm a bit confused to not see a value in mA)? Is it an actual operating
mode (I don't see anything about that mentioned in the BCM5464 sheet)
and if it is, what is there exactly to support?


linux-next: manual merge of the tty tree with the powerpc-fixes tree

2021-03-04 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tty tree got a conflict in:

  drivers/tty/hvc/hvcs.c

between commit:

  386a966f5ce7 ("vio: make remove callback return void")

from the powerpc-fixes tree and commit:

  fb8d350c291c ("tty: hvc, drop unneeded forward declarations")

from the tty tree.

I fixed it up (they both removed the forward decalrartion of
hvcs_remove(), but the latter removed more) and can carry the fix as
necessary. This is now fixed as far as linux-next is concerned, but any
non trivial conflicts should be mentioned to your upstream maintainer
when your tree is submitted for merging.  You may also want to consider
cooperating with the maintainer of the conflicting tree to minimise any
particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgp0ie7aP4o0S.pgp
Description: OpenPGP digital signature


Re: [PATCH v2] sched: Optimize __calc_delta.

2021-03-04 Thread Josh Don
On Thu, Mar 4, 2021 at 9:34 AM Nick Desaulniers  wrote:
>
>
> Hi Josh, Thanks for helping get this patch across the finish line.
> Would you mind updating the commit message to point to
> https://bugs.llvm.org/show_bug.cgi?id=20197?

Sure thing, just saw that it got marked as a dup.

Peter, since you've already pulled the patch, can you modify the
commit message directly? Nick also recommended dropping the
punctuation in the commit oneline.

> >  #include 
> > +#include 
>
> This hunk of the patch is curious.  I assume that bitops.h is needed
> for fls(); if so, why not #include it in kernel/sched/fair.c?
> Otherwise this potentially hurts compile time for all TUs that include
> kernel/sched/sched.h.

bitops.h is already included in sched.h via another include, so this
was just meant to make it more explicit. Motivation for putting it
here vs. fair.c was 325ea10c080940.


mm/filemap.c:2409:9: warning: stack frame size of 2704 bytes in function 'filemap_read'

2021-03-04 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   3cb60ee6323968b694208c4cbd56a7176396e931
commit: 87fa0f3eb267eed966ee194907bc15376c1b758f mm/filemap: rename 
generic_file_buffered_read to filemap_read
date:   8 days ago
config: powerpc-randconfig-r023-20210304 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
eec7f8f7b1226be422a76542cb403d02538f453a)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=87fa0f3eb267eed966ee194907bc15376c1b758f
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 87fa0f3eb267eed966ee194907bc15376c1b758f
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from mm/filemap.c:20:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :224:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:557:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)  readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from mm/filemap.c:20:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :228:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:558:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)  readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from mm/filemap.c:20:
   In file included from include/linux/kernel_stat.h:9:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:619:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~
   arch/powerpc/include/asm/io.h:616:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :232:1: note: expanded from here
   __do_outsb
   ^
   arch/powerpc/include/asm/io.h:559:58: note: expanded from macro '__do_outsb'
   #define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
   ~^
   In file included from mm/filemap.c:20:
   In file included from include

Re: [PATCH v8 4/5] dma-buf: system_heap: Add drm pagepool support to system heap

2021-03-04 Thread kernel test robot
Hi John,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.12-rc1]
[cannot apply to linux/master drm-intel/for-linux-next drm-tip/drm-tip]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/John-Stultz/Generic-page-pool-deferred-freeing-for-system-dmabuf-heap/20210305-072137
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
3cb60ee6323968b694208c4cbd56a7176396e931
config: openrisc-randconfig-p001-20210304 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://github.com/0day-ci/linux/commit/6a9bf19a9ed9e5058a11a3e3217530fdf2675e0c
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
John-Stultz/Generic-page-pool-deferred-freeing-for-system-dmabuf-heap/20210305-072137
git checkout 6a9bf19a9ed9e5058a11a3e3217530fdf2675e0c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   or1k-linux-ld: arch/openrisc/kernel/entry.o: in function 
`_external_irq_handler':
   (.text+0x804): undefined reference to `printk'
   (.text+0x804): relocation truncated to fit: R_OR1K_INSN_REL_26 against 
undefined symbol `printk'
   or1k-linux-ld: drivers/dma-buf/heaps/system_heap.o: in function 
`system_heap_create':
>> system_heap.c:(.text+0x15c): undefined reference to `drm_page_pool_init'
   system_heap.c:(.text+0x15c): relocation truncated to fit: R_OR1K_INSN_REL_26 
against undefined symbol `drm_page_pool_init'
>> or1k-linux-ld: system_heap.c:(.text+0x16c): undefined reference to 
>> `drm_page_pool_init'
   system_heap.c:(.text+0x16c): relocation truncated to fit: R_OR1K_INSN_REL_26 
against undefined symbol `drm_page_pool_init'
   or1k-linux-ld: system_heap.c:(.text+0x17c): undefined reference to 
`drm_page_pool_init'
   system_heap.c:(.text+0x17c): relocation truncated to fit: R_OR1K_INSN_REL_26 
against undefined symbol `drm_page_pool_init'
   or1k-linux-ld: drivers/dma-buf/heaps/system_heap.o: in function 
`system_heap_dma_buf_release':
>> system_heap.c:(.text+0xf78): undefined reference to `drm_page_pool_add'
   system_heap.c:(.text+0xf78): relocation truncated to fit: R_OR1K_INSN_REL_26 
against undefined symbol `drm_page_pool_add'
   or1k-linux-ld: drivers/dma-buf/heaps/system_heap.o: in function 
`system_heap_allocate':
>> system_heap.c:(.text+0x11f8): undefined reference to `drm_page_pool_remove'
   system_heap.c:(.text+0x11f8): relocation truncated to fit: 
R_OR1K_INSN_REL_26 against undefined symbol `drm_page_pool_remove'

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for DRM_PAGE_POOL
   Depends on HAS_IOMEM && DRM
   Selected by
   - DMABUF_HEAPS_SYSTEM && DMABUF_HEAPS


"cppcheck warnings: (new ones prefixed by >>)"
>> drivers/dma-buf/heaps/system_heap.c:290:2: warning: int result is returned 
>> as long value. If the return value is long to avoid loss of information, 
>> then you have loss of information. [truncLongCastReturn]
return 1 << pool->order;
^

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH] Input: elan_i2c - Reduce the resume time for new dev ices

2021-03-04 Thread Dmitry Torokhov
Hi Jingle,

On Tue, Mar 02, 2021 at 09:04:57AM +0800, jingle.wu wrote:
> HI Dmitry:
> 
> So data->ops->initialize(client) essentially performs reset of the
> controller (we may want to rename it even) and as far as I understand
> you would want to avoid resetting the controller on newer devices,
> right?
> 
> -> YES
> 
> My question is how behavior of older devices differ from the new ones
> (are they stay in "undefined" state at power up) and whether it is
> possible to determine if controller is in operating mode. For example,
> what would happen on older devices if we call elan_query_product() below
> without resetting the controller?
> 
> -> But there may be other problems, because ELAN can't test all the older 
> devices , 
> -> so use quirk to divide this part.

OK, but could you please tell me what exactly was changed in the newer
parts behavior regarding need to reset after powering them on?

Thanks.

-- 
Dmitry


Re: [PATCH v3 05/11] x86/entry: Convert ret_from_fork to C

2021-03-04 Thread Brian Gerst
On Thu, Mar 4, 2021 at 2:16 PM Andy Lutomirski  wrote:
>
> ret_from_fork is written in asm, slightly differently, for x86_32 and
> x86_64.  Convert it to C.
>
> This is a straight conversion without any particular cleverness.  As a
> further cleanup, the code that sets up the ret_from_fork argument registers
> could be adjusted to put the arguments in the correct registers.

An alternative would be to stash the function pointer and argument in
the pt_regs of the new kthread task, and test for PF_KTHREAD instead.
That would eliminate the need to pass those two values to the C
version of ret_from_fork().

> This will cause the ORC unwinder to find pt_regs even for kernel threads on
> x86_64.  This seems harmless.
>
> The 32-bit comment above the now-deleted schedule_tail_wrapper was
> obsolete: the encode_frame_pointer mechanism (see copy_thread()) solves the
> same problem more cleanly.
>
> Cc: Josh Poimboeuf 
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/entry/common.c  | 23 ++
>  arch/x86/entry/entry_32.S| 51 +---
>  arch/x86/entry/entry_64.S| 33 +
>  arch/x86/include/asm/switch_to.h |  2 +-
>  arch/x86/kernel/process.c|  2 +-
>  arch/x86/kernel/process_32.c |  2 +-
>  arch/x86/kernel/unwind_orc.c |  2 +-
>  7 files changed, 43 insertions(+), 72 deletions(-)
>
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 95776f16c1cb..ef1c65938a6b 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -214,6 +214,29 @@ SYSCALL_DEFINE0(ni_syscall)
> return -ENOSYS;
>  }
>
> +void ret_from_fork(struct task_struct *prev,
> +  int (*kernel_thread_fn)(void *),
> +  void *kernel_thread_arg,
> +  struct pt_regs *user_regs);
> +
> +__visible void noinstr ret_from_fork(struct task_struct *prev,
> +int (*kernel_thread_fn)(void *),
> +void *kernel_thread_arg,
> +struct pt_regs *user_regs)
> +{
> +   instrumentation_begin();
> +
> +   schedule_tail(prev);
> +
> +   if (kernel_thread_fn) {

This should have an unlikely(), as kernel threads should be the rare case.

--
Brian Gerst


Re: [PATCH v10 0/9] Add support for x509 certs with NIST P384/256/192 keys

2021-03-04 Thread Stefan Berger

Herbert,

   you can take patches 1-8. 9 will not apply without Nayna's series as 
mentioned in the patch.


Regards,
   Stefan


On 3/4/21 7:51 PM, Stefan Berger wrote:

From: Stefan Berger 

This series of patches adds support for x509 certificates signed by a CA
that uses NIST P384, P256 or P192 keys for signing. It also adds support for
certificates where the public key is one of this type of a key. The math
for ECDSA signature verification is also added as well as the math for fast
mmod operation for NIST P384.

Since self-signed certificates are verified upon loading, the following
script can be used for testing of NIST P256 keys:

k=$(keyctl newring test @u)

while :; do
for hash in sha1 sha224 sha256 sha384 sha512; do
openssl req \
-x509 \
-${hash} \
-newkey ec \
-pkeyopt ec_paramgen_curve:prime256v1 \
-keyout key.pem \
-days 365 \
-subj '/CN=test' \
-nodes \
-outform der \
-out cert.der
keyctl padd asymmetric testkey $k < cert.der
if [ $? -ne 0 ]; then
echo "ERROR"
exit 1
fi
done
done

Ecdsa support also works with restricted keyrings where an RSA key is used
to sign a NIST P384/256/192 key. Scripts for testing are here:

https://github.com/stefanberger/eckey-testing

The ECDSA signature verification will be used by IMA Appraisal where ECDSA
file signatures stored in RPM packages will use substantially less space
than if RSA signatures were to be used.

Further, a patch is added that allows kernel modules to be signed with a NIST
p384 key.

Stefan and Saulo

v9->v10:
   - rearranged order of patches to have crypto patches first
   - moved hunk from patch 3 to patch 2 to avoid compilation warning due to
 unused symbol

v8->v9:
   - Appended Saulo's patches
   - Appended patch to support kernel modules signed with NIST p384 key. This
 patch requires Nayna's series here: https://lkml.org/lkml/2021/2/18/856

v7->v8:
   - patch 3/4: Do not determine key algo using parse_OID in public_key.c
 but do this when parsing the certificate. This addresses an issue
 with certain build configurations where OID_REGISTRY is not available
 as 'Reported-by: kernel test robot '.

v6->v7:
   - Moved some OID defintions to patch 1 for bisectability
   - Applied R-b's
   
v5->v6:

   - moved ecdsa code into its own module ecdsa_generic built from ecdsa.c
   - added script-generated test vectors for NIST P256 & P192 and all hashes
   - parsing of OID that contain header with new parse_oid()

v4->v5:
   - registering crypto support under names ecdsa-nist-p256/p192 following
 Hubert Xu's suggestion in other thread
   - appended IMA ECDSA support patch

v3->v4:
   - split off of ecdsa crypto part; registering akcipher as "ecdsa" and
 deriving used curve from digits in parsed key

v2->v3:
   - patch 2 now includes linux/scatterlist.h

v1->v2:
   - using faster vli_sub rather than newly added vli_mod_fast to 'reduce'
 result
   - rearranged switch statements to follow after RSA
   - 3rd patch from 1st posting is now 1st patch


Saulo Alessandre (4):
   crypto: Add NIST P384 curve parameters
   crypto: Add math to support fast NIST P384
   ecdsa: Register NIST P384 and extend test suite
   x509: Add OID for NIST P384 and extend parser for it

Stefan Berger (5):
   crypto: Add support for ECDSA signature verification
   x509: Detect sm2 keys by their parameters OID
   x509: Add support for parsing x509 certs with ECDSA keys
   ima: Support EC keys for signature verification
   certs: Add support for using elliptic curve keys for signing modules

  certs/Kconfig |  22 ++
  certs/Makefile|  14 +
  crypto/Kconfig|  10 +
  crypto/Makefile   |   6 +
  crypto/asymmetric_keys/pkcs7_parser.c |   4 +
  crypto/asymmetric_keys/public_key.c   |   4 +-
  crypto/asymmetric_keys/x509_cert_parser.c |  49 ++-
  crypto/asymmetric_keys/x509_public_key.c  |   4 +-
  crypto/ecc.c  | 281 +-
  crypto/ecc.h  |  31 +-
  crypto/ecc_curve_defs.h   |  32 ++
  crypto/ecdsa.c| 400 
  crypto/ecdsasignature.asn1|   4 +
  crypto/testmgr.c  |  18 +
  crypto/testmgr.h  | 424 ++
  include/crypto/ecdh.h |   1 +
  include/keys/asymmetric-type.h|   6 +
  include/linux/oid_registry.h  |  10 +-
  lib/oid_registry.c|  13 +
  security/integrity/digsig_asymmetric.c|  30 +-
  20 

[PATCH v6] ARM: Implement SLS mitigation

2021-03-04 Thread Jian Cai
This patch adds CONFIG_HARDEN_SLS_ALL that can be used to turn on
-mharden-sls=all, which mitigates the straight-line speculation
vulnerability, speculative execution of the instruction following some
unconditional jumps. Notice -mharden-sls= has other options as below,
and this config turns on the strongest option.

all: enable all mitigations against Straight Line Speculation that are 
implemented.
none: disable all mitigations against Straight Line Speculation.
retbr: enable the mitigation against Straight Line Speculation for RET and BR 
instructions.
blr: enable the mitigation against Straight Line Speculation for BLR 
instructions.

Links:
https://reviews.llvm.org/D93221
https://reviews.llvm.org/D81404
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/downloads/straight-line-speculation
https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/frequently-asked-questions#SLS2

Suggested-by: Manoj Gupta 
Suggested-by: Nick Desaulniers 
Suggested-by: Nathan Chancellor 
Suggested-by: David Laight 
Suggested-by: Will Deacon 
Suggested-by: Russell King 
Suggested-by: Linus Walleij 
Signed-off-by: Jian Cai 
---

Changes v1 -> v2:
 Update the description and patch based on Nathan and David's comments.

Changes v2 -> v3:
  Modify linker scripts as Nick suggested to address boot failure
  (verified with qemu). Added more details in Kconfig.hardening
  description. Disable the config by default.

Changes v3 -> v4:
  Address Nathan's comment and replace def_bool with depends on in
  HARDEN_SLS_ALL.

Changes v4 -> v5:
  Removed "default n" and made the description target indepdent in
  Kconfig.hardening.

Changes v5 -> v6:
  Add select HARDEN_SLS_ALL under config HARDEN_BRANCH_PREDICTOR. This
  turns on HARDEN_SLS_ALL by default where applicable.


 arch/arm/Makefile  | 4 
 arch/arm/include/asm/vmlinux.lds.h | 4 
 arch/arm/kernel/vmlinux.lds.S  | 1 +
 arch/arm/mm/Kconfig| 1 +
 arch/arm64/Makefile| 4 
 arch/arm64/kernel/vmlinux.lds.S| 5 +
 security/Kconfig.hardening | 8 
 7 files changed, 27 insertions(+)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index dad5502ecc28..54f9b5ff9682 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -48,6 +48,10 @@ CHECKFLAGS   += -D__ARMEL__
 KBUILD_LDFLAGS += -EL
 endif
 
+ifeq ($(CONFIG_HARDEN_SLS_ALL), y)
+KBUILD_CFLAGS  += -mharden-sls=all
+endif
+
 #
 # The Scalar Replacement of Aggregates (SRA) optimization pass in GCC 4.9 and
 # later may result in code being generated that handles signed short and signed
diff --git a/arch/arm/include/asm/vmlinux.lds.h 
b/arch/arm/include/asm/vmlinux.lds.h
index 4a91428c324d..c7f9717511ca 100644
--- a/arch/arm/include/asm/vmlinux.lds.h
+++ b/arch/arm/include/asm/vmlinux.lds.h
@@ -145,3 +145,7 @@
__edtcm_data = .;   \
}   \
. = __dtcm_start + SIZEOF(.data_dtcm);
+
+#define SLS_TEXT   \
+   ALIGN_FUNCTION();   \
+   *(.text.__llvm_slsblr_thunk_*)
diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S
index f7f4620d59c3..e71f2bc97bae 100644
--- a/arch/arm/kernel/vmlinux.lds.S
+++ b/arch/arm/kernel/vmlinux.lds.S
@@ -63,6 +63,7 @@ SECTIONS
.text : {   /* Real text segment*/
_stext = .; /* Text and read-only data  */
ARM_TEXT
+   SLS_TEXT
}
 
 #ifdef CONFIG_DEBUG_ALIGN_RODATA
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 35f43d0aa056..bdb63e7b1bec 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -837,6 +837,7 @@ config HARDEN_BRANCH_PREDICTOR
bool "Harden the branch predictor against aliasing attacks" if EXPERT
depends on CPU_SPECTRE
default y
+   select HARDEN_SLS_ALL
help
   Speculation attacks against some high-performance processors rely
   on being able to manipulate the branch predictor for a victim
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 5b84aec31ed3..e233bfbdb1c2 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -34,6 +34,10 @@ $(warning LSE atomics not supported by binutils)
   endif
 endif
 
+ifeq ($(CONFIG_HARDEN_SLS_ALL), y)
+KBUILD_CFLAGS  += -mharden-sls=all
+endif
+
 cc_has_k_constraint := $(call try-run,echo \
'int main(void) {   \
asm volatile("and w0, w0, %w0" :: "K" (4294967295));\
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7eea7888bb02..d5c892605862 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -103,6 

[PATCH v10 4/9] ecdsa: Register NIST P384 and extend test suite

2021-03-04 Thread Stefan Berger
From: Saulo Alessandre 

* crypto/ecdsa.c
  - add ecdsa_nist_p384_init_tfm
  - register and unregister P384 tfm

* crypto/testmgr.c
  - add test vector for P384 on vector of tests

* crypto/testmgr.h
  - add test vector params for P384(sha1, sha224, sha256, sha384
and sha512)

Signed-off-by: Saulo Alessandre 
Tested-by: Stefan Berger 
---
 crypto/ecdsa.c   |  33 +-
 crypto/testmgr.c |   6 ++
 crypto/testmgr.h | 157 +++
 3 files changed, 195 insertions(+), 1 deletion(-)

diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c
index 04fbb3d2abc5..8cce80e4154e 100644
--- a/crypto/ecdsa.c
+++ b/crypto/ecdsa.c
@@ -145,7 +145,7 @@ static int _ecdsa_verify(struct ecc_ctx *ctx, const u64 
*hash,
 
/* res.x = res.x mod n (if res.x > order) */
if (unlikely(vli_cmp(res.x, curve->n, ndigits) == 1))
-   /* faster alternative for NIST p256 & p192 */
+   /* faster alternative for NIST p384, p256 & p192 */
vli_sub(res.x, res.x, curve->n, ndigits);
 
if (!vli_cmp(res.x, r, ndigits))
@@ -289,6 +289,28 @@ static unsigned int ecdsa_max_size(struct crypto_akcipher 
*tfm)
return ctx->pub_key.ndigits << ECC_DIGITS_TO_BYTES_SHIFT;
 }
 
+static int ecdsa_nist_p384_init_tfm(struct crypto_akcipher *tfm)
+{
+   struct ecc_ctx *ctx = akcipher_tfm_ctx(tfm);
+
+   return ecdsa_ecc_ctx_init(ctx, ECC_CURVE_NIST_P384);
+}
+
+static struct akcipher_alg ecdsa_nist_p384 = {
+   .verify = ecdsa_verify,
+   .set_pub_key = ecdsa_set_pub_key,
+   .max_size = ecdsa_max_size,
+   .init = ecdsa_nist_p384_init_tfm,
+   .exit = ecdsa_exit_tfm,
+   .base = {
+   .cra_name = "ecdsa-nist-p384",
+   .cra_driver_name = "ecdsa-nist-p384-generic",
+   .cra_priority = 100,
+   .cra_module = THIS_MODULE,
+   .cra_ctxsize = sizeof(struct ecc_ctx),
+   },
+};
+
 static int ecdsa_nist_p256_init_tfm(struct crypto_akcipher *tfm)
 {
struct ecc_ctx *ctx = akcipher_tfm_ctx(tfm);
@@ -345,8 +367,16 @@ static int ecdsa_init(void)
ret = crypto_register_akcipher(_nist_p256);
if (ret)
goto nist_p256_error;
+
+   ret = crypto_register_akcipher(_nist_p384);
+   if (ret)
+   goto nist_p384_error;
+
return 0;
 
+nist_p384_error:
+   crypto_unregister_akcipher(_nist_p256);
+
 nist_p256_error:
if (ecdsa_nist_p192_registered)
crypto_unregister_akcipher(_nist_p192);
@@ -358,6 +388,7 @@ static void ecdsa_exit(void)
if (ecdsa_nist_p192_registered)
crypto_unregister_akcipher(_nist_p192);
crypto_unregister_akcipher(_nist_p256);
+   crypto_unregister_akcipher(_nist_p384);
 }
 
 subsys_initcall(ecdsa_init);
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 2607602f9de5..08f85719338a 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -4925,6 +4925,12 @@ static const struct alg_test_desc alg_test_descs[] = {
.suite = {
.akcipher = __VECS(ecdsa_nist_p256_tv_template)
}
+   }, {
+   .alg = "ecdsa-nist-p384",
+   .test = alg_test_akcipher,
+   .suite = {
+   .akcipher = __VECS(ecdsa_nist_p384_tv_template)
+   }
}, {
.alg = "ecrdsa",
.test = alg_test_akcipher,
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 2adcc0dc0bdd..e63a760aca2d 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -833,6 +833,163 @@ static const struct akcipher_testvec 
ecdsa_nist_p256_tv_template[] = {
},
 };
 
+static const struct akcipher_testvec ecdsa_nist_p384_tv_template[] = {
+   {
+   .key = /* secp384r1(sha1) */
+   "\x04\x89\x25\xf3\x97\x88\xcb\xb0\x78\xc5\x72\x9a\x14\x6e\x7a\xb1"
+   "\x5a\xa5\x24\xf1\x95\x06\x9e\x28\xfb\xc4\xb9\xbe\x5a\x0d\xd9\x9f"
+   "\xf3\xd1\x4d\x2d\x07\x99\xbd\xda\xa7\x66\xec\xbb\xea\xba\x79\x42"
+   "\xc9\x34\x89\x6a\xe7\x0b\xc3\xf2\xfe\x32\x30\xbe\xba\xf9\xdf\x7e"
+   "\x4b\x6a\x07\x8e\x26\x66\x3f\x1d\xec\xa2\x57\x91\x51\xdd\x17\x0e"
+   "\x0b\x25\xd6\x80\x5c\x3b\xe6\x1a\x98\x48\x91\x45\x7a\x73\xb0\xc3"
+   "\xf1",
+   .key_len = 97,
+   .params =
+   "\x30\x10\x06\x07\x2a\x86\x48\xce\x3d\x02\x01\x06\x05\x2b\x81\x04"
+   "\x00\x22",
+   .param_len = 18,
+   .m =
+   "\x12\x55\x28\xf0\x77\xd5\xb6\x21\x71\x32\x48\xcd\x28\xa8\x25\x22"
+   "\x3a\x69\xc1\x93",
+   .m_size = 20,
+   .algo = OID_id_ecdsa_with_sha1,
+   .c =
+   "\x30\x66\x02\x31\x00\xf5\x0f\x24\x4c\x07\x93\x6f\x21\x57\x55\x07"
+   "\x20\x43\x30\xde\xa0\x8d\x26\x8e\xae\x63\x3f\xbc\x20\x3a\xc6\xf1"
+   "\x32\x3c\xce\x70\x2b\x78\xf1\x4c\x26\xe6\x5b\x86\xcf\xec\x7c\x7e"
+   "\xd0\x87\xd7\xd7\x6e\x02\x31\x00\xcd\xbb\x7e\x81\x5d\x8f\x63\xc0"
+   

[PATCH v10 8/9] x509: Add OID for NIST P384 and extend parser for it

2021-03-04 Thread Stefan Berger
From: Saulo Alessandre 

* crypto/asymmetric_keys/x509_cert_parser.c
  - prepare x509 parser to load nist_secp384r1

* include/linux/oid_registry.h
  - add OID_id_secp384r1

Signed-off-by: Saulo Alessandre 
Tested-by: Stefan Berger 
---
 crypto/asymmetric_keys/x509_cert_parser.c | 3 +++
 include/linux/oid_registry.h  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index f5d547c6dfb5..526c6a407e07 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -510,6 +510,9 @@ int x509_extract_key_data(void *context, size_t hdrlen,
case OID_id_prime256v1:
ctx->cert->pub->pkey_algo = "ecdsa-nist-p256";
break;
+   case OID_id_secp384r1:
+   ctx->cert->pub->pkey_algo = "ecdsa-nist-p384";
+   break;
default:
return -ENOPKG;
}
diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
index 3583908cf1ca..d656450dfc66 100644
--- a/include/linux/oid_registry.h
+++ b/include/linux/oid_registry.h
@@ -64,6 +64,7 @@ enum OID {
 
OID_certAuthInfoAccess, /* 1.3.6.1.5.5.7.1.1 */
OID_sha1,   /* 1.3.14.3.2.26 */
+   OID_id_secp384r1,   /* 1.3.132.0.34 */
OID_sha256, /* 2.16.840.1.101.3.4.2.1 */
OID_sha384, /* 2.16.840.1.101.3.4.2.2 */
OID_sha512, /* 2.16.840.1.101.3.4.2.3 */
-- 
2.29.2



[PATCH v10 3/9] crypto: Add math to support fast NIST P384

2021-03-04 Thread Stefan Berger
From: Saulo Alessandre 

* crypto/ecc.c
  - add vli_mmod_fast_384
  - change some routines to pass ecc_curve forward until vli_mmod_fast

* crypto/ecc.h
  - add ECC_CURVE_NIST_P384_DIGITS
  - change ECC_MAX_DIGITS to P384 size

Signed-off-by: Saulo Alessandre 
Tested-by: Stefan Berger 
---
 crypto/ecc.c | 266 +--
 crypto/ecc.h |   3 +-
 2 files changed, 194 insertions(+), 75 deletions(-)

diff --git a/crypto/ecc.c b/crypto/ecc.c
index f6cef5a7942d..c125576cda6b 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -778,18 +778,133 @@ static void vli_mmod_fast_256(u64 *result, const u64 
*product,
}
 }
 
+#define SL32OR32(x32, y32) (((u64)x32 << 32) | y32)
+#define AND64H(x64)  (x64 & 0xffFFffFFull)
+#define AND64L(x64)  (x64 & 0xffFFffFFull)
+
+/* Computes result = product % curve_prime
+ * from "Mathematical routines for the NIST prime elliptic curves"
+ */
+static void vli_mmod_fast_384(u64 *result, const u64 *product,
+   const u64 *curve_prime, u64 *tmp)
+{
+   int carry;
+   const unsigned int ndigits = 6;
+
+   /* t */
+   vli_set(result, product, ndigits);
+
+   /* s1 */
+   tmp[0] = 0; // 0 || 0
+   tmp[1] = 0; // 0 || 0
+   tmp[2] = SL32OR32(product[11], (product[10]>>32));  //a22||a21
+   tmp[3] = product[11]>>32;   // 0 ||a23
+   tmp[4] = 0; // 0 || 0
+   tmp[5] = 0; // 0 || 0
+   carry = vli_lshift(tmp, tmp, 1, ndigits);
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* s2 */
+   tmp[0] = product[6];//a13||a12
+   tmp[1] = product[7];//a15||a14
+   tmp[2] = product[8];//a17||a16
+   tmp[3] = product[9];//a19||a18
+   tmp[4] = product[10];   //a21||a20
+   tmp[5] = product[11];   //a23||a22
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* s3 */
+   tmp[0] = SL32OR32(product[11], (product[10]>>32));  //a22||a21
+   tmp[1] = SL32OR32(product[6], (product[11]>>32));   //a12||a23
+   tmp[2] = SL32OR32(product[7], (product[6])>>32);//a14||a13
+   tmp[3] = SL32OR32(product[8], (product[7]>>32));//a16||a15
+   tmp[4] = SL32OR32(product[9], (product[8]>>32));//a18||a17
+   tmp[5] = SL32OR32(product[10], (product[9]>>32));   //a20||a19
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* s4 */
+   tmp[0] = AND64H(product[11]);   //a23|| 0
+   tmp[1] = (product[10]<<32); //a20|| 0
+   tmp[2] = product[6];//a13||a12
+   tmp[3] = product[7];//a15||a14
+   tmp[4] = product[8];//a17||a16
+   tmp[5] = product[9];//a19||a18
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* s5 */
+   tmp[0] = 0; //  0|| 0
+   tmp[1] = 0; //  0|| 0
+   tmp[2] = product[10];   //a21||a20
+   tmp[3] = product[11];   //a23||a22
+   tmp[4] = 0; //  0|| 0
+   tmp[5] = 0; //  0|| 0
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* s6 */
+   tmp[0] = AND64L(product[10]);   // 0 ||a20
+   tmp[1] = AND64H(product[10]);   //a21|| 0
+   tmp[2] = product[11];   //a23||a22
+   tmp[3] = 0; // 0 || 0
+   tmp[4] = 0; // 0 || 0
+   tmp[5] = 0; // 0 || 0
+   carry += vli_add(result, result, tmp, ndigits);
+
+   /* d1 */
+   tmp[0] = SL32OR32(product[6], (product[11]>>32));   //a12||a23
+   tmp[1] = SL32OR32(product[7], (product[6]>>32));//a14||a13
+   tmp[2] = SL32OR32(product[8], (product[7]>>32));//a16||a15
+   tmp[3] = SL32OR32(product[9], (product[8]>>32));//a18||a17
+   tmp[4] = SL32OR32(product[10], (product[9]>>32));   //a20||a19
+   tmp[5] = SL32OR32(product[11], (product[10]>>32));  //a22||a21
+   carry -= vli_sub(result, result, tmp, ndigits);
+
+   /* d2 */
+   tmp[0] = (product[10]<<32); //a20|| 0
+   tmp[1] = SL32OR32(product[11], (product[10]>>32));  //a22||a21
+   tmp[2] = (product[11]>>32); // 0 ||a23
+   tmp[3] = 0; // 0 || 0
+   tmp[4] = 0; // 0 || 0
+   tmp[5] = 0; // 0 || 0
+   carry -= vli_sub(result, result, tmp, ndigits);
+
+   /* d3 */
+   tmp[0] = 0; // 0 || 0
+   tmp[1] = AND64H(product[11]);   //a23|| 0
+   tmp[2] = product[11]>>32;   // 0 ||a23
+   tmp[3] = 0; // 0 || 0
+   tmp[4] = 0; // 0 || 0
+   tmp[5] = 0; // 0 || 0
+   carry -= vli_sub(result, result, tmp, ndigits);
+
+   if (carry < 0) {
+   do {
+   carry += vli_add(result, result, curve_prime, ndigits);
+   } while (carry < 0);
+   } else {
+   while (carry || vli_cmp(curve_prime, result, ndigits) != 1)
+  

[PATCH v10 2/9] crypto: Add NIST P384 curve parameters

2021-03-04 Thread Stefan Berger
From: Saulo Alessandre 

* crypto/ecc_curve_defs.h
  - add nist_p384 params

* include/crypto/ecdh.h
  - add ECC_CURVE_NIST_P384

* crypto/ecc.c
  - change ecc_get_curve to accept nist_p384

Signed-off-by: Saulo Alessandre 
Tested-by: Stefan Berger 
---
 crypto/ecc.c|  2 ++
 crypto/ecc_curve_defs.h | 32 
 include/crypto/ecdh.h   |  1 +
 3 files changed, 35 insertions(+)

diff --git a/crypto/ecc.c b/crypto/ecc.c
index 25e79fd70566..f6cef5a7942d 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -50,6 +50,8 @@ const struct ecc_curve *ecc_get_curve(unsigned int curve_id)
return fips_enabled ? NULL : _p192;
case ECC_CURVE_NIST_P256:
return _p256;
+   case ECC_CURVE_NIST_P384:
+   return _p384;
default:
return NULL;
}
diff --git a/crypto/ecc_curve_defs.h b/crypto/ecc_curve_defs.h
index 69be6c7d228f..b327732f6ef5 100644
--- a/crypto/ecc_curve_defs.h
+++ b/crypto/ecc_curve_defs.h
@@ -54,4 +54,36 @@ static struct ecc_curve nist_p256 = {
.b = nist_p256_b
 };
 
+/* NIST P-384 */
+static u64 nist_p384_g_x[] = { 0x3A545E3872760AB7ull, 0x5502F25DBF55296Cull,
+   0x59F741E082542A38ull, 0x6E1D3B628BA79B98ull,
+   0x8Eb1C71EF320AD74ull, 0xAA87CA22BE8B0537ull };
+static u64 nist_p384_g_y[] = { 0x7A431D7C90EA0E5Full, 0x0A60B1CE1D7E819Dull,
+   0xE9DA3113B5F0B8C0ull, 0xF8F41DBD289A147Cull,
+   0x5D9E98BF9292DC29ull, 0x3617DE4A96262C6Full };
+static u64 nist_p384_p[] = { 0xull, 0xull,
+   0xFFFEull, 0xull,
+   0xull, 0xull };
+static u64 nist_p384_n[] = { 0xECEC196ACCC52973ull, 0x581A0DB248B0A77Aull,
+   0xC7634D81F4372DDFull, 0xull,
+   0xull, 0xull };
+static u64 nist_p384_a[] = { 0xFFFCull, 0xull,
+   0xFFFEull, 0xull,
+   0xull, 0xull };
+static u64 nist_p384_b[] = { 0x2a85c8edd3ec2aefull, 0xc656398d8a2ed19dull,
+   0x0314088f5013875aull, 0x181d9c6efe814112ull,
+   0x988e056be3f82d19ull, 0xb3312fa7e23ee7e4ull };
+static struct ecc_curve nist_p384 = {
+   .name = "nist_384",
+   .g = {
+   .x = nist_p384_g_x,
+   .y = nist_p384_g_y,
+   .ndigits = 6,
+   },
+   .p = nist_p384_p,
+   .n = nist_p384_n,
+   .a = nist_p384_a,
+   .b = nist_p384_b
+};
+
 #endif
diff --git a/include/crypto/ecdh.h b/include/crypto/ecdh.h
index a5b805b5526d..e4ba1de961e4 100644
--- a/include/crypto/ecdh.h
+++ b/include/crypto/ecdh.h
@@ -25,6 +25,7 @@
 /* Curves IDs */
 #define ECC_CURVE_NIST_P1920x0001
 #define ECC_CURVE_NIST_P2560x0002
+#define ECC_CURVE_NIST_P3840x0003
 
 /**
  * struct ecdh - define an ECDH private key
-- 
2.29.2



[for-linus][PATCH 1/3] tracing: Fix memory leak in __create_synth_event()

2021-03-04 Thread Steven Rostedt
From: Vamshi K Sthambamkadi 

kmemleak report:
unreferenced object 0xc5a6f708 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
00 c1 3d 60 14 83 1f 8a  ..=`
  backtrace:
[] __kmalloc_track_caller+0x2a6/0x460
[<7d3d60a6>] kstrndup+0x37/0x70
[<45a0e739>] argv_split+0x1c/0x120
[] __create_synth_event+0x192/0xb00
[<0708b8a3>] create_synth_event+0xbb/0x150
[<3d1941e1>] create_dyn_event+0x5c/0xb0
[<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
[<04deb2ef>] dyn_event_write+0x10/0x20
[<8779ac95>] vfs_write+0xa9/0x3c0
[] ksys_write+0x89/0xc0
[] __ia32_sys_write+0x15/0x20
[<7ce02d85>] __do_fast_syscall_32+0x45/0x80
[] do_fast_syscall_32+0x29/0x60
[<2467454a>] do_SYSENTER_32+0x15/0x20
[<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc
unreferenced object 0xc5a6f078 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
08 f7 a6 c5 00 00 00 00  
  backtrace:
[] __kmalloc+0x2b6/0x470
[] argv_split+0x82/0x120
[] __create_synth_event+0x192/0xb00
[<0708b8a3>] create_synth_event+0xbb/0x150
[<3d1941e1>] create_dyn_event+0x5c/0xb0
[<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
[<04deb2ef>] dyn_event_write+0x10/0x20
[<8779ac95>] vfs_write+0xa9/0x3c0
[] ksys_write+0x89/0xc0
[] __ia32_sys_write+0x15/0x20
[<7ce02d85>] __do_fast_syscall_32+0x45/0x80
[] do_fast_syscall_32+0x29/0x60
[<2467454a>] do_SYSENTER_32+0x15/0x20
[<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc

In __create_synth_event(), while iterating field/type arguments, the
argv_split() will return array of atleast 2 elements even when zero
arguments(argc=0) are passed. for e.g. when there is double delimiter
or string ends with delimiter

To fix call argv_free() even when argc=0.

Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos

Signed-off-by: Vamshi K Sthambamkadi 
Signed-off-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace_events_synth.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_synth.c 
b/kernel/trace/trace_events_synth.c
index 2979a96595b4..8d71e6c83f10 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -1225,8 +1225,10 @@ static int __create_synth_event(const char *name, const 
char *raw_fields)
goto err;
}
 
-   if (!argc)
+   if (!argc) {
+   argv_free(argv);
continue;
+   }
 
n_fields_this_loop = 0;
consumed = 0;
-- 
2.30.0




[for-linus][PATCH 2/3] tracing: Skip selftests if tracing is disabled

2021-03-04 Thread Steven Rostedt
From: "Steven Rostedt (VMware)" 

If tracing is disabled for some reason (traceoff_on_warning, command line,
etc), the ftrace selftests are guaranteed to fail, as their results are
defined by trace data in the ring buffers. If the ring buffers are turned
off, the tests will fail, due to lack of data.

Because tracing being disabled is for a specific reason (warning, user
decided to, etc), it does not make sense to enable tracing to run the self
tests, as the test output may corrupt the reason for the tracing to be
disabled.

Instead, simply skip the self tests and report that they are being skipped
due to tracing being disabled.

Signed-off-by: Steven Rostedt (VMware) 
---
 kernel/trace/trace.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index e295c413580e..eccb4e1187cc 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1929,6 +1929,12 @@ static int run_tracer_selftest(struct tracer *type)
if (!selftests_can_run)
return save_selftest(type);
 
+   if (!tracing_is_on()) {
+   pr_warn("Selftest for tracer %s skipped due to tracing 
disabled\n",
+   type->name);
+   return 0;
+   }
+
/*
 * Run a selftest on this tracer.
 * Here we reset the trace buffer, and set the current
-- 
2.30.0




[for-linus][PATCH 0/3] More tracing fixes for 5.12

2021-03-04 Thread Steven Rostedt
Functional fix:

 - Memory leak in __create_synth_event()

 - Skip selftests if tracing is disabled

Non functional fix:

 - Fix stale comment about the trace_event_call flags.

Steven Rostedt (VMware) (2):
  tracing: Skip selftests if tracing is disabled
  tracing: Fix comment about the trace_event_call flags

Vamshi K Sthambamkadi (1):
  tracing: Fix memory leak in __create_synth_event()


 include/linux/trace_events.h  | 11 ++-
 kernel/trace/trace.c  |  6 ++
 kernel/trace/trace_events_synth.c |  4 +++-
 3 files changed, 11 insertions(+), 10 deletions(-)


[for-linus][PATCH 3/3] tracing: Fix comment about the trace_event_call flags

2021-03-04 Thread Steven Rostedt
From: "Steven Rostedt (VMware)" 

In the declaration of the struct trace_event_call, the flags has the bits
defined in the comment above it. But these bits are also defined by the
TRACE_EVENT_FL_* enums just above the declaration of the struct. As the
comment about the flags in the struct has become stale and incorrect, just
replace it with a reference to the TRACE_EVENT_FL_* enum above.

Signed-off-by: Steven Rostedt (VMware) 
---
 include/linux/trace_events.h | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 7077fec653bb..28e7af1406f2 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -349,15 +349,8 @@ struct trace_event_call {
struct event_filter *filter;
void*mod;
void*data;
-   /*
-*   bit 0: filter_active
-*   bit 1: allow trace by non root (cap any)
-*   bit 2: failed to apply filter
-*   bit 3: trace internal event (do not enable)
-*   bit 4: Event was enabled by module
-*   bit 5: use call filter rather than file filter
-*   bit 6: Event is a tracepoint
-*/
+
+   /* See the TRACE_EVENT_FL_* flags above */
int flags; /* static flags of different events */
 
 #ifdef CONFIG_PERF_EVENTS
-- 
2.30.0




[PATCH v10 7/9] ima: Support EC keys for signature verification

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

Add support for IMA signature verification for EC keys. Since SHA type
of hashes can be used by RSA and ECDSA signature schemes we need to
look at the key and derive from the key which signature scheme to use.
Since this can be applied to all types of keys, we change the selection
of the encoding type to be driven by the key's signature scheme rather
than by the hash type.

Cc: Dmitry Kasatkin 
Cc: linux-integr...@vger.kernel.org
Cc: David Howells 
Cc: keyri...@vger.kernel.org
Signed-off-by: Stefan Berger 
Reviewed-by: Vitaly Chikunov 
Reviewed-by: Tianjia Zhang 
Acked-by: Mimi Zohar 

---
v7->v8:
  - use strncmp to check for 'ecdsa-' to match 'ecdsa-nist-p192' and
'ecdsa-nist-p256' key types; previously they were just 'ecdsa'
---
 include/keys/asymmetric-type.h |  6 ++
 security/integrity/digsig_asymmetric.c | 30 --
 2 files changed, 20 insertions(+), 16 deletions(-)

diff --git a/include/keys/asymmetric-type.h b/include/keys/asymmetric-type.h
index a29d3ff2e7e8..c432fdb8547f 100644
--- a/include/keys/asymmetric-type.h
+++ b/include/keys/asymmetric-type.h
@@ -72,6 +72,12 @@ const struct asymmetric_key_ids *asymmetric_key_ids(const 
struct key *key)
return key->payload.data[asym_key_ids];
 }
 
+static inline
+const struct public_key *asymmetric_key_public_key(const struct key *key)
+{
+   return key->payload.data[asym_crypto];
+}
+
 extern struct key *find_asymmetric_key(struct key *keyring,
   const struct asymmetric_key_id *id_0,
   const struct asymmetric_key_id *id_1,
diff --git a/security/integrity/digsig_asymmetric.c 
b/security/integrity/digsig_asymmetric.c
index a662024b4c70..23240d793b07 100644
--- a/security/integrity/digsig_asymmetric.c
+++ b/security/integrity/digsig_asymmetric.c
@@ -84,6 +84,7 @@ int asymmetric_verify(struct key *keyring, const char *sig,
 {
struct public_key_signature pks;
struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
+   const struct public_key *pk;
struct key *key;
int ret;
 
@@ -105,23 +106,20 @@ int asymmetric_verify(struct key *keyring, const char 
*sig,
memset(, 0, sizeof(pks));
 
pks.hash_algo = hash_algo_name[hdr->hash_algo];
-   switch (hdr->hash_algo) {
-   case HASH_ALGO_STREEBOG_256:
-   case HASH_ALGO_STREEBOG_512:
-   /* EC-RDSA and Streebog should go together. */
-   pks.pkey_algo = "ecrdsa";
-   pks.encoding = "raw";
-   break;
-   case HASH_ALGO_SM3_256:
-   /* SM2 and SM3 should go together. */
-   pks.pkey_algo = "sm2";
-   pks.encoding = "raw";
-   break;
-   default:
-   pks.pkey_algo = "rsa";
+
+   pk = asymmetric_key_public_key(key);
+   pks.pkey_algo = pk->pkey_algo;
+   if (!strcmp(pk->pkey_algo, "rsa"))
pks.encoding = "pkcs1";
-   break;
-   }
+   else if (!strncmp(pk->pkey_algo, "ecdsa-", 6))
+   /* edcsa-nist-p192 etc. */
+   pks.encoding = "x962";
+   else if (!strcmp(pk->pkey_algo, "ecrdsa") ||
+  !strcmp(pk->pkey_algo, "sm2"))
+   pks.encoding = "raw";
+   else
+   return -ENOPKG;
+
pks.digest = (u8 *)data;
pks.digest_size = datalen;
pks.s = hdr->sig;
-- 
2.29.2



[PATCH v10 9/9] certs: Add support for using elliptic curve keys for signing modules

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

This patch adds support for using elliptic curve keys for signing
modules. It uses a NIST P384 (secp384r1) key if the user chooses an
elliptic curve key and will have ECDSA support built into the kernel.

Note: A developer choosing an ECDSA key for signing modules has to
manually delete the signing key (rm certs/signing_key.*) when falling
back to building an older version of a kernel that only supports RSA
keys since otherwise ECDSA-signed modules will not be usable when that
older kernel runs and the ECDSA key was still used for signing modules.

Signed-off-by: Stefan Berger 
Reviewed-by: Mimi Zohar 

---

v8->v9:
 - Automatically select CONFIG_ECDSA for built-in ECDSA support
 - Added help documentation

This patch builds on top Nayna's series for 'kernel build support for
loading the kernel module signing key'.
- https://lkml.org/lkml/2021/2/18/856
---
 certs/Kconfig | 22 ++
 certs/Makefile| 14 ++
 crypto/asymmetric_keys/pkcs7_parser.c |  4 
 3 files changed, 40 insertions(+)

diff --git a/certs/Kconfig b/certs/Kconfig
index 48675ad319db..919db43ce80b 100644
--- a/certs/Kconfig
+++ b/certs/Kconfig
@@ -15,6 +15,28 @@ config MODULE_SIG_KEY
  then the kernel will automatically generate the private key and
  certificate as described in 
Documentation/admin-guide/module-signing.rst
 
+choice
+   prompt "Type of module signing key to be generated"
+   default MODULE_SIG_KEY_TYPE_RSA
+   help
+The type of module signing key type to generated. This option
+does not apply if a #PKCS11 URI is used.
+
+config MODULE_SIG_KEY_TYPE_RSA
+   bool "RSA"
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
+   help
+Use an RSA key for module signing.
+
+config MODULE_SIG_KEY_TYPE_ECDSA
+   bool "ECDSA"
+   select CRYPTO_ECDSA
+   depends on MODULE_SIG || IMA_APPRAISE_MODSIG
+   help
+Use an elliptic curve key (NIST P384) for module signing.
+
+endchoice
+
 config SYSTEM_TRUSTED_KEYRING
bool "Provide system-wide ring of trusted keys"
depends on KEYS
diff --git a/certs/Makefile b/certs/Makefile
index 3fe6b73786fa..c487d7021c54 100644
--- a/certs/Makefile
+++ b/certs/Makefile
@@ -69,6 +69,18 @@ else
 SIGNER = -signkey $(obj)/signing_key.key
 endif # CONFIG_IMA_APPRAISE_MODSIG
 
+X509TEXT=$(shell openssl x509 -in $(CONFIG_MODULE_SIG_KEY) -text)
+
+# Support user changing key type
+ifdef CONFIG_MODULE_SIG_KEY_TYPE_ECDSA
+keytype_openssl = -newkey ec -pkeyopt ec_paramgen_curve:secp384r1
+$(if $(findstring ecdsa-with-,$(X509TEXT)),,$(shell rm -f 
$(CONFIG_MODULE_SIG_KEY)))
+endif
+
+ifdef CONFIG_MODULE_SIG_KEY_TYPE_RSA
+$(if $(findstring rsaEncryption,$(X509TEXT)),,$(shell rm -f 
$(CONFIG_MODULE_SIG_KEY)))
+endif
+
 $(obj)/signing_key.pem: $(obj)/x509.genkey
@$(kecho) "###"
@$(kecho) "### Now generating an X.509 key pair to be used for signing 
modules."
@@ -86,12 +98,14 @@ ifeq ($(CONFIG_IMA_APPRAISE_MODSIG),y)
-batch -x509 -config $(obj)/x509.genkey \
-outform PEM -out $(CA_KEY) \
-keyout $(CA_KEY) -extensions ca_ext \
+   $(keytype_openssl) \
$($(quiet)redirect_openssl)
 endif # CONFIG_IMA_APPRAISE_MODSIG
$(Q)openssl req -new -nodes -utf8 \
-batch -config $(obj)/x509.genkey \
-outform PEM -out $(obj)/signing_key.csr \
-keyout $(obj)/signing_key.key -extensions myexts \
+   $(keytype_openssl) \
$($(quiet)redirect_openssl)
$(Q)openssl x509 -req -days 36500 -in $(obj)/signing_key.csr \
-outform PEM -out $(obj)/signing_key.crt $(SIGNER) \
diff --git a/crypto/asymmetric_keys/pkcs7_parser.c 
b/crypto/asymmetric_keys/pkcs7_parser.c
index 967329e0a07b..2546ec6a0505 100644
--- a/crypto/asymmetric_keys/pkcs7_parser.c
+++ b/crypto/asymmetric_keys/pkcs7_parser.c
@@ -269,6 +269,10 @@ int pkcs7_sig_note_pkey_algo(void *context, size_t hdrlen,
ctx->sinfo->sig->pkey_algo = "rsa";
ctx->sinfo->sig->encoding = "pkcs1";
break;
+   case OID_id_ecdsa_with_sha256:
+   ctx->sinfo->sig->pkey_algo = "ecdsa";
+   ctx->sinfo->sig->encoding = "x962";
+   break;
default:
printk("Unsupported pkey algo: %u\n", ctx->last_oid);
return -ENOPKG;
-- 
2.29.2



[PATCH v10 1/9] crypto: Add support for ECDSA signature verification

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

Add support for parsing the parameters of a NIST P256 or NIST P192 key.
Enable signature verification using these keys. The new module is
enabled with CONFIG_ECDSA:
  Elliptic Curve Digital Signature Algorithm (NIST P192, P256 etc.)
  is A NIST cryptographic standard algorithm. Only signature verification
  is implemented.

Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: linux-cry...@vger.kernel.org
Signed-off-by: Stefan Berger 
Acked-by: Jarkko Sakkinen 

---
v8->v9:
  - unregister nist_p192 curve if nist_p256 cannot be registered
---
 crypto/Kconfig   |  10 +
 crypto/Makefile  |   6 +
 crypto/ecc.c |  13 +-
 crypto/ecc.h |  28 +++
 crypto/ecdsa.c   | 369 +++
 crypto/ecdsasignature.asn1   |   4 +
 crypto/testmgr.c |  12 ++
 crypto/testmgr.h | 267 +
 include/linux/oid_registry.h |   6 +-
 9 files changed, 703 insertions(+), 12 deletions(-)
 create mode 100644 crypto/ecdsa.c
 create mode 100644 crypto/ecdsasignature.asn1

diff --git a/crypto/Kconfig b/crypto/Kconfig
index a367fcfeb5d4..a31df40591f5 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -247,6 +247,16 @@ config CRYPTO_ECDH
help
  Generic implementation of the ECDH algorithm
 
+config CRYPTO_ECDSA
+   tristate "ECDSA (NIST P192, P256 etc.) algorithm"
+   select CRYPTO_ECC
+   select CRYPTO_AKCIPHER
+   select ASN1
+   help
+ Elliptic Curve Digital Signature Algorithm (NIST P192, P256 etc.)
+ is A NIST cryptographic standard algorithm. Only signature 
verification
+ is implemented.
+
 config CRYPTO_ECRDSA
tristate "EC-RDSA (GOST 34.10) algorithm"
select CRYPTO_ECC
diff --git a/crypto/Makefile b/crypto/Makefile
index b279483fba50..982066c6bdfb 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -50,6 +50,12 @@ sm2_generic-y += sm2.o
 
 obj-$(CONFIG_CRYPTO_SM2) += sm2_generic.o
 
+$(obj)/ecdsasignature.asn1.o: $(obj)/ecdsasignature.asn1.c 
$(obj)/ecdsasignature.asn1.h
+$(obj)/ecdsa.o: $(obj)/ecdsasignature.asn1.h
+ecdsa_generic-y += ecdsa.o
+ecdsa_generic-y += ecdsasignature.asn1.o
+obj-$(CONFIG_CRYPTO_ECDSA) += ecdsa_generic.o
+
 crypto_acompress-y := acompress.o
 crypto_acompress-y += scompress.o
 obj-$(CONFIG_CRYPTO_ACOMP2) += crypto_acompress.o
diff --git a/crypto/ecc.c b/crypto/ecc.c
index c80aa25994a0..25e79fd70566 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -42,7 +42,7 @@ typedef struct {
u64 m_high;
 } uint128_t;
 
-static inline const struct ecc_curve *ecc_get_curve(unsigned int curve_id)
+const struct ecc_curve *ecc_get_curve(unsigned int curve_id)
 {
switch (curve_id) {
/* In FIPS mode only allow P256 and higher */
@@ -54,6 +54,7 @@ static inline const struct ecc_curve *ecc_get_curve(unsigned 
int curve_id)
return NULL;
}
 }
+EXPORT_SYMBOL(ecc_get_curve);
 
 static u64 *ecc_alloc_digits_space(unsigned int ndigits)
 {
@@ -1281,16 +1282,6 @@ void ecc_point_mult_shamir(const struct ecc_point 
*result,
 }
 EXPORT_SYMBOL(ecc_point_mult_shamir);
 
-static inline void ecc_swap_digits(const u64 *in, u64 *out,
-  unsigned int ndigits)
-{
-   const __be64 *src = (__force __be64 *)in;
-   int i;
-
-   for (i = 0; i < ndigits; i++)
-   out[i] = be64_to_cpu(src[ndigits - 1 - i]);
-}
-
 static int __ecc_is_key_valid(const struct ecc_curve *curve,
  const u64 *private_key, unsigned int ndigits)
 {
diff --git a/crypto/ecc.h b/crypto/ecc.h
index d4e546b9ad79..2ea86dfb5cf7 100644
--- a/crypto/ecc.h
+++ b/crypto/ecc.h
@@ -33,6 +33,8 @@
 
 #define ECC_DIGITS_TO_BYTES_SHIFT 3
 
+#define ECC_MAX_BYTES (ECC_MAX_DIGITS << ECC_DIGITS_TO_BYTES_SHIFT)
+
 /**
  * struct ecc_point - elliptic curve point in affine coordinates
  *
@@ -70,6 +72,32 @@ struct ecc_curve {
u64 *b;
 };
 
+/**
+ * ecc_swap_digits() - Copy ndigits from big endian array to native array
+ *
+ * @in:   input array
+ * @out:  output array
+ * @ndigits:  number of digits to copy
+ */
+static inline void ecc_swap_digits(const u64 *in, u64 *out,
+  unsigned int ndigits)
+{
+   const __be64 *src = (__force __be64 *)in;
+   int i;
+
+   for (i = 0; i < ndigits; i++)
+   out[i] = be64_to_cpu(src[ndigits - 1 - i]);
+}
+
+/**
+ * ecc_get_curve()  - Get a curve given its curve_id
+ *
+ * @curve_id:  Id of the curve
+ *
+ * Returns pointer to the curve data, NULL if curve is not available
+ */
+const struct ecc_curve *ecc_get_curve(unsigned int curve_id);
+
 /**
  * ecc_is_key_valid() - Validate a given ECDH private key
  *
diff --git a/crypto/ecdsa.c b/crypto/ecdsa.c
new file mode 100644
index ..04fbb3d2abc5
--- /dev/null
+++ b/crypto/ecdsa.c
@@ -0,0 +1,369 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (c) 2021 IBM Corporation
+ *
+ * 

[PATCH v10 0/9] Add support for x509 certs with NIST P384/256/192 keys

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

This series of patches adds support for x509 certificates signed by a CA
that uses NIST P384, P256 or P192 keys for signing. It also adds support for
certificates where the public key is one of this type of a key. The math
for ECDSA signature verification is also added as well as the math for fast
mmod operation for NIST P384.

Since self-signed certificates are verified upon loading, the following
script can be used for testing of NIST P256 keys:

k=$(keyctl newring test @u)

while :; do
for hash in sha1 sha224 sha256 sha384 sha512; do
openssl req \
-x509 \
-${hash} \
-newkey ec \
-pkeyopt ec_paramgen_curve:prime256v1 \
-keyout key.pem \
-days 365 \
-subj '/CN=test' \
-nodes \
-outform der \
-out cert.der
keyctl padd asymmetric testkey $k < cert.der
if [ $? -ne 0 ]; then
echo "ERROR"
exit 1
fi
done
done

Ecdsa support also works with restricted keyrings where an RSA key is used
to sign a NIST P384/256/192 key. Scripts for testing are here:

https://github.com/stefanberger/eckey-testing

The ECDSA signature verification will be used by IMA Appraisal where ECDSA
file signatures stored in RPM packages will use substantially less space
than if RSA signatures were to be used.

Further, a patch is added that allows kernel modules to be signed with a NIST
p384 key.

   Stefan and Saulo

v9->v10:
  - rearranged order of patches to have crypto patches first
  - moved hunk from patch 3 to patch 2 to avoid compilation warning due to
unused symbol

v8->v9:
  - Appended Saulo's patches
  - Appended patch to support kernel modules signed with NIST p384 key. This
patch requires Nayna's series here: https://lkml.org/lkml/2021/2/18/856

v7->v8:
  - patch 3/4: Do not determine key algo using parse_OID in public_key.c
but do this when parsing the certificate. This addresses an issue
with certain build configurations where OID_REGISTRY is not available
as 'Reported-by: kernel test robot '.

v6->v7:
  - Moved some OID defintions to patch 1 for bisectability
  - Applied R-b's
  
v5->v6:
  - moved ecdsa code into its own module ecdsa_generic built from ecdsa.c
  - added script-generated test vectors for NIST P256 & P192 and all hashes
  - parsing of OID that contain header with new parse_oid()

v4->v5:
  - registering crypto support under names ecdsa-nist-p256/p192 following
Hubert Xu's suggestion in other thread
  - appended IMA ECDSA support patch

v3->v4:
  - split off of ecdsa crypto part; registering akcipher as "ecdsa" and
deriving used curve from digits in parsed key

v2->v3:
  - patch 2 now includes linux/scatterlist.h

v1->v2:
  - using faster vli_sub rather than newly added vli_mod_fast to 'reduce'
result
  - rearranged switch statements to follow after RSA
  - 3rd patch from 1st posting is now 1st patch


Saulo Alessandre (4):
  crypto: Add NIST P384 curve parameters
  crypto: Add math to support fast NIST P384
  ecdsa: Register NIST P384 and extend test suite
  x509: Add OID for NIST P384 and extend parser for it

Stefan Berger (5):
  crypto: Add support for ECDSA signature verification
  x509: Detect sm2 keys by their parameters OID
  x509: Add support for parsing x509 certs with ECDSA keys
  ima: Support EC keys for signature verification
  certs: Add support for using elliptic curve keys for signing modules

 certs/Kconfig |  22 ++
 certs/Makefile|  14 +
 crypto/Kconfig|  10 +
 crypto/Makefile   |   6 +
 crypto/asymmetric_keys/pkcs7_parser.c |   4 +
 crypto/asymmetric_keys/public_key.c   |   4 +-
 crypto/asymmetric_keys/x509_cert_parser.c |  49 ++-
 crypto/asymmetric_keys/x509_public_key.c  |   4 +-
 crypto/ecc.c  | 281 +-
 crypto/ecc.h  |  31 +-
 crypto/ecc_curve_defs.h   |  32 ++
 crypto/ecdsa.c| 400 
 crypto/ecdsasignature.asn1|   4 +
 crypto/testmgr.c  |  18 +
 crypto/testmgr.h  | 424 ++
 include/crypto/ecdh.h |   1 +
 include/keys/asymmetric-type.h|   6 +
 include/linux/oid_registry.h  |  10 +-
 lib/oid_registry.c|  13 +
 security/integrity/digsig_asymmetric.c|  30 +-
 20 files changed, 1256 insertions(+), 107 deletions(-)
 create mode 100644 crypto/ecdsa.c
 create mode 100644 crypto/ecdsasignature.asn1

-- 
2.29.2



[PATCH v10 5/9] x509: Detect sm2 keys by their parameters OID

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

Detect whether a key is an sm2 type of key by its OID in the parameters
array rather than assuming that everything under OID_id_ecPublicKey
is sm2, which is not the case.

Cc: David Howells 
Cc: keyri...@vger.kernel.org
Signed-off-by: Stefan Berger 
Reviewed-by: Tianjia Zhang 
---
 crypto/asymmetric_keys/x509_cert_parser.c | 12 +++-
 include/linux/oid_registry.h  |  1 +
 lib/oid_registry.c| 13 +
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 52c9b455fc7d..1621ceaf5c95 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -459,6 +459,7 @@ int x509_extract_key_data(void *context, size_t hdrlen,
  const void *value, size_t vlen)
 {
struct x509_parse_context *ctx = context;
+   enum OID oid;
 
ctx->key_algo = ctx->last_oid;
switch (ctx->last_oid) {
@@ -470,7 +471,16 @@ int x509_extract_key_data(void *context, size_t hdrlen,
ctx->cert->pub->pkey_algo = "ecrdsa";
break;
case OID_id_ecPublicKey:
-   ctx->cert->pub->pkey_algo = "sm2";
+   if (parse_OID(ctx->params, ctx->params_size, ) != 0)
+   return -EBADMSG;
+
+   switch (oid) {
+   case OID_sm2:
+   ctx->cert->pub->pkey_algo = "sm2";
+   break;
+   default:
+   return -ENOPKG;
+   }
break;
default:
return -ENOPKG;
diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
index b504e2f36b25..f32d91895e4d 100644
--- a/include/linux/oid_registry.h
+++ b/include/linux/oid_registry.h
@@ -121,6 +121,7 @@ enum OID {
 };
 
 extern enum OID look_up_OID(const void *data, size_t datasize);
+extern int parse_OID(const void *data, size_t datasize, enum OID *oid);
 extern int sprint_oid(const void *, size_t, char *, size_t);
 extern int sprint_OID(enum OID, char *, size_t);
 
diff --git a/lib/oid_registry.c b/lib/oid_registry.c
index f7ad43f28579..508e0b34b5f0 100644
--- a/lib/oid_registry.c
+++ b/lib/oid_registry.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "oid_registry_data.c"
 
 MODULE_DESCRIPTION("OID Registry");
@@ -92,6 +93,18 @@ enum OID look_up_OID(const void *data, size_t datasize)
 }
 EXPORT_SYMBOL_GPL(look_up_OID);
 
+int parse_OID(const void *data, size_t datasize, enum OID *oid)
+{
+   const unsigned char *v = data;
+
+   if (datasize < 2 || v[0] != ASN1_OID || v[1] != datasize - 2)
+   return -EBADMSG;
+
+   *oid = look_up_OID(data + 2, datasize - 2);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(parse_OID);
+
 /*
  * sprint_OID - Print an Object Identifier into a buffer
  * @data: The encoded OID to print
-- 
2.29.2



[PATCH v10 6/9] x509: Add support for parsing x509 certs with ECDSA keys

2021-03-04 Thread Stefan Berger
From: Stefan Berger 

This patch adds support for parsing of x509 certificates that contain
ECDSA keys, such as NIST P256, that have been signed by a CA using any
of the current SHA hash algorithms.

Cc: David Howells 
Cc: keyri...@vger.kernel.org
Signed-off-by: Stefan Berger 

---

v7->v8:
 - do not detect key algo using parse_OID() in public_key.c but set
   pkey_algo to the key type 'ecdsa-nist-p192/256' when parsing cert
---
 crypto/asymmetric_keys/public_key.c   |  4 ++-
 crypto/asymmetric_keys/x509_cert_parser.c | 34 ++-
 crypto/asymmetric_keys/x509_public_key.c  |  4 ++-
 include/linux/oid_registry.h  |  2 ++
 4 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/crypto/asymmetric_keys/public_key.c 
b/crypto/asymmetric_keys/public_key.c
index 788a4ba1e2e7..4fefb219bfdc 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -85,7 +86,8 @@ int software_key_determine_akcipher(const char *encoding,
return n >= CRYPTO_MAX_ALG_NAME ? -EINVAL : 0;
}
 
-   if (strcmp(encoding, "raw") == 0) {
+   if (strcmp(encoding, "raw") == 0 ||
+   strcmp(encoding, "x962") == 0) {
strcpy(alg_name, pkey->pkey_algo);
return 0;
}
diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 1621ceaf5c95..f5d547c6dfb5 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -227,6 +227,26 @@ int x509_note_pkey_algo(void *context, size_t hdrlen,
ctx->cert->sig->hash_algo = "sha224";
goto rsa_pkcs1;
 
+   case OID_id_ecdsa_with_sha1:
+   ctx->cert->sig->hash_algo = "sha1";
+   goto ecdsa;
+
+   case OID_id_ecdsa_with_sha224:
+   ctx->cert->sig->hash_algo = "sha224";
+   goto ecdsa;
+
+   case OID_id_ecdsa_with_sha256:
+   ctx->cert->sig->hash_algo = "sha256";
+   goto ecdsa;
+
+   case OID_id_ecdsa_with_sha384:
+   ctx->cert->sig->hash_algo = "sha384";
+   goto ecdsa;
+
+   case OID_id_ecdsa_with_sha512:
+   ctx->cert->sig->hash_algo = "sha512";
+   goto ecdsa;
+
case OID_gost2012Signature256:
ctx->cert->sig->hash_algo = "streebog256";
goto ecrdsa;
@@ -255,6 +275,11 @@ int x509_note_pkey_algo(void *context, size_t hdrlen,
ctx->cert->sig->encoding = "raw";
ctx->algo_oid = ctx->last_oid;
return 0;
+ecdsa:
+   ctx->cert->sig->pkey_algo = "ecdsa";
+   ctx->cert->sig->encoding = "x962";
+   ctx->algo_oid = ctx->last_oid;
+   return 0;
 }
 
 /*
@@ -276,7 +301,8 @@ int x509_note_signature(void *context, size_t hdrlen,
 
if (strcmp(ctx->cert->sig->pkey_algo, "rsa") == 0 ||
strcmp(ctx->cert->sig->pkey_algo, "ecrdsa") == 0 ||
-   strcmp(ctx->cert->sig->pkey_algo, "sm2") == 0) {
+   strcmp(ctx->cert->sig->pkey_algo, "sm2") == 0 ||
+   strcmp(ctx->cert->sig->pkey_algo, "ecdsa") == 0) {
/* Discard the BIT STRING metadata */
if (vlen < 1 || *(const u8 *)value != 0)
return -EBADMSG;
@@ -478,6 +504,12 @@ int x509_extract_key_data(void *context, size_t hdrlen,
case OID_sm2:
ctx->cert->pub->pkey_algo = "sm2";
break;
+   case OID_id_prime192v1:
+   ctx->cert->pub->pkey_algo = "ecdsa-nist-p192";
+   break;
+   case OID_id_prime256v1:
+   ctx->cert->pub->pkey_algo = "ecdsa-nist-p256";
+   break;
default:
return -ENOPKG;
}
diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index ae450eb8be14..3d45161b271a 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -129,7 +129,9 @@ int x509_check_for_self_signed(struct x509_certificate 
*cert)
}
 
ret = -EKEYREJECTED;
-   if (strcmp(cert->pub->pkey_algo, cert->sig->pkey_algo) != 0)
+   if (strcmp(cert->pub->pkey_algo, cert->sig->pkey_algo) != 0 &&
+   (strncmp(cert->pub->pkey_algo, "ecdsa-", 6) != 0 ||
+strcmp(cert->sig->pkey_algo, "ecdsa") != 0))
goto out;
 
ret = public_key_verify_signature(cert->pub, cert->sig);
diff --git a/include/linux/oid_registry.h b/include/linux/oid_registry.h
index f32d91895e4d..3583908cf1ca 100644
--- a/include/linux/oid_registry.h
+++ b/include/linux/oid_registry.h
@@ -20,6 +20,8 @@ enum OID {
OID_id_dsa_with_sha1,   /* 1.2.840.10030.4.3 */
OID_id_dsa, /* 

[PATCH] ASoC: codecs: lpass-wsa-macro: fix RX MIX input controls

2021-03-04 Thread Jonathan Marek
Attempting to use the RX MIX path at 48kHz plays at 96kHz, because these
controls are incorrectly toggling the first bit of the register, which
is part of the FS_RATE field.

Fix the problem by using the same method used by the "WSA RX_MIX EC0_MUX"
control, which is to use SND_SOC_NOPM as the register and use an enum in
the shift field instead.

Fixes: 2c4066e5d428 ("ASoC: codecs: lpass-wsa-macro: add dapm widgets and 
route")
Signed-off-by: Jonathan Marek 
---
 sound/soc/codecs/lpass-wsa-macro.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/sound/soc/codecs/lpass-wsa-macro.c 
b/sound/soc/codecs/lpass-wsa-macro.c
index f399f4dff5511..bd2561f9fb9fa 100644
--- a/sound/soc/codecs/lpass-wsa-macro.c
+++ b/sound/soc/codecs/lpass-wsa-macro.c
@@ -1211,14 +1211,16 @@ static int wsa_macro_enable_mix_path(struct 
snd_soc_dapm_widget *w,
 struct snd_kcontrol *kcontrol, int event)
 {
struct snd_soc_component *component = 
snd_soc_dapm_to_component(w->dapm);
-   u16 gain_reg;
+   u16 path_reg, gain_reg;
int val;
 
-   switch (w->reg) {
-   case CDC_WSA_RX0_RX_PATH_MIX_CTL:
+   switch (w->shift) {
+   case WSA_MACRO_RX_MIX0:
+   path_reg = CDC_WSA_RX0_RX_PATH_MIX_CTL;
gain_reg = CDC_WSA_RX0_RX_VOL_MIX_CTL;
break;
-   case CDC_WSA_RX1_RX_PATH_MIX_CTL:
+   case WSA_MACRO_RX_MIX1:
+   path_reg = CDC_WSA_RX1_RX_PATH_MIX_CTL;
gain_reg = CDC_WSA_RX1_RX_VOL_MIX_CTL;
break;
default:
@@ -1231,7 +1233,7 @@ static int wsa_macro_enable_mix_path(struct 
snd_soc_dapm_widget *w,
snd_soc_component_write(component, gain_reg, val);
break;
case SND_SOC_DAPM_POST_PMD:
-   snd_soc_component_update_bits(component, w->reg,
+   snd_soc_component_update_bits(component, path_reg,
  CDC_WSA_RX_PATH_MIX_CLK_EN_MASK,
  CDC_WSA_RX_PATH_MIX_CLK_DISABLE);
break;
@@ -2068,14 +2070,14 @@ static const struct snd_soc_dapm_widget 
wsa_macro_dapm_widgets[] = {
SND_SOC_DAPM_MUX("WSA_RX0 INP0", SND_SOC_NOPM, 0, 0, 
_prim_inp0_mux),
SND_SOC_DAPM_MUX("WSA_RX0 INP1", SND_SOC_NOPM, 0, 0, 
_prim_inp1_mux),
SND_SOC_DAPM_MUX("WSA_RX0 INP2", SND_SOC_NOPM, 0, 0, 
_prim_inp2_mux),
-   SND_SOC_DAPM_MUX_E("WSA_RX0 MIX INP", CDC_WSA_RX0_RX_PATH_MIX_CTL,
-  0, 0, _mix_mux, wsa_macro_enable_mix_path,
+   SND_SOC_DAPM_MUX_E("WSA_RX0 MIX INP", SND_SOC_NOPM, WSA_MACRO_RX_MIX0,
+  0, _mix_mux, wsa_macro_enable_mix_path,
   SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
SND_SOC_DAPM_MUX("WSA_RX1 INP0", SND_SOC_NOPM, 0, 0, 
_prim_inp0_mux),
SND_SOC_DAPM_MUX("WSA_RX1 INP1", SND_SOC_NOPM, 0, 0, 
_prim_inp1_mux),
SND_SOC_DAPM_MUX("WSA_RX1 INP2", SND_SOC_NOPM, 0, 0, 
_prim_inp2_mux),
-   SND_SOC_DAPM_MUX_E("WSA_RX1 MIX INP", CDC_WSA_RX1_RX_PATH_MIX_CTL,
-  0, 0, _mix_mux, wsa_macro_enable_mix_path,
+   SND_SOC_DAPM_MUX_E("WSA_RX1 MIX INP", SND_SOC_NOPM, WSA_MACRO_RX_MIX1,
+  0, _mix_mux, wsa_macro_enable_mix_path,
   SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
 
SND_SOC_DAPM_MIXER_E("WSA_RX INT0 MIX", SND_SOC_NOPM, 0, 0, NULL, 0,
-- 
2.26.1



Re: Re: linux-next: build failure after merge of the pinctrl tree

2021-03-04 Thread Linus Walleij
On Fri, Mar 5, 2021 at 1:13 AM jay...@rock-chips.com
 wrote:

> Could you show me the issue log ?

It's attached to Stephen's original mail in this thread.

Yours,
Linus Walleij


RE: [PATCH V3 4/5] dt-bindings: mmc: fsl-imx-esdhc: add clock bindings

2021-03-04 Thread Peng Fan
Hi Rob,

> Subject: Re: [PATCH V3 4/5] dt-bindings: mmc: fsl-imx-esdhc: add clock
> bindings
> 
> On Wed, Feb 24, 2021 at 9:23 PM  wrote:
> >
> > From: Peng Fan 
> >
> > Add clock bindings for fsl-imx-esdhc yaml
> >
> > Signed-off-by: Peng Fan 
> > ---
> >  .../devicetree/bindings/mmc/fsl-imx-esdhc.yaml| 11
> +++
> >  1 file changed, 11 insertions(+)
> 
> Looks like this landed in linux-next and introduces warnings:

Patch 2,3 has not been picked by Shawn. After patch 2, 3 picked up,
there will be no warnings.

Thanks,
Peng.

> 
> /builds/robherring/linux-dt-bindings/Documentation/devicetree/bindings/clo
> ck/imx8qxp-lpcg.example.dt.yaml:
> mmc@5b01: clock-names:1: 'ahb' was expected  From schema:
> /builds/robherring/linux-dt-bindings/Documentation/devicetree/bindings/m
> mc/fsl-imx-esdhc.yaml
> /builds/robherring/linux-dt-bindings/Documentation/devicetree/bindings/clo
> ck/imx8qxp-lpcg.example.dt.yaml:
> mmc@5b01: clock-names:2: 'per' was expected  From schema:
> /builds/robherring/linux-dt-bindings/Documentation/devicetree/bindings/m
> mc/fsl-imx-esdhc.yaml


Re: [RFC PATCH 05/10] vfio: Create a vfio_device from vma lookup

2021-03-04 Thread Jason Gunthorpe
On Thu, Mar 04, 2021 at 05:07:31PM -0700, Alex Williamson wrote:
> On Thu, 4 Mar 2021 19:16:33 -0400
> Jason Gunthorpe  wrote:
> 
> > On Thu, Mar 04, 2021 at 02:37:57PM -0700, Alex Williamson wrote:
> > 
> > > Therefore unless a bus driver opts-out by replacing vm_private_data, we
> > > can identify participating vmas by the vm_ops and have flags indicating
> > > if the vma maps device memory such that vfio_get_device_from_vma()
> > > should produce a device reference.  The vfio IOMMU backends would also
> > > consume this, ie. if they get a valid vfio_device from the vma, use the
> > > pfn_base field directly.  vfio_vm_ops would wrap the bus driver
> > > callbacks and provide reference counting on open/close to release this
> > > object.  
> > 
> > > I'm not thrilled with a vfio_device_ops callback plumbed through
> > > vfio-core to do vma-to-pfn translation, so I thought this might be a
> > > better alternative.  Thanks,  
> > 
> > Maybe you could explain why, because I'm looking at this idea and
> > thinking it looks very complicated compared to a simple driver op
> > callback?
> 
> vfio-core needs to export a vfio_vma_to_pfn() which I think assumes the
> caller has already used vfio_device_get_from_vma(), but should still
> validate the vma is one from a vfio device before calling this new
> vfio_device_ops callback.

Huh? Validate? Why?

Something like this in the IOMMU stuff:

   struct vfio_device *vfio = vfio_device_get_from_vma(vma)

   if (!vfio->vma_to_pfn)
return -EINVAL;
   vfio->ops->vma_to_pfn(vfio, vma, offset_from_vma_start)

Is fine, why do we need to over complicate?

I don't need to check that the vma belongs to the vfio because it is
the API contract that the caller will guarantee that.

This is the kernel, I can (and do!) check these sorts of things by
code inspection when working on stuff - I can look at every
implementation and every call site to prove these things.

IMHO doing an expensive check like that is a style of defensive
programming the kernel community frowns upon.

> vfio-pci needs to validate the vm_pgoff value falls within a BAR
> region, mask off the index and get the pci_resource_start() for the
> BAR index.

It needs to do the same thing fault() already does, which is currently
one line..

> Then we need a solution for how vfio_device_get_from_vma() determines
> whether to grant a device reference for a given vma, where that vma may
> map something other than device memory. Are you imagining that we hand
> out device references independently and vfio_vma_to_pfn() would return
> an errno for vm_pgoff values that don't map device memory and the IOMMU
> driver would release the reference?

That seems a reasonable place to start

> prevent using unmmap_mapping_range().  The IOMMU backend, once it has a
> vfio_device via vfio_device_get_from_vma() can know the format of
> vm_private_data, cast it as a vfio_vma_private_data and directly use
> base_pfn, accomplishing the big point.  They're all operating in the
> agreed upon vm_private_data format.  Thanks,

If we force all drivers into a mandatory (!) vm_private_data format
then every driver has to be audited and updated before the new pfn
code can be done. If any driver in the future makes a mistake here
(and omitting the unique vm_private_data magics is a very easy mistake
to make) then it will cause a kernel crash in an obscure scenario.

It is the "design the API to be hard to use wrong" philosophy.

Jason


Re: [PATCH] arm64: dts: ls1028a: add interrupt to Root Complex Event Collector

2021-03-04 Thread Shawn Guo
On Tue, Feb 09, 2021 at 01:52:59AM +0100, Michael Walle wrote:
> The legacy interrupt INT_A is hardwired to the event collector. RCEC is
> bascially supported starting with v5.11. Having a correct interrupt, will
> make RCEC at least probe correctly.
> 
> There are still issues with how RCEC is implemented in the RCiEP on the
> LS1028A. RCEC will report an error, but it cannot find the correct
> subdevice.
> 
> Signed-off-by: Michael Walle 

Applied, thanks.


RE: Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support

2021-03-04 Thread Daejun Park
Hi Bean,

> > +
> > +static inline int ufshpb_get_read_id(struct ufshpb_lu *hpb)
> > +{
> > +   if (++hpb->cur_read_id >= MAX_HPB_READ_ID)
> > +   hpb->cur_read_id = 0;
> > +   return hpb->cur_read_id;
> > +}
> > +
> > +static int ufshpb_execute_pre_req(struct ufshpb_lu *hpb, struct
> > scsi_cmnd *cmd,
> > + struct ufshpb_req *pre_req, int
> > read_id)
> > +{
> > +   struct scsi_device *sdev = cmd->device;
> > +   struct request_queue *q = sdev->request_queue;
> > +   struct request *req;
> > +   struct scsi_request *rq;
> > +   struct bio *bio = pre_req->bio;
> > +
> > +   pre_req->hpb = hpb;
> > +   pre_req->wb.lpn = sectors_to_logical(cmd->device,
> > +blk_rq_pos(cmd-
> > >request));
> > +   pre_req->wb.len = sectors_to_logical(cmd->device,
> > +blk_rq_sectors(cmd-
> > >request));
> > +   if (ufshpb_pre_req_add_bio_page(hpb, q, pre_req))
> > +   return -ENOMEM;
> > +
> > +   req = pre_req->req;
> > +
> > +   /* 1. request setup */
> > +   blk_rq_append_bio(req, );
> > +   req->rq_disk = NULL;
> > +   req->end_io_data = (void *)pre_req;
> > +   req->end_io = ufshpb_pre_req_compl_fn;
> > +
> > +   /* 2. scsi_request setup */
> > +   rq = scsi_req(req);
> > +   rq->retries = 1;
> > +
> > +   ufshpb_set_write_buf_cmd(rq->cmd, pre_req->wb.lpn, pre_req-
> > >wb.len,
> > +read_id);
> > +   rq->cmd_len = scsi_command_size(rq->cmd);
> > +
> > +   if (blk_insert_cloned_request(q, req) != BLK_STS_OK)
> > +   return -EAGAIN;
> > +
> > +   hpb->stats.pre_req_cnt++;
> > +
> > +   return 0;
> > +}
> > +
> > +static int ufshpb_issue_pre_req(struct ufshpb_lu *hpb, struct
> > scsi_cmnd *cmd,
> > +   int *read_id)
> > +{
> > +   struct ufshpb_req *pre_req;
> > +   struct request *req = NULL;
> > +   struct bio *bio = NULL;
> > +   unsigned long flags;
> > +   int _read_id;
> > +   int ret = 0;
> > +
> > +   req = blk_get_request(cmd->device->request_queue,
> > + REQ_OP_SCSI_OUT | REQ_SYNC,
> > BLK_MQ_REQ_NOWAIT);
> > +   if (IS_ERR(req))
> > +   return -EAGAIN;
> > +
> > +   bio = bio_alloc(GFP_ATOMIC, 1);
> > +   if (!bio) {
> > +   blk_put_request(req);
> > +   return -EAGAIN;
> > +   }
> > +
> > +   spin_lock_irqsave(>rgn_state_lock, flags);
> > +   pre_req = ufshpb_get_pre_req(hpb);
> > +   if (!pre_req) {
> > +   ret = -EAGAIN;
> > +   goto unlock_out;
> > +   }
> > +   _read_id = ufshpb_get_read_id(hpb);
> > +   spin_unlock_irqrestore(>rgn_state_lock, flags);
> > +
> > +   pre_req->req = req;
> > +   pre_req->bio = bio;
> > +
> > +   ret = ufshpb_execute_pre_req(hpb, cmd, pre_req, _read_id);
> > +   if (ret)
> > +   goto free_pre_req;
> > +
> > +   *read_id = _read_id;
> > +
> > +   return ret;
> > +free_pre_req:
> > +   spin_lock_irqsave(>rgn_state_lock, flags);
> > +   ufshpb_put_pre_req(hpb, pre_req);
> > +unlock_out:
> > +   spin_unlock_irqrestore(>rgn_state_lock, flags);
> > +   bio_put(bio);
> > +   blk_put_request(req);
> > +   return ret;
> > +}
> > +
> >  /*
> >   * This function will set up HPB read command using host-side L2P
> > map data.
> > - * In HPB v1.0, maximum size of HPB read command is 4KB.
> >   */
> > -void ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
> > +int ufshpb_prep(struct ufs_hba *hba, struct ufshcd_lrb *lrbp)
> >  {
> > struct ufshpb_lu *hpb;
> > struct ufshpb_region *rgn;
> > @@ -291,26 +560,27 @@ void ufshpb_prep(struct ufs_hba *hba, struct
> > ufshcd_lrb *lrbp)
> > u64 ppn;
> > unsigned long flags;
> > int transfer_len, rgn_idx, srgn_idx, srgn_offset;
> > +   int read_id = 0;
> > int err = 0;
> >  
> > hpb = ufshpb_get_hpb_data(cmd->device);
> > if (!hpb)
> > -   return;
> > +   return -ENODEV;
> >  
> > if (ufshpb_get_state(hpb) != HPB_PRESENT) {
> > dev_notice(>sdev_ufs_lu->sdev_dev,
> >"%s: ufshpb state is not PRESENT",
> > __func__);
> > -   return;
> > +   return -ENODEV;
> > }
> >  
> > if (!ufshpb_is_write_or_discard_cmd(cmd) &&
> > !ufshpb_is_read_cmd(cmd))
> > -   return;
> > +   return 0;
> >  
> > transfer_len = sectors_to_logical(cmd->device,
> >   blk_rq_sectors(cmd-
> > >request));
> > if (unlikely(!transfer_len))
> > -   return;
> > +   return 0;
> >  
> > lpn = sectors_to_logical(cmd->device, blk_rq_pos(cmd-
> > >request));
> >   

[PATCH] clk: use clk_core_enable_lock() a bit more

2021-03-04 Thread Rasmus Villemoes
Use clk_core_enable_lock() and clk_core_disable_lock() in a few places
rather than open-coding them.

Signed-off-by: Rasmus Villemoes 
---
 drivers/clk/clk.c | 18 +++---
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 5052541a0986..8c1ed844b97e 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -2078,12 +2078,8 @@ static void clk_change_rate(struct clk_core *core)
return;
 
if (core->flags & CLK_SET_RATE_UNGATE) {
-   unsigned long flags;
-
clk_core_prepare(core);
-   flags = clk_enable_lock();
-   clk_core_enable(core);
-   clk_enable_unlock(flags);
+   clk_core_enable_lock(core);
}
 
if (core->new_parent && core->new_parent != core->parent) {
@@ -2116,11 +2112,7 @@ static void clk_change_rate(struct clk_core *core)
core->rate = clk_recalc(core, best_parent_rate);
 
if (core->flags & CLK_SET_RATE_UNGATE) {
-   unsigned long flags;
-
-   flags = clk_enable_lock();
-   clk_core_disable(core);
-   clk_enable_unlock(flags);
+   clk_core_disable_lock(core);
clk_core_unprepare(core);
}
 
@@ -3564,8 +3556,6 @@ static int __clk_core_init(struct clk_core *core)
 * reparenting clocks
 */
if (core->flags & CLK_IS_CRITICAL) {
-   unsigned long flags;
-
ret = clk_core_prepare(core);
if (ret) {
pr_warn("%s: critical clk '%s' failed to prepare\n",
@@ -3573,9 +3563,7 @@ static int __clk_core_init(struct clk_core *core)
goto out;
}
 
-   flags = clk_enable_lock();
-   ret = clk_core_enable(core);
-   clk_enable_unlock(flags);
+   ret = clk_core_enable_lock(core);
if (ret) {
pr_warn("%s: critical clk '%s' failed to enable\n",
   __func__, core->name);
-- 
2.29.2



RE: Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support

2021-03-04 Thread Daejun Park
Hi, Can Guo

> > diff --git a/drivers/scsi/ufs/ufs-sysfs.c 
> > b/drivers/scsi/ufs/ufs-sysfs.c
> > index 2546e7a1ac4f..00fb519406cf 100644
> > --- a/drivers/scsi/ufs/ufs-sysfs.c
> > +++ b/drivers/scsi/ufs/ufs-sysfs.c
> > @@ -841,6 +841,7 @@ out:
> > \
> >  static DEVICE_ATTR_RO(_name)
> > 
> >  UFS_ATTRIBUTE(boot_lun_enabled, _BOOT_LU_EN);
> > +UFS_ATTRIBUTE(max_data_size_hpb_single_cmd, _MAX_HPB_SINGLE_CMD);
> >  UFS_ATTRIBUTE(current_power_mode, _POWER_MODE);
> >  UFS_ATTRIBUTE(active_icc_level, _ACTIVE_ICC_LVL);
> >  UFS_ATTRIBUTE(ooo_data_enabled, _OOO_DATA_EN);
> > @@ -864,6 +865,7 @@ UFS_ATTRIBUTE(wb_cur_buf, _CURR_WB_BUFF_SIZE);
> > 
> >  static struct attribute *ufs_sysfs_attributes[] = {
> >  _attr_boot_lun_enabled.attr,
> > +_attr_max_data_size_hpb_single_cmd.attr,
> >  _attr_current_power_mode.attr,
> >  _attr_active_icc_level.attr,
> >  _attr_ooo_data_enabled.attr,
> > diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
> > index 957763db1006..e0b748777a1b 100644
> > --- a/drivers/scsi/ufs/ufs.h
> > +++ b/drivers/scsi/ufs/ufs.h
> > @@ -123,12 +123,13 @@ enum flag_idn {
> >  QUERY_FLAG_IDN_WB_BUFF_FLUSH_EN = 0x0F,
> >  QUERY_FLAG_IDN_WB_BUFF_FLUSH_DURING_HIBERN8 = 0x10,
> >  QUERY_FLAG_IDN_HPB_RESET= 0x11,
> > +QUERY_FLAG_IDN_HPB_EN= 0x12,
>  
> Also add this flag to sysfs?

Sure, I will do.

Thanks,
Daejun


Re: linux-next: rebase of the scsi-mkp tree

2021-03-04 Thread Stephen Rothwell
Hi James,

On Thu, 04 Mar 2021 16:21:20 -0800 James Bottomley 
 wrote:
>
> On Fri, 2021-03-05 at 11:04 +1100, Stephen Rothwell wrote:
> > 
> > I notice that you have rebased the scsi-mkp tree.  Unfotunately James
> > has already merged part of the old version of the scsi-mkp tree int
> > the scsi tree so that commits f69d02e37a85..39ae3edda325 in the scsi-
> > mkp tree are the same patches as commits fe07bfda2fb9..100d21c4ff29
> > in the scsi tree.  
> 
> It's just the flux from Linus announcing he's screwed up -rc1 and we
> shouldn't base on it ... it should all be fixed soon.

Thanks.

-- 
Cheers,
Stephen Rothwell


pgptctnHYopQm.pgp
Description: OpenPGP digital signature


Re: linux-next: Fixes tag needs some work in the block tree

2021-03-04 Thread Jens Axboe
On 3/4/21 4:52 PM, Stephen Rothwell wrote:
> Hi all,
> 
> In commit
> 
>   284e4cdb0c0b ("nvme-hwmon: Return error code when registration fails")
> 
> Fixes tag
> 
>   Fixes: ec420cdcfab4 ("nvme/hwmon: rework to avoid devm allocation")
> 
> has these problem(s):
> 
>   - Target SHA1 does not exist
> 
> Maybe you meant
> 
> Fixes: ed7770f66286 ("nvme-hwmon: rework to avoid devm allocation")

Christoph, since there's multiple commits with issues, mind resending
a fixed branch? Then I'll drop the one I pulled today.


-- 
Jens Axboe



RE: Re: [PATCH v26 4/4] scsi: ufs: Add HPB 2.0 support

2021-03-04 Thread Daejun Park
Hi Bean,

> > +
> > +static inline void ufshpb_put_pre_req(struct ufshpb_lu *hpb,
> > + struct ufshpb_req *pre_req)
> > +{
> > +   pre_req->req = NULL;
> > +   pre_req->bio = NULL;
> > +   list_add_tail(_req->list_req, >lh_pre_req_free);
> > +   hpb->num_inflight_pre_req--;
> > +}
> > +
> > +static void ufshpb_pre_req_compl_fn(struct request *req,
> > blk_status_t error)
> > +{
> > +   struct ufshpb_req *pre_req = (struct ufshpb_req *)req-
> > >end_io_data;
> > +   struct ufshpb_lu *hpb = pre_req->hpb;
> > +   unsigned long flags;
> > +   struct scsi_sense_hdr sshdr;
> > +
> > +   if (error) {
> > +   dev_err(>sdev_ufs_lu->sdev_dev, "block status
> > %d", error);
> > +   scsi_normalize_sense(pre_req->sense,
> > SCSI_SENSE_BUFFERSIZE,
> > +);
> > +   dev_err(>sdev_ufs_lu->sdev_dev,
> > +   "code %x sense_key %x asc %x ascq %x",
> > +   sshdr.response_code,
> > +   sshdr.sense_key, sshdr.asc, sshdr.ascq);
> > +   dev_err(>sdev_ufs_lu->sdev_dev,
> > +   "byte4 %x byte5 %x byte6 %x additional_len
> > %x",
> > +   sshdr.byte4, sshdr.byte5,
> > +   sshdr.byte6, sshdr.additional_length);
> > +   }
>  
>  
> How can you print out sense_key and sense code here? sense code will
> not be copied to pre_req->sense. you should directly use
> scsi_request->sense or let pre_req->sense point to scsi_request->sense.

OK, I will fix it.

> You update the new version patch so quickly. In another word, I am
> wondering if you tested your patch before submitting?

I will check more carefully for the next patch.

Thanks,
Daejun


Re: [PATCH] pinctrl: qcom: lpass lpi: use default pullup/strength values

2021-03-04 Thread Jonathan Marek

On 3/4/21 7:05 PM, Bjorn Andersson wrote:

On Thu 04 Mar 13:48 CST 2021, Jonathan Marek wrote:


If these fields are not set in dts, the driver will use these variables
uninitialized to set the fields. Not only will it set garbage values for
these fields, but it can overflow into other fields and break those.

In the current sm8250 dts, the dmic01 entries do not have a pullup setting,
and might not work without this change.



Perhaps you didn't see it, but Dan reported this a few days back. So
unless you object I would suggest that we include:



I did not see it. But feel free to add tags.


Reported-by: kernel test robot 
Reported-by: Dan Carpenter 


Reviewed-by: Bjorn Andersson 

Regards,
Bjorn


Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver")
Signed-off-by: Jonathan Marek 
---
  drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c 
b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
index 369ee20a7ea95..2f19ab4db7208 100644
--- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
+++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
@@ -392,7 +392,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, 
unsigned int group,
  unsigned long *configs, unsigned int nconfs)
  {
struct lpi_pinctrl *pctrl = dev_get_drvdata(pctldev->dev);
-   unsigned int param, arg, pullup, strength;
+   unsigned int param, arg, pullup = LPI_GPIO_BIAS_DISABLE, strength = 2;
bool value, output_enabled = false;
const struct lpi_pingroup *g;
unsigned long sval;
--
2.26.1



Re: linux-next: rebase of the scsi-mkp tree

2021-03-04 Thread James Bottomley
On Fri, 2021-03-05 at 11:04 +1100, Stephen Rothwell wrote:
> Hi Martin,
> 
> I notice that you have rebased the scsi-mkp tree.  Unfotunately James
> has already merged part of the old version of the scsi-mkp tree int
> the scsi tree so that commits f69d02e37a85..39ae3edda325 in the scsi-
> mkp tree are the same patches as commits fe07bfda2fb9..100d21c4ff29
> in the scsi tree.

It's just the flux from Linus announcing he's screwed up -rc1 and we
shouldn't base on it ... it should all be fixed soon.

James



signature.asc
Description: This is a digitally signed message part


[PATCH] x86: kprobes: orc: Fix ORC walks in kretprobes

2021-03-04 Thread Daniel Xu
Getting a stack trace from inside a kretprobe used to work with frame
pointer stack walks. After the default unwinder was switched to ORC,
stack traces broke because ORC did not know how to skip the
`kretprobe_trampoline` "frame".

Frame based stack walks used to work with kretprobes because
`kretprobe_trampoline` does not set up a new call frame. Thus, the frame
pointer based unwinder could walk directly to the kretprobe'd caller.

For example, this stack is walked incorrectly with ORC + kretprobe:

# bpftrace -e 'kretprobe:do_nanosleep { @[kstack] = count() }'
Attaching 1 probe...
^C

@[
kretprobe_trampoline+0
]: 1

After this patch, the stack is walked correctly:

# bpftrace -e 'kretprobe:do_nanosleep { @[kstack] = count() }'
Attaching 1 probe...
^C

@[
kretprobe_trampoline+0
__x64_sys_nanosleep+150
do_syscall_64+51
entry_SYSCALL_64_after_hwframe+68
]: 12

Fixes: fc72ae40e303 ("x86/unwind: Make CONFIG_UNWINDER_ORC=y the default in 
kconfig for 64-bit")
Signed-off-by: Daniel Xu 
---
 arch/x86/kernel/unwind_orc.c | 53 +++-
 kernel/kprobes.c |  8 +++---
 2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2a1d47f47eee..1b88d75e2e9e 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -1,7 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0-only
+#include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -77,9 +79,11 @@ static struct orc_entry *orc_module_find(unsigned long ip)
 }
 #endif
 
-#ifdef CONFIG_DYNAMIC_FTRACE
+#if defined(CONFIG_DYNAMIC_FTRACE) || defined(CONFIG_KRETPROBES)
 static struct orc_entry *orc_find(unsigned long ip);
+#endif
 
+#ifdef CONFIG_DYNAMIC_FTRACE
 /*
  * Ftrace dynamic trampolines do not have orc entries of their own.
  * But they are copies of the ftrace entries that are static and
@@ -117,6 +121,43 @@ static struct orc_entry *orc_ftrace_find(unsigned long ip)
 }
 #endif
 
+#ifdef CONFIG_KRETPROBES
+static struct orc_entry *orc_kretprobe_find(void)
+{
+   kprobe_opcode_t *correct_ret_addr = NULL;
+   struct kretprobe_instance *ri = NULL;
+   struct llist_node *node;
+
+   node = current->kretprobe_instances.first;
+   while (node) {
+   ri = container_of(node, struct kretprobe_instance, llist);
+
+   if ((void *)ri->ret_addr != _trampoline) {
+   /*
+* This is the real return address. Any other
+* instances associated with this task are for
+* other calls deeper on the call stack
+*/
+   correct_ret_addr = ri->ret_addr;
+   break;
+   }
+
+
+   node = node->next;
+   }
+
+   if (!correct_ret_addr)
+   return NULL;
+
+   return orc_find((unsigned long)correct_ret_addr);
+}
+#else
+static struct orc_entry *orc_kretprobe_find(void)
+{
+   return NULL;
+}
+#endif
+
 /*
  * If we crash with IP==0, the last successfully executed instruction
  * was probably an indirect function call with a NULL function pointer,
@@ -148,6 +189,16 @@ static struct orc_entry *orc_find(unsigned long ip)
if (ip == 0)
return _orc_entry;
 
+   /*
+* Kretprobe lookup -- must occur before vmlinux addresses as
+* kretprobe_trampoline is in the symbol table.
+*/
+   if (ip == (unsigned long) _trampoline) {
+   orc = orc_kretprobe_find();
+   if (orc)
+   return orc;
+   }
+
/* For non-init vmlinux addresses, use the fast lookup table: */
if (ip >= LOOKUP_START_IP && ip < LOOKUP_STOP_IP) {
unsigned int idx, start, stop;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 745f08fdd7a6..334c23d33451 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1895,10 +1895,6 @@ unsigned long __kretprobe_trampoline_handler(struct 
pt_regs *regs,
BUG_ON(1);
 
 found:
-   /* Unlink all nodes for this frame. */
-   current->kretprobe_instances.first = node->next;
-   node->next = NULL;
-
/* Run them..  */
while (first) {
ri = container_of(first, struct kretprobe_instance, llist);
@@ -1917,6 +1913,10 @@ unsigned long __kretprobe_trampoline_handler(struct 
pt_regs *regs,
recycle_rp_inst(ri);
}
 
+   /* Unlink all nodes for this frame. */
+   current->kretprobe_instances.first = node->next;
+   node->next = NULL;
+
return (unsigned long)correct_ret_addr;
 }
 NOKPROBE_SYMBOL(__kretprobe_trampoline_handler)
-- 
2.30.1



Re: [RFC PATCH 05/10] vfio: Create a vfio_device from vma lookup

2021-03-04 Thread Alex Williamson
On Thu, 4 Mar 2021 19:16:33 -0400
Jason Gunthorpe  wrote:

> On Thu, Mar 04, 2021 at 02:37:57PM -0700, Alex Williamson wrote:
> 
> > Therefore unless a bus driver opts-out by replacing vm_private_data, we
> > can identify participating vmas by the vm_ops and have flags indicating
> > if the vma maps device memory such that vfio_get_device_from_vma()
> > should produce a device reference.  The vfio IOMMU backends would also
> > consume this, ie. if they get a valid vfio_device from the vma, use the
> > pfn_base field directly.  vfio_vm_ops would wrap the bus driver
> > callbacks and provide reference counting on open/close to release this
> > object.  
> 
> > I'm not thrilled with a vfio_device_ops callback plumbed through
> > vfio-core to do vma-to-pfn translation, so I thought this might be a
> > better alternative.  Thanks,  
> 
> Maybe you could explain why, because I'm looking at this idea and
> thinking it looks very complicated compared to a simple driver op
> callback?

vfio-core needs to export a vfio_vma_to_pfn() which I think assumes the
caller has already used vfio_device_get_from_vma(), but should still
validate the vma is one from a vfio device before calling this new
vfio_device_ops callback.  vfio-pci needs to validate the vm_pgoff
value falls within a BAR region, mask off the index and get the
pci_resource_start() for the BAR index.

Then we need a solution for how vfio_device_get_from_vma() determines
whether to grant a device reference for a given vma, where that vma may
map something other than device memory.  Are you imagining that we hand
out device references independently and vfio_vma_to_pfn() would return
an errno for vm_pgoff values that don't map device memory and the IOMMU
driver would release the reference?

It all seems rather ad-hoc.
 
> The implementation of such an op for vfio_pci is one line trivial, why
> do we need allocated memory and a entire shim layer instead? 
> 
> Shim layers are bad.

The only thing here I'd consider a shim layer is overriding vm_ops,
which just seemed like a cleaner and simpler solution than exporting
open/close functions and validating the bus driver installs them, and
the error path should they not.

> We still need a driver op of some kind because only the driver can
> convert a pg_off into a PFN. Remember the big point here is to remove
> the sketchy follow_pte()...

The bus driver simply writes the base_pfn value in the example
structure I outlined in its .mmap callback.  I'm just looking for an
alternative place to store our former vm_pgoff in a way that doesn't
prevent using unmmap_mapping_range().  The IOMMU backend, once it has a
vfio_device via vfio_device_get_from_vma() can know the format of
vm_private_data, cast it as a vfio_vma_private_data and directly use
base_pfn, accomplishing the big point.  They're all operating in the
agreed upon vm_private_data format.  Thanks,

Alex



Re: [PATCH] pinctrl: qcom: lpass lpi: use default pullup/strength values

2021-03-04 Thread Bjorn Andersson
On Thu 04 Mar 13:48 CST 2021, Jonathan Marek wrote:

> If these fields are not set in dts, the driver will use these variables
> uninitialized to set the fields. Not only will it set garbage values for
> these fields, but it can overflow into other fields and break those.
> 
> In the current sm8250 dts, the dmic01 entries do not have a pullup setting,
> and might not work without this change.
> 

Perhaps you didn't see it, but Dan reported this a few days back. So
unless you object I would suggest that we include:

Reported-by: kernel test robot 
Reported-by: Dan Carpenter 


Reviewed-by: Bjorn Andersson 

Regards,
Bjorn

> Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver")
> Signed-off-by: Jonathan Marek 
> ---
>  drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c 
> b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
> index 369ee20a7ea95..2f19ab4db7208 100644
> --- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
> +++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
> @@ -392,7 +392,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, 
> unsigned int group,
> unsigned long *configs, unsigned int nconfs)
>  {
>   struct lpi_pinctrl *pctrl = dev_get_drvdata(pctldev->dev);
> - unsigned int param, arg, pullup, strength;
> + unsigned int param, arg, pullup = LPI_GPIO_BIAS_DISABLE, strength = 2;
>   bool value, output_enabled = false;
>   const struct lpi_pingroup *g;
>   unsigned long sval;
> -- 
> 2.26.1
> 


Re: [PATCH] crypto: ccp - Don't initialize SEV support without the SEV feature

2021-03-04 Thread Brijesh Singh


On 3/3/21 4:31 PM, Tom Lendacky wrote:
> From: Tom Lendacky 
>
> If SEV has been disabled (e.g. through BIOS), the driver probe will still
> issue SEV firmware commands. The SEV INIT firmware command will return an
> error in this situation, but the error code is a general error code that
> doesn't highlight the exact reason.
>
> Add a check for X86_FEATURE_SEV in sev_dev_init() and emit a meaningful
> message and skip attempting to initialize the SEV firmware if the feature
> is not enabled. Since building the SEV code is dependent on X86_64, adding
> the check won't cause any build problems.
>
> Cc: John Allen 
> Cc: Brijesh Singh 
> Signed-off-by: Tom Lendacky 


Reviewed-By: Brijesh Singh 

> ---
>  drivers/crypto/ccp/sev-dev.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c
> index 476113e12489..b9fc8d7aca73 100644
> --- a/drivers/crypto/ccp/sev-dev.c
> +++ b/drivers/crypto/ccp/sev-dev.c
> @@ -21,6 +21,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -971,6 +972,11 @@ int sev_dev_init(struct psp_device *psp)
>   struct sev_device *sev;
>   int ret = -ENOMEM;
>  
> + if (!boot_cpu_has(X86_FEATURE_SEV)) {
> + dev_info_once(dev, "SEV: memory encryption not enabled by 
> BIOS\n");
> + return 0;
> + }
> +
>   sev = devm_kzalloc(dev, sizeof(*sev), GFP_KERNEL);
>   if (!sev)
>   goto e_err;


<    1   2   3   4   5   6   7   8   9   10   >